PDF Tools

Feature Information

Feature ID: FEAT-033
Created: 2026-03-01
Last Updated: 2026-03-01
Status: Draft
Priority: P1 (Should Have)
Owner: TBD
Related RFC: RFC-033

User Story

As an AI agent using OneClaw, I want tools to read PDF files – extracting metadata, text content, and rendering pages as images, so that I can help users understand, summarize, and analyze PDF documents on their device.

Typical Scenarios

A user shares a research paper PDF and asks the agent to summarize it. The agent calls pdf_info to check the page count, then calls pdf_extract_text to get the text content, and produces a summary.
A user has a scanned invoice PDF with no text layer. The agent calls pdf_extract_text, gets the “no text content” response, then calls pdf_render_page to render the page as an image for visual inspection.
A user asks the agent to extract specific pages from a long PDF report (e.g., “summarize pages 10-15”). The agent uses pdf_extract_text with the pages parameter to extract only the requested range.
A user wants to know the author and creation date of a document. The agent calls pdf_info to retrieve the document metadata.

Feature Description

Overview

FEAT-033 adds three PDF-related tools to OneClaw as Kotlin built-in tools, enabling AI agents to work with PDF files stored on the device. The tools are ported from the OneClaw 1.0 lib-pdf plugin and adapted to OneClaw’s tool architecture.

The three tools are:

pdf_info – Retrieve PDF metadata (page count, file size, title, author, etc.)
pdf_extract_text – Extract text content from PDF files with optional page range selection
pdf_render_page – Render a PDF page to a PNG image for visual inspection

Architecture Overview

AI Model
    | tool call: pdf_info(path="...") / pdf_extract_text(...) / pdf_render_page(...)
    v
ToolExecutionEngine  (unchanged)
    |
    v
ToolRegistry
    |
    v
PdfInfoTool / PdfExtractTextTool / PdfRenderPageTool  [NEW - Kotlin built-in tools]
    |
    +-- PDFBox Android (text extraction, metadata reading)
    |
    +-- Android PdfRenderer (page rendering to bitmap)
    |
    +-- PdfToolUtils [NEW - shared path resolution, page range parsing]

Tool Definitions

pdf_info

Field	Value
Name	`pdf_info`
Description	Get metadata and info about a PDF file
Parameters	`path` (string, required): Path to the PDF file
Required Permissions	`READ_EXTERNAL_STORAGE`
Timeout	15 seconds
Returns	Text with page count, file size, title, author, and other document properties

pdf_extract_text

Field	Value
Name	`pdf_extract_text`
Description	Extract text content from a PDF file
Parameters	`path` (string, required): Path to the PDF file
	`pages` (string, optional): Page range (e.g., “1-5”, “3”, “1,3,5-7”). Omit for all pages
	`max_chars` (integer, optional): Maximum characters to return. Default: 50000
Required Permissions	`READ_EXTERNAL_STORAGE`
Timeout	30 seconds
Returns	Extracted text with header showing filename, page range, and total page count

pdf_render_page

Field	Value
Name	`pdf_render_page`
Description	Render a PDF page to a PNG image
Parameters	`path` (string, required): Path to the PDF file
	`page` (integer, required): Page number to render (1-based)
	`dpi` (integer, optional): Render resolution in DPI (default 150, range 72-300)
Required Permissions	`READ_EXTERNAL_STORAGE`
Timeout	30 seconds
Returns	Text with output file path, resolution, and file size

User Interaction Flow

User: "Help me summarize this PDF"
   (User has shared a PDF via file attachments or the file is in accessible storage)
AI calls pdf_info(path="/sdcard/Documents/report.pdf")
pdf_info returns: "Pages: 42, Title: Annual Report 2025, ..."
AI calls pdf_extract_text(path="/sdcard/Documents/report.pdf", max_chars=80000)
pdf_extract_text returns: extracted text content
AI produces a summary based on the extracted text

User: "What's on page 3 of this scanned PDF?"
AI calls pdf_extract_text(path="...", pages="3")
Returns: "No text content found. This may be a scanned document."
AI calls pdf_render_page(path="...", page=3, dpi=200)
Returns: "Page 3 rendered and saved to: pdf-renders/document-page3.png"
AI uses the rendered image for visual analysis

Acceptance Criteria

Must pass (all required):

Optional (nice to have):

pdf_extract_text supports extracting text from password-protected PDFs (with a password parameter)
pdf_render_page supports rendering multiple pages at once

UI/UX Requirements

This feature has no new UI. The tools are registered in the tool system and available to AI agents:

Tool names appear in the tool management screen (FEAT-017)
Tool call results are displayed in the chat view (FEAT-001)

Feature Boundary

Included

Three Kotlin built-in tools: PdfInfoTool, PdfExtractTextTool, PdfRenderPageTool
Shared utility class: PdfToolUtils (path resolution, page range parsing)
PDFBox Android dependency addition to build.gradle.kts
PDFBox initialization in OneclawApplication or Koin module
Update to ToolModule to register the three tools
PNG output directory management for rendered pages

Not Included (V1)

PDF creation or editing
PDF annotation
OCR for scanned PDFs (rendered images can be processed by vision-capable models)
PDF form filling
PDF signing or encryption
Password-protected PDF support
PDF-to-Markdown conversion (beyond raw text extraction)

Business Rules

All three tools require a valid file path to an existing PDF file
pdf_extract_text defaults to all pages if pages parameter is omitted
pdf_extract_text defaults to 50,000 character limit if max_chars is not specified
pdf_render_page uses 1-based page numbering (page 1 is the first page)
pdf_render_page DPI is clamped to the range 72-300 regardless of input
Rendered PNG files are saved to the app’s internal storage (filesDir/pdf-renders/)
Rendered PNG filenames follow the pattern {basename}-page{N}.png

Non-Functional Requirements

Performance

pdf_info: < 500ms for typical PDFs (< 50MB)
pdf_extract_text: < 2s for typical PDFs (< 100 pages)
pdf_render_page: < 3s per page at 150 DPI
Memory: PDDocument objects are closed immediately after use to prevent leaks

Security

File path validation prevents access to restricted system directories
PDF rendering output stays within the app’s internal storage
No network access – all operations are local file operations
PDFBox parser is safe against malicious PDFs (no script execution)

Compatibility

Requires Android API 21+ (PdfRenderer requires API 21)
PDFBox Android is compatible with all supported Android versions
Works with both text-based and scanned PDFs (graceful fallback for scanned)

Dependencies

Depends On

FEAT-004 (Tool System): Tool interface, registry, execution engine
FEAT-025 (File Browsing): File system access patterns
FEAT-026 (File Attachments): Users can share PDF files to the app

Depended On By

No other features currently depend on FEAT-033

External Dependencies

PDFBox Android (com.tom-roush:pdfbox-android:2.0.27.0): PDF text extraction and metadata reading. Apache 2.0 license.
Android PdfRenderer: Built into Android framework (API 21+). Used for page-to-bitmap rendering.

Error Handling

Error Scenarios

File not found
- Cause: Path points to a nonexistent file
- Handling: Return ToolResult.error("file_not_found", "File not found: <path>")
Not a valid PDF
- Cause: File exists but is not a valid PDF document
- Handling: Return ToolResult.error("invalid_pdf", "Failed to read PDF: <message>")
Page out of range
- Cause: Requested page number exceeds document page count
- Handling: Return ToolResult.error("invalid_page", "Page N out of range (document has M pages)")
Invalid page range
- Cause: Malformed page range string (e.g., “abc”, “5-2”)
- Handling: Return ToolResult.error("invalid_page_range", "Invalid page range: <spec>")
Permission denied
- Cause: App lacks READ_EXTERNAL_STORAGE permission
- Handling: ToolExecutionEngine requests permission before execution
Out of memory
- Cause: Very large PDF or high DPI rendering
- Handling: Caught by exception handler; returns error with suggestion to use lower DPI

Test Points

Functional Tests

Verify pdf_info returns correct page count for a known PDF
Verify pdf_info returns available metadata fields (title, author, etc.)
Verify pdf_extract_text extracts correct text from a text-based PDF
Verify pdf_extract_text with pages="1-3" extracts only pages 1-3
Verify pdf_extract_text with pages="2" extracts only page 2
Verify pdf_extract_text with pages="1,3,5-7" extracts the specified pages
Verify pdf_extract_text truncates at max_chars with truncation notice
Verify pdf_extract_text returns scanned-document message for image-only PDFs
Verify pdf_render_page creates a PNG file at the expected path
Verify pdf_render_page respects DPI parameter
Verify pdf_render_page clamps DPI to 72-300 range

Edge Cases

PDF with 0 pages (corrupted)
PDF with 1000+ pages
PDF with no metadata fields
PDF larger than 100MB
PDF with complex Unicode text (CJK, Arabic, etc.)
Page range “1-1” (single page range)
Page range with spaces “1 - 5”
max_chars set to 0
DPI set to 0 or negative
DPI set to 1000 (should clamp to 300)
File path with spaces and special characters

Change History

Date	Version	Changes	Owner
2026-03-01	0.1	Initial version	-