PDF Tools
PDF Tools
Feature Information
- Feature ID: FEAT-033
- Created: 2026-03-01
- Last Updated: 2026-03-01
- Status: Draft
- Priority: P1 (Should Have)
- Owner: TBD
- Related RFC: RFC-033
User Story
As an AI agent using OneClaw, I want tools to read PDF files – extracting metadata, text content, and rendering pages as images, so that I can help users understand, summarize, and analyze PDF documents on their device.
Typical Scenarios
- A user shares a research paper PDF and asks the agent to summarize it. The agent calls
pdf_infoto check the page count, then callspdf_extract_textto get the text content, and produces a summary. - A user has a scanned invoice PDF with no text layer. The agent calls
pdf_extract_text, gets the “no text content” response, then callspdf_render_pageto render the page as an image for visual inspection. - A user asks the agent to extract specific pages from a long PDF report (e.g., “summarize pages 10-15”). The agent uses
pdf_extract_textwith thepagesparameter to extract only the requested range. - A user wants to know the author and creation date of a document. The agent calls
pdf_infoto retrieve the document metadata.
Feature Description
Overview
FEAT-033 adds three PDF-related tools to OneClaw as Kotlin built-in tools, enabling AI agents to work with PDF files stored on the device. The tools are ported from the OneClaw 1.0 lib-pdf plugin and adapted to OneClaw’s tool architecture.
The three tools are:
pdf_info– Retrieve PDF metadata (page count, file size, title, author, etc.)pdf_extract_text– Extract text content from PDF files with optional page range selectionpdf_render_page– Render a PDF page to a PNG image for visual inspection
Architecture Overview
AI Model
| tool call: pdf_info(path="...") / pdf_extract_text(...) / pdf_render_page(...)
v
ToolExecutionEngine (unchanged)
|
v
ToolRegistry
|
v
PdfInfoTool / PdfExtractTextTool / PdfRenderPageTool [NEW - Kotlin built-in tools]
|
+-- PDFBox Android (text extraction, metadata reading)
|
+-- Android PdfRenderer (page rendering to bitmap)
|
+-- PdfToolUtils [NEW - shared path resolution, page range parsing]
Tool Definitions
pdf_info
| Field | Value |
|---|---|
| Name | pdf_info |
| Description | Get metadata and info about a PDF file |
| Parameters | path (string, required): Path to the PDF file |
| Required Permissions | READ_EXTERNAL_STORAGE |
| Timeout | 15 seconds |
| Returns | Text with page count, file size, title, author, and other document properties |
pdf_extract_text
| Field | Value |
|---|---|
| Name | pdf_extract_text |
| Description | Extract text content from a PDF file |
| Parameters | path (string, required): Path to the PDF file |
pages (string, optional): Page range (e.g., “1-5”, “3”, “1,3,5-7”). Omit for all pages |
|
max_chars (integer, optional): Maximum characters to return. Default: 50000 |
|
| Required Permissions | READ_EXTERNAL_STORAGE |
| Timeout | 30 seconds |
| Returns | Extracted text with header showing filename, page range, and total page count |
pdf_render_page
| Field | Value |
|---|---|
| Name | pdf_render_page |
| Description | Render a PDF page to a PNG image |
| Parameters | path (string, required): Path to the PDF file |
page (integer, required): Page number to render (1-based) |
|
dpi (integer, optional): Render resolution in DPI (default 150, range 72-300) |
|
| Required Permissions | READ_EXTERNAL_STORAGE |
| Timeout | 30 seconds |
| Returns | Text with output file path, resolution, and file size |
User Interaction Flow
1. User: "Help me summarize this PDF"
(User has shared a PDF via file attachments or the file is in accessible storage)
2. AI calls pdf_info(path="/sdcard/Documents/report.pdf")
3. pdf_info returns: "Pages: 42, Title: Annual Report 2025, ..."
4. AI calls pdf_extract_text(path="/sdcard/Documents/report.pdf", max_chars=80000)
5. pdf_extract_text returns: extracted text content
6. AI produces a summary based on the extracted text
1. User: "What's on page 3 of this scanned PDF?"
2. AI calls pdf_extract_text(path="...", pages="3")
3. Returns: "No text content found. This may be a scanned document."
4. AI calls pdf_render_page(path="...", page=3, dpi=200)
5. Returns: "Page 3 rendered and saved to: pdf-renders/document-page3.png"
6. AI uses the rendered image for visual analysis
Acceptance Criteria
Must pass (all required):
- TEST-033-01:
pdf_infois registered as a Kotlin built-in tool inToolRegistry - TEST-033-02:
pdf_extract_textis registered as a Kotlin built-in tool inToolRegistry - TEST-033-03:
pdf_render_pageis registered as a Kotlin built-in tool inToolRegistry - TEST-033-04:
pdf_inforeturns page count, file size, and available metadata fields - TEST-033-05:
pdf_extract_textreturns text content from a text-based PDF - TEST-033-06:
pdf_extract_textsupports page range selection (“1-5”, “3”, “1,3,5-7”) - TEST-033-07:
pdf_extract_textreturns a helpful message for scanned PDFs with no text layer - TEST-033-08:
pdf_extract_texttruncates output atmax_charswith a truncation notice - TEST-033-09:
pdf_render_pagerenders the specified page as a PNG image - TEST-033-10:
pdf_render_pagesaves the output to the app’s internalpdf-renders/directory - TEST-033-11:
pdf_render_pagerespects thedpiparameter (clamped to 72-300) - TEST-033-12: All three tools return appropriate errors for missing files, invalid parameters, and out-of-range pages
- TEST-033-13: All Layer 1A tests pass
Optional (nice to have):
pdf_extract_textsupports extracting text from password-protected PDFs (with apasswordparameter)pdf_render_pagesupports rendering multiple pages at once
UI/UX Requirements
This feature has no new UI. The tools are registered in the tool system and available to AI agents:
- Tool names appear in the tool management screen (FEAT-017)
- Tool call results are displayed in the chat view (FEAT-001)
Feature Boundary
Included
- Three Kotlin built-in tools:
PdfInfoTool,PdfExtractTextTool,PdfRenderPageTool - Shared utility class:
PdfToolUtils(path resolution, page range parsing) - PDFBox Android dependency addition to
build.gradle.kts - PDFBox initialization in
OneclawApplicationor Koin module - Update to
ToolModuleto register the three tools - PNG output directory management for rendered pages
Not Included (V1)
- PDF creation or editing
- PDF annotation
- OCR for scanned PDFs (rendered images can be processed by vision-capable models)
- PDF form filling
- PDF signing or encryption
- Password-protected PDF support
- PDF-to-Markdown conversion (beyond raw text extraction)
Business Rules
- All three tools require a valid file path to an existing PDF file
pdf_extract_textdefaults to all pages ifpagesparameter is omittedpdf_extract_textdefaults to 50,000 character limit ifmax_charsis not specifiedpdf_render_pageuses 1-based page numbering (page 1 is the first page)pdf_render_pageDPI is clamped to the range 72-300 regardless of input- Rendered PNG files are saved to the app’s internal storage (
filesDir/pdf-renders/) - Rendered PNG filenames follow the pattern
{basename}-page{N}.png
Non-Functional Requirements
Performance
pdf_info: < 500ms for typical PDFs (< 50MB)pdf_extract_text: < 2s for typical PDFs (< 100 pages)pdf_render_page: < 3s per page at 150 DPI- Memory: PDDocument objects are closed immediately after use to prevent leaks
Security
- File path validation prevents access to restricted system directories
- PDF rendering output stays within the app’s internal storage
- No network access – all operations are local file operations
- PDFBox parser is safe against malicious PDFs (no script execution)
Compatibility
- Requires Android API 21+ (PdfRenderer requires API 21)
- PDFBox Android is compatible with all supported Android versions
- Works with both text-based and scanned PDFs (graceful fallback for scanned)
Dependencies
Depends On
- FEAT-004 (Tool System): Tool interface, registry, execution engine
- FEAT-025 (File Browsing): File system access patterns
- FEAT-026 (File Attachments): Users can share PDF files to the app
Depended On By
- No other features currently depend on FEAT-033
External Dependencies
- PDFBox Android (
com.tom-roush:pdfbox-android:2.0.27.0): PDF text extraction and metadata reading. Apache 2.0 license. - Android PdfRenderer: Built into Android framework (API 21+). Used for page-to-bitmap rendering.
Error Handling
Error Scenarios
- File not found
- Cause: Path points to a nonexistent file
- Handling: Return
ToolResult.error("file_not_found", "File not found: <path>")
- Not a valid PDF
- Cause: File exists but is not a valid PDF document
- Handling: Return
ToolResult.error("invalid_pdf", "Failed to read PDF: <message>")
- Page out of range
- Cause: Requested page number exceeds document page count
- Handling: Return
ToolResult.error("invalid_page", "Page N out of range (document has M pages)")
- Invalid page range
- Cause: Malformed page range string (e.g., “abc”, “5-2”)
- Handling: Return
ToolResult.error("invalid_page_range", "Invalid page range: <spec>")
- Permission denied
- Cause: App lacks READ_EXTERNAL_STORAGE permission
- Handling: ToolExecutionEngine requests permission before execution
- Out of memory
- Cause: Very large PDF or high DPI rendering
- Handling: Caught by exception handler; returns error with suggestion to use lower DPI
Test Points
Functional Tests
- Verify
pdf_inforeturns correct page count for a known PDF - Verify
pdf_inforeturns available metadata fields (title, author, etc.) - Verify
pdf_extract_textextracts correct text from a text-based PDF - Verify
pdf_extract_textwithpages="1-3"extracts only pages 1-3 - Verify
pdf_extract_textwithpages="2"extracts only page 2 - Verify
pdf_extract_textwithpages="1,3,5-7"extracts the specified pages - Verify
pdf_extract_texttruncates atmax_charswith truncation notice - Verify
pdf_extract_textreturns scanned-document message for image-only PDFs - Verify
pdf_render_pagecreates a PNG file at the expected path - Verify
pdf_render_pagerespects DPI parameter - Verify
pdf_render_pageclamps DPI to 72-300 range
Edge Cases
- PDF with 0 pages (corrupted)
- PDF with 1000+ pages
- PDF with no metadata fields
- PDF larger than 100MB
- PDF with complex Unicode text (CJK, Arabic, etc.)
- Page range “1-1” (single page range)
- Page range with spaces “1 - 5”
max_charsset to 0- DPI set to 0 or negative
- DPI set to 1000 (should clamp to 300)
- File path with spaces and special characters
Change History
| Date | Version | Changes | Owner |
|---|---|---|---|
| 2026-03-01 | 0.1 | Initial version | - |