RFC-033: PDF Tools

Document Information

RFC ID: RFC-033
Related PRD: FEAT-033 (PDF Tools)
Related Architecture: RFC-000 (Overall Architecture)
Related RFC: RFC-004 (Tool System)
Created: 2026-03-01
Last Updated: 2026-03-01
Status: Draft
Author: TBD

Overview

Background

AI agents frequently need to work with PDF documents – summarizing reports, extracting data from invoices, reading research papers, or analyzing scanned forms. Currently, OneClaw has no built-in capability to read PDF files. Users can attach files (FEAT-026) and browse the file system (FEAT-025), but agents cannot extract content from PDFs.

OneClaw 1.0 has a mature PDF tools implementation in lib-pdf that provides three tools: pdf_info, pdf_extract_text, and pdf_render_page. This RFC ports that functionality to OneClaw’s tool architecture as Kotlin built-in tools, adapting the code to use OneClaw’s Tool interface, ToolResult, and ToolDefinition data types.

Goals

Implement three Kotlin built-in tools: PdfInfoTool, PdfExtractTextTool, PdfRenderPageTool
Create shared utility PdfToolUtils for path resolution and page range parsing
Add PDFBox Android dependency to the project
Initialize PDFBox in the application startup
Register all three tools in ToolModule
Add unit tests for all tools and utilities

Non-Goals

PDF creation, editing, or annotation
OCR for scanned PDFs (vision-capable models can analyze rendered images)
Password-protected PDF support (deferred to future iteration)
PDF form interaction
PDF-to-Markdown conversion (beyond raw text extraction)

Technical Design

Architecture Overview

┌──────────────────────────────────────────────────────────────┐
│                     Chat Layer (RFC-001)                      │
│  SendMessageUseCase                                          │
│       │                                                      │
│       │  tool call: pdf_info / pdf_extract_text /            │
│       │             pdf_render_page                           │
│       v                                                      │
├──────────────────────────────────────────────────────────────┤
│                   Tool Execution Engine (RFC-004)             │
│  executeTool(name, params, availableToolIds)                 │
│       │                                                      │
│       v                                                      │
│  ┌────────────────────────────────────────────────────────┐  │
│  │                    ToolRegistry                         │  │
│  │                                                        │  │
│  │  ┌─────────────┐  ┌──────────────────┐  ┌───────────┐ │  │
│  │  │  pdf_info    │  │ pdf_extract_text │  │pdf_render │ │  │
│  │  │(PdfInfoTool) │  │(PdfExtractText  │  │  _page    │ │  │
│  │  │             │  │        Tool)     │  │(PdfRender │ │  │
│  │  │             │  │                  │  │ PageTool) │ │  │
│  │  └──────┬──────┘  └────────┬─────────┘  └─────┬─────┘ │  │
│  │         │                  │                   │       │  │
│  │         v                  v                   v       │  │
│  │  ┌─────────────────────────────────────────────────┐   │  │
│  │  │                  PdfToolUtils                    │   │  │
│  │  │  - initPdfBox(context)                          │   │  │
│  │  │  - parsePageRange(spec, totalPages)             │   │  │
│  │  └─────────────────────────────────────────────────┘   │  │
│  │         │                  │                   │       │  │
│  │         v                  v                   v       │  │
│  │  ┌──────────────┐  ┌──────────────┐  ┌─────────────┐  │  │
│  │  │ PDFBox       │  │ PDFBox       │  │ Android     │  │  │
│  │  │ PDDocument   │  │ PDFText      │  │ PdfRenderer │  │  │
│  │  │ .docInfo     │  │ Stripper     │  │ + Bitmap    │  │  │
│  │  └──────────────┘  └──────────────┘  └─────────────┘  │  │
│  └────────────────────────────────────────────────────────┘  │
└──────────────────────────────────────────────────────────────┘

Core Components

New:

PdfInfoTool – Kotlin built-in tool that reads PDF metadata
PdfExtractTextTool – Kotlin built-in tool that extracts text from PDFs
PdfRenderPageTool – Kotlin built-in tool that renders PDF pages as PNG images
PdfToolUtils – Shared utility for PDFBox initialization and page range parsing

Modified:

ToolModule – Register the three PDF tools
build.gradle.kts – Add PDFBox Android dependency

Detailed Design

Directory Structure (New & Changed Files)

app/src/main/
├── kotlin/com/oneclaw/shadow/
│   ├── tool/
│   │   ├── builtin/
│   │   │   ├── PdfInfoTool.kt              # NEW
│   │   │   ├── PdfExtractTextTool.kt       # NEW
│   │   │   ├── PdfRenderPageTool.kt        # NEW
│   │   │   ├── WebfetchTool.kt             # unchanged
│   │   │   ├── BrowserTool.kt              # unchanged
│   │   │   ├── LoadSkillTool.kt            # unchanged
│   │   │   ├── CreateScheduledTaskTool.kt  # unchanged
│   │   │   └── CreateAgentTool.kt          # unchanged
│   │   └── util/
│   │       ├── PdfToolUtils.kt             # NEW
│   │       └── HtmlToMarkdownConverter.kt  # unchanged
│   └── di/
│       └── ToolModule.kt                   # MODIFIED

app/src/test/kotlin/com/oneclaw/shadow/
    └── tool/
        ├── builtin/
        │   ├── PdfInfoToolTest.kt           # NEW
        │   ├── PdfExtractTextToolTest.kt    # NEW
        │   └── PdfRenderPageToolTest.kt     # NEW
        └── util/
            └── PdfToolUtilsTest.kt          # NEW

PdfToolUtils

/**
 * Located in: tool/util/PdfToolUtils.kt
 *
 * Shared utilities for PDF tools: PDFBox initialization
 * and page range parsing.
 */
object PdfToolUtils {

    private const val TAG = "PdfToolUtils"
    private var initialized = false

    /**
     * Initialize PDFBox resource loader. Must be called once
     * before any PDFBox operations. Safe to call multiple times.
     */
    fun initPdfBox(context: Context) {
        if (!initialized) {
            PDFBoxResourceLoader.init(context.applicationContext)
            initialized = true
            Log.i(TAG, "PDFBox initialized")
        }
    }

    /**
     * Parse a page range specification string.
     *
     * Supported formats:
     * - Single page: "3"
     * - Range: "1-5"
     * - Comma-separated: "1,3,5-7"
     *
     * @param spec Page range specification string
     * @param totalPages Total number of pages in the document
     * @return Pair of (startPage, endPage) 1-based inclusive, or null if invalid
     */
    fun parsePageRange(spec: String, totalPages: Int): Pair<Int, Int>? {
        val trimmed = spec.trim()

        // Comma-separated: find overall min and max
        if (trimmed.contains(",")) {
            val parts = trimmed.split(",").map { it.trim() }
            var min = Int.MAX_VALUE
            var max = Int.MIN_VALUE
            for (part in parts) {
                val range = parsePageRange(part, totalPages) ?: return null
                min = minOf(min, range.first)
                max = maxOf(max, range.second)
            }
            return Pair(min, max)
        }

        // Range: "start-end"
        if (trimmed.contains("-")) {
            val parts = trimmed.split("-", limit = 2)
            val start = parts[0].trim().toIntOrNull() ?: return null
            val end = parts[1].trim().toIntOrNull() ?: return null
            if (start < 1 || end > totalPages || start > end) return null
            return Pair(start, end)
        }

        // Single page
        val page = trimmed.toIntOrNull() ?: return null
        if (page < 1 || page > totalPages) return null
        return Pair(page, page)
    }
}

PdfInfoTool

/**
 * Located in: tool/builtin/PdfInfoTool.kt
 *
 * Reads PDF metadata: page count, file size, title, author,
 * subject, creator, producer, and creation date.
 */
class PdfInfoTool(
    private val context: Context
) : Tool {

    companion object {
        private const val TAG = "PdfInfoTool"
    }

    override val definition = ToolDefinition(
        name = "pdf_info",
        description = "Get metadata and info about a PDF file. " +
            "Returns page count, file size, title, author, and other document properties.",
        parametersSchema = ToolParametersSchema(
            properties = mapOf(
                "path" to ToolParameter(
                    type = "string",
                    description = "Path to the PDF file"
                )
            ),
            required = listOf("path")
        ),
        requiredPermissions = emptyList(),
        timeoutSeconds = 15
    )

    override suspend fun execute(parameters: Map<String, Any?>): ToolResult {
        val path = parameters["path"]?.toString()
            ?: return ToolResult.error("validation_error", "Parameter 'path' is required")

        val file = File(path)
        if (!file.exists()) {
            return ToolResult.error("file_not_found", "File not found: $path")
        }
        if (!file.canRead()) {
            return ToolResult.error("permission_denied", "Cannot read file: $path")
        }

        PdfToolUtils.initPdfBox(context)

        return try {
            val doc = PDDocument.load(file)
            val result = try {
                val info = doc.documentInformation
                val lines = mutableListOf<String>()
                lines.add("File: ${file.name}")
                lines.add("Path: $path")
                lines.add("Pages: ${doc.numberOfPages}")
                lines.add("File size: ${file.length()} bytes")

                info.title?.takeIf { it.isNotBlank() }?.let {
                    lines.add("Title: $it")
                }
                info.author?.takeIf { it.isNotBlank() }?.let {
                    lines.add("Author: $it")
                }
                info.subject?.takeIf { it.isNotBlank() }?.let {
                    lines.add("Subject: $it")
                }
                info.creator?.takeIf { it.isNotBlank() }?.let {
                    lines.add("Creator: $it")
                }
                info.producer?.takeIf { it.isNotBlank() }?.let {
                    lines.add("Producer: $it")
                }
                info.creationDate?.time?.let {
                    lines.add("Created: $it")
                }

                ToolResult.success(lines.joinToString("\n"))
            } finally {
                doc.close()
            }
            result
        } catch (e: Exception) {
            Log.e(TAG, "Failed to read PDF info: $path", e)
            ToolResult.error("pdf_error", "Failed to read PDF info: ${e.message}")
        }
    }
}

PdfExtractTextTool

/**
 * Located in: tool/builtin/PdfExtractTextTool.kt
 *
 * Extracts text content from PDF files using PDFBox's PDFTextStripper.
 * Supports page range selection and output truncation.
 */
class PdfExtractTextTool(
    private val context: Context
) : Tool {

    companion object {
        private const val TAG = "PdfExtractTextTool"
        private const val DEFAULT_MAX_CHARS = 50_000
    }

    override val definition = ToolDefinition(
        name = "pdf_extract_text",
        description = "Extract text content from a PDF file. " +
            "Supports page range selection. For scanned PDFs with no text layer, " +
            "use pdf_render_page to get page images instead.",
        parametersSchema = ToolParametersSchema(
            properties = mapOf(
                "path" to ToolParameter(
                    type = "string",
                    description = "Path to the PDF file"
                ),
                "pages" to ToolParameter(
                    type = "string",
                    description = "Page range to extract (e.g. \"1-5\", \"3\", \"1,3,5-7\"). " +
                        "Omit to extract all pages."
                ),
                "max_chars" to ToolParameter(
                    type = "integer",
                    description = "Maximum characters to return (default 50000)"
                )
            ),
            required = listOf("path")
        ),
        requiredPermissions = emptyList(),
        timeoutSeconds = 30
    )

    override suspend fun execute(parameters: Map<String, Any?>): ToolResult {
        val path = parameters["path"]?.toString()
            ?: return ToolResult.error("validation_error", "Parameter 'path' is required")
        val maxChars = (parameters["max_chars"] as? Number)?.toInt()
            ?: DEFAULT_MAX_CHARS
        val pagesArg = parameters["pages"]?.toString()

        val file = File(path)
        if (!file.exists()) {
            return ToolResult.error("file_not_found", "File not found: $path")
        }
        if (!file.canRead()) {
            return ToolResult.error("permission_denied", "Cannot read file: $path")
        }

        PdfToolUtils.initPdfBox(context)

        return try {
            val doc = PDDocument.load(file)
            val result = try {
                val stripper = PDFTextStripper()
                val totalPages = doc.numberOfPages

                if (pagesArg != null) {
                    val range = PdfToolUtils.parsePageRange(pagesArg, totalPages)
                        ?: return ToolResult.error(
                            "invalid_page_range",
                            "Invalid page range: $pagesArg (document has $totalPages pages)"
                        )
                    stripper.startPage = range.first
                    stripper.endPage = range.second
                }

                val text = stripper.getText(doc)

                if (text.isBlank()) {
                    ToolResult.success(
                        "No text content found in PDF. This may be a scanned document. " +
                            "Use pdf_render_page to render pages as images for visual inspection."
                    )
                } else {
                    val truncated = if (text.length > maxChars) {
                        text.take(maxChars) +
                            "\n\n[Truncated at $maxChars characters. " +
                            "Total text length: ${text.length}. " +
                            "Use 'pages' parameter to extract specific pages.]"
                    } else {
                        text
                    }

                    val header = "Extracted text from ${file.name}" +
                        (if (pagesArg != null) " (pages: $pagesArg)" else "") +
                        " [$totalPages total pages]:\n\n"

                    ToolResult.success(header + truncated)
                }
            } finally {
                doc.close()
            }
            result
        } catch (e: Exception) {
            Log.e(TAG, "Failed to extract PDF text: $path", e)
            ToolResult.error("pdf_error", "Failed to extract PDF text: ${e.message}")
        }
    }
}

PdfRenderPageTool

/**
 * Located in: tool/builtin/PdfRenderPageTool.kt
 *
 * Renders a PDF page to a PNG image using Android's PdfRenderer.
 * Saves the output to the app's internal pdf-renders/ directory.
 */
class PdfRenderPageTool(
    private val context: Context
) : Tool {

    companion object {
        private const val TAG = "PdfRenderPageTool"
        private const val DEFAULT_DPI = 150
        private const val MIN_DPI = 72
        private const val MAX_DPI = 300
    }

    override val definition = ToolDefinition(
        name = "pdf_render_page",
        description = "Render a PDF page to a PNG image. " +
            "Useful for scanned PDFs or pages with complex layouts, charts, or images. " +
            "The rendered image is saved to pdf-renders/ in the app's storage.",
        parametersSchema = ToolParametersSchema(
            properties = mapOf(
                "path" to ToolParameter(
                    type = "string",
                    description = "Path to the PDF file"
                ),
                "page" to ToolParameter(
                    type = "integer",
                    description = "Page number to render (1-based)"
                ),
                "dpi" to ToolParameter(
                    type = "integer",
                    description = "Render resolution in DPI (default 150, min 72, max 300)"
                )
            ),
            required = listOf("path", "page")
        ),
        requiredPermissions = emptyList(),
        timeoutSeconds = 30
    )

    override suspend fun execute(parameters: Map<String, Any?>): ToolResult {
        val path = parameters["path"]?.toString()
            ?: return ToolResult.error("validation_error", "Parameter 'path' is required")
        val pageNum = (parameters["page"] as? Number)?.toInt()
            ?: return ToolResult.error("validation_error", "Parameter 'page' is required")
        val dpi = ((parameters["dpi"] as? Number)?.toInt() ?: DEFAULT_DPI)
            .coerceIn(MIN_DPI, MAX_DPI)

        val file = File(path)
        if (!file.exists()) {
            return ToolResult.error("file_not_found", "File not found: $path")
        }
        if (!file.canRead()) {
            return ToolResult.error("permission_denied", "Cannot read file: $path")
        }

        return try {
            val fd = ParcelFileDescriptor.open(file, ParcelFileDescriptor.MODE_READ_ONLY)
            val renderer = PdfRenderer(fd)

            val pageIndex = pageNum - 1
            if (pageIndex < 0 || pageIndex >= renderer.pageCount) {
                renderer.close()
                fd.close()
                return ToolResult.error(
                    "invalid_page",
                    "Page $pageNum out of range (document has ${renderer.pageCount} pages)"
                )
            }

            val page = renderer.openPage(pageIndex)
            val scale = dpi / 72f
            val width = (page.width * scale).toInt()
            val height = (page.height * scale).toInt()

            val bitmap = Bitmap.createBitmap(width, height, Bitmap.Config.ARGB_8888)
            bitmap.eraseColor(Color.WHITE)

            page.render(
                bitmap,
                null,
                null,
                PdfRenderer.Page.RENDER_MODE_FOR_DISPLAY
            )
            page.close()
            renderer.close()
            fd.close()

            // Save PNG to app's internal storage
            val outputDir = File(context.filesDir, "pdf-renders").also { it.mkdirs() }
            val baseName = file.nameWithoutExtension
            val outputFile = File(outputDir, "${baseName}-page${pageNum}.png")
            FileOutputStream(outputFile).use { out ->
                bitmap.compress(Bitmap.CompressFormat.PNG, 100, out)
            }
            bitmap.recycle()

            ToolResult.success(
                "Page $pageNum rendered and saved to: ${outputFile.absolutePath}\n" +
                    "Resolution: ${width}x${height} (${dpi} DPI)\n" +
                    "File size: ${outputFile.length()} bytes"
            )
        } catch (e: Exception) {
            Log.e(TAG, "Failed to render PDF page: $path page $pageNum", e)
            ToolResult.error("pdf_error", "Failed to render PDF page: ${e.message}")
        }
    }
}

PDFBox Android Dependency

Add to app/build.gradle.kts:

dependencies {
    // ... existing dependencies ...

    // PDF tools: PDFBox for text extraction and metadata
    implementation("com.tom-roush:pdfbox-android:2.0.27.0")
}

PDFBox Android:

Size: ~2.5MB (includes fonts and resources)
License: Apache 2.0
Transitive dependencies: none significant
Compatible with Android API 21+
Used by OneClaw 1.0 (lib-pdf module)

ToolModule Changes

// In ToolModule.kt, add imports:
import com.oneclaw.shadow.tool.builtin.PdfInfoTool
import com.oneclaw.shadow.tool.builtin.PdfExtractTextTool
import com.oneclaw.shadow.tool.builtin.PdfRenderPageTool
import com.oneclaw.shadow.tool.util.PdfToolUtils

val toolModule = module {
    // ... existing declarations ...

    // RFC-033: PDF tools
    single {
        PdfToolUtils.initPdfBox(androidContext())
        PdfInfoTool(androidContext())
    }
    single {
        PdfExtractTextTool(androidContext())
    }
    single {
        PdfRenderPageTool(androidContext())
    }

    single {
        ToolRegistry().apply {
            // ... existing tool registrations ...

            // RFC-033: PDF tools
            try {
                register(get<PdfInfoTool>(), ToolSourceInfo.BUILTIN)
            } catch (e: Exception) {
                Log.e("ToolModule", "Failed to register pdf_info: ${e.message}")
            }
            try {
                register(get<PdfExtractTextTool>(), ToolSourceInfo.BUILTIN)
            } catch (e: Exception) {
                Log.e("ToolModule", "Failed to register pdf_extract_text: ${e.message}")
            }
            try {
                register(get<PdfRenderPageTool>(), ToolSourceInfo.BUILTIN)
            } catch (e: Exception) {
                Log.e("ToolModule", "Failed to register pdf_render_page: ${e.message}")
            }

            // ... JS tool loading (unchanged) ...
        }
    }

    // ... rest of module unchanged ...
}

Implementation Plan

Phase 1: Dependencies and Utilities

Add com.tom-roush:pdfbox-android:2.0.27.0 to app/build.gradle.kts
Create PdfToolUtils.kt in tool/util/
Create PdfToolUtilsTest.kt with tests for parsePageRange()
Verify build compiles successfully

Phase 2: PdfInfoTool

Create PdfInfoTool.kt in tool/builtin/
Create PdfInfoToolTest.kt with unit tests
Register in ToolModule.kt
Verify ./gradlew test passes

Phase 3: PdfExtractTextTool

Create PdfExtractTextTool.kt in tool/builtin/
Create PdfExtractTextToolTest.kt with unit tests
Register in ToolModule.kt
Verify ./gradlew test passes

Phase 4: PdfRenderPageTool

Create PdfRenderPageTool.kt in tool/builtin/
Create PdfRenderPageToolTest.kt with unit tests
Register in ToolModule.kt
Verify ./gradlew test passes

Phase 5: Integration Testing

Run full Layer 1A test suite (./gradlew test)
Run Layer 1B tests if emulator available
Manual testing with real PDF files on device
Write test report

Data Model

No data model or database changes. The tools operate on files and return string results through the existing ToolResult type.

API Design

Tool Interfaces

Tool: pdf_info
Parameters:
  - path: string (required) -- Path to the PDF file
Returns on success:
  Multi-line text with file info, page count, and metadata fields
Returns on error:
  ToolResult.error with error type and message

Tool: pdf_extract_text
Parameters:
  - path: string (required) -- Path to the PDF file
  - pages: string (optional) -- Page range specification
  - max_chars: integer (optional, default: 50000) -- Output character limit
Returns on success:
  Header line + extracted text content
Returns on error:
  ToolResult.error with error type and message

Tool: pdf_render_page
Parameters:
  - path: string (required) -- Path to the PDF file
  - page: integer (required) -- Page number (1-based)
  - dpi: integer (optional, default: 150) -- Render resolution (72-300)
Returns on success:
  Text with output file path, resolution, and file size
Returns on error:
  ToolResult.error with error type and message

Error Handling

Error Type	Cause	Response
`validation_error`	Missing or invalid required parameter	`ToolResult.error("validation_error", "Parameter 'X' is required")`
`file_not_found`	File does not exist at given path	`ToolResult.error("file_not_found", "File not found: <path>")`
`permission_denied`	Cannot read the file	`ToolResult.error("permission_denied", "Cannot read file: <path>")`
`invalid_page`	Page number out of document range	`ToolResult.error("invalid_page", "Page N out of range (document has M pages)")`
`invalid_page_range`	Malformed page range string	`ToolResult.error("invalid_page_range", "Invalid page range: <spec>")`
`pdf_error`	PDFBox or PdfRenderer exception	`ToolResult.error("pdf_error", "Failed to ...: <exception message>")`

All errors follow the existing ToolResult.error(errorType, errorMessage) pattern used by other built-in tools.

Security Considerations

File access: Tools accept file paths within app-private storage. No external storage permissions are required – file access is confined to app-private storage (context.filesDir) via FsBridge allowlist validation.
Resource management: PDDocument, PdfRenderer, ParcelFileDescriptor, and Bitmap objects are properly closed/recycled in try-finally blocks to prevent resource leaks.
Memory safety: PDFBox may consume significant memory for large PDFs. The tools do not load the entire document into memory at once – PDFTextStripper processes pages sequentially. PdfRenderer renders one page at a time with the bitmap recycled after saving.
Output isolation: Rendered PNG files are saved to the app’s internal storage (context.filesDir/pdf-renders/), not to shared external storage.
No network access: All operations are local file operations. No data is sent externally.

Performance

Operation	Expected Time	Notes
`pdf_info` (typical PDF)	< 500ms	Opens file, reads metadata, closes
`pdf_extract_text` (10 pages)	< 1s	PDFTextStripper sequential processing
`pdf_extract_text` (100 pages)	< 5s	Linear scaling with page count
`pdf_render_page` (150 DPI)	< 2s	Render + PNG compression
`pdf_render_page` (300 DPI)	< 5s	4x pixels vs 150 DPI

Memory usage:

PDDocument: ~2-10MB for typical PDFs (closed after use)
Bitmap at 150 DPI (A4 page): ~3.5MB (recycled after saving)
Bitmap at 300 DPI (A4 page): ~14MB (recycled after saving)

Testing Strategy

Unit Tests

PdfToolUtilsTest.kt:

testParsePageRange_singlePage – “3” returns (3, 3)
testParsePageRange_range – “1-5” returns (1, 5)
testParsePageRange_commaSeparated – “1,3,5-7” returns (1, 7)
testParsePageRange_rangeWithSpaces – “1 - 5” returns (1, 5)
testParsePageRange_invalidRange – “5-2” returns null
testParsePageRange_outOfBounds – “0” and “11” for 10-page doc return null
testParsePageRange_nonNumeric – “abc” returns null
testParsePageRange_singlePageRange – “1-1” returns (1, 1)

PdfInfoToolTest.kt:

testDefinition – Tool definition has correct name, parameters, permissions
testExecute_missingPath – Returns validation error
testExecute_fileNotFound – Returns file_not_found error
testExecute_validPdf – Returns page count, file size, metadata (using test PDF resource)

PdfExtractTextToolTest.kt:

testDefinition – Tool definition has correct name, parameters, permissions
testExecute_missingPath – Returns validation error
testExecute_fileNotFound – Returns file_not_found error
testExecute_extractAllPages – Returns full text (using test PDF resource)
testExecute_extractPageRange – Returns text from specified pages
testExecute_invalidPageRange – Returns invalid_page_range error
testExecute_truncation – Returns truncated text with notice when exceeding max_chars
testExecute_defaultMaxChars – Default max_chars is 50000

PdfRenderPageToolTest.kt:

testDefinition – Tool definition has correct name, parameters, permissions
testExecute_missingPath – Returns validation error
testExecute_missingPage – Returns validation error
testExecute_fileNotFound – Returns file_not_found error
testExecute_pageOutOfRange – Returns invalid_page error
testExecute_dpiClamping – DPI values outside 72-300 are clamped

Test PDF Resources

Place test PDF files in app/src/test/resources/:

test-document.pdf – A simple text-based PDF with known content (2-3 pages)
Created programmatically in test setup using PDFBox, or checked in as a small test fixture

Integration Tests (Manual)

Install app on device with PDFs in /sdcard/Documents/
Ask agent to summarize a text-based PDF
Ask agent to render a page from a scanned PDF
Ask agent to extract specific pages from a long document
Verify tool calls appear correctly in chat history

Alternatives Considered

1. Single PdfTool class with mode parameter

Approach: One PdfTool class with a mode parameter (“info”, “extract_text”, “render_page”), similar to BrowserTool.

Rejected because: The three operations have very different parameter sets. Combining them into one tool with a mode parameter would make the parameter schema complex and confusing for AI models. Separate tools with focused parameter schemas produce better AI tool-calling behavior.

2. JavaScript-based PDF tools

Approach: Implement PDF tools as JS tools using a PDF library in QuickJS.

Rejected because: QuickJS has no DOM APIs and limited file I/O. PDF parsing libraries require native code or full JVM support. PDFBox Android and PdfRenderer are Kotlin/Java native and cannot run in QuickJS. The Kotlin built-in approach is the correct fit.

3. Use only Android PdfRenderer (no PDFBox)

Approach: Use Android’s built-in PdfRenderer for everything, including text extraction (by rendering pages and using OCR).

Rejected because: PdfRenderer can only render pages as bitmaps. It cannot extract text directly. Using OCR for text extraction would be slow, inaccurate, and add unnecessary complexity. PDFBox provides direct text extraction via PDFTextStripper.

Dependencies

External Dependencies

Dependency	Version	Size	License
com.tom-roush:pdfbox-android	2.0.27.0	~2.5MB	Apache 2.0
Android PdfRenderer	Built-in (API 21+)	0	Android Framework

Internal Dependencies

Tool interface from tool/engine/
ToolResult, ToolDefinition, ToolParametersSchema, ToolParameter from core/model/
ToolRegistry, ToolSourceInfo from tool/engine/ and core/model/
Context from Android framework (injected via Koin androidContext())

Future Extensions

Password-protected PDFs: Add optional password parameter to pdf_extract_text and pdf_info
Batch page rendering: Add pages parameter to pdf_render_page to render multiple pages at once
PDF search: Add a pdf_search tool to search for text within a PDF and return page numbers and context
PDF table extraction: Specialized table extraction using PDFBox’s table detection heuristics
PDF bookmark/outline extraction: Extract the document outline/table of contents

Change History

Date	Version	Changes	Owner
2026-03-01	0.1	Initial version	-