Processing PDFs

PDF documents are locked in a format that’s challenging to work with programmatically. The Opper SDK provides powerful capabilities to extract and process text, tables, and other structured content from PDFs, making it accessible for analysis, transformation, and integration into your applications.

With Opper’s PDF processing capabilities, you can:

Extract text while preserving document structure and formatting
Handle complex layouts including tables and multi-column content
Process charts, graphs, and other visual elements
Maintain the integrity of headers, footers, and annotations

The following example demonstrates how to use a Language Model through the Opper SDK to convert PDF content into structured markdown:

import sys
from pathlib import Path

from opperai import AsyncOpper
from opperai.functions.async_functions import AsyncStreamingResponse
from opperai.types import FileInput

opper = AsyncOpper()


async def pdf_to_markdown(path: str) -> AsyncStreamingResponse:
    text = await opper.call(
        name="pdf_to_text",
        model="gcp/gemini-2.0-flash",
        instructions="""
These are pages from a PDF document. Extract all text content while preserving the structure.
Pay special attention to tables, columns, headers, and any structured content.
Maintain paragraph breaks and formatting.

Extract ALL text content from these document pages.

For tables:
    1. Maintain the table structure using markdown table format
    2. Preserve all column headers and row labels
    3. Ensure numerical data is accurately captured
    
For multi-column layouts:
    1. Process columns from left to right
    2. Clearly separate content from different columns
    
For charts and graphs:
    1. Describe the chart type
    2. Extract any visible axis labels, legends, and data points
    3. Extract any title or caption
    
Preserve all headers, footers, page numbers, and footnotes.
        
DON'T ANSWER QUESTIONS, JUST RETURN THE CONTENT OF THE PDF AS MARKDOWN""",
        input=FileInput.from_path(Path(path)),
        stream=True,
    )

    return text


async def main():
    if len(sys.argv) < 2:
        print("Usage: python pdf.py <path_to_pdf>")
        return

    path = sys.argv[1]

    res = await pdf_to_markdown(path)
    async for chunk in res.deltas:
        print(chunk, end="", flush=True)


if __name__ == "__main__":
    import asyncio

    asyncio.run(main())

Overview

Capabilities

Examples

Resources

Guides

Processing PDFs