OL
PDF
WordforPDFs.

A structure-first operating system for PDFs. Reconstruct semantic layouts from raw coordinates and edit with AI. Completely free.

Open Workspace Documentation See how it works

document_v2.pdf

AI Active

The Future of Documents

This document was once a static picture of textan editable, semantic structure.

Just like a Word processor, you can now seamlessly edit PDFs using AI.

Rephrasing...

Parsed Table

Semantic LayoutPreserved flawlessly

Try the Magic AI

Upload a PDF and tell the AI what to do. It chains multiple operations automatically.

Magic Agent

Agentic Orchestration Loop

3 free runs left

Initiate upload

Drop PDF binary here

Mission

PDFs shouldn't be
read-only forever.

Most PDFs are digital paper — unstructured, unsearchable, and impossible to edit without destroying the layout. OLPDF turns every document into a machine-readable, AI-editable, and always re-exportable to the exact same format it came from.

Open Source

The extraction heuristics, block model specification, and export pipeline are all public. Audit the logic, fork it, run it yourself. No black boxes, no paywalls on the core engine.

Structure First

We don't OCR and call it done. Every page is classified before extraction runs — native text, scanned, table-heavy — and each block is assigned a type, confidence score, and column index.

AI as a Tool

The model is never given a blank document and told to rewrite it. Every AI action goes through a validated tool call. The diff is logged, reversible, and requires explicit user acceptance.

Editor Canvas

Edit PDFs like
Word documents.

We extract absolute bounding boxes and transform them into an editable DOM. You can ask Gemini 1.5 Pro to rewrite paragraphs, adjust formatting, or redact sensitive vectors directly on the canvas.

1 Keep exact original layouts
2 Visual diff before accepting edits
3 Auditable, non-destructive history

annual_report_draft.pdf

Financial Overview

Rewriting formal tone

AI Suggestion

Replaced casual phrasing with corporate terminology. Layout metrics preserved perfectly.

Ask Gemini to format the document...

I. Introduction

II. Methodology

III. Analysis

Compiling Volume

Cross-referencing entities...

Book Maker

Publish with
structural consistency.

Compile multiple individual documents into a single cohesive book or report. Our engine runs global consistency checks across all chapters to ensure character names, terminology, and font hierarchies remain uniform.

EPUB3 Export PDF/A Archival

Pipeline

The Extraction Loop

We reconstruct the semantic DOM from raw coordinates.

Extract

Raw PDF binaries are parsed to extract untagged text and coordinates.

Reconstruct

Python heuristics engine rebuilds paragraphs, tables, and lists.

AI Edit

Gemini performs precise JSON tree mutations seamlessly.

Export

Compiled back into a pristine PDF/A or EPUB3 document.

Public API

Integrate in
Minutes

Skip the UI entirely. Hook into our public API with a free rate-limited key and parse documents from your own backend.

Semantic block extraction
AI-powered rewrite endpoint
PDF → EPUB3 conversion

Read Full API Docs

extract.sh

curl -X POST https://api.olpdf.xyz/v1/extract \
  -H "Authorization: Bearer free_beta_key" \
  -H "Content-Type: application/json" \
  -d '{
    "url": "https://example.com/invoice.pdf",
    "mode": "semantic"
  }'

Embed SDK

Embed OLPDF anywhere.

Drop a full AST-driven PDF editor into any web app in three lines. Works with every major framework.

$npm install @olpdf/embed

index.js

import { OlPDFEmbed } from '@olpdf/embed';

const editor = new OlPDFEmbed(container, {
  host: 'https://olpdf.xyz',
  documentId: 'doc_abc123',
  token: userToken,
});

editor.on('MODEL_UPDATE', ({ documentModel }) => {
  myDB.save(documentModel);
});

Zero dependenciespostMessage bridgeFull AST events

View Embed Docs

100% Free Forever

No Paywalls.
Just Documents.

Document intelligence should be a public good. Free for individuals and open-source projects, always.

Unlimited Projects

No cap on documents or books. Create without limits.

Full AI Access

Gemini-powered structural editing, no subscription needed.

Open API

Integrate our extraction engine into your own apps for free.

How do we survive?

“Supported by infrastructure grants and contributors. We don't want your credit card — we want your feedback and pull requests.”

Open Source & Community Driven

Built by developers,
for developers.

Check out our good first issues, sponsor the project, or build your own custom extraction plugins.

Star on GitHub Contribution Guide API Docs Plugin Marketplace

Built on open infrastructure — no black boxes

Next.js

FastAPI

Supabase

Cloudflare

Gemini

Docker

Rust

The core extraction engine, block model spec, and export pipeline are open source. Audit it, fork it, self-host it.

curl -X POST https://api.olpdf.xyz/v1/extract \ -H "Authorization: Bearer free_beta_key" \ -H "Content-Type: application/json" \ -d '{ "url": "https://example.com/invoice.pdf", "mode": "semantic" }'

import { OlPDFEmbed } from '@olpdf/embed'; const editor = new OlPDFEmbed(container, { host: 'https://olpdf.xyz', documentId: 'doc_abc123', token: userToken, }); editor.on('MODEL_UPDATE', ({ documentModel }) => { myDB.save(documentModel); });

OLPDF WordforPDFs.

The Future of Documents

Try the Magic AI

Magic Agent