PDF extraction API · x402 payments

Turn any PDF into
clean markdown,
instantly.

A minimal HTTP API built for AI agents and developers. Send a PDF URL, receive structured markdown. Pay $0.001 per page via USDC — no accounts, no subscriptions, no friction.

docpull API
# 1. Check cost before paying
curl "https://docpull.ai/probe?url=https://example.com/report.pdf"
{ "pageCount": 12, "costUSDC": "0.012000" }
# 2. Extract with x402 payment
curl -X POST https://docpull.ai/extract \
  -H "Content-Type: application/json" \
  -d '{"url": "https://example.com/report.pdf"}'
{ "success": true, "pageCount": 12, "markdown": "# Annual Report..." }

How it works

Three steps.
Zero setup.

01
🔍

Probe the document

Call /probe with any PDF URL. Get back the page count and exact cost before committing a payment.

02

Pay per page

Send a POST to /extract. The x402 protocol handles micropayment settlement in USDC on Base — no API keys required.

03
📄

Receive markdown

Get clean, structured markdown back. Headings, lists, and paragraphs are detected automatically from the PDF's layout.

04
🤖

Agent-native

No subscriptions, no sessions, no credential management. Any agent with a Base wallet can call docpull autonomously.


API reference

Simple endpoints.
Predictable responses.

Base URL: https://docpull.ai

GET /health No auth Service status check
// Response
{ "status": "ok", "service": "docpull", "version": "1.0.0" }
GET /probe No auth Get page count + cost estimate
// Query params
?url=https://example.com/document.pdf

// Response
{
  "pageCount": 12,
  "costUSDC": "0.012000",
  "pricePerPage": "0.001 USDC"
}
POST /extract x402 payment Extract PDF to markdown
// Request body
{ "url": "https://example.com/document.pdf" }

// Without payment → 402 Payment Required
{
  "x402Version": 1,
  "error": "Payment required",
  "accepts": [{ "scheme": "exact", "network": "base", ... }]
}

// With x402 payment → 200 OK
{
  "success": true,
  "pageCount": 12,
  "charCount": 18432,
  "markdown": "# Title\n\n## Section..."
}
POST /extract x402-fetch example Agent client integration
// Using x402-fetch (agent client)
import { wrapFetchWithPayment } from "x402-fetch";
import { createWalletClient } from "viem";

const wallet = createWalletClient({ ... }); // Base wallet
const fetch402 = wrapFetchWithPayment(fetch, wallet);

const res = await fetch402("https://docpull.ai/extract", {
  method: "POST",
  headers: { "Content-Type": "application/json" },
  body: JSON.stringify({ url: "https://example.com/doc.pdf" }),
});

const { markdown, pageCount } = await res.json();

Pricing

Pay only for
what you use.

No subscriptions. No minimums. USDC on Base settles instantly via x402 — no accounts needed.

How x402 works

The x402 protocol uses the HTTP 402 status code to request payment. Your agent receives the payment requirements, settles USDC on Base, and retries the request — all automatically with an x402-compatible client.

Learn about x402 →

Testnet available

Use Base Sepolia to test your integration before going to mainnet. Set network=base-sepolia in your client config.

Get testnet USDC →


Why docpull

Unlike every other
PDF extraction tool.

BlazeDocs, pdfRest, LandingAI, and Docling all require accounts, API keys, or subscriptions — none of which an AI agent can set up autonomously. docpull uses x402 v2: any agent with a Base wallet can call it immediately, no human required.

vs BlazeDocs

No account or subscription. Agents call docpull immediately with zero setup.

vs pdfRest

Simpler and cheaper for text PDFs. No API key management. pdfRest wins for OCR.

vs Docling

Hosted API — no infrastructure to run. Docling wins for self-hosted ML accuracy.

vs LandingAI

Fraction of the cost for standard PDFs. LandingAI wins for scanned / visual docs.

Full comparison →

From the blog

Technical guides for
agent builders.

Tutorial · 8 min read

How to Extract PDFs in AI Agent Pipelines Using x402

Build autonomous document ingestion into your agent stack — pay per page via x402 v2.

Deep Dive · 6 min read

PDF to Markdown: Why Structured Output Matters for RAG Pipelines

Why Markdown — not raw text — is the optimal format for feeding PDFs into retrieval systems.

Architecture · 7 min read

Building Agent-Native APIs with x402 Micropayments

How x402 removes every friction point between AI agents and paid APIs.

View all posts →