How to Extract PDFs in AI Agent Pipelines Using x402
AI agents that can autonomously ingest, understand, and act on PDF documents unlock an enormous surface area of real-world use cases: contract review, research synthesis, financial report analysis, and more. But building PDF extraction into an agent pipeline has historically required human setup — API key registration, billing configuration, and credential management that breaks the autonomous loop.
x402 v2 changes this. Combined with a purpose-built extraction API like docpull, an agent can discover, pay for, and use PDF extraction without any human intervention.
This guide walks through the complete pattern, from discovery to extraction.
The problem with traditional PDF APIs
Most PDF extraction services — whether cloud APIs or self-hosted libraries — share a common assumption: a human developer is setting them up. This creates three friction points for autonomous agents:
- Account creation — requires email verification, billing info, and often a manual approval step
- API key management — keys must be provisioned, stored, and rotated by a human
- Subscription billing — monthly charges that require a credit card on file
An AI agent can do none of these things. If your agent hits a paywalled PDF API without pre-configured credentials, the pipeline fails and waits for human intervention.
The x402 solution
The x402 v2 protocol uses the HTTP 402 status code to request payment inline. Instead of "you need an account," the server says "send $0.001 USDC to this address on Base mainnet and retry." An x402-compatible client handles payment settlement automatically, with no human in the loop.
The flow looks like this:
- Agent calls
POST /extractwith a PDF URL - Server returns
402 Payment Requiredwith aPAYMENT-REQUIREDheader (base64 JSON with payment instructions) - x402 client decodes the requirements, signs a USDC transfer on Base mainnet via CDP facilitator
- Client retries the request with an
X-PAYMENTheader - Server verifies payment and returns the extracted Markdown
The entire sequence takes under a second and costs $0.001 per page.
Setting up the x402 client
First, install the required packages:
npm install @x402/fetch @x402/evm viem
Your agent needs a wallet with USDC on Base mainnet. For testing, use Base Sepolia with test USDC from faucet.circle.com.
import { x402Client, wrapFetchWithPayment } from "@x402/fetch";
import { ExactEvmScheme } from "@x402/evm/exact/client";
import { privateKeyToAccount } from "viem/accounts";
const signer = privateKeyToAccount(process.env.EVM_PRIVATE_KEY);
const client = new x402Client()
.register("eip155:*", new ExactEvmScheme(signer));
const fetchWithPayment = wrapFetchWithPayment(fetch, client);
The probe-then-extract pattern
Before paying, agents should probe the PDF to check its page count and exact cost. docpull exposes a free /probe endpoint for this:
// Step 1: Check cost before paying
const probeRes = await fetch(
`https://docpull.ai/probe?url=${encodeURIComponent(pdfUrl)}`
);
const { pageCount, costUSDC } = await probeRes.json();
// { pageCount: 12, costUSDC: "0.012000", pricePerPage: "0.001 USDC" }
// Step 2: Decide whether to proceed
if (parseFloat(costUSDC) > maxBudget) {
return { error: "Document too expensive", costUSDC };
}
// Step 3: Extract with automatic payment
const res = await fetchWithPayment("https://docpull.ai/extract", {
method: "POST",
headers: { "Content-Type": "application/json" },
body: JSON.stringify({ url: pdfUrl }),
});
const { markdown, pageCount: pages } = await res.json();
Integrating with agent frameworks
LangChain / LangGraph
Wrap the extraction logic as a tool:
import { tool } from "@langchain/core/tools";
import { z } from "zod";
const extractPdfTool = tool(
async ({ url, maxCostUSDC = 0.10 }) => {
const probe = await fetch(
`https://docpull.ai/probe?url=${encodeURIComponent(url)}`
).then(r => r.json());
if (parseFloat(probe.costUSDC) > maxCostUSDC) {
return `Document too large: ${probe.pageCount} pages, costs ${probe.costUSDC} USDC`;
}
const res = await fetchWithPayment("https://docpull.ai/extract", {
method: "POST",
headers: { "Content-Type": "application/json" },
body: JSON.stringify({ url }),
});
const { markdown } = await res.json();
return markdown;
},
{
name: "extract_pdf",
description: "Extract text from a PDF URL as Markdown. Costs $0.001 USDC per page.",
schema: z.object({
url: z.string().url().describe("Public URL of the PDF"),
maxCostUSDC: z.number().optional().describe("Max spend in USDC"),
}),
}
);
Via MCP
docpull also exposes an MCP server at https://docpull.ai/mcp. Add it to your Claude, ChatGPT, or any MCP-compatible client:
{
"type": "streamable-http",
"url": "https://docpull.ai/mcp"
}
The MCP server exposes probe_pdf, extract_pdf, and health_check tools that agents can call natively.
Production considerations
- Wallet funding — keep a buffer of USDC on Base. $1 = 1000 pages, so $10 handles most workloads
- Error handling — handle
402responses explicitly. If payment fails, the retry will also fail - PDF accessibility — docpull fetches PDFs directly. Ensure URLs are publicly accessible and not behind auth
- Timeout — large PDFs can take up to 30 seconds. Set appropriate timeouts in your agent
Discovery via CDP Bazaar
docpull is indexed in the CDP Bazaar — the discovery layer for x402-enabled APIs. Agents that query the Bazaar for "PDF extraction" will find docpull autonomously and can integrate it without any manual configuration:
# Search the Bazaar
curl "https://api.cdp.coinbase.com/platform/v2/x402/discovery/search?query=pdf+extraction"
This means your agent infrastructure can self-configure — given a task that requires PDF extraction, an agent can discover, evaluate, and integrate docpull on its own.
/probe endpoint is always free. Start there.