Tutorial

How to Extract PDFs in AI Agent Pipelines Using x402

May 2026 · 8 min read · docpull

AI agents that can autonomously ingest, understand, and act on PDF documents unlock an enormous surface area of real-world use cases: contract review, research synthesis, financial report analysis, and more. But building PDF extraction into an agent pipeline has historically required human setup — API key registration, billing configuration, and credential management that breaks the autonomous loop.

x402 v2 changes this. Combined with a purpose-built extraction API like docpull, an agent can discover, pay for, and use PDF extraction without any human intervention.

This guide walks through the complete pattern, from discovery to extraction.

The problem with traditional PDF APIs

Most PDF extraction services — whether cloud APIs or self-hosted libraries — share a common assumption: a human developer is setting them up. This creates three friction points for autonomous agents:

An AI agent can do none of these things. If your agent hits a paywalled PDF API without pre-configured credentials, the pipeline fails and waits for human intervention.

The x402 solution

The x402 v2 protocol uses the HTTP 402 status code to request payment inline. Instead of "you need an account," the server says "send $0.001 USDC to this address on Base mainnet and retry." An x402-compatible client handles payment settlement automatically, with no human in the loop.

The flow looks like this:

  1. Agent calls POST /extract with a PDF URL
  2. Server returns 402 Payment Required with a PAYMENT-REQUIRED header (base64 JSON with payment instructions)
  3. x402 client decodes the requirements, signs a USDC transfer on Base mainnet via CDP facilitator
  4. Client retries the request with an X-PAYMENT header
  5. Server verifies payment and returns the extracted Markdown

The entire sequence takes under a second and costs $0.001 per page.

Setting up the x402 client

First, install the required packages:

npm install @x402/fetch @x402/evm viem

Your agent needs a wallet with USDC on Base mainnet. For testing, use Base Sepolia with test USDC from faucet.circle.com.

import { x402Client, wrapFetchWithPayment } from "@x402/fetch";
import { ExactEvmScheme } from "@x402/evm/exact/client";
import { privateKeyToAccount } from "viem/accounts";

const signer = privateKeyToAccount(process.env.EVM_PRIVATE_KEY);
const client = new x402Client()
  .register("eip155:*", new ExactEvmScheme(signer));

const fetchWithPayment = wrapFetchWithPayment(fetch, client);

The probe-then-extract pattern

Before paying, agents should probe the PDF to check its page count and exact cost. docpull exposes a free /probe endpoint for this:

// Step 1: Check cost before paying
const probeRes = await fetch(
  `https://docpull.ai/probe?url=${encodeURIComponent(pdfUrl)}`
);
const { pageCount, costUSDC } = await probeRes.json();
// { pageCount: 12, costUSDC: "0.012000", pricePerPage: "0.001 USDC" }

// Step 2: Decide whether to proceed
if (parseFloat(costUSDC) > maxBudget) {
  return { error: "Document too expensive", costUSDC };
}

// Step 3: Extract with automatic payment
const res = await fetchWithPayment("https://docpull.ai/extract", {
  method: "POST",
  headers: { "Content-Type": "application/json" },
  body: JSON.stringify({ url: pdfUrl }),
});

const { markdown, pageCount: pages } = await res.json();
The probe step is important for budget management. Without it, agents can unknowingly extract 200-page documents and incur $0.20 charges per call. Always probe first.

Integrating with agent frameworks

LangChain / LangGraph

Wrap the extraction logic as a tool:

import { tool } from "@langchain/core/tools";
import { z } from "zod";

const extractPdfTool = tool(
  async ({ url, maxCostUSDC = 0.10 }) => {
    const probe = await fetch(
      `https://docpull.ai/probe?url=${encodeURIComponent(url)}`
    ).then(r => r.json());

    if (parseFloat(probe.costUSDC) > maxCostUSDC) {
      return `Document too large: ${probe.pageCount} pages, costs ${probe.costUSDC} USDC`;
    }

    const res = await fetchWithPayment("https://docpull.ai/extract", {
      method: "POST",
      headers: { "Content-Type": "application/json" },
      body: JSON.stringify({ url }),
    });

    const { markdown } = await res.json();
    return markdown;
  },
  {
    name: "extract_pdf",
    description: "Extract text from a PDF URL as Markdown. Costs $0.001 USDC per page.",
    schema: z.object({
      url: z.string().url().describe("Public URL of the PDF"),
      maxCostUSDC: z.number().optional().describe("Max spend in USDC"),
    }),
  }
);

Via MCP

docpull also exposes an MCP server at https://docpull.ai/mcp. Add it to your Claude, ChatGPT, or any MCP-compatible client:

{
  "type": "streamable-http",
  "url": "https://docpull.ai/mcp"
}

The MCP server exposes probe_pdf, extract_pdf, and health_check tools that agents can call natively.

Production considerations

Discovery via CDP Bazaar

docpull is indexed in the CDP Bazaar — the discovery layer for x402-enabled APIs. Agents that query the Bazaar for "PDF extraction" will find docpull autonomously and can integrate it without any manual configuration:

# Search the Bazaar
curl "https://api.cdp.coinbase.com/platform/v2/x402/discovery/search?query=pdf+extraction"

This means your agent infrastructure can self-configure — given a task that requires PDF extraction, an agent can discover, evaluate, and integrate docpull on its own.

docpull is live at docpull.ai. The /probe endpoint is always free. Start there.
← All posts Next: PDF to Markdown for RAG →