PaperPipe API

Extract structured data from PDF documents with a simple REST API

Auto-detectionWebhooksJSON API

POST paperpipe.app/api/v1/developer/extract

Quick Start

Request
Send a PDF URL and get structured JSON data back
bash
curl -X POST "https://paperpipe.app/api/v1/developer/extract" \
  -H "X-API-Key: YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "document_url": "https://example.com/invoice.pdf"
  }'
Response
Get structured data with auto-detected document type
json
{
  "success": true,
  "detected_type": "invoice",
  "confidence": 0.95,
  "data": {
    "invoice_number": "INV-2024-001",
    "vendor": "Acme Corp",
    "total": 1250.00,
    "date": "2024-01-15",
    "line_items": [
      {
        "description": "Professional Services",
        "quantity": 10,
        "unit_price": 125.00,
        "amount": 1250.00
      }
    ]
  },
  "pages_processed": 1,
  "processing_time_ms": 847,
  "credits_used": 1
}

Document Input Methods

URL Upload

Provide a publicly accessible URL to your PDF document

{
  "document_url": "https://..."
}
Base64 Upload

Send PDF content as base64-encoded string

{
  "document_base64": "JVBERi0..."
}
Multipart Upload

Upload file directly using multipart/form-data

Content-Type:
multipart/form-data

API Examples

Auto Detection
Let AI automatically detect the document type and extract relevant fields

Send document via publicly accessible URL:

bash
curl -X POST "https://paperpipe.app/api/v1/developer/extract" \
  -H "X-API-Key: YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "document_url": "https://example.com/invoice.pdf"
  }'
Specify Document Type
Use a specific schema for consistent results (invoice, contract, receipt, etc.)
bash
curl -X POST "https://paperpipe.app/api/v1/developer/extract" \
  -H "X-API-Key: YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "document_url": "https://example.com/contract.pdf",
    "schema": "contract"
  }'

Available schemas:

invoicereceiptcontractpurchase_orderbank_statementtax_document
Async Processing with Webhooks
Process large documents in the background and get notified when complete

Request

bash
curl -X POST "https://paperpipe.app/api/v1/developer/extract" \
  -H "X-API-Key: YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "document_url": "https://example.com/large-doc.pdf",
    "webhook_url": "https://yourapp.com/webhook",
    "async_processing": true
  }'

Immediate Response

json
{
  "success": true,
  "job_id": "job_abc123def456",
  "status": "processing",
  "estimated_completion": "2024-01-15T10:32:00Z"
}

💡 Webhook Notification: When processing completes, we'll POST the full results to your webhook URL with the extracted data.

What You Can Extract

Invoices

Invoice numbers, vendors, totals, line items, dates, tax amounts

invoice_number: "INV-001"
vendor: "Acme Corp"
total: 1250.00
line_items: [...]
Contracts

Parties, effective dates, terms, payment info, key clauses

parties: ["Company A", "Company B"]
effective_date: "2024-01-01"
terms: "12 months"
payment_terms: {...}
Receipts

Store names, items, totals, tax, payment methods, timestamps

store_name: "Target"
total: 45.67
payment_method: "Credit Card"
items: [...]

Error Handling

Error Response Format
json
{
  "success": false,
  "error": {
    "code": "INVALID_API_KEY",
    "message": "The provided API key is invalid or expired"
  }
}
Common Error Codes
INVALID_API_KEYInvalid API key
FILE_TOO_LARGEFile exceeds 50MB limit
UNSUPPORTED_FILEFile is not a valid PDF
RATE_LIMIT_EXCEEDEDToo many requests

SDKs & Libraries

Python SDKOfficial
Full-featured SDK with auto-detection and webhook support
bash
pip install paperpipe
python
import paperpipe

client = paperpipe.PaperPipeClient()
result = client.extract("https://example.com/invoice.pdf")

print(f"Type: {result.detected_type}")
print(f"Data: {result.data}")
More SDKs Coming
Node.js, Go, and other language SDKs in development
Node.jsComing Soon
GoComing Soon
RubyPlanned

For now, use the REST API directly with any HTTP client

Ready to get started?

Get your free API key and start extracting data from PDFs in minutes. 50 free pages per month, no credit card required.