Extract structured data from PDF documents with a simple REST API
POST paperpipe.app/api/v1/developer/extract
curl -X POST "https://paperpipe.app/api/v1/developer/extract" \
-H "X-API-Key: YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"document_url": "https://example.com/invoice.pdf"
}'
{
"success": true,
"detected_type": "invoice",
"confidence": 0.95,
"data": {
"invoice_number": "INV-2024-001",
"vendor": "Acme Corp",
"total": 1250.00,
"date": "2024-01-15",
"line_items": [
{
"description": "Professional Services",
"quantity": 10,
"unit_price": 125.00,
"amount": 1250.00
}
]
},
"pages_processed": 1,
"processing_time_ms": 847,
"credits_used": 1
}
Provide a publicly accessible URL to your PDF document
Send PDF content as base64-encoded string
Upload file directly using multipart/form-data
Send document via publicly accessible URL:
curl -X POST "https://paperpipe.app/api/v1/developer/extract" \
-H "X-API-Key: YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"document_url": "https://example.com/invoice.pdf"
}'
curl -X POST "https://paperpipe.app/api/v1/developer/extract" \
-H "X-API-Key: YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"document_url": "https://example.com/contract.pdf",
"schema": "contract"
}'
Available schemas:
curl -X POST "https://paperpipe.app/api/v1/developer/extract" \
-H "X-API-Key: YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"document_url": "https://example.com/large-doc.pdf",
"webhook_url": "https://yourapp.com/webhook",
"async_processing": true
}'
{
"success": true,
"job_id": "job_abc123def456",
"status": "processing",
"estimated_completion": "2024-01-15T10:32:00Z"
}
💡 Webhook Notification: When processing completes, we'll POST the full results to your webhook URL with the extracted data.
Invoice numbers, vendors, totals, line items, dates, tax amounts
Parties, effective dates, terms, payment info, key clauses
Store names, items, totals, tax, payment methods, timestamps
{
"success": false,
"error": {
"code": "INVALID_API_KEY",
"message": "The provided API key is invalid or expired"
}
}
INVALID_API_KEY
Invalid API keyFILE_TOO_LARGE
File exceeds 50MB limitUNSUPPORTED_FILE
File is not a valid PDFRATE_LIMIT_EXCEEDED
Too many requestspip install paperpipe
import paperpipe
client = paperpipe.PaperPipeClient()
result = client.extract("https://example.com/invoice.pdf")
print(f"Type: {result.detected_type}")
print(f"Data: {result.data}")
For now, use the REST API directly with any HTTP client