From pixels to structured data in four API calls
Authenticate with an API key, request an upload URL, PUT your file, and submit the document. OCRWell processes asynchronously and returns typed JSON — either raw page text or data mapped to your JSON schema.
Send your API key
Create a key in the dashboard and include it on every request. Keys are shown once at creation time and hashed at rest — rotate freely without downtime.
- API key in the X-API-Key header
- Per-organisation rate limits with Retry-After on 429
- HTTPS only — plaintext HTTP is rejected
$ curl https://api.ocrwell.com/v1/webhooks \
-H "X-API-Key: $OCRWELL_KEY"Upload your document
Two-step flow. Request a pre-signed upload URL, PUT your file, then submit it for processing. The API accepts PDF, JPEG, PNG, TIFF, WebP, and BMP — up to 20 MB and 1000 pages.
- Pre-signed upload URLs valid for 15 minutes
- Idempotency-Key header prevents duplicate jobs
- Text or structured mode, chosen per request
# 1. Request upload URL
$ UP=$(curl -s https://api.ocrwell.com/v1/uploads \
-H "X-API-Key: $KEY" \
-d '{"filename":"invoice.pdf"}')
# 2. PUT file to the signed URL
$ curl -X PUT "$(echo $UP | jq -r .upload_url)" \
-H "Content-Type: application/pdf" \
--data-binary @invoice.pdf
# 3. Submit for processing
$ curl https://api.ocrwell.com/v1/documents \
-H "X-API-Key: $KEY" \
-d '{"upload_id":"...","filename":"invoice.pdf","mode":"text"}'Poll or receive a webhook
Jobs complete asynchronously. Poll GET /v1/jobs/:id or configure an HMAC-signed webhook. Either way, results are available for five minutes after first retrieval.
- Polling via GET /v1/jobs/{id}
- Webhooks signed with HMAC-SHA256 over timestamp + body
- Text mode returns pages[], structured mode returns your schema
{
"job": {
"id": "019539a6-6c3d-7e5f",
"status": "completed",
"mode": "text",
"page_count": 3
},
"result": {
"pages": [
{ "page": 1, "text": "INVOICE..." },
{ "page": 2, "text": "Line items..." }
]
}
}Choose how you want the data back.
Every submission picks one of two modes. Same endpoints, same rate limits — only the shape of the result changes.
Raw page text, one entry per page. Preserves layout breaks and line structure. Ideal for search indexing and downstream LLM pipelines.
Works on any document with no pre-training, fine-tuning, or template setup. Provide a JSON schema and receive data shaped exactly to it, with field-level type validation and schemas up to 64 KB.
Ready to automate your document flow?
Ship document intelligence in an afternoon — free tier included.