Build apps that can read the documents your customers send.
Bank statements, photographed receipts, faxed forms, handwritten notes. OCRWell turns any of them into plain text or typed JSON your code can use, in one API call.
{
"data": {
"merchant": "GREENGATE GROCERS",
"date": "11/04/2026",
"items": [
{
"name": "Flat White",
"price": 3.8
},
{
"name": "Sourdough Loaf",
"price": 4.5
},
{
"name": "Oat Milk 1L",
"price": 2.2
},
{
"name": "Bananas 1kg",
"price": 1.9
}
],
"subtotal": 12.4,
"vat": 2.48,
"total": 14.88
}
}- Works on PDFs, phone photos, scans, and handwriting. No templates, no fine-tuning, no training.
- Plain text, or data shaped to your own JSON schema, ready for the next step in your pipeline.
- Ships as a skill for Claude Code, Codex, Gemini CLI, Antigravity, and Cursor. Your coding agent writes the integration.
Your agent writes the integration. You review the diff.
Install the OCRWell skill once, set an API key, and ask your agent to build the feature. The skill ships the rules it would otherwise guess at, so the code that lands in your project uses the API correctly the first time.
The call itself is one function.
from ocrwell import client
result = client.extract(
file=open("receipt.png", "rb"),
schema={"total": "number", "items": "array"},
)Install in Claude Code
Uses the built-in plugin marketplace, so you get versioning and /plugin update for free.
/plugin marketplace add ocrwell/skill
/plugin install ocrwell@ocrwellInstall in Codex CLI (or Claude Code)
One npx target per agent. Writes the SKILL.md to the location your agent reads from.
$ npx @ocrwell/skill install --codex
$ export OCRWELL_API_KEY=...What the skill handles
Same SKILL.md format in Claude Code and Codex CLI, so one install covers both environments.
Request a pre-signed URL, PUT the file, then submit for processing. The skill handles the branching and retries.
Polls GET /v1/jobs/{id} until the job settles, backing off on 429s and never busy-looping.
HMAC-SHA256 over timestamp and body. The skill generates the verification code in whatever language your project uses.
Pass a JSON schema up to 64 KB and get typed data back, with field-level validation. No templates to build first.
An HTTP 200 with job.status "failed" is a job failure, not a success. The skill teaches the agent to tell the difference.
The skill reads OCRWELL_API_KEY from the environment. Generated code never hard-codes secrets or commits them.
Save with OCRWell
Using our dedicated OCR processing leads to better accuracy and big savings vs using in-built image processing.
| Per A4 page | OCRWell Scale + Opus 4.7 text | Raw 200 DPI image → Opus 4.7 |
|---|---|---|
| OCR cost | $0.0008 ($600 ÷ 750,000 pages) | — |
| Input to Claude | ~700 text tokens | ~4,784 image tokens |
| Opus 4.7 input ($5 / MTok) | $0.0035 | $0.0239 |
| Total per page | $0.0043 | $0.0239 |
Opus 4.7 API pricing from Anthropic; image token count uses the
published width × height ÷ 750 formula, capped at Opus 4.7's
4,784-token native resolution limit. Output tokens are identical in both
paths and excluded. Scale Plan per-page OCR cost assumes 100%
utilisation of the 750,000 included pages and no overage.
Ship the document-reading part of your app this afternoon.
Generate a key, install the skill, and your next prompt can turn real-world paperwork into data your code can work with.