Free forever · 200 OCR + 50 extraction pages/mo

The scanned PDF OCR API for image-based documents.

Turn scanned PDFs, faxed documents, and archive pages into clean text. Use the same upload and async job flow for image-only PDFs and digital PDFs.

Get started today for free. No credit card required.

Before and after

See a scanned document become page text.

Image-based PDFs and faxed forms can be converted to text even when normal PDF text extraction cannot read the page contents.

Source document
A scanned motor insurance claim notification form
Extracted output
claim-pages.json
json
{
  "pages": [
    {
      "page": 1,
      "text": "MOTOR CLAIM NOTIFICATION\nPolicy number: POL-418209..."
    },
    {
      "page": 2,
      "text": "Repair quote\nFront bumper and left headlamp..."
    }
  ]
}
Image-based PDF OCR

Submit a scanned PDF. Get text per page.

Text mode returns one entry per page. Use it to index archives, power search, feed LLM workflows, or hand off extracted text to downstream systems.

submit-scanned-pdf.sh
bash
# OCR an image-based PDF
$ curl https://api.ocrwell.com/v1/documents \
    -H "X-API-Key: $OCRWELL_KEY" \
    -d '{
      "upload_id":"upl_01H7",
      "filename":"archive-scan.pdf",
      "mode":"text"
    }'
ocr-result.json
json
{
  "job": {
    "id": "019539a6-6c3d-7e5f",
    "status": "completed",
    "mode": "text",
    "page_count": 3
  },
  "result": {
    "pages": [
      {
        "page": 1,
        "text": "SERVICE AGREEMENT\nParties..."
      },
      {
        "page": 2,
        "text": "Schedule A\nSupport terms..."
      },
      {
        "page": 3,
        "text": "Signed for and on behalf of..."
      }
    ]
  }
}
Scanned document workflows

Where teams use scanned PDF OCR.

Index-ready text

Archive search

Convert scanned contracts, forms, letters, and legacy PDFs into page-by-page text ready for search indexes and audit tools.

Async jobs

Fax and scan ingestion

Faxed pages, photocopies, and image-only PDFs enter the same async job flow as clean digital PDFs, with polling or webhooks for completion.

Text before tokens

LLM document pipelines

Extract text before sending content to an LLM, reducing token cost and keeping your model prompt focused on the document content.

FAQ

Scanned PDF OCR questions.

What is a scanned PDF?

A scanned PDF contains page images rather than selectable text. OCRWell reads those images and returns page-by-page text through text mode.

How is this different from the PDF to text API page?

The API flow is the same. This page focuses on image-based, faxed, and archive PDFs where normal text extraction cannot read the page contents.

Can scanned PDFs also return structured data?

Yes. Use structured mode with a schema when you need fields or rows instead of raw page text.

Pricing

Start free. Pay only when you scale.

Free forever
200 OCR + 50 extraction pages every month
  • No credit card required.
  • Hard cap, no overage charges.
  • Paid plans from $20/mo when you grow.

Read scanned PDFs today.

Generate an API key, upload your first image-based PDF, and get page-by-page text back in seconds. Free forever tier covers 200 OCR pages per month.

Get your API key today No credit card required. 200 OCR + 50 extraction pages / month, free forever.