When does OCR work well, and when does it fail?

**Works well**: clean scans at 300 DPI or more, screenshots of digital text, printed pages, invoices and receipts in standard fonts, white background with dark text. **Fails or struggles**: **handwriting** (Tesseract is trained on print, not cursive), **busy backgrounds** (text on top of photos), **low resolution** (anything below ~150 px tall per line), **tilted or curved text**, **stylised fonts**, **reflective glare** on the original. If a human can read it in half a second, Tesseract will probably get it. If you have to squint, expect errors.

Which languages are supported?

Five language packs are wired in: **English (eng)**, **Polish (pol)**, **German (deu)**, **French (fra)**, and **Spanish (spa)**. Tesseract itself supports **100+ languages**, including non-Latin scripts (Arabic, Chinese, Hindi, Cyrillic), so if you need another one open an issue and we will enable the pack. **Pick the language that matches the image** - running Polish OCR on an English page produces gibberish, and vice versa.

How can I get a better result? Any preprocessing tips?

**Crop tight** around the text - irrelevant areas only confuse the engine. **Increase contrast** if the image is washed out (a quick "auto levels" in any photo app helps). **Deskew** if the page is tilted more than a few degrees - straight horizontal lines work best. **Avoid JPEG artifacts** on text: re-save the source as PNG if you can. **Aim for ~300 DPI** at the final size; a 100 px tall paragraph will misfire, a 400 px tall one will not.

How accurate is OCR, realistically?

On a **clean printed page in a supported language**, expect **98 to 99% character accuracy**. On a **decent phone photo of a receipt**, more like **90 to 95%** - enough to read, but you will want to scan the result for typos. On a **blurry, tilted, low-res photo**, accuracy can drop below 70%, at which point retyping is faster. The **confidence percentage** the tool shows per block is a useful guide: anything above 85 is usually clean, below 60 is suspect.

My image has English and German mixed - what should I do?

Tesseract can technically load **multiple language packs at once**, but in practice **mixed-language pages produce worse results** for both languages than picking the dominant one. **Pick the language that covers most of the text**. For a heavily mixed page, run OCR **twice** (once per language) and merge the parts you trust from each pass. We may add a multi-language mode in the future, but the single-language default is the right choice for almost every real document.

Can I extract a table with rows and columns?

**Tesseract reads text, not table structure**. You will get the cell contents as a flat stream of words, in roughly **reading order** (left to right, top to bottom). The visual grid is **lost** - there are no commas, tabs, or column markers in the output. For real tabular data, the best workflow is: **OCR the page → manually paste rows into a spreadsheet**, or use a dedicated table-extraction tool. Anything that promises "perfect Excel from a screenshot" is using a different (and much heavier) ML model than Tesseract.

My file is a PDF - should I use this tool?

**Probably not, try the [PDF text extractor](/en/pdf-text-extractor) first**. If the PDF was made by exporting from Word, Google Docs, a browser, or any modern app, it **already contains real text** - extracting it is **instant and perfect**. Use OCR **only when the PDF is a scanned image** (a photocopier output, a "Save as image" PDF, an old fax) and the text extractor returns empty. For multi-page scanned PDFs, split out the pages first and OCR them one by one - this tool takes a single image at a time.

Is my image private? Where does it go?

The image is **sent to our server** to run Tesseract - there is no way around that, the engine needs the pixels. We **never write it to disk, never log it, never store it**. The file lives in **process memory just long enough** to recognise the text (typically 2 to 10 seconds) and is **garbage-collected** the moment the response is sent back. We also **do not see the extracted text** beyond the response we return to you. If you need stricter privacy guarantees for confidential documents, run Tesseract locally - it is open source and the same engine we use.

What is the maximum image size?

**10 MB per file**. That covers virtually every JPG, PNG, or WebP from a phone, scanner, or screenshot tool. The **rate limit** is **10 OCR runs per hour per IP** - OCR is CPU-heavy and we run it server-side, so this prevents one user from monopolising the worker. If you hit the limit, wait an hour or run Tesseract locally for a bulk job. Files **above 10 MB** are rejected with a clear error - usually you can shrink a phone photo to 1 to 2 MB without losing any OCR quality.

Image OCR - free | YourDevTools

Image OCR (Tesseract)

Drop an image here

JPG, PNG, WebP up to 10 MB

The image is sent to our server only to run Tesseract. We do not write it to disk, log it, or store it. Limit of 10 OCR runs per hour per IP.

Extracted text

OCR result will appear here once the image is processed.

How do I extract text from an image (OCR)?

Image OCR reads the text inside a photo, screenshot, or scan and gives you back a plain string you can copy, search, or paste anywhere.

Drop a JPG, PNG, or WebP (up to 10 MB), pick a language, get the recognised text with a confidence score per block.

Recognition runs on our server using Tesseract - the same engine Google built for Android - with English, Polish, German, French, and Spanish language packs.

Best for clean scans, screenshots of dialog boxes, invoices, receipts, and printed pages. Handwriting and busy backgrounds will struggle.

How to use it

Drag your image onto the dropzone or click "Choose file" - JPG, PNG, WebP are accepted, HEIC is not (convert it first with the HEIC converter).

Pick the language that matches your image. Mixing languages on one page works poorly - use the dominant one.

Click "Extract text". The first run downloads a ~10 MB language pack on the server, so the very first call may take 5 to 15 seconds; later calls are faster.

Read the extracted text in the box on the right. Use "Copy" to put it on the clipboard or "Download" to save a `.txt` file.

Toggle "Show word boxes" to overlay every recognised word on the image - useful for spotting missed regions or low-confidence patches.

When this is useful

Where OCR pays off - typical situations:

Quoting a screenshot in a doc or chat without retyping it word by word.
Pulling a phone number, email, or address off a photo of a business card or a printed flyer.
Reading a receipt to track an expense - the totals and line items become searchable text.
Lifting text from a UI when a developer or designer ships you a flat PNG with no editable layer.
Old invoices and contracts that were scanned to PDF and lost their text layer along the way.
Memes, signs, posters - quickly grab the slogan or caption.

If your file is a PDF that already contains a text layer (most PDFs from Word / Pages / Chrome "Save as PDF" do), use the PDF text extractor instead - it is instant, perfectly accurate, and free of OCR errors. OCR is only the right tool when there is no real text in the file, only pixels.

Questions and answers

OCR stands for Optical Character Recognition - software that looks at the pixels of an image and decides "this shape is the letter A, this one is a B". Modern OCR (including Tesseract, which powers this tool) uses a neural network trained on millions of letter shapes, so it handles different fonts, sizes, and slight rotation without you tuning anything. The output is a plain text string plus a confidence number between 0 and 100 for every word and block.

How do I extract text from an image (OCR)?

Image OCR reads the text inside a photo, screenshot, or scan and gives you back a plain string you can copy, search, or paste anywhere.

Drop a JPG, PNG, or WebP (up to 10 MB), pick a language, get the recognised text with a confidence score per block.

Recognition runs on our server using Tesseract - the same engine Google built for Android - with English, Polish, German, French, and Spanish language packs.

Best for clean scans, screenshots of dialog boxes, invoices, receipts, and printed pages. Handwriting and busy backgrounds will struggle.

How to use it

Drag your image onto the dropzone or click "Choose file" - JPG, PNG, WebP are accepted, HEIC is not (convert it first with the HEIC converter).

Pick the language that matches your image. Mixing languages on one page works poorly - use the dominant one.

Click "Extract text". The first run downloads a ~10 MB language pack on the server, so the very first call may take 5 to 15 seconds; later calls are faster.

Read the extracted text in the box on the right. Use "Copy" to put it on the clipboard or "Download" to save a `.txt` file.

Toggle "Show word boxes" to overlay every recognised word on the image - useful for spotting missed regions or low-confidence patches.

When this is useful

Where OCR pays off - typical situations:

Quoting a screenshot in a doc or chat without retyping it word by word.
Pulling a phone number, email, or address off a photo of a business card or a printed flyer.
Reading a receipt to track an expense - the totals and line items become searchable text.
Lifting text from a UI when a developer or designer ships you a flat PNG with no editable layer.
Old invoices and contracts that were scanned to PDF and lost their text layer along the way.
Memes, signs, posters - quickly grab the slogan or caption.

Questions and answers

Image OCR

Drop an image here

How do I extract text from an image (OCR)?

How to use it

When this is useful

Questions and answers

Related tools

Images converter

PDF Text Extractor

HEIC to JPG/PNG Converter

Image Compressor

Image OCR

Drop an image here

How do I extract text from an image (OCR)?

How to use it

When this is useful

Questions and answers

Related tools

Images converter

PDF Text Extractor

HEIC to JPG/PNG Converter

Image Compressor