What language is this text?
Paste any block of text and this tool tells you what language it is, with a confidence score and the top five matching candidates. It uses franc-min, a small Node library that recognises over 80 languages through a pure statistical method: it splits text into three-letter chunks called trigrams, counts how often each one appears, and compares those frequencies against reference profiles built from real language samples.
Everything runs on our server in plain JavaScript. No machine-learning model, no external API, no data leaves our infrastructure beyond the request itself. We do not store the text you submit.
Two important things to know up front. Short input fails: under twenty characters the trigram statistics are basically noise, so the answer can flip language with one extra word. And closely related languages confuse the detector: Czech and Slovak share so many trigrams that a short Czech sentence sometimes scores higher for Slovak. Always look at the top-5 list before treating the headline result as gospel.
How to use it
- Paste your text in the input box. Anything counts, an email, a paragraph, a chat message, a tweet.
- Try the sample chips under the box if you want to see how detection behaves on English, Polish, German, Japanese and Arabic.
- Click "Detect language". The result comes back in under a hundred milliseconds because nothing leaves our server.
- Read the primary verdict: the detected language name, its flag, the ISO 639-3 three-letter code and the ISO 639-1 two-letter code (when one exists).
- Glance at the confidence percentage: anything above 85% is solid, 50-85% means the input is short or shares trigrams with another language, below 50% means the result is unreliable.
- Open the top-5 candidates below. If the second candidate is within a few percent of the first, your text might be a mix or one of the famous "look-alike" pairs (Czech / Slovak, Norwegian / Danish, Spanish / Portuguese).
- For mixed-language text (an English email with one Polish quote, for example) expect the detector to pick the dominant language, not to split the result.
When this is useful
Five honest, day-to-day uses for a quick language detector:
- Triage incoming support emails or contact-form messages before routing them. Drop the body in, see if it is English, Polish, German, etc., then forward to the right team. Faster than guessing from a name or a domain.
- Audit a content database before running translation jobs. Paste a sample row in, confirm the language matches what the column says it should be. Catches mis-tagged rows that would otherwise be sent to the wrong translator.
- Quickly identify a snippet you found in logs, in an old document, in a screenshot OCR result, when you have no idea what language it is. Detection plus the flag is usually enough to know where to look next.
- Sanity-check generated content when an LLM is supposed to reply in a specific language and you suspect it answered in English by mistake. Paste, see the iso3 code, done.
- Teach how trigram detection works. The top-5 list with bars is a great visual aid because you can see *how close* Czech is to Slovak or Portuguese is to Spanish in trigram space.
Related tools: AI text detector, text counter, case converter, LLM token counter.