Skip to content
Kordu Tools Kordu Tools

AI Language Detector

AI Runs in browser

Identify the language of any text with a BERT neural network — accurate on short snippets and closely related languages. Runs on-device.

Last updated 01 Apr 2026

Identifies text language using a BERT-based neural network trained on 200 languages. Returns ranked candidates with confidence scores. Runs entirely in your browser via WebAssembly after a one-time 25 MB model download — your text is never sent to any server.

~23.6 MB download

Paste text above to detect its language with AI

Loading rating…

How to use

  1. 1

    Paste your text

    Type or paste any text into the input box. The AI model works well even with short snippets of 10 or more characters.

  2. 2

    Wait for automatic detection

    Detection runs automatically as you type (after a short 500ms debounce). On first use, the 25 MB model downloads once and is cached for all future visits.

  3. 3

    View the top language

    The most likely language is shown with its name, ISO code, and a confidence percentage bar colour-coded from orange (low) to green (high).

  4. 4

    Review all candidates

    Scroll to see up to 5 ranked language candidates with their relative confidence bars — useful for mixed-language or ambiguous text.

Frequently asked questions

How is this different from the regular Language Detector?
The standard detector uses statistical n-gram analysis (franc) — instant, no download, supports 187 languages, but struggles on short or ambiguous text. This AI version uses a BERT neural network trained on 121 million sentences across 200 languages. It is significantly more accurate on short snippets and closely related languages.
How many languages does it support?
The model supports 200 languages including major world languages, regional languages, and many less-common ones. Coverage spans Latin, Cyrillic, Arabic, Chinese, Japanese, Korean, Devanagari, Thai, Georgian, and many other scripts.
Why does it need a 25 MB download?
The model is a quantized BERT-mini neural network (~24.7 MB). It downloads once to your browser cache and is reused on all future visits — you will not need to download it again unless you clear your browser cache.
Is my text sent to a server?
No. All inference runs entirely in your browser using WebAssembly. Your text never leaves your device — this is 100% on-device processing.
What is the minimum text length required?
Detection triggers automatically for inputs of 10 or more characters. For best accuracy, use 30 or more characters. Very short snippets may produce lower-confidence results.
How accurate is it?
The model achieves 97.3% F1 accuracy on its benchmark test set. Real-world accuracy depends on language and text length — longer text and well-known languages tend to yield higher confidence.
Can it detect mixed-language text?
The model classifies the dominant language of the full input. It does not perform word-level or sentence-level language splitting. For mixed-language text, it returns the most represented language with lower confidence.
Does it work on mobile browsers?
Yes — the tool works in any modern mobile browser that supports WebAssembly, including Chrome, Safari, and Firefox on iOS and Android. The 25 MB model download applies on mobile as well.
What if detection confidence is low?
Low confidence usually means the text is very short, uses unusual vocabulary, or is genuinely mixed-language. Try adding more text, or check the ranked candidate list for the most likely alternatives.

Statistical language detectors count character n-grams and look up frequency

tables — fast and lightweight, but they struggle with short snippets, mixed-script

text, and closely related languages like Norwegian Bokmål vs Nynorsk or Serbian

vs Croatian.

This AI-powered detector uses a fine-tuned BERT-mini model trained on the Open LID

dataset — 121 million sentences across 200 languages. Unlike n-gram approaches, the

neural network reads text as a whole, weighing word context and script patterns

together for far better accuracy on short inputs. It achieves 97.3% F1 accuracy

across its supported language set.

All inference runs locally in your browser via WebAssembly — your text is never

sent to any server. After a one-time 25 MB model download (cached for future

visits), detection is near-instant. Results appear as a ranked list with confidence

percentages so you can see not just the top match but how certain the model is and

what the alternatives are.

Use it to pre-classify user-submitted content, verify the language of scraped

text, build multilingual content pipelines, or quickly identify which language a

short message is written in. For text longer than 50 characters where speed matters

more than accuracy on ambiguous cases, the standard Language Detector is faster

with no download required.

Related tools