Perplexity
How surprised a language model is by a given piece of text — lower means the text looks more model-generated.
Perplexity is a measure of how well a language model predicts a sequence of text. Concretely, it's the exponentiated cross-entropy of the model on the text. The lower the perplexity, the more "expected" the text was from the model's point of view.
That makes perplexity a useful AI-detection signal: text generated by a model tends to have low perplexity under that same model (or a closely related one), while human-written text — full of idiosyncratic word choices and unexpected turns — tends to have higher perplexity.
Perplexity is rarely used alone in production detection. It's combined with burstiness, classifier confidence, and other features. Note that perplexity in this sense is a statistical measure, not the search company Perplexity AI.
Related terms
- Burstiness— A measure of variation in sentence length, structure, and complexity across a piece of text.
- AI detection— The task of identifying text that was written by a large language model rather than a human.
- Token— The smallest unit of text a language model processes — usually a word or a piece of a word.
Move from definition to code
Free 1,000 requests/month — no credit card. Be detecting AI text in 5 minutes.