Token

Also known as · tokens · tokenization

The basic unit of text an LLM reads and generates — roughly a word-piece.

A token is the atomic unit a language model actually processes. Text is broken into tokens by a tokenizer before the model ever sees it. A token is often a whole short word, but longer or rarer words get split into pieces — "unbelievable" might become "un", "believ", "able". As a rough rule of thumb, one token is about four characters of English, or roughly ¾ of a word.

Tokens matter for two practical reasons: cost and limits. API pricing is quoted per million tokens (input and output priced separately), and every model has a maximum number of tokens it can handle at once — its context window. Counting tokens, not words, is how you estimate what a request will cost and whether it will fit.

Because tokenization isn't the same as word-splitting, quirks appear: numbers, code, and non-English text often use more tokens than you'd expect, which makes them more expensive to process.

Learn more in Module 3 — Tokens & Context Windows →

Token

Related terms

Beyond definitions.