Temperature

Also known as · sampling temperature

A setting that controls how random vs. deterministic a model's output is.

Temperature controls how 'sharp' or 'flat' a model's choice of the next token is. At each step the model produces a probability distribution over possible next tokens; temperature scales that distribution. Near 0, the highest-probability token almost always wins — output is deterministic, focused, and repetitive. Higher values flatten the odds so less-likely tokens get a real chance — output becomes more varied and creative, but also less predictable.

The practical rule: use low temperature (around 0–0.3) for tasks that demand precision and consistency — code, data extraction, factual answers — and higher temperature (around 0.7–1.0) for brainstorming and creative writing. Many consumer chat products default to around 0.7.

Temperature is one of a few sampling controls, alongside top-p and top-k, that shape the randomness of generation.

Learn more in Module 8 — Prompt Engineering →

Temperature

Related terms

Beyond definitions.