Top-k Sampling
Also known as · top k
A sampling setting that keeps only the k most likely next tokens.
Top-k sampling restricts the model to the k most likely next tokens and ignores the rest, then samples from that fixed-size pool. If k is 1, the model always takes the single most likely token (fully deterministic); larger k allows more variety.
Unlike top-p, top-k is a fixed count — it always keeps exactly k candidates regardless of how confident the model is. That makes it simple but slightly less adaptive: the same k can be too restrictive when the model is uncertain and too loose when it's confident.
In practice, top-k, top-p, and temperature are often used together, and most developers tune temperature first and leave the others at defaults.