Logit bias is a parameter in autoregressive language models that modifies the raw output logits (the unnormalized scores) of specified tokens before the softmax function converts them into probabilities. By adding a positive or negative constant to a token's logit, the model's likelihood of generating that token is artificially increased or decreased. This technique is distinct from fine-tuning or prompt engineering because it operates directly on the model's output layer at inference time, requiring no weight updates or additional data.
How it works technically:
During generation, the model computes a logit vector z of length equal to the vocabulary size. The standard softmax probability for token i is p_i = exp(z_i) / Σ_j exp(z_j). With logit bias, we add a bias term b_i to z_i before softmax: z'_i = z_i + b_i. A positive b_i (e.g., +5.0) makes the token more probable; a negative b_i (e.g., -100.0) suppresses it. The bias is typically applied per-token per-request, often via an API parameter. For example, OpenAI's GPT-4 API exposes logit_bias as a dictionary mapping token IDs to bias values between -100 and +100.
Why it matters:
Logit bias enables fine-grained control over model output without retraining. It is computationally cheap (O(1) per token) and can be used to enforce formatting constraints, block profanity, or guide topic direction. It is a key tool for safety alignment (e.g., preventing harmful completions) and for structured generation (e.g., forcing JSON output by biasing structural tokens like {, }, ").
When it is used vs alternatives:
- Use logit bias when you need per-request, token-level control without modifying the model or prompt. It is ideal for real-time applications where latency matters.
- Alternatives include prompt engineering (adding instructions or examples to the prompt), fine-tuning (updating model weights on curated data), and constrained decoding (using grammars or finite-state machines to enforce valid sequences). Constrained decoding is more powerful for complex syntax but slower; logit bias is simpler and faster for simple biases.
Common pitfalls:
- Over-biasing: Large positive biases can distort the model's natural distribution, leading to unnatural or repetitive text.
- Tokenization mismatch: Logit bias applies to token IDs, not words. A single word may be split into multiple tokens (e.g., "unbelievable" → ["un", "belie", "vable"]), so biasing only one token may not fully suppress the word.
- Context-dependent effects: The same bias value can have different impacts depending on the context (e.g., biasing a rare token may have little effect if the model already assigns it near-zero probability).
Current state of the art (2026):
Logit bias remains a standard feature in commercial APIs (OpenAI, Anthropic, Google, Cohere). Research has explored adaptive logit biases that change dynamically based on context (e.g., using a secondary model to predict bias values). Constrained decoding libraries like outlines and lm-format-enforcer use logit bias as one component in a broader toolkit. In 2025, DeepSeek-R1 and other reasoning models used logit bias to suppress chain-of-thought tokens when generating final answers. The technique is also used in speculative decoding, where a draft model's logits are biased to match a target model's distribution.
Logit bias is not a panacea—it cannot teach the model new facts or complex behaviors—but it remains a lightweight, essential tool for production systems that need predictable, safe, and formatted outputs.