When a language model fails in a real product, it usually isnβt in a random way; it fails in one of a few specific modes. Knowing these modes is like having a checklist: you know where to look and how to test for it. Letβs go through the most important ones.
Comprehension: the model doesnβt understand the input
The first mode is a failure of comprehension: the model misses whatβs stated in the input, or loses the thread in long, multi-step contexts. The sign is a reply that answers something that was never asked. The way to catch it is tests that probe understanding of difficult and long inputs.
Reasoning: brittle logic
The second mode is a failure of reasoning: the model settles for surface answers, or slips on multi-step inference and novel logical chains. This mode can be exposed by benchmarking the model on new problems β not examples it likely saw in training.
Structured generation: the format breaks
The third mode is breaking on structured output: the model produces invalid JSON, invents a field, or returns a value outside the allowed list. This one is easy to catch β a structure validator downstream catches any broken output and logs it.
Fidelity: hallucination
Perhaps the most important mode is a failure of fidelity β what we call hallucination. The model ignores the given context and makes something up, or cites a source that doesnβt exist, or backs down from a clear truth under pressure. This mode is dangerous because its output looks right. The way to catch it is measuring fidelity on data where the correct answer is known, and checking how well the model stays anchored to the context.
Calibration: misplaced confidence
The fifth mode is a failure of calibration, with two faces. First, overconfidence: the model is sure about everything and never says βI donβt know.β Second, sycophancy: under user pressure, the model abandons its own correct reasoning and agrees with the userβs wrong answer. Both can be measured with tests that deliberately invite the model to err and see whether it holds firm or yields.
Instruction-following: rules get ignored
The sixth mode is a failure of instruction-following: the model violates multi-part rules, drops a constraint, or ignores negations. An important sub-case of this mode is prompt injection: the model follows instructions hidden inside the userβs input instead of the systemβs instructions. This can be exposed with instruction-following tests and adversarial inputs.
Initiative: it answers only what was asked
The last mode is a lack of initiative: the model answers only what was explicitly asked and never flags a gap it can see. This is fine for closed tasks but brittle for open ones. The test is giving open-ended tasks and seeing whether the model goes beyond the minimum.
From a list of failures to a list of tests
The key point is that these modes arenβt random, so you can build a test for each one. Instead of waiting for a user to discover a failure, turn these very modes into a checklist and measure every candidate model against it. This is the difference between a system that knows its failures and one thatβs surprised by each new one.