back
Back to Glossary

What is a Confusion matrix?

A confusion matrix is a table that shows how a classifier’s predictions compare to the true labels. It helps you see not just how often a model is right, but how it is wrong—which classes it confuses and where errors concentrate.

Binary confusion matrix (two classes)

For a yes/no model, the matrix typically includes:

  1. True Positive (TP): predicted positive, actually positive
  2. True Negative (TN): predicted negative, actually negative
  3. False Positive (FP): predicted positive, actually negative
  4. False Negative (FN): predicted negative, actually positive

From these values, teams compute common metrics:

  1. Precision: when the model predicts positive, how often is it correct?
  2. Recall: of all actual positives, how many did the model catch?
  3. F1 score: balance of precision and recall

Multi-class confusion matrix

For many classes, the matrix becomes a grid where each row is a true class and each column is a predicted class. This is one of the fastest ways to spot:

  1. Two classes that are consistently mixed up
  2. A class the model almost never predicts
  3. A class with high false positives because it “looks like everything”

Why it matters for data teams

Confusion matrices are practical because they often point directly to data actions:

  1. Add more examples for a weak class
  2. Split a confusing class into clearer definitions
  3. Improve labeling guidelines
  4. Collect “hard negatives” that look similar but are different