What is a Confusion matrix?

A confusion matrix is a table that shows how a classifier’s predictions compare to the true labels. It helps you see not just how often a model is right, but how it is wrong—which classes it confuses and where errors concentrate.

Binary confusion matrix (two classes)

For a yes/no model, the matrix typically includes:

True Positive (TP): predicted positive, actually positive
True Negative (TN): predicted negative, actually negative
False Positive (FP): predicted positive, actually negative
False Negative (FN): predicted negative, actually positive

From these values, teams compute common metrics:

Precision: when the model predicts positive, how often is it correct?
Recall: of all actual positives, how many did the model catch?
F1 score: balance of precision and recall

Multi-class confusion matrix

For many classes, the matrix becomes a grid where each row is a true class and each column is a predicted class. This is one of the fastest ways to spot:

Two classes that are consistently mixed up
A class the model almost never predicts
A class with high false positives because it “looks like everything”

Why it matters for data teams

Confusion matrices are practical because they often point directly to data actions:

Add more examples for a weak class
Split a confusing class into clearer definitions
Improve labeling guidelines
Collect “hard negatives” that look similar but are different

‍