Skip to main content
  1. Posts/

What is the F1 Score in Machine Learning?

··262 words·2 mins·

📊 Accuracy can lie to you. The F1 Score won’t.

If your dataset has imbalanced classes (e.g., 95% negative, 5% positive), a model that always predicts “negative” gets 95% accuracy. But its F1 Score would be 0. That’s exactly what you need to know.

🧮 The three key metrics:

MetricQuestion it answers
PrecisionOf all I predicted positive, how many actually are? TP / (TP+FP)
RecallOf all real positives, how many did I detect? TP / (TP+FN)
F1 ScoreHow balanced are both? Harmonic mean of Precision and Recall

📐 The formula:

$$F1 = 2 \times \frac{Precision \times Recall}{Precision + Recall}$$

The harmonic mean penalizes extreme values. If precision=0.90 and recall=0.10, F1 ≈ 0.18, not 0.50.

🐍 In Python with scikit-learn:

from sklearn.metrics import f1_score, classification_report

y_true = [1, 1, 1, 1, 1, 0, 0, 0, 0, 0]
y_pred = [1, 0, 1, 1, 0, 0, 0, 1, 0, 0]

print(f"F1 Score: {f1_score(y_true, y_pred):.2f}")
# F1 Score: 0.67

print(classification_report(y_true, y_pred))

📏 How to interpret the result:

  • 0.0–0.5: Poor model
  • 0.6–0.7: Acceptable (may be a good starting point)
  • 0.8–0.9: Strong model
  • 0.9–1.0: Excellent

⚠️ When NOT to use F1?

  • When one type of error is far more costly than the other (use Precision or Recall individually)
  • When classes are balanced and all errors are equal (use accuracy)

More information at the link 👇

Also published on LinkedIn.
Juan Pedro Bretti Mandarano
Author
Juan Pedro Bretti Mandarano