What is the F1 Score in Machine Learning?

📊 Accuracy can lie to you. The F1 Score won’t.

If your dataset has imbalanced classes (e.g., 95% negative, 5% positive), a model that always predicts “negative” gets 95% accuracy. But its F1 Score would be 0. That’s exactly what you need to know.

🧮 The three key metrics:

Metric	Question it answers
Precision	Of all I predicted positive, how many actually are? `TP / (TP+FP)`
Recall	Of all real positives, how many did I detect? `TP / (TP+FN)`
F1 Score	How balanced are both? Harmonic mean of Precision and Recall

📐 The formula:

$$F1 = 2 \times \frac{Precision \times Recall}{Precision + Recall}$$

The harmonic mean penalizes extreme values. If precision=0.90 and recall=0.10, F1 ≈ 0.18, not 0.50.

🐍 In Python with scikit-learn:

from sklearn.metrics import f1_score, classification_report

y_true = [1, 1, 1, 1, 1, 0, 0, 0, 0, 0]
y_pred = [1, 0, 1, 1, 0, 0, 0, 1, 0, 0]

print(f"F1 Score: {f1_score(y_true, y_pred):.2f}")
# F1 Score: 0.67

print(classification_report(y_true, y_pred))

📏 How to interpret the result:

0.0–0.5: Poor model
0.6–0.7: Acceptable (may be a good starting point)
0.8–0.9: Strong model
0.9–1.0: Excellent

⚠️ When NOT to use F1?

When one type of error is far more costly than the other (use Precision or Recall individually)
When classes are balanced and all errors are equal (use accuracy)

What is F1 Score in Machine Learning?

Learn what the F1 score is, how it is calculated, when to use it, and how it compares to accuracy, with clear formulas and Python examples.

www.analyticsvidhya.com ↗

Also published on LinkedIn.

Author

Juan Pedro Bretti Mandarano