Các metrics trong Machine Learning (Zalo: 0349942449)

Key Concepts

Binary Classification: A machine learning task where the goal is to categorize data into one of two distinct classes (e.g., Spam vs. Not Spam).
Confusion Matrix: A table used to evaluate the performance of a classification model by comparing predicted labels against actual labels.
True Positive (TP): Correctly predicted positive instances.
False Positive (FP): Incorrectly predicted positive instances (Type I error).
True Negative (TN): Correctly predicted negative instances.
False Negative (FN): Incorrectly predicted negative instances (Type II error).
Precision: The ratio of correctly predicted positive observations to the total predicted positives.
Recall (Sensitivity): The ratio of correctly predicted positive observations to all actual positives.
F1-Score: The harmonic mean of Precision and Recall, providing a balance between the two metrics.

1. Binary Classification Overview

The video discusses the fundamental concepts of binary classification in machine learning. The primary objective is to distinguish between two classes, often labeled as "positive" and "negative." A common real-world application mentioned is email filtering, where the system must classify incoming messages as either "spam" or "not spam."

2. The Confusion Matrix

To measure how well a model performs, the speaker introduces the Confusion Matrix. This framework organizes predictions into four categories:

True Positive (TP): The model correctly identifies a positive case.
True Negative (TN): The model correctly identifies a negative case.
False Positive (FP): The model incorrectly labels a negative case as positive.
False Negative (FN): The model incorrectly labels a positive case as negative.

3. Performance Metrics: Precision and Recall

The speaker emphasizes that accuracy alone is often insufficient for evaluating models, especially when classes are imbalanced. Two critical metrics are defined:

Precision: Defined as the accuracy of positive predictions. It answers the question: "Of all instances predicted as positive, how many were actually positive?"
- Formula: $TP / (TP + FP)$
Recall (Sensitivity): Defined as the ability of the model to find all positive instances. It answers the question: "Of all actual positive instances, how many did the model correctly identify?"
- Formula: $TP / (TP + FN)$

4. Balancing Metrics with F1-Score

The speaker highlights the trade-off between Precision and Recall. Improving one often leads to a decrease in the other. To address this, the F1-Score is introduced as a single metric that combines both Precision and Recall using the harmonic mean. This is particularly useful when a balance between the two is required for a robust model evaluation.

5. Synthesis and Conclusion

The main takeaway is that evaluating a binary classification model requires more than just a simple accuracy percentage. By utilizing a Confusion Matrix, practitioners can gain granular insights into where the model is failing (e.g., whether it is producing too many false positives or false negatives). Precision and Recall serve as the primary tools for this diagnostic process, while the F1-Score provides a consolidated view of performance, ensuring that the model is reliable across different classification scenarios.