Machine Learning Tutorial Part 15: Cross-Validation & Evaluation Metrics
🔁 What is Cross-Validation?
Cross-validation helps evaluate model performance on unseen data. It divides the dataset into several parts (folds) to ensure the model generalizes well.
🔹 K-Fold Cross Validation
Splits data into k
parts. Model trains on k-1
parts and validates on the remaining part.
from sklearn.model_selection import cross_val_score
from sklearn.ensemble import RandomForestClassifier
model = RandomForestClassifier()
scores = cross_val_score(model, X, y, cv=5)
print("CV Accuracy:", scores.mean())
📚 Evaluation Metrics
Metrics help assess classification or regression performance. Common ones include:
- Accuracy: (TP + TN) / Total
- Precision: TP / (TP + FP)
- Recall: TP / (TP + FN)
- F1-Score: 2 * (Precision * Recall) / (Precision + Recall)
- Confusion Matrix: Shows TP, TN, FP, FN in a table.
from sklearn.metrics import classification_report, confusion_matrix
y_pred = model.predict(X_test)
print(confusion_matrix(y_test, y_pred))
print(classification_report(y_test, y_pred))
⚖️ When to Use What?
- Accuracy: Good for balanced datasets.
- Precision: Important when false positives are costly (e.g., spam detection).
- Recall: Important when false negatives are costly (e.g., cancer detection).
- F1: Use when there's class imbalance.
📝 Quiz 1: Cross-Validation
Q: What does a 5-fold cross-validation mean?
A: The dataset is split into 5 equal parts. Each fold gets a chance to be the validation set, while the remaining 4 are used for training.
📝 Quiz 2: Evaluation Metrics
Q: Which metric is best to use when detecting rare diseases?
A: Recall. Missing a positive case (false negative) in rare diseases can be dangerous, so we focus on detecting all true positives.
📝 Quiz 3: Precision vs Recall
Q: If a spam filter marks too many real emails as spam, which metric is likely too low?
A: Precision. A low precision means many false positives—legitimate emails wrongly marked as spam.
📌 Summary
- Use cross-validation to get a reliable estimate of model performance.
- Choose evaluation metrics based on your problem context.
- Balance trade-offs between precision, recall, and F1 depending on risk factors.
Comments
Post a Comment