Machine Learning Tutorial Part 10: Hyperparameter Tuning & Grid Search
⚙️ What is Hyperparameter Tuning?
Hyperparameters are settings used to control the learning process (e.g., number of trees, depth of tree, learning rate). Tuning them can significantly improve model performance.
🔍 Grid Search with Cross-Validation
Grid Search exhaustively tries all combinations of parameters using cross-validation to find the best set.
from sklearn.model_selection import GridSearchCV
from sklearn.ensemble import RandomForestClassifier
param_grid = {
'n_estimators': [50, 100, 200],
'max_depth': [None, 10, 20],
'min_samples_split': [2, 5]
}
clf = RandomForestClassifier()
grid_search = GridSearchCV(estimator=clf, param_grid=param_grid, cv=5)
grid_search.fit(X_train, y_train)
print("Best Parameters:", grid_search.best_params_)
print("Best Score:", grid_search.best_score_)
🎲 Randomized Search
Randomized Search randomly samples combinations and is faster when the search space is large.
from sklearn.model_selection import RandomizedSearchCV
from scipy.stats import randint
param_dist = {
'n_estimators': randint(50, 200),
'max_depth': [None, 10, 20, 30],
'min_samples_split': randint(2, 10)
}
random_search = RandomizedSearchCV(estimator=clf, param_distributions=param_dist, cv=5, n_iter=10, random_state=42)
random_search.fit(X_train, y_train)
print("Best Parameters:", random_search.best_params_)
📌 Tips for Tuning
- Use
GridSearchCV
when parameter space is small and precise tuning is needed. - Use
RandomizedSearchCV
for faster, approximate search over large spaces. - Always use cross-validation to prevent overfitting during tuning.
Comments
Post a Comment