Machine Learning Tutorial Part 13: Hyperparameter Tuning (GridSearchCV & RandomizedSearchCV)
⚙️ What is Hyperparameter Tuning?
Hyperparameters are parameters that are set before training a model. Tuning hyperparameters is essential for improving model performance. Let’s explore two key techniques: GridSearchCV and RandomizedSearchCV.
🔍 GridSearchCV
GridSearchCV exhaustively searches over a specified parameter grid and evaluates model performance using cross-validation. It’s the best method for small search spaces.
from sklearn.model_selection import GridSearchCV
from sklearn.ensemble import RandomForestClassifier
param_grid = {
'n_estimators': [100, 200],
'max_depth': [10, 20, None]
}
grid_search = GridSearchCV(estimator=RandomForestClassifier(), param_grid=param_grid, cv=5)
grid_search.fit(X_train, y_train)
print("Best Parameters:", grid_search.best_params_)
🎯 RandomizedSearchCV
RandomizedSearchCV samples a fixed number of hyperparameter combinations from a specified grid. It’s faster than GridSearchCV, especially for larger search spaces.
from sklearn.model_selection import RandomizedSearchCV
from scipy.stats import randint
param_dist = {
'n_estimators': randint(100, 200),
'max_depth': [10, 20, None]
}
random_search = RandomizedSearchCV(estimator=RandomForestClassifier(), param_distributions=param_dist, n_iter=100, cv=5)
random_search.fit(X_train, y_train)
print("Best Parameters:", random_search.best_params_)
📈 Evaluating Model Performance
After tuning, you can evaluate the model using various metrics like accuracy, precision, recall, and F1-score, as explained in previous tutorials.
best_model = grid_search.best_estimator_
y_pred = best_model.predict(X_test)
from sklearn.metrics import accuracy_score
print("Accuracy:", accuracy_score(y_test, y_pred))
📌 Summary
- GridSearchCV: Best for small parameter search spaces.
- RandomizedSearchCV: Best for large parameter search spaces.
- Both methods help find the optimal hyperparameters to boost model performance.
Comments
Post a Comment