Hyperparameter Tuning (GridSearchCV & RandomizedSearchCV)

Machine Learning Tutorial Part 13: Hyperparameter Tuning (GridSearchCV & RandomizedSearchCV)

Machine Learning Tutorial Part 13: Hyperparameter Tuning (GridSearchCV & RandomizedSearchCV)

⚙️ What is Hyperparameter Tuning?

Hyperparameters are parameters that are set before training a model. Tuning hyperparameters is essential for improving model performance. Let’s explore two key techniques: GridSearchCV and RandomizedSearchCV.

🔍 GridSearchCV

GridSearchCV exhaustively searches over a specified parameter grid and evaluates model performance using cross-validation. It’s the best method for small search spaces.

from sklearn.model_selection import GridSearchCV
from sklearn.ensemble import RandomForestClassifier

param_grid = {
    'n_estimators': [100, 200],
    'max_depth': [10, 20, None]
}

grid_search = GridSearchCV(estimator=RandomForestClassifier(), param_grid=param_grid, cv=5)
grid_search.fit(X_train, y_train)

print("Best Parameters:", grid_search.best_params_)

🎯 RandomizedSearchCV

RandomizedSearchCV samples a fixed number of hyperparameter combinations from a specified grid. It’s faster than GridSearchCV, especially for larger search spaces.

from sklearn.model_selection import RandomizedSearchCV
from scipy.stats import randint

param_dist = {
    'n_estimators': randint(100, 200),
    'max_depth': [10, 20, None]
}

random_search = RandomizedSearchCV(estimator=RandomForestClassifier(), param_distributions=param_dist, n_iter=100, cv=5)
random_search.fit(X_train, y_train)

print("Best Parameters:", random_search.best_params_)

📈 Evaluating Model Performance

After tuning, you can evaluate the model using various metrics like accuracy, precision, recall, and F1-score, as explained in previous tutorials.

best_model = grid_search.best_estimator_
y_pred = best_model.predict(X_test)

from sklearn.metrics import accuracy_score
print("Accuracy:", accuracy_score(y_test, y_pred))

📌 Summary

  • GridSearchCV: Best for small parameter search spaces.
  • RandomizedSearchCV: Best for large parameter search spaces.
  • Both methods help find the optimal hyperparameters to boost model performance.

Comments