Hyperparameter Tuning & Grid Search (ML)

Machine Learning Tutorial Part 10: Hyperparameter Tuning & Grid Search

Machine Learning Tutorial Part 10: Hyperparameter Tuning & Grid Search

⚙️ What is Hyperparameter Tuning?

Hyperparameters are settings used to control the learning process (e.g., number of trees, depth of tree, learning rate). Tuning them can significantly improve model performance.

🔍 Grid Search with Cross-Validation

Grid Search exhaustively tries all combinations of parameters using cross-validation to find the best set.

from sklearn.model_selection import GridSearchCV
from sklearn.ensemble import RandomForestClassifier

param_grid = {
    'n_estimators': [50, 100, 200],
    'max_depth': [None, 10, 20],
    'min_samples_split': [2, 5]
}

clf = RandomForestClassifier()
grid_search = GridSearchCV(estimator=clf, param_grid=param_grid, cv=5)
grid_search.fit(X_train, y_train)

print("Best Parameters:", grid_search.best_params_)
print("Best Score:", grid_search.best_score_)

🎲 Randomized Search

Randomized Search randomly samples combinations and is faster when the search space is large.

from sklearn.model_selection import RandomizedSearchCV
from scipy.stats import randint

param_dist = {
    'n_estimators': randint(50, 200),
    'max_depth': [None, 10, 20, 30],
    'min_samples_split': randint(2, 10)
}

random_search = RandomizedSearchCV(estimator=clf, param_distributions=param_dist, cv=5, n_iter=10, random_state=42)
random_search.fit(X_train, y_train)

print("Best Parameters:", random_search.best_params_)

📌 Tips for Tuning

  • Use GridSearchCV when parameter space is small and precise tuning is needed.
  • Use RandomizedSearchCV for faster, approximate search over large spaces.
  • Always use cross-validation to prevent overfitting during tuning.

Comments