Lesson 19: Cross-Validation and Hyperparameter Tuning
This lesson introduces how to estimate model performance more reliably and search for better settings within a structured machine learning path. It begins with intuition, moves into workflow thinking, and then shows a practical Python example with clear notes.
Concept and intuition
Cross-Validation and Hyperparameter Tuning is a core topic in machine learning because it shapes how we frame the problem, choose tools, and judge results. One train-test split can be lucky or unlucky. Cross-validation gives a more stable view, and hyperparameter tuning helps search for stronger configurations systematically.
When learning how to estimate model performance more reliably and search for better settings, do not focus only on formulas. The more important habit is to ask what the model is trying to learn, what assumptions it makes, and what could go wrong when the data is noisy, incomplete, or biased.
How it fits into a workflow
In a real project, how to estimate model performance more reliably and search for better settings sits inside a larger workflow: define the problem, prepare data, choose features, train a model, evaluate it carefully, and improve the system over time. Strong machine learning practice is iterative rather than one-shot.
This means you should connect how to estimate model performance more reliably and search for better settings to practical questions such as: What data is available? How will predictions be used? Which errors are most costly? How will the system be monitored after deployment? Those questions matter as much as model accuracy.
Common mistakes and practical advice
A common beginner mistake is to treat how to estimate model performance more reliably and search for better settings as a purely technical task. In practice, success depends on data quality, evaluation design, and the clarity of the business goal. Even a sophisticated model can fail if the data pipeline is weak or the target is poorly defined.
As you read the code example in this lesson, pay attention to how the inputs are shaped, how training and prediction are separated, and how the output is interpreted. Good coding habits make machine learning work more reliable, explainable, and easier to improve.
Three practical examples
Several candidate models are compared with the same cross-validation procedure.
A random forest is tested with different depths and tree counts.
A team chooses settings based on average validation performance rather than one split.
Grid search with cross-validation
This code example focuses on clarity rather than production scale. Read the comments, then study the notes below to understand why each step matters.
from sklearn.datasets import load_wine
from sklearn.ensemble import RandomForestClassifier
from sklearn.model_selection import GridSearchCV
data = load_wine()
param_grid = {
"n_estimators": [100, 200],
"max_depth": [None, 3, 5]
}
search = GridSearchCV(
RandomForestClassifier(random_state=42),
param_grid=param_grid,
cv=5
)
search.fit(data.data, data.target)
print("Best params:", search.best_params_)
print("Best CV score:", search.best_score_)Code walkthrough
- `cv=5` means the data is split into five folds for repeated validation.
- `GridSearchCV` tries all listed parameter combinations and compares them fairly.
- The best cross-validation score is usually more trustworthy than a single split result.
- Hyperparameters are settings chosen by the practitioner, not learned directly by the model.
Summary and key takeaways
- Cross-validation reduces dependence on one random split.
- Hyperparameter tuning should be systematic rather than guesswork.
- The best model is selected using validation evidence, not only intuition.
- Keep a final untouched test set for honest end evaluation.
Exercises
- What is the difference between a model parameter and a hyperparameter?
- Why can one train-test split be misleading?
- Add another value to `max_depth` and rerun the search conceptually.
- Why should the final test set still be kept separate after tuning?