Beginner lesson

Lesson 6: Regression Basics

Beginner Course position: 6 of 30 Track: Machine Learning Tutorials

This lesson introduces how models predict continuous numeric values within a structured machine learning path. It begins with intuition, moves into workflow thinking, and then shows a practical Python example with clear notes.

Learning objectives

Understand the main idea behind how models predict continuous numeric values.
See how the concept appears in real machine learning workflows.
Follow a practical Python example step by step.
Finish the lesson with key takeaways and exercises.

Prerequisites

Basic Python familiarity is helpful, but the explanation is written for guided self-study.

Key takeaways

Regression predicts numbers, not categories.
Choose regression when the business answer is a continuous value.
Error metrics such as MAE help interpret how wrong the model tends to be.
Feature choice strongly affects regression quality.

Concept and intuition

Regression Basics is a core topic in machine learning because it shapes how we frame the problem, choose tools, and judge results. Regression is used whenever the target is a number, such as price, demand, temperature, or waiting time. It is one of the most useful machine learning patterns in business and science.

When learning how models predict continuous numeric values, do not focus only on formulas. The more important habit is to ask what the model is trying to learn, what assumptions it makes, and what could go wrong when the data is noisy, incomplete, or biased.

How it fits into a workflow

In a real project, how models predict continuous numeric values sits inside a larger workflow: define the problem, prepare data, choose features, train a model, evaluate it carefully, and improve the system over time. Strong machine learning practice is iterative rather than one-shot.

This means you should connect how models predict continuous numeric values to practical questions such as: What data is available? How will predictions be used? Which errors are most costly? How will the system be monitored after deployment? Those questions matter as much as model accuracy.

Common mistakes and practical advice

A common beginner mistake is to treat how models predict continuous numeric values as a purely technical task. In practice, success depends on data quality, evaluation design, and the clarity of the business goal. Even a sophisticated model can fail if the data pipeline is weak or the target is poorly defined.

As you read the code example in this lesson, pay attention to how the inputs are shaped, how training and prediction are separated, and how the output is interpreted. Good coding habits make machine learning work more reliable, explainable, and easier to improve.

Three practical examples

House pricing

A model estimates property price from location, size, and number of rooms.

Demand planning

A manufacturer predicts next month's order volume.

Energy forecasting

A utility estimates future electricity usage.

Training a simple regression model

This code example focuses on clarity rather than production scale. Read the comments, then study the notes below to understand why each step matters.

from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression
from sklearn.metrics import mean_absolute_error
import pandas as pd

df = pd.DataFrame({
    "rooms": [2, 3, 4, 4, 5, 6],
    "size": [800, 950, 1200, 1300, 1500, 1800],
    "price": [180000, 220000, 275000, 290000, 340000, 410000]
})

X = df[["rooms", "size"]]
y = df["price"]

X_train, X_test, y_train, y_test = train_test_split(X, y, random_state=42)
model = LinearRegression()
model.fit(X_train, y_train)

preds = model.predict(X_test)
print("MAE:", mean_absolute_error(y_test, preds))

Code walkthrough

The target variable is `price`, which is continuous, so this is a regression task.
`rooms` and `size` act as input features for the model.
`mean_absolute_error` reports the average size of prediction errors in the original unit.
The goal is not just to fit the training data, but to estimate unseen cases reasonably well.

Summary and key takeaways

Regression predicts numbers, not categories.
Choose regression when the business answer is a continuous value.
Error metrics such as MAE help interpret how wrong the model tends to be.
Feature choice strongly affects regression quality.

Exercises

Give three new examples of regression problems.
Why is house-price prediction not a classification problem?
Add a new feature column named `age_of_house` and imagine how it might help.
Explain what MAE means in plain language.

Continue your learning

Previous lesson Lesson 5: The Standard Machine Learning Workflow Next lesson Lesson 7: Classification Basics