Linear Regression

What Is Linear Regression?

Linear regression is one of the simplest and most widely used machine learning algorithms. It looks at the available data and tries to figure out the overall trend that describes all the data — especially when the data points are linearly related (when one increases or decreases, the other follows a similar pattern).

This trend is shown as a straight line and it helps in making future predictions.

Why the Name “Linear Regression”?

The term linear refers to a straight line. In this algorithm, the data features are linearly related — meaning, as one feature changes, the other changes in a consistent way (not always exact, but with a similar trend).

The word regression has its roots in statistics and history. The concept was first introduced by Sir Francis Galton in the late 1800s while studying the relationship between parents' heights and their children's heights.

He observed that tall parents tended to have tall children, but the children weren’t as tall — and short parents had shorter children, but not as short. He called this effect "regression to the mean", meaning extreme traits tend to move toward the average over generations.

Later, Karl Pearson built on Galton’s work and gave us the mathematical framework we now call linear regression — a method to find a best-fit straight line through data, helping us make predictions.

What Does It Do?

Linear Regression model tries to find out the best-fit line for future prediction, on the available sample data.

Let's understand this with an Example:

imagine we have many samples of height and weight which represents the info as when height increases weight also increases.

If we want to predict weight of a person based on height. Plot those data point on the graph taking height on x is a independent variable on axis and weight which depends on the height on y axis. Now imagine drawing a straight line through these points that best represents their overall trend.

The idea is to minimize the distance between each point and the line.

This best-fit line—can then be used to predict the weight of a new person just by knowing their height.

Mathematical Intuition

When we talk about a line, it's often represented as f(x) : y = mx + c

Where:

x is the input (or feature or called the independent variable),
y is the output (or target, or called the dependent variable, which we wanted to predict),
m is the slope (rate of change : how much y changes with x),
c is the bias or intercept (where the line crosses the y-axis).

Now, in linear regression, we’re given several data points — meaning we already have many (x, y) pairs that is like we know the output(y) for a given input(x). We need to find out f(x): y = mx + c where m and c are unknowns.

To use this equation make prediction we must need to estimate the coefficient m and c from the data.

So, how do we find that best-fit line, the f(x)?

Interestingly, we don’t start by choosing two points to draw a line. Instead, we start with some random guesses for m and c.

Yes — just some initial random values.

Then comes the learning part. where it learns and optimizes the values of m and c.

How Does It Learn?

With each data point we feed in, we calculate how far our predicted y is from the actual y (this difference is called the error or loss).

We combine these errors into a single number and aim to reduce it. There are different methods to combine the error. like the ordinary least square method(OLS)

where the combined error is called RSS(Residual sum of square), and r1,r2,..etc are the individual error of each data point.

This combined error is the one we wanted to minimize to estimate m and c. we adjust our guesses of m and c — just a little — in the right direction, so that the error becomes smaller next time.

This process is repeated across all data points, often for multiple passes over the same dataset.

Eventually, m and c settle into values that minimize the overall error, and we get our best-fit line.

Using some calculus while reducing RSS we can find out the value of m and c.

Real-Life Example

Predicting House Prices

Let's say, we want to estimate the price of a house based on its size (in square feet). We collect data on houses recently sold in the area — their sizes and prices — and use that data to find a relationship.

Using linear regression, we might get an equation like the one below:

Price = 20,000 + (120 × Square Feet)

So, a 1,000 sq ft home would be predicted to cost:

20,000 + (120 × 1000) = 140,000

Where Is It Used ?

Linear regression is used in many fields where understanding trends or making predictions is important:

Business: Forecasting sales based on advertising spend
Healthcare: Predicting patient recovery time based on age and treatment duration
Economics: Estimating consumer spending based on income
Agriculture: Predicting crop yield based on rainfall and fertilizer usage
Education: Predicting student performance based on study hours

When to Use It ?

Use linear regression when:

We want to predict a numeric value (e.g., salary, temperature, price).
We suspect a linear relationship (straight-line pattern) between variables.
We have continuous input (independent) and output (dependent) variables.
We need a simple, interpretable model to understand how inputs affect output.

Limitations

Linear regression is powerful but has some limitations:

Assumes Linearity: It only works well if the relationship is roughly a straight line. Curved relationships? Not ideal.

Effect:
If the actual relationship is curved or complex (non-linear), linear regression will miss patterns and unable to capture what's really going on(underfitting) and thereby give poor predictions.

Example:

If predicting age based on wrinkles, the relationship might not be strictly linear — younger people have fewer wrinkles, but it may plateau at some point. A linear line might not fit well.

Sensitive to Outliers: A single unusual data point can pull the regression line away from the ideal results.

Effect:

Even one outlier can drag the regression line toward itself. The model might try too hard to "fit" the outlier thereby moving away from the general trend.

Example:

Predicting house prices: one extremely expensive luxury villa in a normal neighborhood can shift the trend line and mess up predictions for average homes.

Correlation ≠ Causation: Just because two things move together doesn’t mean one causes the other.

Effect:

Using linear regression on two correlated variables without understanding the context can lead to wrong conclusions.

Example:

Ice cream sales and AC sales both go up in summer. Predicting AC sales from ice cream sales would be silly — the real cause is hot weather, a third factor influencing both.

Multicollinearity: If input features are highly correlated with each other, the model gets confused. they bring in redundant or overlapping information.

Effect:

The model struggles to figure out which variable is really influencing the output. This can make coefficient estimates unstable or misleading, hurting the interpretability.

Example:

Using both “number of hours studied” and “number of pages read” as inputs to predict grades — if these two are highly correlated, the model may get confused on how much weight to assign each.

Underfitting: It's too simple for complex patterns — miss important pattern in the data.

Effect:

Linear regression may not capture the real-world complexity, leading to high errors and poor generalization.

Example:

Trying to predict stock prices (which are volatile and have complex trends) using just a straight-line relationship would miss important signals.

Conclusion

Linear regression is simple, intuitive, useful when the pattern is right. It works best when you sense a straight-line relationship and want something that’s easy to understand and quick to apply. But like any tool, it has its limits. The real skill lies in knowing when it fits — and when it doesn’t.

Search This Blog

Into The World of AI/ML