Linear regression is just a special case of polynomial regression where the order is 1. You can drag the slider above to see how different polynomial orders affect the fitting curve. When the order is N, the polynomial function will have a degree of N, and the curve will have N-1 turning points (so 1 is a straight line).
The Mean Squared Error (MSE) value (Σ(value - predictedValue)^2 / dataNum) measures the "goodness" of the fit of the curve to the data. The smaller the error is, the better the curve describes the given data.
Usually, the easy solution would be manually choosing a reasonable order and hardcoding it in our visualization. That’s what a lot of people do and, in most cases, it should be okay. However, when we don’t know the behavior of the data (is it constantly increasing or decreasing? is it periodic?), it’s hard to choose a good order.
As you might notice, a higher order will result in a more “accurate” curve with a lower error in general—but it also results in a more noisy curve. It can turn the curve too many times to follow our data because of all the noise. This is called overfitting.
You can play with the example below on Observable: it generates fake data points with some normal distributed randomness and then calculates the regression curve for it. For the “Constant” dataset, it’s better to just use a straight line to fit (order = 1). However, for the “Quadratic” dataset, selecting order = 2 will result in a more stable, better fitting curve.
Configuring PowerShell (shell prompt)
Markdown Syntax Guide
Custom Content Code Block Using HTML CSS JS