In this lesson, we'll extend our knowledge of linear regression by taking a look at lagged regression features. First, let's remind ourself of how linear regression works, specifically when trying to predict consumer demand. Simple linear aggression, or as is typically known, ordinary least squares, finds a linear function to fit the relationship between a single dependent variable as a function of another explanatory variable. The equation here sums it up nicely. Y equals beta zero, which is our intercept or constant terms, plus beta one x, typically known as the coefficient, plus epsilon, which is our error term. Since all the points are not going to be on the best fit line, the error term is necessary. This is a picture of what it looks like illustrating the coefficients as well as the error just discussed. The black dots represent points from actual data, fitting a best trend line to them, and we'll go through the process of how you do that in a second. Finds the necessary intercept, which looks like right above zero, plus a coefficient which is positive, indicating also positive correlation between these two variables. That epsilon term for any given point is how far away it is from that best fit line. When fitting the line, the goal is to minimize a cost function. In this case, with ordinarily least squares, our cost function is just a difference between all of the points on the scatter plot and the predicted point on the line. We squared the values to make sure that negative and positive values have a similar effect. Some examples of when you might want to use linear regression include predicting consumer demand as a function of price, or global demand as a function of interest rate, competitive pricing, and new distribution channels. Note the distinction here. In a, there's a single explanatory variable and a single dependent variable. That's simple linear regression. Whereas in b, there are multiple dependent variables for a single explanatory variable, which is multivariate regression. We explore linear regression first because it's the baseline for so many other machine learning techniques, such as the other ARIMA models that we'll get into in the later lessons. In the supply chain, linear regression again finds a relationship between X and Y. Let's talk about some of its use cases. Since you can see in the table on the right here, you can imagine x maybe being product sales and y being price. Now, let's talk about lagged regression. Lagged regression is a little bit different because instead of comparing X to Y, where Y and X are totally different features, X and Y in this case are the same series just shifted. So as you can see, x might be product sales and y might be those same product sales two days later. When we run regression in this format, we get to find the relationships between the variables within a single time series. And that's what ARIMA models do, which we'll talk about in a later lesson. Let's take a look at a plot of the passengers and passengers shifted 20. The passengers here are the number of international passengers between the years, 1949 to 1960. The same data we've been working with. On the y axis, the data's been shifted by 20 months. You'll notice we can now perform our regression from our actual passengers data to our shifted data, getting that best fit line. From here, we can take a look at not only the y equals mx plus B as an abstract concept, but incorporating time we're left with the equation y sub t equals alpha attempts sub t plus beta times x sub t minus one plus b. As you can see, we're trying to predict y as a function of x and x's previous step, where x steps back in time. As you can imagine, we can expand this further and have XT -2 to -3 and so on. Many modern machine learning models will do this automatically, but it's good to understand the fundamentals of regression before diving in.