Recalling the last lecture, that when we had an outcome, y, and the regressor x, y was n by 1, and x was the regressor that was n by 1. Recall that if we wanted to minimize the least squares criteria, our minimum worked out to be beta hat = the inner product of y and x, over the inner product of x with itself. Now, imagine if we we're to look at the y and x pairs, this fits a line to the origin. However, let's add some data over here. A line to the origin may or may not make sense. So, let me draw a picture that's a little bit closer to what I'm thinking. So, imagine if you had a setting like this. There's a clear linear relationship, but trying to fit a line to the origin to that data set isn't going to work so well. So what you might consider doing is resetting the origin to somewhere more relevant, okay? Or of course fitting a line that also has an intercept. But let's imagine if all you had at your disposal was regression to the origin. First thing you might wanna do is just reset the origin to be right in the middle of the data. Then when you were to fit a line to the origin it would go right through the data in a pretty reasonable fashion. At least it would take care of the problem that the origin is sort of nowhere near the data, and it's forcing the line to go into a position that is not at all reasonable. So how would we get right in the middle of the data? Well, we would just say, let's define y tilde there as the centered version of y, one where we've subtracted off the mean from every data point, so that it now has mean 0. And then this is ostensibly just shifting the origin right into the middle of the dataset. So, that would just be (I- Jn(Jn transposed Jn inverse) Jn transpose)y. And then, if we were to center our x, that's of course I minus the same matrix Times x. So now if we were to do regression through the origin of these two matrices, I try to minimize y tilde- x tilde beta squared, the norm of that squared, what would be the beta hat, in fact just to differentiate it from the beta before, let's call that gamma, what would be the gamma hat that we would get? Well, it's of course, gamma hat would be norm, I'm sorry, in a product of y tilde and x tilde divided by the inner product of x tilde by itself, okay? Well that is equal to y transpose times (I minus, and let me just replace this matrix so I don't have to keep writing it. Let me just call it H, (I- H) transpose times (I- H) then times x, all over x transposed times (I- H) (I- H)x. We'll transpose there. So now if you go back to our previous lecture, or a couple lectures previous, when we we're talking about variances, what you can see is that this works out to be this quantity in the top. That works out to be n- 1 times the empirical covariance between y and x. And this quantity in the denominator is (n- 1) var (x). Well, we can manipulate that. And let's, to make it a little bit more statistical, let's write as rho hat xy is the empirical correlation between y and x. And sigma squared x hat is the empirical variance of x in sigma hat squared y is the empirical variance of y, okay, cuz that makes it seems a little more statistical. This language over here seems more like we're writing computer code than writing mathematical notation. Okay, so let's take that. So the covariance is nothing other than, we can write the numerator, right? The numerator is (n- 1). That could be written as the correlation between y and x times the standard deviation for x and a standard deviation for y, and the denominators of course, the variance of x. And then we should put hats over the. Okay, so the (n-1)'s cancel out. One of the two sigma x's cancel out, and you get rho hat xy times the standard deviation of the y divided by the standard deviation of the x, and this is sort of a famous formula. And that's basically saying that the slope, if we were to center our y regress, our outcome, and center our regressor, and fit the regression to the origin, the slope of the best fitting regression line is the correlation between the y and the x, the estimated correlation between the y and the x times the ratio of the standard deviations. Now a couple things to note. First of all, the units of this quantity are correct. So the slope has to be in units change in y over change in x. So units of y divided by units of x. Okay, but let's look at this quantity. The correlation is a unit for e quantity, then it's multiplied by the standard deviation of the y, so that's the units of y divided by the standard deviation of the x. So that's the units of x. So this quantity, our estimated slope has units y units divided by x units, which is what it has to have. It's also interesting that if we reverse the relationship, and fit x as the outcome and y as the predictor, all that happens then is these two reverse themselves, because the correlation, it doesn't matter which argument is first. Then it works out to be rho hat xy, sigma hat x divided by sigma hat y. So, that also implies, you'll notice what this also implies is if we standardize our two regression variables in addition to centering them before we do regression to the origin, then both their variances are 1. And the regression to the mean, I'm sorry the regression, correlation works out to just regression, slope estimate works out to just be the correlation, okay? So that's maybe everything you need to know about regression to the origin. And the big take home message is that if you center your variables first, regression to the origin leads to, and we'll see in a minute this leads to the exact same regression slope than if we fit a line that has both an intercept and a slope. But it works out to be the correlation between the x's and the y's times the ratio of the standard deviations. Okay, so let's try some computer code to just illustrate this and compare it with what LMRS function for regression is doing.