In the previous screencast, I showed you how you could create just simple linear models, in other words, just the equation for a line using the regression tool. In this screencast, I'm going to continue talking about regression models of a single regressor variable. So we have our variable x, which is the independent variable, and the dependent variable is y, which depends upon x. Now, in general linear regression, we have this form of the model where the output y has an intercept Beta_naught plus we have coefficients Beta_1, Beta_2, and so on up to Beta_n, each multiplied by different functions of x. These functions of x can be any function of x, they don't have to be linear even though these models are known as linear regression models. Linear regression models mean that the output is linear in the coefficients. So I could actually have one of these, let's say f_1 of x equals 1 over x plus maybe something like the natural log of x. So whereas this is not linear, the model is linear and Beta_1 is just proportional to Beta_1. To prepare the spreadsheet, what you want to do is you want to have your x, y data, and a lot of times for regression I'll put y in front of x, then you need to create columns for your other functions. So what we do is we create columns for all of the other functions in our model up through f_n of x, we drag that down so these are actually going to be values. Everything in this table below the titles are going to be actual values that we're going to use in the regression tool. So let's go through an example. We're going to create a polynomial model, specifically in this example, we're going to create a second-order polynomial. Second-order polynomial is just the equation as shown here, y equals Beta_naught. So that's the intercept plus Beta_1 times x plus Beta_2x squared. So we still have a single regressor variable which is x, but now we have two functions of x. F_1 of x is just x and f_2 of x is x squared. Let's work through an example here I have the viscosity of water, this is in degrees Fahrenheit. This is the temperature, and we then have the viscosity. So which one is the dependent variable and which one is the independent variable? Temperature is the independent variable and viscosity depends upon temperature. So what I'm going to do, I'm just going to relabel this just so I know this is x, and I like to do this, and I know that that's y. If we wanted to, I could go ahead and select both of these columns, I could insert a chart, a scatter plot, and I'll just go ahead and format this real quick. We see that this is not linear, it's not really a line, but instead it's a different type of a model. I'm going to go ahead and right-click on here, and I'm going to add a trend line. Now, if you just want to do cause quick and dirty, add a trend line to see what the model might be, and we're going to be fitting this to a second order model. In other words, we're going to create this equation, this model, y equals intercept plus Beta_1 times x plus Beta_2 x squared, where x is the temperature and y is the viscosity. So back in Excel, I can click on polynomial model and maybe something like order 2. You could play around with this to see what model best fits your data, but let's just go ahead with the second-order here. If you wanted to display the equation and the R squared, You could do that. Now again, it's not typical for scientists and engineers to use R squared, instead, we want to use R squared adjusted, which can only be output by the regression tool. R squared adjusted basically penalizes you for having too many terms in your model. Let's go ahead then and prepare the spreadsheet. Remember, we want to have two functions of x, the first function of x is just x, and the second function of x is just x squared. So we need to honor spreadsheet have a column for just the x values and the x squared values. Now importantly, all of your x inputs, your functions of x have to be in one contiguous block of cells. So what I'm going to do is just put a label there for x and x squared, and here I'm just going to replicate x over here. So I'm just going to do that, Ctrl, Shift, Enter, just to clone that or replicate it, and here then we need to, for each row, create a formula for x squared. So I can select that entire column, I can do equals, and I'm just going to do that, carry 2, Ctrl, Shift, Enter. That then is an easy way to create x squared. Now I've got f_1 of x in a column, f_2 of x in a column, I'm all set to use the regression tool. So let's go ahead up to the data tab, and I'm going to go to Data Analysis, Regression tool, and I'm going to click Okay. I'm going to select for the y input range, that's going be our viscosity. So I'm going to click and drag and I selected the top label there and make sure that I've selected labels box, the x input range, I'm going to select both of these columns because those are our x inputs or functions of x, f_1 of x and f_2 of x. I'm going to go ahead and click Okay. After some formatting, I can get the following, and I've just copied and pasted the formula here. Down here is really the important stuff, the coefficients, the intercept, the coefficient to the x term, the coefficient to the x squared term. So our model, our final polynomial model is shown here. So maybe I can put temperature here, and I want to calculate the viscosity. I could just plug in this temperature into this equation. So I'm going to go ahead and just click on the intercept. So the intercept plus the coefficient here, times my temperature, plus my coefficient to the X squared term, times x squared. So that's how I can determine then the approximation. So this is the model prediction at 85 degrees. Let's go ahead and go back here, at 85 degrees, we might be somewhere in here, and that looks about right. I think we got something like 8.85. So that's how you can use experimental data to create a mathematical model, and then you can input x values that are not in your original dataset, to get predictions for the output. Let's go ahead and look at a different model, and I'll explain the limitations of using this trend line tool. If we want to create a custom model, you can't really use this trend line tool. You don't really want to use a trend line tool anyway. You want to use the regression tool to get more statistical information. I'm going to go ahead and delete this trend line. The model I'm going to create now is shown here, we're going to create y equals Beta naught, that's our intercept, plus Beta 1 times 1 over x, plus Beta 2 times the natural log of x. Now, in order to prepare the spreadsheet for this, what's our f1 of x and f2 of x going to be? In order to prepare the spreadsheet, we're going to have to create a column for f1 of x and a column for f2 of x. So I'm going to delete what I had here. We're going to then create one over x, and the natural log of x, because those are the two functions, f1 of x and f2 of x that we need to calculate. You notice that in my original model, I do not have a function that is just x, so I don't need a column of x values, I just need a column of one over x, and a column of natural log of x. So I'm going to just drag this down, equals one divided by x over here using an array formula, Control Shift Enter. I'm also then going to do this same here, equals the natural log of x, Control Shift, or I guess I need a parentheses, Control Shift Enter, and now I've got my two columns ready to go. So I'm going to go up here to the Data tab, Data Analysis, Regression. Let's go ahead and click Okay. I'm going to do something a little bit different here. So let's first select our data that's going to be the same, we have our y range with the labels selected, the input range is going to be the same. So you didn't have to do anything there. The one thing I'm going to do now is I'm going to select this residuals box, and I'll explain why we do that here in a minute. Then I'm going to select, okay, it spits out the regression, I'm going to go ahead and auto size that. Now, when you select that residuals box, it's nice because what it does is it gives you this residual output. We're not going to talk too much about residuals, but these are the difference between the observation, or the experimental values and the predicted, which are based on the model. What I'm going to do is I'm going to go ahead and copy. So I'm going to select this, right-click copy. I'm going to go back to my original data, and I want to plot on the same plot the experimental, which are the blue with our model here, and we're going to see how good of a model we have. I'm going to do Control V, to paste, and you see that this simple two-term model is much better than our quadratic equation. I'm going to go ahead, we don't want to plot the model as markers, you want to change that. So we can right-click, and I'm going to do Format Data Series. I want to remove the marker and I want to add a line. So the first thing I'm going to do is I'm going to add a line, and I'm just going to change the color to that, to something like a red, the marker for this series, I don't want a marker. So under marker options, I'm going to select none, and I'm going to just go ahead and close this. So that is our model. The red is our model equation, and the blue are the experimental data points. So this model is a little bit better than our second-order quadratic model that we created a little earlier. I can go back into the regression sheet, and we can look up here at our adjusted R squared. The Adjusted R squared of this model is about 0.998. The Adjusted R squared of the quadratic model is 0.9805. So in summary, I can substitute the intercept, the coefficient to one over x, which is Beta_one, and the coefficient two natural log of x, which are available here. I can substitute those back into my model equation, and this is the mathematical equation that we get. Again, you could just plug in 85 degrees as x to get a good model prediction for the viscosity at that temperature. Hope enjoyed the screencasts. Thanks for watching.