So in this lesson, we'll motivate the two-way ANOVA model and we'll do so by looking at some examples of research questions that two-way ANOVA can help us answer. We'll also talk about some important concepts that are introduced through two-way ANOVA, such as crossed versus nested factors and factor interactions. So to motivate two-way ANOVA, let's jump right into an example, a study example. So electronic paper devices like Amazon's Kindle have really changed the way that people read. But you might ask the question, has it changed reading for the better? A 2013 study attempted to, try to answer something about this question. In this study, researchers set out to ask whether reading speed, a continuous variable differed across different electronic paper displays the Kindle. In addition, they were also interested in whether different lighting conditions impacted reading speed and so on the study, there were three different variables measured. So the first one was a factor and it was the device type. So there were three different levels to this factor, according to the three different device types. One was a Sony device, another one was an Amazon Kindle, and the third one was called an IREX device. Now the second variable was also a factor and it was the lighting condition. There were four different lighting conditions, brighter light or lower light. Then the last variable that was measured was the response variable, and that was the reading time measured in seconds. So there are at least three important questions that we could ask in a study like this one. So the first one, might be are the effects of device type significant? That could be both statistically significant but also practically significant. Here that might mean, is there any evidence that suggests that individuals read faster or slower based on the device that they're using, either the IREX, the Sony, or the Kindle. So another question you could ask is, is the effect of lighting, the lighting condition significant? That would mean is there any evidence that individuals read faster or slower in different lighting conditions? Then a third question that you could ask is that, do device type and lighting conditions interact in some way? Now, for example, you could suppose that on average people can read for longer on-device A than on-device B in low-light. Is that trend the same for medium light or bright-line? If not, for example, if B is better than A in bright light, then we could say that these factors interact with each other. So there's a difference between how well people read on different devices depending on the type of light that they're in. Now these questions seem relatively straightforward, and in fact, you might think that you could answer them using the tools we have so far, namely one-way ANOVA. But this approach wouldn't be super-efficient and it wouldn't help us answer all of the questions that we've just posed. So first, you can note that if you set up two different experiments, one to check the effect of device type, the other to check the effect of lighting, you might need many more participants than if you just had a single study together. So there would be a loss of efficiency if you had two different studies and it may also be resource-intensive. You might need more money to get those participants and more time, things like that. There's also another reason why, and it's that we wouldn't really be able to understand the interaction effect that we just talked about. So the third question on that list, it would be impossible to understand whether there's an interaction between device type and lighting condition. If we didn't have a single study that brought those things together and varied those factors in ways that could understand that interaction. So really two-way ANOVA is important for us to understand each of the individual effects, but also the interaction effects. For this reason, we really need a modeling technique that will allow us to take into account two different factors. The two-way, in two-way ANOVA is doing exactly that. It says that we have two different factors, each with some number of levels. This allows us to understand the main effects and also the interactions. One thing to note is that in this reading display study, researchers were able to observe readers on each of the three different devices and in each of the four different lighting conditions. More generally, if we have factors say, tau and beta, these two factors are crossed, if at every level of tau there occurs every level of beta, and vice versa. Another way of saying that is that there is at least one observation in every factor level combination, and if we have that, we have crossed factors. Now we might contrast that with nested factors, meaning something like factor tau is nested within beta when each level of tau occurs only within one level of beta, so all combination levels are not represented. Let's take a look at another example. In the previous two modules, we studied the impact of espresso brewing methods on the foam index, which was a measure of the quality of an espresso. But of course we can go further and measure other factors. Within a single brewing method, other factors may impact the quality of an espresso. In a 2015 study, researchers compared the foam index of two different brewing pressures across three different temperatures. You have pressure and temperature as our variables, the temperature varied from 75 Celsius, 85 Celsius, and 90 Celsius, and the extraction pressure for brewing was set at two different levels, 15 and 20 bar. In this study, our continuous response was the foam index, and we'd like to ask a few different questions here. What is the impact of the brewing temperature on the mean foam index? That's one question. What's the impact of brewing temperature on the mean foam index? Then the third question is, do these two factors interact, do temperature and pressure interact? Is the change in foam index due to temperature the same across different pressure levels? Again, in this study, if we conducted two separate experiments, we wouldn't get all of the information that we'd like, especially about that interaction factor. Of course, the general idea here with two-way ANOVA is that we have two different factors, one we might call tau, the other beta, and we're asking about the main effects of tau, the main effects of beta, and then the interactions between tau and beta. Tau and beta each could have many different levels, and so the interaction terms could be quite complicated. Part of understanding two-way ANOVA is learning a bit about those complications. Throughout this video, I've used the word effect to talk about the different factors. For example, the effect of brewing temperature on the foam index. But the word effect is actually not so great, even though it's really widely used with statisticians and data scientists. The reason why I don't love that word is because it makes us think about causal implications, causal effects, right? But the mathematical formulation that we've laid out so far doesn't provide us causal conclusions, at best it provides us with associations. It might be the case that there's a higher foam index associated with a higher temperature brewing method, for example. But we can't conclude that there's a causal effect at this point, because we haven't put into place experimental design, concepts, and techniques, and in these situations we need those in order to draw causal conclusions. Finally, let's end the lesson with one last distinction that I think is important for two-way ANOVA, and that's the distinction between a balanced and unbalanced two-way ANOVA. It might be the case that each factor level combination contains the same number of replications, so the same say number of brews of Espresso or the same number of individuals reading on a device in different lighting conditions. If they're the same number in each factor level combination, then the study is said to be balanced, and if there aren't the same number of units within each factor level combination, then the study is said to be unbalanced. Balanced studies are in some ways easier to work with, but we will make some notes and say something about unbalanced studies. In the next lesson, we'll begin to study the mathematics of two-way ANOVA, and we'll then go on, in later lessons, to think about what assumptions we can make, how we can collect data in experimental conditions so that we can make causal claims. But for right now we better study the mathematics so we can understand how two-way ANOVA works.