So far in this course, we've studied the mathematical and statistical tools that can help us analyze data that come in in a particular format. So, we've been looking at data that have continuous response variable, a thing that we want to predict or explain. And then, where we have a predictor variables that are factors. So in one-way ANOVA we had just one predictor variable that was a factor of potentially many levels. In two-way ANOVA we have two predictor variables that are each factors. And then in ANCOVA, we have factor predictors as well as a continuous covariants that we use for adjustment purposes. And in all of these cases, so far, we haven't really talked about the way that the data were collected. And that's an important piece of the puzzle. Because if our goal is to make causal claims about the relationships between these variables. Say that units that fall within different levels of the factor, they're falling in those different levels, is the causal effect of the response being different on average. Those sorts of claims require either additional assumptions or a designed experiment. So think of an example. Suppose we were looking at some data that came in from, say, a company that is redesigning their website and trying to detect whether the redesigned, say, that a different color scheme on their website has a causal impact on the number of sales that they undertake in a certain period of time. Well, without a properly designed experiment, which we'll learn about in this next module. So without a designed experiment, the justification for making causal claims would be pretty weak. So say there are two different color schemes. And on average, people with color scheme one by more than people who see color scheme too. Well, if we haven't assigned the color schemes to people at random and we just observe which ones there could be systematic biases. There could be lurking variables that cause a certain correlation structure between the two schemes and sales without there really being a causal relationship. So we would need to set up a designed experiment in order to make well justified causal claims. So to that end, in this module we'll really do two things. One is that we'll study important experimental design concepts that allow us to make well justified causal claims, at least in many cases. And then we'll also apply those concepts to real study designs, such as block designs, factorial designs. And we'll also analyze real data and real experiments using these techniques. One of the most important concepts that we will utilize in this module is the concept of a cause or causality itself. Now, identifying causal relationships is really the goal of a lot of scientific research. But there are really legitimate questions, as to what is a cause? And how can we learn about causal relationships? Now these are really difficult questions. And there are some intense debates about the answers to these questions. And we'll not really summarize those debates here. But I do want to think about a few different possible answers. Because I think it will help us frame our discussion and understanding of experimental design. So philosophers, statisticians, data scientists, and many others study the nature of causal relationships. And there's not really an agreed upon definition of a cause of causation. So what I'd like to do here is just discuss very briefly a few common approaches to defining causality. And we can do that in considering the following example. Namely, what does it mean to say that the rain caused the grass to be wet? So obviously a really simple sentence. Let's try to parse that sentence using a few different definitions of causality. So one possible definition for causality involves probability. And sometimes it's called the probabilistic approach. And the main idea here is that a cause raises the probability of its effect. So, for example, we might say that C causes E if the probability of E given C is greater than just the probability of E alone. So the idea here being that if you observe C, then the probability of E is higher than if you did not observe C. So to say that the rain caused the grass to become wet under this definition is to say that rain increase the probability of the grass being wet. Now, of course, there are problems with this approach. And the problems with this approach can be summarized pretty neatly in the common statistician refrain being correlation does not imply causation, right? There can be probability raising effects that are not causal effects. So another popular definition of causality is sometimes called the counterfactual definition of causality. And roughly theories that adhere to a counterfactual definition of causality posit that C causes E if in the absence of C, E would not have occurred or would have been less likely to occur. And that's a counterfactual, right? It's counter to the facts. So we have to understand what would have happened if a certain event did not occur. So here, to say that the rain caused the grass to become wet under this definition, really means that if it didn't rain, the grass would not become wet, or it would be less likely for it to become wet. And this definition of causality, I think, really is in the background of a lot of statistical methods. So, when we study experimental design and we try to to make causal conclusions. I really think we have some kind of counterfactual definition in mind. Now there's a third possible way of approaching the definition of causality that's becoming popular. And that's called the structural model approach. And there's a lot to say about this approach. So we'll just try to very briefly summarize. The basic idea is that causation assumes something like the claim C causes E must mean that we evaluate that claim relative to a structural model. So these models describe a system of interest driven by stable stochastic mechanisms. And really, what we're saying is that any causal relationship, say, the rain causing the grass to be wet, has to be evaluated in terms of lots of background factors that would be summarized by some structural equation model. So each of these approaches has benefits and drawbacks. But for the purposes of this course and of this module in particular, establishing the following conditions will make it reasonable to draw causal conclusions from experimental data. So first is that we need an empirical association. So we need something like a correlation in order to make causal claims. So there needs to be a correlation between your predictor variables, your explanatory variables, or covariates. All different names for the same part of an ANOVA model say, or regression model more broadly. And the response variable, there needs to be a correlation. And we understand that correlation by using regression or ANOVA. So those tools give us correlation or empirical association relationships. So we also need the correct temporal relationship. So, we need the cause to come before the effect in time. So the predictor in our ANOVA model has to come before the response, right? Before the change in the response. And the last thing that we need is nonspuriousness. So we need the relationship between the predictor and the response to not be sort of permeated by some third unaccounted forward variable. So there can't be some background variable that is really causing the response to change. And the thing that we have in our model that we've actually measured is really maybe just correlated with that thing. So we want to try to isolate the actual relationship, the actual thing that's causing the response to be the thing that we've actually measured. Whatever our factor is in ANOVA or whatever our continuous predictor is in a regression model. So we'll also note that it's desirable to know something about the causal mechanism. So whatever the scientific law is that would describe relationships between variables, that is important to know if we can know it. But it's often not possible to know that, or we don't have a scientific law that describes these relationships. And that's actually what we're trying to find out. And so if we don't have that, the strength of our causal conclusion might be a bit weaker. But experimental design techniques can help us learn something about causes. So experimental designs really help us establish the empirical association criterion. And it also helps with temporal relationships and nonspuriousness. So it helps with each of the three criteria that we were just just speaking about. So specifically, experimental design concepts such as treatment designs. For example, having treatment and control groups, which will talk more about in a future lesson. And randomization, which is another concept that we'll talk about here, does much of the work in helping us establish these three conditions. So treatment designs help researchers think clearly about experimental factors. So we might ask, how many treatments or factors should there be in our experiment? And an example of that might be, how many different doses of a drug might we give to patients in a study? Or how many different color schemes should we implement and look for the effectiveness of those color schemes on sales? It really depends on sort of the goals of the research. Whether we want several color schemes or just one or two. And whether we want many different doses of a drug or just one. And within a treatment design, we might also think about whether or not we want to control for continuous covariates. Meaning, should we use an ANCOVA model instead of just an ANOVA model? This could be important because those covariates could be correlated with the response variable, and we want to control for those if possible. So all of this goes into coming up with a good treatment design for your experiment. So another really important issue in experimental design is randomization. And randomization really concerns the assignment of treatments to experimental units. So in order to do that, we have to think clearly about what experimental units actually are. And we will discuss that in just a few moments. And we'll ask ourselves how they differ from what we might call sampling units. So there's a subtle but important distinction between experimental units and sampling units. But the question about randomization really is, how should we assign the treatment, say, the drug or the website color scheme to our units? Should it be completely at random, should it be systematically in some way? Well, that might be bad for reasons that we'll discuss. Should it be random but within some predefined group or block? How many replications should we have? Meaning how many people should actually get the drug versus the placebo? Or how many people get assigned one color scheme versus another? Those are all important questions for randomization design. So in our study of ANOVA, we learned some technical details about treatment designs. So, for example, we know that if we have two different experimental treatments. And maybe one is a fertilizer, the other is a pesticide, and we're studying some characteristic of plants that we give these two different treatments to. We should decide on the number of levels for each of those treatments. Like how much fertilizer, is there a control group that gets something like a placebo and same thing for the pesticide? Once we decide that we can use a two-way ANOVA model. Or we might think about controlling for something else, like the intensity of the sunlight and then we might use an ANCOVA model. So we have learned some mathematical and statistical techniques that will help us out with treatment designs. And what we're really learning about in this module is some of the details of how they actually apply to controlled experiments. So later in this module we'll discuss two different randomized designs. So one will be a completely randomized design. And another will be a randomized complete block design. And they have uses that differ depending on certain factors that show up in the data. So we'll learn which one will be more appropriate in which context. But I think it's first important to become clear on the distinction between our experimental unit and our sampling unit. Because sometimes that can be a point of confusion. So we'll end with a discussion here on that difference. So first, recall that an experiment deliberately imposes a treatment on a group of units in the interest of observing a response. And this differs from an observational study which involves collecting and analyzing data without changing the conditions. So, experiments we have control over imposing treatments and in observational studies we don't. So with that in mind, let's define an experimental unit. So an experimental unit is the entity to which the treatment is applied. And if we independently assign treatments to several experimental units, we've performed a replication. Now, replication is good because we get to see, basically the more information we have about the treatment through replication, the less variability we have in our estimates of causal effects. So importantly, an experimental unit may not be the same as a sampling unit. So a sampling unit is really the entity on which the response is measured. And that could be different from the experimental unit. So we might apply a treatment to some class that's different from the class that the response is measured on. So to gain a better understanding here, let's think about an example. So consider a company that wants to perform an experiment to compare the effects of four different advertising campaigns for improving click rates. So they like more Internet users to click on their ads. And they want to try four different advertising campaigns to bring people to their website. And they have four different social media platforms that they choose to host their ad campaigns. And each of those will show the add to a 1000 Internet users. So experimenters randomly assign each campaign to a different social media platform. Now, in this example, the treatment is the advertising campaign. You're assigning the treatment to the different social media platforms. But what are the experimental units? Now in this case, they're not the Internet users, right? They're not the total 4000 Internet users. They're actually the social media platforms. And that's because the social media platforms are the entities to which the treatments are applied, right? They're randomly assigned to those social media platforms. But the individual Internet users that are on those platforms, those are the sampling units. Because those are the entities on which we measure the response, namely whether they clicked on the ad or not. So this distinction might seem trivial, but in a lot of cases it's not. So, for example, if you thought the experimental units were users, then you would falsely believe that we had many replications in each ad campaign. But in fact, we don't have any replications. Because there are four campaigns for different advertising campaigns. And there are four different social media platforms. So, in fact, in this study, in this experiment, there's no replication. But if you mistakenly identified the experimental units to be Internet users, then you would have thought there were many replications. So in a future lesson, we'll learn how to block, which is a technical term, and that will help us with replication. But in the next lesson, we'll start with a simpler experimental design, namely the completely randomized design.