So now we're going to turn to a discussion of sample size determination in designed experiments. This is one of the most commonly asked questions in the design of experiments world. How much testing do I need to do? Well, the answer actually depends on lots of things. Not only the type of experiment that you're thinking about running, but the resources that you have, how you're going to run the experiment, and what the desired sensitivity of your experiment is. What do we mean by sensitivity? Well, sensitivity is the difference in means that you actually want to be able to detect with fairly high probability. Generally what we find, is it increasing the number of replications, increases the sensitivity. That is, it makes it easier to detect small differences in the means. We're going to talk about sample size determination focusing on this fixed effects experiment. And we typically can choose the sample sizes to detect a specific difference in means that achieve certain desired values of type I and type II error. Now, type I error is the probability of rejecting H0 when it's actually true. We control that, that's the significance level that we choose for test statistic, but the type II error is the failure to reject H0 when it's actually false. That's usually represented by the symbol beta. We also often talk about power, and power is defined as 1- beta. Typically, what we want to do is we want to design an experiment to give us adequate power to detect differences in means of particular sizes. There are a variety of ways to do this. There are actually formal graphs of operating characteristic curves that are in some textbooks. And these operating characteristic curves either plot beta, the type II error, or the power against some parameter that expresses how different the means are. A very common way to define this parameter is in terms of fee, and fee square is as you see on the bottom of the slide. It's n times the sum of the treatment effect squared, divided by a times sigma squared. So if you know how big that ratio is that you want to detect, the operating characteristic curve can help you determine the amount of replication to use. I'm going to show you how JMP does this, JMP is a very widely used package and it does a very nice job of this. Let's go back to the plasma etching experiment that we had earlier. Let's suppose that our experimenter is interested in rejecting the null hypothesis with a high probability. Let's say 0.9 at least, she wants the power to be at least 0.9 if the treatment means are actually as you see displayed here. Mu1 is 575, mu2 is 600, mu3 is 650, and mu1 is 675. Now, if you know what those means are, you could actually calculate the taus and once you knew the taus, you could calculate this parameter sigma square if you had a rough idea of the variants. Well, she feels that the standard deviation of the etch rate is not going to be any bigger than about 25 angstroms per minute, so she can input that information. You don't have to calculate the taus, all you have to do is input the means and sigma square into JMP, and JMP will produce a display that looks like this. This is a plot of power versus sample size for this particular scenario. And you'll notice that over on the left, this is where we have put the individual treatment means in and we've put in the standard deviation. We fixed the value of alpha, and now we get this power curve that shows us the total sample size per group, that would be required to get a particular power. Well, we want a power of about, 90% or better. So that would be somewhere up around here. And so that would drop down to this lower axis. And so that a total sample size of about 15 or so, would appear to be appropriate. Well, we have four treatments, let's round that up to 16. And so that would be a plot of let's say, rather a sample size requirement based on this plot of about four wafers per test combination. So that would be one way that we could do this. One problem with this approach is it's often difficult for the experimenter to decide what should we use for these treatment means. How do we make those up? [LAUGH] Well, that can be a problem. And so there is another way to do this. Another way to do this is to select a sample size so that if the difference between any two of your treatment means exceeds a target specified value, then the null hypothesis would be rejected. And Minitab is a software package that uses this approach, and so here's some output for Minitab use that's based on this approach. The upper part of this display shows you how to calculate power for a specified difference and sample size. So here we specified the maximum difference to be 0.75. We plugged in our sample size of 5, and then Minitab reported the power to be about 0.8. Well, if you want the power to be larger, you could do this a different way. You could again input the maximum difference you want, you could input the standard deviation you want, and you could input a target power. And then Minitab will tell you the appropriate sample size which turns out to be 6 in this problem. And that would give you an actual power of about 0.91. So software can be very useful in helping you generate an appropriate sample size for your experiment.