[SOUND] A store brand battery claims that it lasts as long as more expensive national brand. A consumer watch dog group wants to make sure the claim being made by the store brand is not false at 5% level of significance. The batteries were tested continuously by equipment mimicking normal use, and elapsed time until the batteries were no long functioning was recorded. So, we're going to test this by using a hypothesis testing comparing two means from two examples. One sample is coming from a national brand and one is coming from a store brand. So in this case, what we have to do first is state our hypothesis, so if I say A is the store brand and B is the national brand then one easy way of coming up with your hypothesis is to think about it this way. The fact is, I think the lifetime of the store brand and the national brand are the same. And therefore, the alternate would be that it is different. Once you write it this way, it's easy to now write it in a way that the software can handle. So, software is looking for what do you think is the difference between the two? So, now you can rewrite these as we see the difference between the two batteries lifetime to be equal to zero. And obviously, the other choice is that they are different so therefore, they would have something other than zero as their difference. So once we have the hypothesis stated, we will know if we are doing a one tail test or two tail test. So, this would be an example of the two tail because again they are not saying the store brand is worse than the national brand or better than the national brand. You're just saying it's the same. So we will reject this hypothesis, if we find out that the store brand is indeed better or if the store brand is worse. So on either side of the tail, if we deviate too much from that difference of zero that we expect. Then you will end up rejecting the null hypothesis, otherwise we will not. So, now let's go back to our data and see how we would do this. So this is what we have for the data, and if I look down I would see that I have 186 and I had the same number of batteries tested for Brand A and Brand B. By the way, that is not a requirement. You don't have to have the same sample size. These are two independent samples coming from two independent populations that are being randomly selected, so they don't have to be the same. Now here, since I'm doing batteries, it probably is the case so that I will just pick up the same numbers. But think about it, if I have send surveys out to two different types of population. I can't control that one population might return more of their surveys back to me versus another sample. So the fact that they don't have to be equal is that a great thing for us let me show you how we will do this in Excel. You will go to data, you will go to data analysis. When the pop up window comes, scroll down until you see three different T test. The first one is the paired two sample for means. I will discuss this later on in one of my examples. And the next two, are two sample assuming equal variances and unequal variances. So, let me just talk about this a little bit. When you do assumption that they equal variances which means that I expect that population A, so brand A and population B brand B. Within themselves they have some variability but I'm going to assume that their variability within their own population is about the same. The reason we have this it's because when we were doing this manually the degrees of that we have to come up with for T test is lots simpler if I assume equal variances. So, assumption of equal variances may or may not be true. So, when we are using a computer program and we actually not doing the difficult work of calculating everything manually, there is really no reason for you to assume equal variances. So, always run this kind of a test assuming unequal variances because if they have equal variances, the answers will be the same. But if we do the other way around, which is I assume equal variances, but then in reality, they're not equal, then my analysis is not right. So, be safe and always go for the most generalized form. Because for you, when you're using a software, it really does not make a difference. The calculations are coming back in, in splits of a second, so you won't know the difference at all. So, I'm going to say that I'm going to do a t-Test. That t-Test is going to tell me if the significant differences between these two exist or not. I'm using Two-Sample Assuming Unequal Variances. I'm going to click OK, and then it says what is your Range 1? Highlight to tick my labels because that be my analysis to short labels rather than the variable one. So, I'm going to put my cursor on A1. Hold my Ctrl+Shift down. Pick the entire data secs. Remember to scroll back up, put it on Brand B, pick the entire cell. Remember to say that you have labels, if you don't you'll get an error message that says you have non-numeric values. That's usually the case when you forget to click this. So if you get that, just open it up and click your labels. Then it says what is your hypothesized mean difference? Here, I can put zero or I can just ignore it because default is zero and as you can see Alpha is 0.05, that's again a default value. You can change this to 0.10, you can change this to whatever that you want to have. I'd like to put my output on the same page but as soon as I click this my cursor jumps here graying this out. So if I go and click on anywhere on the spreadsheet, I would lose that referencing. So before you do clicking on anywhere on your spreadsheet, make sure you click here and then clicks a place and say OK. Now you can see that everything is kind of smooshed up, so I'm going to first do some formatting, so it would be easy to read. The best thing to do, is to put your cursor here and double click and it will increase the size. And you don't have to worry about it, so same thing. So I can, this one was big. So, I'm going to make this small. What you see is an output. First of all, is the fact that it gives you a lot of descriptive statistics right away. For example, it would tell you that the mean for Brand A is 9.99 hours and Brand B is 9.92 hours. Now remember, Brand A is my store and Brand B is my national brand. If you want to change this to remember what it is, you can, by the way. You can say okay, store brand, so if it's easier for you to think about it as a store brand, then you can just change it to store brand. So, that's the nice thing about it. So, I'm going to go back to what it was. So looking at this, it looks like Brand A is actually a little bit better. So this is the cheaper battery because it's a store brand, it's a generic brand. So I may say wow, Brand A is actually better. But remember, we can't say that because statistically they may be exactly the same. So, that's what we are testing here. So, it tells you also what is the variance in Brand A. And what is the variance in Brand B. Remember, variance is the square root of your standard deviation. So if I wanted a standard deviation, I have to take the square root of this and I would know what is the standard deviation of Brand A. It says that it has 185 observations in Brand A, 185 observations in Brand B. And the hypothesized mean difference was zero, and the degrees of freedom has been calculated by taking these two values. And then going through the process of creating the degrees of freedom based on the equation that is used for unequal variances. Once it does that, then you would see that it gives you a value for the t Statistics. And then a P value if you are doing a one-tail test and a P value if you are doing a two-tail test. It also tells you where is the t Critical. So, let me show you what the meaning of the t Critical is. If you're doing a two-tail test and your level of significant is 5%, then 2.5% is here and 2.5% is here. So, this t Critical would be 1.96 and -1.96. So, this is what the critical t value is. However, if you were doing the one-tail test, doesn't matter which tail by the way, if it's right tail or left tail. Let's just assume right tail. If you are doing the right tailed test, the entire 5% will be here, right? So then this is 1.65, which is what you see here. So, it will give you all the values. Because it doesn't know if you're doing a one tail test or a two tail test. Excel doesn't know. Excel gives you the enter and output. You have to pick out what is right. Then let me also tell you what is t Stat. What the t Stat is saying to you, which is this one. T stat says you have taken a sample from Brand A and a sample from Brand B, the difference that you have noticed between these two samples when you look at hypothesized difference being zero. Your sample is somewhere here, is 1.54. And therefore, is going to give you the P value and you'll have to find out whether or not you will reject the hypothesis or not. So in out case, we're doing it two-tall task and that's what we're going to pay attention to. We're going to only pay attention to this part of the table and looking at this P value you would see that it's greater than 0.5 therefore, we do not reject the null hypothesis. If we don't reject the null hypothesis, which means what? Our hypothesis was that these two perform about the same and we ended up based on our data not rejecting it. Therefore, the two brands are but the same. So, what they are claiming right now based on the data that we have cannot be disputed.