Today I want to talk to you about how science answers questions. We all want the right answers. Should I prepare for rain today? How can I deal with my boyfriend or girlfriend? Should I go to the ball game or the concert tonight? Every day we face myriads of circumstances for which we need answers, the right answer. Well, sorry gang. But science never can tell you the right answer. All we can do is give you a probability that an answer is correct or not. What this means is, is that we live in a statistical universe. If the universe is statistical by nature and we now have quantum mechanics to bla, back up that claim. It may be that fundamentally a 100% crew deterministic answer does not even exist. In this very real sense, the universe with its constituents is not a machine, but an indeterminate process. However, even if it is indeterminate, it is not a free for all, but constrained by certain boundaries. It is these boundaries that physics wishes to explore, quantify, and refine. So we have a problem on our hands. If there are three kinds of lies; lies, damned lies and statistics. How are we to proceed? Incidentally, that phrase is often attributed to several people, among who are Mark Twain and Benjamin Disraeli. Well, the fact of the matter is, that the only way we can find out whether we are dealing with lies or damned lies, is through statistics. The way it works is as follows. If a theory predicts a phenomenon that is not observed, we can rule out the theory. But if it does accord with the data we have, all we can say is that the theory is consistent with the data on hand, not that the theory has been proven correct. In fact, we can never prove a theory correct. What this means in practice is that the most important part of any scientific experiment is the probable error associated with the measurement. It is more or less the wiggle room that we give a measurement, by how much our measurement might be different if we did the experiment over and over again. As a concrete example, let's image we are trying to measure the length of a dining room table. We get out our trusty old stone-age tape measure, and do an experiment. Measurement one, 260 centimeters. Now, what is the experimental error that, that we can estimate from this measurement? You might think nothing, since we have nothing to compare it to. But wait. If we look at our stone-age device, we see that it is very crudely made with big unmarked segments and thick lines marking the intervals. So we can estimate something called a systematic error, at say, ten centimeters. But we are not satisfied with that. So we make more measurements. 250 centimeters. 260 centimeters. 250, 270, 260, 100. 100 centimeters? Whoa, what happened? Do we really think that the table length is variable? It might be. But at first glance it appears we have made what we call a blunder. If we use that 100 centimeter measurement in our computation of, say, an average length, we will throw off everything. But if we suspect a blunder, we need to track down its source, if possible. So, using the six supposedly valid measurements, we can obtain a sample average of 258 centimeters. Now we can ask, what is the experimental error associated with this determination? In other words, how close to 258 centimeters would you expect each measurement to be? Clearly, we must compare our sample average with the individual measurements we already have. Also, we sense that the measurement that's smaller than the average, should count equally with a measurement that is larger. So, we better square things first and then take the square root, in order to avoid minus signs that would be associated with measurements that might be smaller than average. Thus, we expect that we need to take the following quantity. In other words, for each of our N measurements, we compare the actual measurement, x of i with the mean of all the measurements, x bar, and square it. Then, add all N results together and take the square root of the whole shebang. But clearly something is missing here because larger samples of measurements, in other words, N larger, should not imply a larger error. Measurements have to be worth something. So we sense that our estimate of standard deviation should include some factor of 1 over N. In fact, it turns out that this quantity, usually designated as sigma, is equal to our previous sum, but divided by the square root of N. In essence, we are taking the average value of the square deviations from the mean. Now, remembering that you need a minimum of two measurements to get any average value, we lead, this leads to the refinement of our equation as follows. It's the same as before in, except it has N minus 1, instead of N under the square root. The value of this quantity, sigma, associated with a mean value, can be shown to have an astonishing property. That 68% or about 2 3rds of all measurements you can possibly make, even into the future, will fall within plus or minus 1 sigma of the mean. As long as the properties of the phenomenon haven't been altered. To summarize, plus or minus 1 sigma contains 68% of all data measurements. Plus or minus 2 sigma contains 95% of all data measurements. And plus or minus 3 sigma contains 99.7% of all data measurements. That's it. It doesn't matter what you are measuring. You could be interested in the height of 25 year old women in Borneo. A comparison of daily maximum temperatures of two cities anywhere in the world. Measurement of returns on investment versus risks in financial markets. Analysis of statistics of scoring in sports. All of these basically use the same ideas presented here. However, if the phenomenon has changed, or a new phenomenon is somehow buried in the data, a smaller standard deviation will enable you to detect it more easily. Let's get back to our table and fast forward to the 21st century. New devices now allow us to obtain much better precision in our measurements. Now, using the same table, we may obtain the following results. Now, you look you at these numbers closely and say, hmm. Are we seeing something significant in the fact that the numbers seem to cluster around 258.65 and 258.85? Maybe, maybe not. But we pay attention to this detail, and then find out with additional measurements, that we have obtained the higher numbers when the temperature of the room is significantly higher than when we look at the lower values. We have discovered something. The table is changing its length in response to a temperature change in its environment. Our measurements have revealed the thermal expansion of the table. Something we had not anticipated, perhaps. And X-ray astronomy, as we shall see, is filled with surprises of this sort. And I'm sure your life is filled with them, as well. When have you thought that you were exploring or answering one question, when in reality, you were finding out something quite different instead? Why not discuss this on the forum? We now shift gears and look at, look at a hypothetical astronomical example. In this case, our determination of the standard deviation, or uncertainty in our measurements, are even easier to obtain than our result for the table length. This is because in certain situations, which fortunately include most astronomical observations, a very simple result ensues concerning what we might expect from a measurement of say, the brightness of a cosmic X-ray source as a function of time. The idea is as follows. Let's suppose you have a random process, such as the emission of light from an object. We know that when an electron changes its energy from within an atom by jumping from one level to another, it is accompanied by the emission or absorption of a photon. And we know that it is random in the mathematical sense, because we can never know exactly when this will happen, but it will probably happen in a certain given time period. And if it happens to lots and lots of electrons, lots and lots of times, we will get lots and lots of photons into our cameras or detectors. Let's suppose, to make this concrete, we observe a source for ten minutes and count 21,262 photons. We sense that if we were to do this measurement again and again, we would not get exactly 21,262 photons again and again, even if the source were unchanging. The randomness of the process ensures this. Well, in these circumstances, there is a simple way to estimate what the probability is of getting another result similar to but not identical with our first trial, if we were to repeat our measurement. We simply take the square root of the number of photons observed, and that represents the range, plus and minus from our observation, that we would expect to see 2 3rds of the time, if we were to do the observation over and over again. Thus, if we consider our original observation, we would expect to observe 21,262 photons plus or minus 146, about 2 3rds of the time if we were to repeat the experiment over and over. Why? Because 146 is approximately the square root of 21,262. The number 146, once again, is the standard deviation of our observation. In astronomy however, just raw numbers of photons are not particularly interesting. We are more interested in rates. How much energy is emitted per second or how many photons are detected per second during any given observation? So, let's see how this plays out in practice. Let's imagine that we have 100 photons in ten seconds. Our expected range, or in statistical language, our standard deviation, will then be 100 plus or minus 10, because 10 is the square root of hundred, of 100 over the ten seconds. This translates into a rate of 100 counts over 10 seconds, plus or minus 10 counts over 10 seconds. Or, 10 plus or minus 1 count per second. Let's imagine the same source, which is assumed unchanging, but now we observe it for 1,000 seconds. In other words, our observation is 100 times as long. Since we get 100 counts in 10 seconds, we would expect to get 10,000 counts in 1,000 seconds. And therefore, we would expect to have 10,000, plus or minus 100 counts, in our observation. Our rate then would be 10,000 divided by 1,000 seconds, plus or minus 100 counts in 1,000 seconds for 10, plus or minus 0.1 counts per second. Notice, that we needed 100 times more data to get our standard deviation down by only a factor of 10. What a bummer. So, as you see, it can be slow going and sometimes very expensive to get better and better results. But it is the reason why scientists are always asking for more data, better detection instruments, and bigger telescopes. We will explore this important issue in greater depth in week three, when we talk about clocks in the sky. But for now, we will just state that the size of the error bar, or standard deviation, may play a decisive role in what we can legitimately say about an astronomical source. Consider the following hypothetical data points measuring the brightness of an object versus time. So what we're going to do is we're going to plot the brightness of a source versus time. And let's imagine that we have the following points on our graph, something that looks like this. Let's make them a little bit more definite so that we can really see what's going on here. So, these are our measurements as a function of time of this source of light. Now we ask a simple question, is this source varying? Well, it depends on the size of the error bars. With a small sigma, we are more or less forced to connect the observations with some kind of variable curve. Let's look at that. Okay? Let's just look at what would happen if we have very small error bars attached to which of these points. You can see without even drawing anything, that it's impossible to just fit an ordinary, non-varying line through the data that would meet our requirement that 2 3rds of our data points be within one sigma of that particular line. We are almost forced into drawing something like that. But this not the case with larger error bars. Let's imagine that we have exactly the same data points. They're the same, but now we have associated with each measurement, maybe because we're using a smaller telescope, or our detectors aren't as good or whatever. Now, we imagine that associated with the same data are really, really big error bars. Now, you can see that it's quite easy to meet our requirement, more or less, of 2 3rds of all of the data being within plus or minus 1 sigma of our mean, by fitting a straight unvarying line through the data. So you can see that the standard deviation sigma is critical to our observation and it will determine whether our scientific estimate of variability is a lie, a damned lie, or a legitimate statement of probable fact.