[SOUND] Now that we're refreshed on the basics of how C programs are laid out in memory, in particular, how they use the stack to support calling and returning from functions. We can start looking at buffer overflow attacks. Let's look at the components of the name. A buffer is simply a contiguous region of memory associated with a program variable or field. When they use the term buffer, people are often thinking of strings, where a string is simply an array of characters ending with a null or zero. For now, we will focus on strings too. Later, we will consider format string attacks and in the process see how the idea of a buffer is actually quite general. An overflow occurs when the program tries to write more data to a buffer that it can actually hold. This term is evocative of data running off the end of the buffer. But once again, the idea is really more general. Basically, whenever the program tries to use a variable to access memory, that doesn't belong to that variable. For example, by indexing an array out of its bounds, the program is performing a kind of overflow. An important question is, what happens when the program reads or writes to a buffer outside its bounds? According to the C programing language standard, such a program is undefined. Effectively, it is allowed to do anything. In a move positive for security, the compiler could choose to insert code to detect out of bounds accesses and terminate the program when they occur. Instead, most compilers simply assume the program does not have any overflows, and so the program will access whatever memory happens to be at the accessed location. By knowing how memory is laid out, an attacker can use out of bounds accesses to his advantage. Let's look at what could happen if a buffer overflow takes place. Here we have a function, func, and the function main which calls this function with the string AuthMe. Inside of the function, it tries to copy the string AuthMe into a buffer. But probably you can see the problem here. The string has seven characters plus a null terminator. Whereas the buffer in the local function only allots four characters. And so, we're going to overflow that buffer when we call strcpy. Let's see this depicted on the stack. First, when calling func, we see arg1 and we see the instruction pointer that we saved from the caller, and we see the frame pointer. Then we see the buffer, four bytes, that we allocated inside of func. Now we see strcpy works and it's going to copy the first four characters. Then, it's going to copy more characters and overwrite the frame pointer with the rest. When we get to the end of the function, we're going to try to follow the same process we always do, to return to the calling function main. But of course, the frame pointer is now corrupted. So it's going to set it to whatever this strange value is. And we're going to segmentation fault when we subsequently use that frame pointer, for example, when accessing a local variable in the caller. Now, normally, we think, oh, that's a crash. There are bugs in the program, this is one of them. Who cares? Eventually we'll discover it and we'll fix it. Well, buffer overflows are security relevant. If we modify the function func as follows, we can see that it can have security implications on the program when the buffer is overflowed. We've allocated a new local variable, authenticated, and throughout the function func we assume that authenticated should be set only if in fact authentication has really taken place. Perhaps this will happen after a strcpy. Now let's see what happens with our buffer overflow this time. So when calling func, we push arg1, then the instruction pointer, then the frame pointer, and we've allocated the local variable authenticated, and the local variable buffer. And now the strcpy takes place. This time, instead of overwriting the frame pointer, we overwrite the contents of the authenticated variable. Now this is a problem, because every time we go to check authenticated, the value is non-zero and the check is going to succeed. So this mistake had a security relevant outcome by allowing the program to do things that probably we didn't intend. Could it be worse than this? Well in fact, if we think about it, strcpy gives us the ability to copy any amount of data into a buffer that's not the right size. So basically, we could overwrite lots of memory on the stack. And the question is, what could you do with that ability if you were an attacker? Well as we'll see, one thing the attacker can do is overwrite the buffer with code. It arrange for the program to execute that code when it returns from the function. Now, before we see how that works, as an aside, let me point out that these examples are providing their own strings simply as constants. But in reality, the issue is that strings come from users, some of those users malicious. For example, they could come as textual input. They could come as packets, or environment variables, or input from files. It's very important that we validate our assumptions about user input. That is, we want to make sure that the input, for example, is not too long or that it conforms to a certain structure that the program assumes. We'll discuss validating input assumptions later and throughout the course because it turns out to be a problem that programs make all the time, not just with buffer overflows.