So let us begin our code generation journey. And the first thing on our item is handling variables. So here's an example of a high-level expression written in some typical high-level language. And we see that in this particular expression, we have three variables, sum, x, and rate. Now, the code generator is going to take this expression and somehow, by some magic which we haven't yet discussed, it is going to translate it to VM code. And the result will be the stream of VM instruction that you see here. And you can convince yourself, you can go through these instructions and you can convince yourself that they will end up computing the value of the desired expression and putting this value at the top of the stack. So we're going to get exactly what we wanted, and yet we haven't yet discussed how the code generator is going to accomplish this magical transformation. This will be done in the next unit of this module. But for now, let us focus only on the variables. So once again, we have sum, x, and rate. Now, remember that the VM language does not have symbolic variables. It only has things like local, argument, this, that, and so on. So in order to resolve this pseudocode into final executable VM code, I have to map these symbolic variables on what we call the virtual memory segments. So I need some information. In order to generate code, I have to know whether sum, x, and rate are field, static, local, or argument variables. I need this information for every one of these variables. I also have to know, is it the first local variable or the second local variable and so on. Once I will have this information, I can indeed go ahead and complete the translation. And the result will look something like this. Now, obviously, here I made some arbitrary assumptions. I assumed that, let's see, I assumed that x was argument two. In other words, it was the third argument, zero, one, two. I assumed that, let's see, rate was the first static variable in this class code. And so on. These are just arbitrary assumptions, but once again, I just want to illustrate the fact that I cannot complete the code generation task unless I know what are the properties of every one of the underlying variables. So what are these properties? Well, let's look at the more realistic example, a segment of Jack code. And I'm going to highlight, as I did, all the variables that appear in this code. And earlier in previous units or in the previous unit, I told you that we can focus on the class level code and the subroutine level code in two separate points of focus, so to speak. And indeed, here we have class-level variables, which are, in the Jack language, they are the fields and the static variables. And we have subroutine-level variables, which in this case are the argument and local variables. So these two categories capture all the possible variables in the Jack language. So every one of these variables has some properties. It has a name, it has a type, it has a kind or role in the program, and it has a scope, which is the region of the code in which this variable is recognized. Now, in the Jack language, we have more specific documentation or specifications of what each one of these things mean. So the name of the variable must be an identifier, something that happens in every programming language that I can think of, high-level language. The type in Jack can be either one of the three primitive types, int, char, or boolean, or it can be any class name in your application. So in theory, we can have an infinite number of different types, three fixed primitive types, and as many class types as necessary. Then we have the kind of the variable, which is either field, static, local, and argument. And finally, we have the scope of the variable, which in the Jack language is very simple, you have only two scopes. You have a class level scope, that's where these variables are recognized, throughout the class. And we have subroutine level scope, which is the scope of the subroutine, the specific subroutine that you are compiling only. We'll have more to say about scoping rules later on in this unit. So taken together, what we have here is a bundle of variable properties that must be maintained for every variable that occurs in the source code. How should we do it? How can we manage this information about the variables? Well, the typical way of handling it is using something called a symbol table. So we'll illustrate the notion of symbol tables using an example, using this particular piece of code. And let us begin with a class level variables, of which we have three, right. We have x, y, the coordinates of the point, and pointCount, which is a variable that counts how many points we have constructed so far. And in order to represent these variables in a symbol table, we'll construct a table that looks like this. It has several columns. First column records the name of each variable. The next one is the type of the variable, which in this example happens to be integer all along. Then we have the different kinds of the variables, and a running number that indicates that in this example, we have two fields, field zero and field one, and one static, static zero. So this is the symbol table which is associated with the Point class. Assuming that these are the only field and static variables in this class. Moving along, what about the subroutine that we see here? Well, if we look at it, we see, what seems to be three variables. Right? The first variable is other, it's an argument variable. And then we have two local variables, dx and dy. But in fact, if you stop to think about it, things are slightly more sophisticated. If this were a function, or what is called in Java a static method, then, indeed, it would have had only three variables. But this particular subroutine is a method. And a method is always designed to operate on the current object. And this object is always represented using the variable, which by convention is called this. And therefore, the symbol table of this method begins with a somewhat surprising entry, which is the properties of the current object. The name of the current object is this. The type of the current object is always the name of the class to which this subroutine belongs, in this case, it's Point. And this particular variable, this, is always treated as argument zero. It is always treated as the argument which is passed to the code here by the caller. And it is some kind of an implicit argument that we'll discuss at length later on in this module. So for now, you just have to remember that whenever you construct a symbol table for a method, you have to start it with this particular entry. And the only thing that will change from this code to your code is the name of the class. Because, you know, the name of your class may be different. Obviously the compiler has to handle many different classes. But everything else is going to be fixed, this, argument, zero. The rest of the symbol table is straightforward. We have an other variable of type point, also happens to have a type point, it's an argument. Argument number one. And then we have two local variables, very simple. So these are the two symbol tables that we need in order to manage the variables of this application. Now, the class level symbol table can be reset each time we begin to compile a new class. I told you in the previous unit that in Jack, as well as in Java and in C#, classes are standalone compilation units. So whenever you complete to compile a class, you can throw away the symbol table of that class, and when you start compiling a new class, you can start fresh from a new class-level symbol table. Something similar happens when you compile subroutines. Each time you finish compiling a subroutine, you can throw away the symbol table of the subroutine, and start fresh with a new one. So I think these are sort of simplifying observations, because it turns out that when you compile anything in Jack, you always have to maintain just two symbol tables, the class level symbol table and the current subroutine symbol table, and that's it. Now, I'd like to say a few words about the perspective or the viewpoint of the code writer, which is the agent which is responsible creating and managing these symbol tables. Now, the code writer is going to encounter all sorts of variable declaration commands, of which in Jack we have three, right? We have either local variable declarations, field declarations, or static declarations. So whenever the code writer is going to encounter an instance of any one of these variable declaration categories, the code writer will elucidate from this statement all the important properties of the declared variable. And then the code writer will add this information to the respective symbol table. If we are defining a field or a static variable, the code writer will add a new row to the end of the class level symbol table. If we are defining a local variable or an argument, the code writer will update the symbol table of the subroutine which is currently being compiled. And by the way, one thing which is missing here is the treatment of arguments, right? We talked only about locals, fields, and statics. Well, arguments are being defined as part of the parameter list of the method's signature. So when the code writer goes through this parameter list, it also adds the respective lines to the subroutine's symbol table. Now, importantly, that's the only thing that the code writer does. It generates no code whatsoever beyond updating the symbol tables. So it's a relatively simple task to handle. What about using variables within the context of expressions or statements? Like, let dx equals x minus, applying the get method to the other object. How should we handle the variables in this example? Well, here's what we do. For each variable that appears in the source code, we have to look up this variable in the subroutine level symbol table. If we find it there, fine, we know which properties we have to use. If we don't find it there, we revert to looking it up in the class level symbol table. And if we don't find it there either, we can conclude that this variable is undefined. We can throw an error message. So in this example, we have the two possibilities, because dx, for example, is a local variable. So we'll find it right away in the symbol table of the distance method. But x will not be found in the symbol table of distance, so we look up the class level variable, and indeed, we find that x is right there. So that's how the lookup algorithm works. All right, let me give you a more sort of elaborate example to wrap up what we said so far. If we have some high level statement like this one, let y = y + dy, how do we handle it? Well, once again, the compiler is going to translate this expression into VM code. How it does it is something that we'll discuss in the next unit. But in the process of generating this code, it will look up the respective tables. It will find out that y stands for the first field of the current object. And therefore, it will translate y into this one. It will find out that dy stands for the second local variable, and therefore it will map dy on local one. And so on, and so forth. So that's how usage is being handled. Now, I'd like to end this unit with several very general observations about handling variables in programming languages in general. Now, high level of programming languages vary in terms of how they feature different variable types, different kinds, and different scoping rules. Now, the techniques that we discussed in this unit, most importantly, the symbol table generation and usage, these techniques can be easily extended to handle any number of possible variable types and kinds. In Java, for example, we have eight primitive variable types, or data types. And in Jack, we have three. It's a very small detail. We can handle any other primitive type just as well with our symbol table mechanism. Likewise, if you have more kinds of variables, we can also easily record this information in the symbol tables that we describe. Maybe we'll need more columns or whatever. It's a small detail. The nested scoping rules are somewhat more involved, and I'd like to say a few words about them. Now, I'd like to say a few words about, what is nested scoping to begin with. Well, some languages, of which Java is a good example, feature unlimited scoping. And this means that whenever you define a block of code with a pair of curly brackets, you can define variables within this code which are recognized only within that block. So x in this block does not mean the same as x in that block if you so desire as a high level programmer. So you can create once again as many variable scopes as you desire, which in some situations is quite helpful. And the question, of course, is how do we represent this information in our symbol table mechanism? Well, the answer is, or the classical solution is to use a linked list of symbol tables. And here's how you do it. When you start compiling the class, you create this linked list, and the first symbol table that you add to the list is the class level symbol table. And then when you start compiling a method, you create the method symbol table, just like we did previously. And then, whenever you have another scoping region, you add another symbol table to this linked list, and at the end you're going to get the linked list of several such symbol tables. And now the question is, how do you use this linked list when you encounter some variable x in the code? Well, here's what you do. You start in the first table in the list, and you look up x in the current scope. Failing to find x in the symbol table, you move to the next symbol table in the list and you look it up there. And you simply go on downstream in this linked list until you get to the class level symbol table. And if it's not there either, you can conclude safely that this variable is undefined and you can throw an error message. So as you see, this linked list data structure here captures very nicely this notion that the current scope hides all the scopes behind it. And that's how we handle nested scoping. This is something that we don't need at all when we write the compiler for the Jack language, because in Jack, we have only two symbol tables, class level, and subroutine level. And yet, I thought that it would be very relevant to talk about this extension at the end of this unit. So we know how to handle variables, and we can happily move on to talk about handling expressions.