In the previous lesson, we looked at a sorting problem. We took a well-known sorting algorithm, merge sort that works for the internal memory setting, and we adapted it to the external memory setting so that it performs an optimal number of I/Os. In this lesson, we're going to take a well-known internal memory data structure, the binary search tree, and we're going to again adapt it to the external memory setting. So, remember that a dictionary is a data structure that stores a set of elements. Each element has key, a number. The operations that the dictionary is supposed to be able to do are the following: First of all if you want to find an element with a certain key, then the data structure should be able to retrieve that element quickly, if it's present. If it's not present, it's going to tell you that the element is not in the set, and you should also be able to insert and delete elements from the set. In internal memory, there are lots of different data structures for this problem that serve as a dictionary that are quite efficient. So, one of them is a hash table. Here, all the operations that I just mentioned can be done in only constant time expected. Another well-known data structure is the binary search tree, a balanced binary search tree, and then you can do all the operations in log n time, and a nice thing compared to hash tables is that you can actually do more. You can also, for instance, if an element is not present, you find the next largest number, or if you want to find the element with the largest key or the element with the smallest key, that's also possible. Now the question we're going to look at in this lesson is, what can we say about the I/O-efficiency of a binary search tree, and if it's not so good, how can we adapt the binary search tree to make it suited for external memory? Okay. So here, you see a binary search tree, and so let's have a look at the number of I/Os that we perform when we do an operation on this binary search tree. So typically, what an operation does, whether you are searching or inserting, you're going to walk down this tree. The question is how many blocks do you read as you walk down the tree? Well, that's going to depend on how the nodes of the tree are put together in blocks. Which nodes are together in a block? So, it depends on the blocking strategy, but if you're unlucky then the blocking strategy could be such that every node is in a different block and that means that, well, the number of I/Os is the depth of the tree and that would be, if it's balanced, order log n, log base two, white ball. You could say that maybe you can have a more clever strategy to avoid this log base two, which is not really what you want, but because the pointers that you typically have in a binary search tree, it's hard to control this blocking strategy. Typically, if you create a new element with a pointer to it, it's going to end up somewhere in memory that you don't really have much control over. So, our goal is now to change the search tree a little bit to make sure that we get not log base two but some different log. Okay. So, what's the main idea? Well, it's kind of similar to what we saw with merge sort that we want to have a higher branching degree, and to get that, we want to put many elements in one node. So then, we get a higher branching degree, that means a smaller depth and that means fewer number of I/Os. So here, you see an example of what we want to do. We put lots of elements in one node. That means the branching degree is higher, and then the tree will be less deep. So if the branching degree is k, then the depth of the tree is going to be log base k of n, instead of log base two of n. The way the elements are put into the nodes is again, of course, using the search tree property, namely, well, for a node, we have all the elements in there are in sorted order, and then, for instance in the leftmost sub tree are all the elements that are smaller than the leftmost element in the root. So then, we have this search tree, and well, the depth will be log base k and the main question is again, how large can we pick k? Okay. Now, remember that what the goal was, well, when we walk down for each node, well, we are going to do one I/O. So, we want to make the nodes as big as possible such that they feel still fit in one block, so that for each node we still need only one I/O. Well, that means that the number of elements that we put into a node should be roughly be. Okay, roughly. Now, let's have a look at the more formal definition which makes this a little bit more precise. So a B-tree, that's this adaptation of a binary search tree to the external memory has, well, all its leaves should by definition be at the same level. We also have that the nodes, well, they store roughly B-elements, and actually we will allow them to store anywhere between some value dmin minus 1, and 2dmin minus 1 number of keys. So, that means that the degree is between 2 and 2 times dmin, and this dmin will be chosen such that a node with this many elements still completely fits into one block. So, this 2dmin minus 1 keys, together with the outgoing pointers should all fit into one block, and this mean dmin is going to be, well, roughly be something like B divided by 4, or something like this. So if we do this, then the depth of the tree will be now log base b of n, and that means that the operations, well, they're essentially walking down the tree. They cost log base b number of I/Os. I'm not going to explain exactly how, for instance, this tree is kept balanced, how you make sure that all the leaves are at the same level, and all these other conditions on the structure of the tree are also satisfied. You can read about that in many books on algorithms if you want, but it's important to realize that the basic idea here is to put lots of elements into one node to make the tree less deep and thereby have fewer number of I/Os, and you can do this in such a way that the depth will be log base b of n. So, that means that a tree like this which is called a B-tree, well, it can perform all the operations we talked about earlier: searching for an element, inserting an element, deleting an element, finding the successor of an element, finding the maximum or the minimum element. All of these can be done in log base b of n time. Then, it's important to realize that the difference between log base b and log base two is actually quite significant. So here, if you have lots and lots of items so that your tree has to fit in internal memory, then even if you have something like 3.5 times 10 to the power 13 number of items, this log base b of n if you have something like b is 512 which is a standard number, then this log is only 4. So, you would do only for I/Os when you walk down, and actually what you would typically do is the root and maybe even also the nodes right below the root, so the first two levels, you always keep them in internal memory because that will easily fit, and that means that for those you don't have to do any I/Os. So typically, you only do one or two I/Os when you use a B-tree. If you would use a normal binary tree, well, then log base two of this number is actually 45. So, you speed up things by a factor of 10 or something like that. So, that's pretty nice. In the next lesson, we're actually going to look at a slightly different tree which is called the buffer tree, which has even better performance. So here, the number of I/Os for one operation is one over B times this log that you see, and what is interesting is that this formula is typically smaller than one. So, that means that on average, so if you spread it out over many operations, then the average number I/Os is going to be smaller than one. How we will achieve that, you can hear about in the next lesson.