Hello. Today, we'll talk about genome rearrangements, and we'll be focusing on the question, how to transform man into mice. Ten years ago when my daughter was just ten years old, she asked what I am doing at work. And I told her, I'm transforming men into mice. She wasn't satisfied, and she asked for a clarification, and therefore I had to tell her a story. Men and mice may look different, but genetically they are very similar. For nearly every gene in man, there is a similar gene in mouse. And, mouse of course, outperform us with respect to smell genes, but we outperform mice a little bit with respect to brain related genes. But overall, you can say that men and mice have roughly the same set of genes. But these genes are arranged differently in human and mouse genomes. And that's why explaining to my daughter how man and mice differ genetically, I was telling her take 23 human chromosome, cut them into 280 pieces, shuffle these pieces and glue them together in a new order in 20 mouse chromosomes. You will get a mouse genome. She seemed satisfied, but she asked me if you can transform men into mice, can you transform mice into man as well. And I responded, of course it's very easy. You just need to reverse this operation of cutting and gluing, and you will get, starting from mice, you will get man. So today we'll focus on a slightly more simple problem of transforming mouse X chromosome into human X chromosome. X chromosomes in mammals are special because genes do not jump from the sex chromosome to other chromosome. And therefore you can think about X chromosomes in mammals as separate sub-genomes. Making it a little bit easier to analyze. And it turns out that human and mouse X chromosome, despite the fact that they are very long strings, 150 million nucleotides long, they can be thought of as just sequences of 11 large segments. Each of these segments may contain hundreds of genes but within each segment the genes are very similar. However, these segments, called synteny blocks, are arranged differently in mouse and human. And a number of questions arise. First, how can we transform a long strand consisting of 150 million nucleotides into just 11 synteny blocks. And what is the evolutionary scenario that nature used to transform mouse arrangement of blocks into human arrangement of blocks. You may notice that I show every block as a directed block, which is oriented either to the left or to the right. And I will explain later what precisely these directions mean, but you may recall that two complimentary strands of DNA ran opposite to each other. And depending on what strand a gene is located, we may assign the gene's orientation, left or right, or plus or minus, as we show in this slide. Now nature doesn't use this dramatic cut and glue together operations that I described when I was explaining the process to my daughter. It uses a simpler operation called "reversal". And reversal simply takes a segment of the genome and flip it over like this, reversing the directions of full blocks within the segments. Lets try to see step by step of what this particular evolutionary scenario for transforming mouse into human, amounts to. At the first step, we simply revent the orientation of block 6. In the next step we revert the orientation of block 9. Then we take two blocks and reverse their orientation, and continue, continue, continue, until we transform the mouse gene arrangement into the human gene arrangement on the X chromosome. I emphasize that this is just a hypothetical scenario. Nobody knows today what really happened during 75 million years of evolution while nature will has been transforming, mouse gene arrangement into human gene arrangement. But if the scenarios that I've showed here were correct, then one of the intermediate arrangements of blocks would correspond to arrangement of blocks on the X chromosome of the human-mouse ancestor. That's shown here, and we have to realize that on the way from mouse to the human-mouse ancestor, we're actually moving in back in time, and then from the human-mouse ancestor to human, we are moving forward 75 million years in time. Now, rearrangements are of course dramatic events happening within genomes. And you can think about rearrangements as earthquakes, because many bad things may happen. For example, every rearrangement, every reversal has two end points. And these endpoints, after a reversal happens, may actually disrupt the gene, or they may bring a gene to a completely foreign territory and put it under the influence of the wrong transcription factor, thus disrupting gene regulations. Now, if rearrangements can be compared to earthquakes, and we know that earthquakes are not happening just at random points, for example, where I live near Los Angeles earthquakes, are common, but in most other places they would be extremely rare. Well what about genomes? Are there any rearrangement hotspots or fragile regions in mammalian genomes and human genomes in particular? And also, so far, we have been talking about evolutionary rearrangements that happened at the million-year scale. But rearrangements also happened at much smaller scale during human development. And we also may ask a question, do rearrangement hot spots, if they exist, in mammalian evolutions, correlate with human rearrangement hotspots in fragile regions in humans that we will be talking about a bit later. Now, let's talk a little bit whether, in the scenarios that we described, there are rearrangement hotspots. And let's proceed step by step. So this is how our first reversal. And there are two earthquakes corresponding to the end points of this reversal that are shown in this slide. Next one, two more earthquakes. Next one, two more earthquakes. And again every time we perform a reversal, there are two earthquakes happening. Note that I marked the positions of these earthquakes and points of reversal by vertical gold streaks, and now we have two gold streaks in the same regions, this is a rearrangement hotspot. And by the time we finished transformation of the mouse X chromosome into a human X chromosome, we will actually count three regions where there will multiply repeated earthquakes in the corresponding areas. Can we deduce from here that there are fragile regions in the human genome where this rearrangements are happening over and over again? Of course not. This is just a hypothetical scenario. The real scenario may be completely different. And also, statistically, we cannot make judgment based on such a small sample. 40 years ago, the prominent biologist Susumu Ohno, came up with a random breakage model of chromosome evolution. Ohno argued that since rearrangements are so rare, then they must occur at random positions in the chromosome, implying that there are no fragile regions In human genome. Honestly, Ohno hardly had any information to support this model. But 30 years ago Nadeau & Taylor generated the first statistical argument in favor of the Random Breakage Model. You may be wondering how one can suggest statistical argument supporting something that happened millions of years ago, without even knowing the evolutionary scenario that describe transformation of one genome into another. Nadeu and Taylor asked a question, does random breakage model have predictive power, despite the fact that according to this model, rearrangements are happening at random places. And they suggested the following sort of experiment. Let's apply N random reversals to a fake chromosome consisting of M genes. Can we predict how many blocks of a given length will be generated as a result of this random experiment? If we can predict it, will this prediction fit what we observe in real genomes? And if they do fit what we observe in real genomes? Then we have an evidence in support of the random breakage model. For example, what would you expect after applying N random reversals to a chromosome? Would we expect roughly the same number of blocks of every length, or would we expect something like this, where the number of blocks of each length will be variable. It turned out that we expect something that is similar to this last slide, and it turned out that despite the fact that reversals occur at random positions, we can predict roughly how many blocks of each length will be generated. For example, if we repeat this experiment 100 times, then the average number of blocks of given size will follow this distribution. And this distribution is very well approximated by the exponential distribution and an approximation curve as shown on this slide. So when Nadeau and Taylor figured out how random breakage model would look like at the simulated example, they compared the synteny block distribution for real human and mouse data, despite the fact that in 1984 when their work came out, there was very little information about the lengths of human mouse synteny blocks. But even in this time, the lengths of known synteny blocks fit exponential distribution quite well. 10 years later, after the Nadeau and Taylor work, when the amount of human-mouse data on comparative architecture increase tenfold, people built the same diagram and saw that it fits exponential distribution even better. So Nadeau and Taylor prediction had almost prophetic power. As a result, starting 1990s, random breakage model was embraced by biologists and has become de facto theory of chromosome evolution. And after we described the random breakage model, we will look into algorithmic aspects of genome rearrangement, and talk about sorting by reversals.