So in this lecture I'd like to think about where these imprinted genes are located throughout the genome. So in the human there are around 150 imprinted genes, and in the mouse, perhaps 130. Although, it seems that we're constantly discovering more imprinted genes all the time. But these 150 or 130 genes are not spread completely throughout the genome and all existing on their own, but rather imprinted genes exist in clusters. So each cluster of imprinted genes has one imprint control region that controls the expression of the surrounding imprinted genes. In the simplest scenario, like I've drawn here, we can see that there's an imprint control region that I'm showing as a white box. And it can be unmethylated or methylated. And in this case, the maternal allele is methylated. And therefore, we would class this cluster as maternally imprinted. On the unmethylated allele we also see paternal expression. So we see expression of these three clustered genes all of which are controlled by that imprint control region. What's important to remember about imprinting is that unlike X inactivation, it's not the same in every tissue. So if you looked in embryo and look at X inactivation at the adult female, then every cell would indeed display X inactivation. But with parental imprinting, this is not the case. We find that although the imprint control region methylation is present in every tissue, it doesn't necessarily result in imprinted of expression of the genes within that cluster within every tissue. So it's tissue-specific. And this is dependent on the way that the imprint control region works that we'll deal with in the subsequent lectures. But just remember that it's not always happening in every tissue, at every time. So not all genes are being expressed in an imprinted way all the time. But rather, it can happen differently in each tissue. And the two tissues that where we see the most imprinted expression are the placenta and the brain. So if we now have a look at where these imprinted genes exist, this is a picture of the mouse autosomes, the 19 mouse autosomes. And you can see clusters that exist. So while we have some genes that exists by themselves, for example qpct or fvxo4o, well we know that in most, the vast majority of cases these imprinted genes exist in a cluster. Say for example here this large cluster on chromosome 12 and a very large number of clusters on chromosome seven. Within each cluster, we tend to have both genes that are expressed from the maternal chromosome, shown in red, and genes that are expressed from the paternal chromosome, shown in blue. So this tends to tell you that although the imprint control region tends to be the same for the whole cluster, it doesn't necessarily have the same effect on all genes within the cluster. And this is, again, something that is common to all of these clusters that you look at. Again, the way that this results depends on each individual cluster that you examine and we'll think about three clusters over the series of the next two lectures. But now I'd like to take a little brief amount of time to think about, how do we actually measure DNA methylation technically? So, in some of the papers that you'll read with this course, they all mention different ways that DNA methylation is measured. And perhaps the most common method is bisulfite sequencing. So while there are large number of new and emerging techniques to look at DNA methylation, one of the older and perhaps more traditional mechanisms to look at this, or methods to look at this is using bisulfite conversion. So bisulfite conversion is a chemical conversion of the DNA, which allows you to discriminate between cytosine and 5-methyl cytosine. In bisulfite sequencing, you then go on and amplify this DNA by polymerase chain reaction. And then look at the sequence that results. In other methods, you can still use the bisulfite conversion to begin with but you don't necessarily perform the same method of amplification and sequencing afterwards. So the way that bisulfite conversion works is you have either a cytosine or 5-methylcytosine. And when you add bisulfite to this bases, here showing this bisulfite, then the cytosine is subject to sulphonation and this sulfur group is added. However, the 5-methyl group that's on the 5-methylcytosine precludes this from occurring and so, 5-methylcytosine is retained as such. The cytosine sulphate then goes through a deamination phase to uracil sulphonate and finally to uracil. And so the difference between an originally unmethylated cytosine and 5-methylcytosine is that this will be produced, the cytosine will end up as uracil, 5-methylcytosine will stay as 5-methylcytosine, and this chemical difference between the two is used to discriminate which of those occurred in the underlying sequence. So for bisulfite sequencing, you then take this DNA and will amplify it before sequencing. So here, if we consider a very short sequence with one CPG dineuclotoxide inside it, it can be either un-methylated as in the top, or methylated as shown on bottom. And if it's methylated of course, through this bisulfite conversion process, it's retained as 5-methylcytosine, whereas the unmethylated version is converted to uracil. So, in fact, all cytosines that are unmethylated, whether they're in the context of a CPG or not, will be converted to uracil. So, then during the PCR amplification phase, we only provide the nucleotides A, C, G and T. And so, U will be amplified as T. Because of course these are the same equivalent base. So, now you can discriminate between whether the original DNA was methylated or unmethylated based on whether at the CPG you find a CG meaning that it was methylated CPG, or TG because it was unmethylated. So there are different ways that we can actually sequence the resulting DNA. The way I'm going to mention is clonal bisulfite sequencing. So each of the different DNA's that are produced by the polymorphism chain reaction can be used to sequence. So you can isolate each fragment and treat each of these fragments differently by inserting them into a plasmid backbone and this is called a clone. Each clone is then sequenced. The advantage of this approach is that you can treat, in the linear fashion, each of the molecules, each of the DNA molecules differently. And this means that if you can find a polymorphism, a sequence polymorphism that you’ll be able to identify in this clonal sequencing, you’ll be able to see whether or not it was the paternal or the maternal chromosome that you’re looking at. And this of course is very useful for parental imprinting but it's also useful for X inactivation and other studies. We also then can compare what the relationship is between neighboring CPG dinucleotides. So, do you have one that's methylated, the next one down is unmethylated and the one after that is methylated again? Or are they all methylated or all unmethylated, and this linear relationship about each clone can tell you about the haplotype or the kind of the epigenetic state of multiple neighboring CPGs. And they tend to be shown in this kind of a fashion that's show here, these filled in circles, these black circles being for DNA methylation, and the open circles being unmethylated CPG dinucleotides. So if we consider here, this imprint control region which I drawn in a previous slide, and we know that it's methylated at the three cytocines which I've shown, CPG dinucleotides I've shown on the maternal allele and unmethylated on the paternal allele. The way that this may have been determined would be to bisulphite convert the DNA, amplify up this region, that is the imprint control region in this case, and then sequence individual clones. If there's a single nucleotide polymorphism difference between the paternal allele, here with that perhaps a blue single nucleotide polymorphism, compared with the maternal allele with the pink single nucleotide polymorphism, then you can discriminate between which methylation was found in which clone from each parent. In this case you can see that all of the clones that we derived from the maternal allele had methylation of all three CPG dinucleotides. Whereas all of the clones that were derived from the paternal chromosome with the blue single nucleotide polymorphism are completely unmethylated. So this clonal bisulfite sequencing has the advantage that we can discriminate parental alleles and also that we can compare the methylation at all these sites. If we just took the PCR product that we had and sequenced it without this cloning stage, then what you would see is 50% methylation at all three CPGs. But you wouldn't know whether that's because on average they were 50%, or because one allele was completely methylated and the other allele was completely unmethylated. So hopefully you can see the advantage of this approach. In some of the papers, you should be able to see bisulfite sequencing being performed similarly to this and so you'll understand, hopefully, what they're talking about. In the next lecture, we'll think about how imprint control methylation brings about imprinted gene expression. So how is it that this difference in DNA methylation at this particular region for each cluster brings about that parent of an origin specific monolithic gene expression.