Now that we have DNA and other molecular tools to study microbes, we found that organisms that we would classify as being the same, based on what they do on the agar plate, or how they look under the microscope, can actually be wildly different. For example, two bacteria of the same type, such as the well known E coli. May only share about 40% of their genes. This is because bacterial genomes contain some genes, that can be deleted, inserted, or duplicated quite easily. Pretty amazing, when you consider that you share 98% of your genome with chimpanzees. And about 90% with mice. This can make classifying microbes challenging. On the one hand, we can look at their taxonomy. This is what we call them based on where they appear, on the tree of life, or their evolutionary history. Molecular studies, using the 16S genes, which encodes for the 16S rRNA molecule, are able to do just that, because this gene is so essential to the cell. That is highly conserved across the domains of bacteria in archea. As we heard last week from Catherine, the 16S gene is an excellent marker, for tracing the evolutionary history of bacteria and archea and best taxonomy. On the other hand, we can use the various omics approaches that I talked about last week. These approaches allow us to classify microbes based on what they do. Or their functions instead of their evolutionary history. In today's lecture, we will focus on using the 16S gene for assigning taxonomy. Taxonomy is like an address that places each organism on the tree of life. When we process and sequence samples, such as a fecal swab, we obtain, short DNA sequences that act like zip codes, for the microbes in the sample. Allowing us to put them in the right neighborhood on the tree of life and thus figure out what they are. Taxonomy has many levels just like addresses do. We can group people, who live on the same street, or in the same city, or the same country, together to describe them. Similar to house addresses taxonomic levels range from very broad, to very specific. Including Domain=Phylum=Class=Order=Family=Genus=Sp- ecies. For example here's the full taxonomic information, for the Genus Prevotella. Which is in the family Prevotellaceae, the order Bacteroidales, the class and phylum have the same name in this case, Bacteroidetes and finally Prevotella falls in the domain Bacteria. [SOUND] As we heard about in week two, Carl Woe's pioneering work with with 16 [INAUDIBLE] revealed that there three domains of life, not two, as previously believed. Bacteria collectively form one of these domains and Archaea and Eucarya are the other two. There are organisms from all of these domains in your gut. But as we said before the majority are bacteria. When describing total bacterial communities we often grouped them by filum. Which is the hi, very high level of taxonomic designation. Kind of like country if we're still talking about addresses. For example, butterflies and lobsters are both in the phylum Arthropoda and both fish and dogs are part of the phylum Chordata. Since microbes are so diverse and samples can contain 100s or 1000s of species, grouping by phylum allows us an initial overview of the community composition. In their upcoming lecture, Rob and Will Van Treuren, will talk more about these types of plots, which show the relative abundance of bacterial phyla in a sample. We're used to seeing names of bacteria as two words, genus and species like Escherichia coli or E coli. Genus species levels of resolution was studies. We often group organisms into operational taxonomic units or OTUs. We do this because its difficult to know when a bacteria should be considered a different species or genus or even family. So, we use DNA sequence similarity to help us standardize our groupings. OTUs are created by grouping 16S sequences based on how similar they are to each other. Because the 16S gene is so highly conserved in general, two organisms that are closely related, will have highly similar 16S gene sequences. As an example, lets look at some of the words in different languages. Suppose we decided to compare the word for turtle in 13 different languages. We can probably, guess from how similar the different words are, which languages are most closely related and came from a common ancestor. We can group them together. This is basically how we form OTUs, except the process is quantitative. That is, we require some percentage of match, in the DNA sequence. Scientists commonly group 16S sequences at 97% similarity, to form 90 per, 7% OTUs. If we group sequences with a reference database, for which taxonomy has been assigned. We can then use this information to assign taxonomy to our sequences. Importantly, we also place all of our sequence data, in the tree of life, which allows us to perform additional assessments about the community structure. We will here more about these methods, in the upcoming lecture with Rob and Will. We will also hear more about the process of assigning taxonomy, in the upcoming interview, with Professor Phil