October 2020

This week was spent using the CLC software to visualize and analyze genomic data from some given samples (not data from our sourdough starters).

Working with CLC wasn’t too bad. It took me around 30-40 mins to work through all the graphs and absorb what I could from the data. Of course, that’s not including the time I had to wait for the data to be uploaded and whatnot! Personally, I really enjoy looking at graphs and statistical data like this so I thought most of the stuff done using the CLC software was pretty cool.

This graph above represents the amount of OTU’s in the samples that correspond with specific groups or types of bacteria. In this case, OTU’s were only defined if they were 97% or more similar to the reference data used. The inner ring shows that 99% of OTU’s in the sample come from bacteria and then the outer ring shows, what I believe to be the different phylum of bacteria that constitute all the bacteria in the sample.

When more rings are added to the graph (not pictured) it is important to understand just how many different species of bacteria are present in the sample. By the time the last ring is broken up there are well over 100 different pieces to represent each individual species. The diversity is staggering!

I have decided to keep the same three questions from last week because I think they’re all super interesting questions that could be answered using our data.

Does adding fruit, such as a banana, change the microbial composition of a starter?

What sorts of “outliers” will be discovered? What kind of unique or interesting microbes will be found in some classmates starter samples?

Will all of my classmates control cups be gnomically very similar? Or will they be different based on the environmental conditions they grew in while in our homes?

Now that the sourdough starters have started to be DNA sequenced its time to talk about what exactly is being sequenced. There are two specific genes/regions within genomes that are being sequenced.

16S rRNA gene: found in bacterial cells

ITS region: found in eukaryotic cells such as yeast

To accomplish this sequencing the cells must lysed and all the debris must be removed until only the DNA is left. Then, a specific primer that will only attach to these genes is mixed in with the DNA solution and PCR is performed to make lots of copies of the targeted gene. Next, a unique “barcode” is added to each gene so it can be later identified. Finally, all the DNA samples are added to one tube and are sequenced using a computer program to generate useful genomic data.

When comparing shotgun and amplicon sequencing I believe it would be easier to prepare DNA for shotgun sequencing because it is not necessary to worry about tagging specific genes and introducing the correct primers to only amply them. However, when it comes to actually analyzing the results it would be a lot more tedious using shotgun because it would require working with thousands and thousands of genes and trying to find a particular gene would be like trying to find a needle in a haystack.

Some questions I have that could be answered by our sourdough genomic data:

Does adding fruit, such as a banana, change the microbial composition of a starter?

What sorts of “outliers” will be discovered? What kind of unique or interesting microbes will be found in some classmates starter samples?

Will all of my classmates control cups be gnomically very similar? Or will they be different based on the environmental conditions they grew in while in our homes?

10/12-10/16/20: All These Graphs Sure Look Cool, But What Do They Mean?

10/05/-10/09/20 So Many Questions To Explore!