Workable Data!

After literal months of processing and tweaking, I finally have usable data. This is extremely exciting for me as this project is moving into the next stage. Right now I’m currently working on data cleaning. As a lot of the data was low coverage, these reads needed to be removed. I’ve written several simple processing scripts in both Python and R to trim data and separate it into different data frames that I’ll begin analyzing separately. I’ve begun very rudimentary analysis in R on one of my species of Mimulus, naiandinus, and have found some very preliminary results (displayed in the attached boxplot). The aim of this research is to estimate the diversity of populations. In this case, diversity is represented by the number of nucleotide differences divided by the total number of nucleotide overlap in any given gene. In this way, we can use the number of base pair differences as a metric of genetic diversity.

[Read more…]

When at first you don’t succeed…

One of the most frustrating things about science is that you spend a lot of time wondering if you’re headed down the right path. Although having multiple ways of doing things can be a benefit, it can also lead to second-guessing. I have spent more time than I care to admit thinking about whether the method I’m currently using is the best way to achieve results. Unfortunately, the past few weeks have been a lot of trial and error. I spent a week trying to get my data in the correct format in order to use a package on it (PAML; Phylogenetic Analysis by Maximum Likelihood) only to realize that the package wasn’t ideal for the low read coverage data I’m working with. This meant I was forced to jump ship and start trying to figure out a completely new way of doing things.

[Read more…]

Creating the Processing Pipeline

The early stages of my research have been comprised of creating a pipeline of computer scripts that can process the large amounts of genomic data I have. Because the files I’m dealing with are incredibly large (10gb text files) none of the data cleaning and processing can feasibly be done by hand. I’ve tried several strategies to do this, and after weeks worth of failed attempts, I was able to get the major file processed and broken down into much more reasonably sized files that I now have to work on further to fully process to the point where I can use them to create a phylogeny.

[Read more…]

Petal Spot Evolution: Abstract

I seek to understand how the wide variety of petal pigmentation we see today evolved. Using Mimulus as a model system I will weigh in on several questions surrounding the origin of petal spots. These include: Did petal spots evolve once and get passed down through generations, hybridizing with nearby species? Or were there multiple origins of this trait? By analyzing the DNA sequences of different species of Mimulus I hope to see similarities or differences in the genes responsible for this patterning that can then give insights into the origins and evolution of this trait.

[Read more…]