Workable Data!

After literal months of processing and tweaking, I finally have usable data. This is extremely exciting for me as this project is moving into the next stage. Right now I’m currently working on data cleaning. As a lot of the data was low coverage, these reads needed to be removed. I’ve written several simple processing scripts in both Python and R to trim data and separate it into different data frames that I’ll begin analyzing separately. I’ve begun very rudimentary analysis in R on one of my species of Mimulus, naiandinus, and have found some very preliminary results (displayed in the attached boxplot). The aim of this research is to estimate the diversity of populations. In this case, diversity is represented by the number of nucleotide differences divided by the total number of nucleotide overlap in any given gene. In this way, we can use the number of base pair differences as a metric of genetic diversity.

Screen Shot 2018-08-14 at 7.53.39 PM

As I mentioned above, my goal is to examine how diversity varies among populations and between species. Because I have samples from several different naturally occurring populations that contain multiple species, it will be possible to look at if population has an effect on genetic diversity.