100 molecules to go

Hello,

I have just completed my first week of summer research and have set a few short term goals for the following week. I am currently focusing on two things: data collection and literature research. I am continuing my work from the school year of collecting blinking traces of Rhodamine B (RB) dye on TiO2. To quickly recap, the RB dye is diluted with water and TiO2 to 1e-9M. This solution is then spin-coated onto a blank slide and analyzed under a confocal microscope. A laser is then used to hit the slide, resulting in fluorescent molecules, a photon detector and scanning acquisition program then produces an image, in which molecules show up as bright spots. Once the molecules are located, each one is hit with the laser and blinking are traces are collected as the molecule switches from on (fluorescent) states to dark states(Fig.1).  Once in a permanent dark state, the molecule is photo-bleached. Data collected on a photo-bleached molecule provides no useful information, so I am experimenting with different factors to increase the molecules stability for longer data collection time. Currently, molecules are put in a Nitrogen atmosphere for increased stability. The power of the laser also effects how long the molecule may fluoresce until being photo-bleached, and currently the best power has been a wave plate length of 20nm. Half of the times blinking traces are collected, however the other times the molecules are already photo-bleached. There are some cases where the molecules begin to fluoresce after certain periods of times, making it difficult to know if a molecule is truly photo bleached.  As more data is collected this week, I hope to better understand patterns and time duration of the blinking traces. The goal is to collect useable data from 100 molecules! Currently I have 28, I guess I have a ways to go.

Once enough data is collected, it will be organized on a histogram and be fitted with a probability distribution function. How do we find a best fitted function? Previous research has often pointed to the power-law as the best fitting function, however recent research has argued against this. I am currently reading up on what might be the more accurate and unbiased method of determining a probability distribution function to a set of data points. I am currently studying the article ‘Power Law Distributions in Empirical Data’ by Aaron Clauset, and realize that I am a bit rusty on my math and probability skills. So far, the gist of the article is that it is biased to predetermine a function and see if data fits. Instead it would be more accurate if the data determined the function by using the relationship between probability distribution functions (PDF) and cumulative distribution functions (CDF). Long story short, the CDF is the integral of the PDF and may be found directly from the data. Once the CDF is determined you can differentiate it to find the corresponding PDF which would be the best fitting function. There are other terms such as P-value and D-value which have something to do with whether or not two data sets are even characterized by the same probability function. For the next week I will be reading and researching these methods for finding PDF’s to get a tighter grip on the calculations and hopefully begin applying it to my own data.