Xuan's blog

Finding needle in the hay

Signal and Noise

When I took the biostatistical course given by Dr. Thomas Love a few years ago, he recommended us reading a seemingly unrelavent stastical book “ The signal and the noise” by Nate Silver. The book is great and uncovers the promises and pitfalls of the big data analysis, one of the biggest pitfalls is that in most times what we got from the data are what we would like to hear, and that is deeply rooted in our nature.

Big data is routine is bioscience, and we are generating tons of high throughput data everyday. The data come in every aspects and disciplines, from patients clinical electronic records (images), high throughput drug screening, proteomics, genomic deep sequencing and more. But those data are not separated with others, for example, many different layers of the data for the same target can be integrated into one database to better understand the mechanisms. Moreover, the candidate can be further studied in high levels with integrations of other factors or conditions. Big data are embedded with the clues for better human health and we need better designs and collaborations to take advantages of the data. And we also need to be aware of our humanity of believe what we want to believe.