Storage prices versus DNA sequencing costs


The blue squares describe the historic cost of disk prices in megabytes per US dollar. The long-term trend (blue line, which is a straight line here because the plot is logarithmic) shows exponential growth in storage per dollar with a doubling time of roughly 1.5 years. The cost of DNA sequencing, expressed in base pairs per dollar, is shown by the red triangles. It follows an exponential curve (yellow line) with a doubling time slightly slower than disk storage until 2004, when next generation sequencing (NGS) causes an inflection in the curve to a doubling time of less than 6 months (red line). These curves are not corrected for inflation or for the 'fully loaded' cost of sequencing and disk storage, which would include personnel costs, depreciation and overhead.

Filed under  //  Big data  
Comments (0)
Posted

Hadoop is beatable

The problem with simple batch processing tools like MapReduce and Hadoop is that they are just not powerful enough in any one of the dimensions of the big data space that really matters. If you need complex joins or ACID requirements, SQL beats Hadoop easily. If you have realtime requirements, Cloudscale beats Hadoop by three or four orders of magnitude. If you have supercomputing requirements, MPI or BSP beat Hadoop easily. If you have graph computing requirements, Google's Pregel beats Hadoop by orders of magnitude. If you need interactive analysis of web-scale data sets, then Google's Dremel architecture beats Hadoop by orders of magnitude. If you need to incrementally update the analytics on a massive data set continuously, as Google now have to do on their index of the web, then an architecture like Percolator beats Hadoop easily.

Media_httpressysconco_jdsbt

Filed under  //  Big data   Hadoop   MapReduce  
Comments (0)
Posted

Hadoop Ecosystem

Media_httpgigaomfiles_jnaxf

Filed under  //  Big data   Hadoop   MapReduce  
Comments (0)
Posted

Too much data leads to disbelief

data crowds out faith. And without faith, it's hard to believe in the data enough to make a leap.

Filed under  //  Big data  
Comments (0)
Posted