All of Humanity’s Data to be Stored in DNA

1
DNA
The New York Genome Center Laboratory | Nygenome.org

For years, scientists have been exploring bio-inspired solutions to meet the world’s growing data storage needs. DNA strands allow for high-density, large-scale data storage, provided that the technology’s costs come down in the coming years.

It’s estimated that all of humanity produces about 2.5 quintillion (a billion billion) bytes of data. 90% of all the data available in the world today has been created in the last two years alone. To deal with the unbridled growth of information technologies and compensate this deluge of data, we built millions of data centers. The energy required to power these digital storage facilities is immense.

In effort to reduce the energy cost of data storage, we look to nature, and our DNA.Click To Tweet

We Can’t Just Keep Building Data Centers!

Huge amounts of energy are needed not only to power data centers, but also to cool them to avoid overheating and malfunctioning.

According to a U.S. Department of Energy study, U.S. data centers consumed 70 billion kWh of electricity in 2014. That’s 2% of the country’s total energy consumption–the equivalent amount of energy consumed by 6.4 million households.

The International Data Corporation issued a report entitled, “Worldwide Datacenter Installation Census and Construction Forecast, 2015-2019.” The IDC estimated the total number of data centers around to reach 8.6 million in 2017, with worldwide data center space to grow to 1.94 billion square feet in 2018.

DNA’s Four-Letter Alphabet for Reliable and Efficient Data Storage

Almost indestructible and energy efficient, synthetic DNA could meet the growing data storage needs with its unrivaled storage density (one gram of DNA can store 215 million gigabytes!) In 2012, DNA storage capabilities were first demonstrated by a team of geneticists from Harvard.

Now, researchers replicated the test but improved its efficiency. Yaniv Erlich, a computer scientist at Columbia University, and Dina Zielinski from the New York Genome Center took DNA strands and inserted encoded files that can be reconstructed later on computer.

The procedure was carried out in stages:

  • First, scientists coded six different files and converted them into binary strings of 1s and 0s.
  • Then, using an algorithm they have developed, DNA Fountain, they packaged the strings into “droplets”, and added tags so they know the order of succession of each string later.
  • The information was then sent to Twist Bioscience, a San Francisco-based startup, which synthesized the DNA strands and sent them back as a speck of DNA in a vial.
  • Erlich and Zielinski used modern DNA sequencing technology decode the files.

If DNA is a reliable tool that allows optimal data storage and solves the space problem, it is also relatively slow going. At first, the technology would be better suited to archive data that doesn’t require frequent access.

Also, the price for encoding and reading data using this method is very high. The sequencing of DNA takes time and therefore money (It cost $9,000.00 USD to synthesize then read 2 megabytes of data).

banner ad to seo services page