To reduce the burden on traditional data centers, improving on DNA data storage could be the key
The pace at which data – such as photos, videos, and social media posts – are being generated is ramping up drastically, exceeding the scaling limits of traditional silicon-based data storage technologies, and DNA could be deployed to help meet this challenge. As an indication of the massive amount of data storage that may be required, one model predicts that by the year 2030, electricity use by data centers could approach about eight percent of total global electricity demand. New paradigms for data storage, such as the use of DNA for preserving information, are necessary.
DNA is genetic material that contains plans for the design of living things, but DNA can also be used to store data created by living things. DNA is an attractive material for data storage – it is stable, writable, readable, and information dense. In theory, the entire world’s data could be stored in a coffee mug-sized portion of DNA.
So how does storing, for example, a video, in DNA work? (See Figure 1.) First, an algorithm is used to encode the video into the As, Ts, Cs, and Gs that make up DNA molecules. The DNA molecules are then synthesized, and stored. To access the data, the DNA molecules would be sequenced, and the DNA sequences translated using the same algorithm, reproducing the video.

Data storage and retrieval in DNA. First, data – like those stored on a computer hard drive – are processed by an algorithm that translates 1s and 0s into DNA sequences made up of As, Ts, Cs, and Gs. DNA strands with those sequences are then synthesized – or written – and stored either in living cells (in vivo) or in the test tube (in vitro). Data can be retrieved from storage in part by using PCR – the same technology deployed to test for the coronavirus that causes COVID-19 – to selectively target specific data packages. The PCR products can be read with DNA sequencing instruments, providing the original DNA sequences, and reproducing the data. Figure adapted from Ceze, Nivala, and Strauss 2019, Nature Reviews Genetics.
DNA is a polymer – a substance consisting of a high number of similar building blocks that are linked together – and other polymers can be used to store information, too. For example, plastic polymers are being explored for information-storage applications; one group synthesized a plastic polymer that, when read out, reproduced a quote by Jane Austen. By expanding experimental development efforts into (i) increasing the rates at which DNA can be synthesized and sequenced and (ii) detecting and correcting for errors in DNA synthesis, and by pursuing fundamental research into data storage across a variety of polymers, it is possible the U.S. science and technology enterprise could devise a polymer-based method for rapid data storage and retrieval, and meet the data storage challenge.
This CSPI Science and Technology Policy Snapshot expands upon a scientific exchange between Congressman Bill Foster (D, IL-11) and his new FAS-organized Science Council.
The space economy is enormous, but one of its biggest challenges is tiny: space debris.
The U.S. would need 65,000 miles of pipeline to achieve net-zero emissions by 2050. Here’s how the Biden Administration can expanding the use of low-emission, composite materials to support a net-zero vision.
Amino acids are essential but costly inputs for large-scale bioproduction. Federal funding can incentivize scalable production, cutting these costs in half.
Investing in oxygen as a utility through on-demand infrastructure can improve access and mortality rates globally. Healthcare experts propose how an international coalition led by USAID can transform the medical oxygen marketplaces of low- and middle-income countries to ensure every patient has the oxygen they need.