To reduce the burden on traditional data centers, improving on DNA data storage could be the key
The pace at which data – such as photos, videos, and social media posts – are being generated is ramping up drastically, exceeding the scaling limits of traditional silicon-based data storage technologies, and DNA could be deployed to help meet this challenge. As an indication of the massive amount of data storage that may be required, one model predicts that by the year 2030, electricity use by data centers could approach about eight percent of total global electricity demand. New paradigms for data storage, such as the use of DNA for preserving information, are necessary.
DNA is genetic material that contains plans for the design of living things, but DNA can also be used to store data created by living things. DNA is an attractive material for data storage – it is stable, writable, readable, and information dense. In theory, the entire world’s data could be stored in a coffee mug-sized portion of DNA.
So how does storing, for example, a video, in DNA work? (See Figure 1.) First, an algorithm is used to encode the video into the As, Ts, Cs, and Gs that make up DNA molecules. The DNA molecules are then synthesized, and stored. To access the data, the DNA molecules would be sequenced, and the DNA sequences translated using the same algorithm, reproducing the video.

Data storage and retrieval in DNA. First, data – like those stored on a computer hard drive – are processed by an algorithm that translates 1s and 0s into DNA sequences made up of As, Ts, Cs, and Gs. DNA strands with those sequences are then synthesized – or written – and stored either in living cells (in vivo) or in the test tube (in vitro). Data can be retrieved from storage in part by using PCR – the same technology deployed to test for the coronavirus that causes COVID-19 – to selectively target specific data packages. The PCR products can be read with DNA sequencing instruments, providing the original DNA sequences, and reproducing the data. Figure adapted from Ceze, Nivala, and Strauss 2019, Nature Reviews Genetics.
DNA is a polymer – a substance consisting of a high number of similar building blocks that are linked together – and other polymers can be used to store information, too. For example, plastic polymers are being explored for information-storage applications; one group synthesized a plastic polymer that, when read out, reproduced a quote by Jane Austen. By expanding experimental development efforts into (i) increasing the rates at which DNA can be synthesized and sequenced and (ii) detecting and correcting for errors in DNA synthesis, and by pursuing fundamental research into data storage across a variety of polymers, it is possible the U.S. science and technology enterprise could devise a polymer-based method for rapid data storage and retrieval, and meet the data storage challenge.
This CSPI Science and Technology Policy Snapshot expands upon a scientific exchange between Congressman Bill Foster (D, IL-11) and his new FAS-organized Science Council.
While the U.S. has made significant advancements and remained a global leader in biotechnology over the past decade, the next four years will be critical in determining whether it can sustain that leadership.
It’s paramount to balance both innovation capabilities and risk as we work towards ensuring that the U.S. bioeconomy is a priority area for both the Nation and for National Security.
The Federation of American Scientists supports the National Security Commission on Emerging Biotechnology’s Final Report and the Recommendations contained within it.
The U.S. should create a new non-governmental Innovation Accelerator modeled after the successful In-Q-Tel program to invest in small and mid-cap companies creating technologies that address critical needs of the United States.