DNA-Encoded Sonnets Show New Possibilities in Data Storage

Tech  |  News
DNA-Encoded Sonnets Show New Possibilities in Data Storage

Ewan Birney and Nick Goldman, scientists from the European Bioinformatics Institute have now encoded all 154 of William Shakespeare’s sonnets on small segments of DNA according to NPR.

The completion of this project was a result of years of work towards solving a problem the two men had: How can they continue to store an exponentially growing amount of genetic information at the Institute without having to increase their spending on electricity and hard drives?

The answer? DNA.

To encode the sonnets into DNA, Birney and Goldman first converted the words into binary code and then, utilizing software that Goldman wrote, the scientists converted the binary code into DNA’s chemical language, an alphabet of four nucleotide bases: A, C, G and T, according to Time.

A is adenine, C is cytosine, G is guanine and T is thymine. Combinations of these letters normally translate into the production of certain proteins needed to carry out the functions of living things.

And now they can archive the sonnets of one of the world’s greatest playwrights.

After converting the data into the A-C-G-T code, Birney and Goldman sketched the designs for thousands of fragments of DNA, each containing a segment of a file and then sent the designs on to Agilent Technologies to manufacture the DNA for them. Once they got the DNA, it took two weeks to open the files encoded on them. Finally, Goldman’s software was used to reorganize the DNA into readable files.

Size, when it comes to data storage, does matter. So much less physical space would be wasted if we archived our information using DNA.

But DNA as data storage has its drawbacks. Unlike CDs it’s not re-writable and so editing information would be a pain as you would have to start the process all over again in order to change things. Also, DNA doesn’t allow you to pick and choose which piece of information you want to access. So you wouldn’t be able to read just one of Shakespeare’s sonnets without unraveling the entire file.

And for now, it still costs quite a bit of money, about $12,400 per megabyte. But Goldman and other scientists note that DNA synthesis costs are rapidly decreasing and so in about a decade it may be more economical for large companies to use DNA for data storage.