IDEAS home Printed from https://ideas.repec.org/a/nat/natcom/v13y2022i1d10.1038_s41467-022-30140-x.html
   My bibliography  Save this article

Rewritable two-dimensional DNA-based data storage with machine learning reconstruction

Author

Listed:
  • Chao Pan

    (University of Illinois at Urbana-Champaign)

  • S. Kasra Tabatabaei

    (University of Illinois at Urbana-Champaign
    University of Illinois at Urbana-Champaign)

  • S. M. Hossein Tabatabaei Yazdi

    (Dorna Robotics)

  • Alvaro G. Hernandez

    (University of Illinois at Urbana-Champaign)

  • Charles M. Schroeder

    (University of Illinois at Urbana-Champaign
    University of Illinois at Urbana-Champaign
    University of Illinois at Urbana-Champaign)

  • Olgica Milenkovic

    (University of Illinois at Urbana-Champaign)

Abstract

DNA-based data storage platforms traditionally encode information only in the nucleotide sequence of the molecule. Here we report on a two-dimensional molecular data storage system that records information in both the sequence and the backbone structure of DNA and performs nontrivial joint data encoding, decoding and processing. Our 2DDNA method efficiently stores images in synthetic DNA and embeds pertinent metadata as nicks in the DNA backbone. To avoid costly worst-case redundancy for correcting sequencing/rewriting errors and to mitigate issues associated with mismatched decoding parameters, we develop machine learning techniques for automatic discoloration detection and image inpainting. The 2DDNA platform is experimentally tested by reconstructing a library of images with undetectable or small visual degradation after readout processing, and by erasing and rewriting copyright metadata encoded in nicks. Our results demonstrate that DNA can serve both as a write-once and rewritable memory for heterogenous data and that data can be erased in a permanent, privacy-preserving manner. Moreover, the storage system can be made robust to degrading channel qualities while avoiding global error-correction redundancy.

Suggested Citation

  • Chao Pan & S. Kasra Tabatabaei & S. M. Hossein Tabatabaei Yazdi & Alvaro G. Hernandez & Charles M. Schroeder & Olgica Milenkovic, 2022. "Rewritable two-dimensional DNA-based data storage with machine learning reconstruction," Nature Communications, Nature, vol. 13(1), pages 1-12, December.
  • Handle: RePEc:nat:natcom:v:13:y:2022:i:1:d:10.1038_s41467-022-30140-x
    DOI: 10.1038/s41467-022-30140-x
    as

    Download full text from publisher

    File URL: https://www.nature.com/articles/s41467-022-30140-x
    File Function: Abstract
    Download Restriction: no

    File URL: https://libkey.io/10.1038/s41467-022-30140-x?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Nick Goldman & Paul Bertone & Siyuan Chen & Christophe Dessimoz & Emily M. LeProust & Botond Sipos & Ewan Birney, 2013. "Towards practical, high-capacity, low-maintenance information storage in synthesized DNA," Nature, Nature, vol. 494(7435), pages 77-80, February.
    2. Christopher E. Arcadia & Eamonn Kennedy & Joseph Geiser & Amanda Dombroski & Kady Oakley & Shui-Ling Chen & Leonard Sprague & Mustafa Ozmen & Jason Sello & Peter M. Weber & Sherief Reda & Christopher , 2020. "Multicomponent molecular memory," Nature Communications, Nature, vol. 11(1), pages 1-8, December.
    3. S. Kasra Tabatabaei & Boya Wang & Nagendra Bala Murali Athreya & Behnam Enghiad & Alvaro Gonzalo Hernandez & Christopher J. Fields & Jean-Pierre Leburton & David Soloveichik & Huimin Zhao & Olgica Mil, 2020. "DNA punch cards for storing data on native DNA sequences via enzymatic nicking," Nature Communications, Nature, vol. 11(1), pages 1-10, December.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Marius Welzel & Peter Michael Schwarz & Hannah F. Löchel & Tolganay Kabdullayeva & Sandra Clemens & Anke Becker & Bernd Freisleben & Dominik Heider, 2023. "DNA-Aeon provides flexible arithmetic coding for constraint adherence and error correction in DNA storage," Nature Communications, Nature, vol. 14(1), pages 1-10, December.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Cheng Kai Lim & Jing Wui Yeoh & Aurelius Andrew Kunartama & Wen Shan Yew & Chueh Loo Poh, 2023. "A biological camera that captures and stores images directly into DNA," Nature Communications, Nature, vol. 14(1), pages 1-11, December.
    2. Afsaneh Sadremomtaz & Robert F. Glass & Jorge Eduardo Guerrero & Dennis R. LaJeunesse & Eric A. Josephs & Reza Zadegan, 2023. "Digital data storage on DNA tape using CRISPR base editors," Nature Communications, Nature, vol. 14(1), pages 1-10, December.
    3. Abdur Rasool & Qiang Qu & Yang Wang & Qingshan Jiang, 2022. "Bio-Constrained Codes with Neural Network for Density-Based DNA Data Storage," Mathematics, MDPI, vol. 10(5), pages 1-21, March.
    4. Ahmed A. Agiza & Kady Oakley & Jacob K. Rosenstein & Brenda M. Rubenstein & Eunsuk Kim & Marc Riedel & Sherief Reda, 2023. "Digital circuits and neural networks based on acid-base chemistry implemented by robotic fluid handling," Nature Communications, Nature, vol. 14(1), pages 1-9, December.
    5. Jingwei Hong & Abdur Rasool & Shuo Wang & Djemel Ziou & Qingshan Jiang, 2024. "VSD: A Novel Method for Video Segmentation and Storage in DNA Using RS Code," Mathematics, MDPI, vol. 12(8), pages 1-21, April.
    6. Jan Kretschmer & Tomáš David & Martin Dračínský & Ondřej Socha & Daniel Jirak & Martin Vít & Radek Jurok & Martin Kuchař & Ivana Císařová & Miloslav Polasek, 2022. "Paramagnetic encoding of molecules," Nature Communications, Nature, vol. 13(1), pages 1-12, December.
    7. Lifu Song & Feng Geng & Zi-Yi Gong & Xin Chen & Jijun Tang & Chunye Gong & Libang Zhou & Rui Xia & Ming-Zhe Han & Jing-Yi Xu & Bing-Zhi Li & Ying-Jin Yuan, 2022. "Robust data storage in DNA by de Bruijn graph-based de novo strand assembly," Nature Communications, Nature, vol. 13(1), pages 1-9, December.
    8. Shekaari, Ashkan & Jafari, Mahmoud, 2019. "Statistical mechanical modeling of a DNA nanobiostructure at the base-pair level," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 518(C), pages 80-88.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:nat:natcom:v:13:y:2022:i:1:d:10.1038_s41467-022-30140-x. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.nature.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.