Reducing cost in DNA-based data storage by sequence analysis-aided soft information decoding of variable-length reads
- Abstract
- Motivation
DNA-based data storage is one of the most attractive research areas for future archival storage. However, it faces the problems of high writing and reading costs for practical use. There have been many efforts to resolve this problem, but existing schemes are not fully suitable for DNA-based data storage, and more cost reduction is needed.
Results
We propose whole encoding and decoding procedures for DNA storage. The encoding procedure consists of a carefully designed single low-density parity-check code as an inter-oligo code, which corrects errors and dropouts efficiently. We apply new clustering and alignment methods that operate on variable-length reads to aid the decoding performance. We use edit distance and quality scores during the sequence analysis-aided decoding procedure, which can discard abnormal reads and utilize high-quality soft information. We store 548.83 KB of an image file in DNA oligos and achieve a writing cost reduction of 7.46% and a significant reading cost reduction of 26.57% and 19.41% compared with the two previous works.
- Author(s)
- Seong-Joon Park; Sunghwan Kim; Jaeho Jeong; Albert No; Jong-Seon No; Hosung Park
- Issued Date
- 2023
- Type
- Article
- Keyword
- DNA storage
- DOI
- 10.1093/bioinformatics/btad548
- URI
- https://oak.ulsan.ac.kr/handle/2021.oak/17117
- Publisher
- BIOINFORMATICS
- Language
- 영어
- ISSN
- 1367-4803
- Citation Volume
- 39
- Citation Number
- 9
- Citation Start Page
- 1
- Citation End Page
- 8
-
Appears in Collections:
- Engineering > IT Convergence
- 공개 및 라이선스
-
- 파일 목록
-
Items in Repository are protected by copyright, with all rights reserved, unless otherwise indicated.