KLI

Cooperative sequence clustering and decoding for DNA storage system with fountain codes

Metadata Downloads
Abstract
Motivation: In DNA storage systems, there are tradeoffs between writing and reading costs. Increasing the code rate of error-correcting codes may save writing cost, but it will need more sequence reads for data retrieval. There is potentially a way to improve sequencing and decoding processes in such a way that the reading cost induced by this tradeoff is reduced without increasing the writing cost. In past researches, clustering, alignment and decoding processes were considered as separate stages but we believe that using the information from all these processes together may improve decoding performance. Actual experiments of DNA synthesis and sequencing should be performed because simulations cannot be relied on to cover all error possibilities in practical circumstances.
Results: For DNA storage systems using fountain code and Reed-Solomon (RS) code, we introduce several techniques to improve the decoding performance. We designed the decoding process focusing on the cooperation of key components: Hamming-distance based clustering, discarding of abnormal sequence reads, RS error correction as well as detection and quality score-based ordering of sequences. We synthesized 513.6 KB data into DNA oligo pools and sequenced this data successfully with Illumina MiSeq instrument. Compared to Erlich’s research, the proposed decoding method additionally incorporates sequence reads with minor errors which had been discarded before, and thus was able to make use of 10.6?11.9% more sequence reads from the same sequencing environment, this resulted in 6.5?8.9% reduction in the reading cost. Channel characteristics including sequence coverage and read-length distributions are provided as well.
Author(s)
Jaeho JeongSeong-Joon ParkJae-Won KimJong-Seon NoHa Hyeon JeonJeong Wook LeeAlbert No김성환Hosung Park
Issued Date
2021
Type
Article
DOI
10.1093/bioinformatics/btab246
URI
https://oak.ulsan.ac.kr/handle/2021.oak/9092
https://ulsan-primo.hosted.exlibrisgroup.com/primo-explore/fulldisplay?docid=TN_cdi_proquest_miscellaneous_2518983829&context=PC&vid=ULSAN&lang=ko_KR&search_scope=default_scope&adaptor=primo_central_multiple_fe&tab=default_tab&query=any,contains,Cooperative%20sequence%20clustering%20and%20decoding%20for%20DNA%20storage%20system%20with%20fountain%20codes&offset=0&pcAvailability=true
Publisher
BIOINFORMATICS
Location
영국
Language
영어
ISSN
1367-4803
Citation Volume
37
Citation Number
19
Citation Start Page
3136
Citation End Page
3143
Appears in Collections:
Engineering > IT Convergence
Authorize & License
  • Authorize공개
Files in This Item:
  • There are no files associated with this item.

Items in Repository are protected by copyright, with all rights reserved, unless otherwise indicated.