KLI

Research on Constrained and Error Correction Codes for DNA Storage

Metadata Downloads
Abstract
Due to the increasing demand for data storage, DNA storage systems have begun to attract considerable attention as next-generation storage technologies due to their high densities and longevity. DNA storage technology is a method of storing binary information in the form of DNA strands, which are composed up of DNA sequences and primers. However, common obstacles to DNA storage are caused by insertion, deletion, and substitution errors occurring in DNA synthesis and sequencing. Therefore, reducing the error rates and correcting errors during DNA synthesis and sequencing is inevitable to guarantee reliable data storage in DNA storage. When the DNA strands stored in the DNA pool, efficient random-access desired information from stored DNA strands presents an additional obstacle in DNA storage.

To reduce error rates, a common approach involves imposing constraints on the stored DNA strands, such as ensuring they satisfy GC-balanced and homopolymer run-length constraints, etc. In terms of error correction in DNA storage, error correction codes are employed to enhance the reliability of the DNA synthesis and sequencing processes. Additionally, primers in DNA strands solve the problem of random-access in DNA storage.

This thesis propose a novel code construction method based on the weight distribution of the data and introduce a specific encoding process for both balanced and imbalanced data parts, which enables us to efficiently construct GC-balanced DNA codes. Additionally, to minimize errors in DNA storage processes, we propose a new single insertion/deletion nonbinary systematic error correction code with the maximum run-length constraint and its corresponding encoding algorithm. Finally, to solve the issue that efficient primer design for random-access in synthesized DNA strands, we propose a code design by combining weakly mutually uncorrelated codes with the maximum run length constraint for primer design. Moreover, we also explore the weakly mutually uncorrelated codes to satisfy combinations of maximum run length constraint with more constraints such as being almost-balanced and having large Hamming distance, which are also efficient constraints for random-access in DNA storage systems.
Author(s)
육소주
Issued Date
2024
Awarded Date
2024-02
Type
Dissertation
URI
https://oak.ulsan.ac.kr/handle/2021.oak/13176
http://ulsan.dcollection.net/common/orgView/200000730040
Alternative Author(s)
LU XIAOZHOU
Affiliation
울산대학교
Department
일반대학원 전기전자컴퓨터공학과
Advisor
Sunghwan Kim
Degree
Doctor
Publisher
울산대학교 일반대학원 전기전자컴퓨터공학과
Language
eng
Rights
울산대학교 논문은 저작권에 의해 보호받습니다.
Appears in Collections:
Computer Engineering & Information Technology > 2. Theses (Ph.D)
공개 및 라이선스
  • 공개 구분공개
파일 목록

Items in Repository are protected by copyright, with all rights reserved, unless otherwise indicated.