1
|
Zhang X, Lu Y. "Galaxy" Encoding: Toward High Storage Density and Low Cost. IEEE Trans Nanobioscience 2025; 24:200-207. [PMID: 39466861 DOI: 10.1109/tnb.2024.3481504] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/30/2024]
Abstract
DNA is considered one of the most attractive storage media because of its excellent reliability and durability. Early encoding schemes lacked flexibility and scalability. To address these limitations, we propose a combination of static mapping and dynamic encoding, named "Galaxy" encoding. This scheme uses both the "dual-rule interleaving" algorithm and the "twelve-element Huffman rotational encoding" algorithm. We tested it with "Shakespeare Sonnets" and other files, achieving an encoding information density of approximately 2.563 bits/nt. Additionally, the inclusion of Reed-Solomon error-correcting codes can correct nearly 5% of the errors. Our simulations show that it supports various file types (.gz, .tar, .exe, etc.). We also analyzed the cost and fault tolerance of "Galaxy" encoding, demonstrating its high coding efficiency and ability to fully recover original information while effectively reducing the costs of DNA synthesis and sequencing.
Collapse
|
2
|
Zhang J. Levy Sooty Tern Optimization Algorithm Builds DNA Storage Coding Sets for Random Access. ENTROPY (BASEL, SWITZERLAND) 2024; 26:778. [PMID: 39330111 PMCID: PMC11431215 DOI: 10.3390/e26090778] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/15/2024] [Revised: 09/02/2024] [Accepted: 09/05/2024] [Indexed: 09/28/2024]
Abstract
DNA molecules, as a storage medium, possess unique advantages. Not only does DNA storage exhibit significantly higher storage density compared to electromagnetic storage media, but it also features low energy consumption and extremely long storage times. However, the integration of DNA storage into daily life remains distant due to challenges such as low storage density, high latency, and inevitable errors during the storage process. Therefore, this paper proposes constructing a DNA storage coding set based on the Levy Sooty Tern Optimization Algorithm (LSTOA) to achieve an efficient random-access DNA storage system. Firstly, addressing the slow iteration speed and susceptibility to local optima of the Sooty Tern Optimization Algorithm (STOA), this paper introduces Levy flight operations and propose the LSTOA. Secondly, utilizing the LSTOA, this paper constructs a DNA storage encoding set to facilitate random access while meeting combinatorial constraints. To demonstrate the coding performance of the LSTOA, this paper consists of analyses on 13 benchmark test functions, showcasing its superior performance. Furthermore, under the same combinatorial constraints, the LSTOA constructs larger DNA storage coding sets, effectively reducing the read-write latency and error rate of DNA storage.
Collapse
Affiliation(s)
- Jianxia Zhang
- College of Mathematics and Information Science, Henan Normal University, Xinxiang 453003, China
- School of Intelligent Engineering, Henan Institute of Technology, Xinxiang 453003, China
| |
Collapse
|
3
|
Cao B, Zheng Y, Shao Q, Liu Z, Xie L, Zhao Y, Wang B, Zhang Q, Wei X. Efficient data reconstruction: The bottleneck of large-scale application of DNA storage. Cell Rep 2024; 43:113699. [PMID: 38517891 DOI: 10.1016/j.celrep.2024.113699] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2023] [Revised: 11/15/2023] [Accepted: 01/05/2024] [Indexed: 03/24/2024] Open
Abstract
Over the past decade, the rapid development of DNA synthesis and sequencing technologies has enabled preliminary use of DNA molecules for digital data storage, overcoming the capacity and persistence bottlenecks of silicon-based storage media. DNA storage has now been fully accomplished in the laboratory through existing biotechnology, which again demonstrates the viability of carbon-based storage media. However, the high cost and latency of data reconstruction pose challenges that hinder the practical implementation of DNA storage beyond the laboratory. In this article, we review existing advanced DNA storage methods, analyze the characteristics and performance of biotechnological approaches at various stages of data writing and reading, and discuss potential factors influencing DNA storage from the perspective of data reconstruction.
Collapse
Affiliation(s)
- Ben Cao
- School of Computer Science and Technology, Dalian University of Technology, Lingshui Street, Dalian, Liaoning 116024, China; Centre for Frontier AI Research, Agency for Science, Technology, and Research (A(∗)STAR), 1 Fusionopolis Way, Singapore 138632, Singapore
| | - Yanfen Zheng
- School of Computer Science and Technology, Dalian University of Technology, Lingshui Street, Dalian, Liaoning 116024, China
| | - Qi Shao
- Key Laboratory of Advanced Design and Intelligent Computing, Ministry of Education, School of Software Engineering, Dalian University, Xuefu Street, Dalian, Liaoning 116622, China
| | - Zhenlu Liu
- Key Laboratory of Advanced Design and Intelligent Computing, Ministry of Education, School of Software Engineering, Dalian University, Xuefu Street, Dalian, Liaoning 116622, China
| | - Lei Xie
- Key Laboratory of Advanced Design and Intelligent Computing, Ministry of Education, School of Software Engineering, Dalian University, Xuefu Street, Dalian, Liaoning 116622, China
| | - Yunzhu Zhao
- Key Laboratory of Advanced Design and Intelligent Computing, Ministry of Education, School of Software Engineering, Dalian University, Xuefu Street, Dalian, Liaoning 116622, China
| | - Bin Wang
- Key Laboratory of Advanced Design and Intelligent Computing, Ministry of Education, School of Software Engineering, Dalian University, Xuefu Street, Dalian, Liaoning 116622, China
| | - Qiang Zhang
- School of Computer Science and Technology, Dalian University of Technology, Lingshui Street, Dalian, Liaoning 116024, China.
| | - Xiaopeng Wei
- School of Computer Science and Technology, Dalian University of Technology, Lingshui Street, Dalian, Liaoning 116024, China
| |
Collapse
|
4
|
Zheng Y, Cao B, Wu J, Wang B, Zhang Q. High Net Information Density DNA Data Storage by the MOPE Encoding Algorithm. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2023; 20:2992-3000. [PMID: 37015121 DOI: 10.1109/tcbb.2023.3263521] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/19/2023]
Abstract
DNA has recently been recognized as an attractive storage medium due to its high reliability, capacity, and durability. However, encoding algorithms that simply map binary data to DNA sequences have the disadvantages of low net information density and high synthesis cost. Therefore, this paper proposes an efficient, feasible, and highly robust encoding algorithm called MOPE (Modified Barnacles Mating Optimizer and Payload Encoding). The Modified Barnacles Mating Optimizer (MBMO) algorithm is used to construct the non-payload coding set, and the Payload Encoding (PE) algorithm is used to encode the payload. The results show that the lower bound of the non-payload coding set constructed by the MBMO algorithm is 3%-18% higher than the optimal result of previous work, and theoretical analysis shows that the designed PE algorithm has a net information density of 1.90 bits/nt, which is close to the ideal information capacity of 2 bits per nucleotide. The proposed MOPE encoding algorithm with high net information density and satisfying constraints can not only effectively reduce the cost of DNA synthesis and sequencing but also reduce the occurrence of errors during DNA storage.
Collapse
|
5
|
Mortuza GM, Guerrero J, Llewellyn S, Tobiason MD, Dickinson GD, Hughes WL, Zadegan R, Andersen T. In-vitro validated methods for encoding digital data in deoxyribonucleic acid (DNA). BMC Bioinformatics 2023; 24:160. [PMID: 37085766 PMCID: PMC10120115 DOI: 10.1186/s12859-023-05264-6] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2021] [Accepted: 03/30/2023] [Indexed: 04/23/2023] Open
Abstract
Deoxyribonucleic acid (DNA) is emerging as an alternative archival memory technology. Recent advancements in DNA synthesis and sequencing have both increased the capacity and decreased the cost of storing information in de novo synthesized DNA pools. In this survey, we review methods for translating digital data to and/or from DNA molecules. An emphasis is placed on methods which have been validated by storing and retrieving real-world data via in-vitro experiments.
Collapse
Affiliation(s)
- Golam Md Mortuza
- Department of Computer Science, Boise State University, Boise, Idaho USA
| | - Jorge Guerrero
- Department of Nanoengineering, Joint School of Nanoscience and Nanoengineering, North Carolina A&T State University, Greensboro, NC USA
| | | | | | | | - William L. Hughes
- School of Engineering, Kelowna, University of British Columbia, Kelowna, British Columbia Canada
| | - Reza Zadegan
- Department of Nanoengineering, Joint School of Nanoscience and Nanoengineering, North Carolina A&T State University, Greensboro, NC USA
| | - Tim Andersen
- Department of Computer Science, Boise State University, Boise, Idaho USA
| |
Collapse
|
6
|
Du H, Zhou S, Yan W, Wang S. Study on DNA Storage Encoding Based IAOA under Innovation Constraints. Curr Issues Mol Biol 2023; 45:3573-3590. [PMID: 37185757 PMCID: PMC10136724 DOI: 10.3390/cimb45040233] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2023] [Revised: 04/09/2023] [Accepted: 04/13/2023] [Indexed: 05/17/2023] Open
Abstract
With the informationization of social processes, the amount of related data has greatly increased, making traditional storage media unable to meet the current requirements for data storage. Due to its advantages of a high storage capacity and persistence, deoxyribonucleic acid (DNA) has been considered the most prospective storage media to solve the data storage problem. Synthesis is an important process for DNA storage, and low-quality DNA coding can increase errors during sequencing, which can affect the storage efficiency. To reduce errors caused by the poor stability of DNA sequences during storage, this paper proposes a method that uses the double-matching and error-pairing constraints to improve the quality of the DNA coding set. First, the double-matching and error-pairing constraints are defined to solve problems of sequences with self-complementary reactions in the solution that are prone to mismatch at the 3' end. In addition, two strategies are introduced in the arithmetic optimization algorithm, including a random perturbation of the elementary function and a double adaptive weighting strategy. An improved arithmetic optimization algorithm (IAOA) is proposed to construct DNA coding sets. The experimental results of the IAOA on 13 benchmark functions show a significant improvement in its exploration and development capabilities over the existing algorithms. Moreover, the IAOA is used in the DNA encoding design under both traditional and new constraints. The DNA coding sets are tested to estimate their quality regarding the number of hairpins and melting temperature. The DNA storage coding sets constructed in this study are improved by 77.7% at the lower boundary compared to existing algorithms. The DNA sequences in the storage sets show a reduction of 9.7-84.1% in the melting temperature variance, and the hairpin structure ratio is reduced by 2.1-80%. The results indicate that the stability of the DNA coding sets is improved under the two proposed constraints compared to traditional constraints.
Collapse
Affiliation(s)
- Haigui Du
- Key Laboratory of Advanced Design and Intelligent Computing, Ministry of Education, School of Software Engineering, Dalian University, Dalian 116622, China
| | - Shihua Zhou
- Key Laboratory of Advanced Design and Intelligent Computing, Ministry of Education, School of Software Engineering, Dalian University, Dalian 116622, China
| | - WeiQi Yan
- School of Engineering, Computer and Mathematical Sciences, Auckland University of Technology, Auckland 1010, New Zealand
| | - Sijie Wang
- Key Laboratory of Advanced Design and Intelligent Computing, Ministry of Education, School of Software Engineering, Dalian University, Dalian 116622, China
| |
Collapse
|
7
|
Liu J, Liu S, Zou C, Xu S, Zhou C. Research Progress in Construction and Application of Enzyme-Based DNA Logic Gates. IEEE Trans Nanobioscience 2023; 22:245-258. [PMID: 35679378 DOI: 10.1109/tnb.2022.3181615] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/05/2022]
Abstract
As a research hotspot in the field of information processing, DNA computing exhibits several important underlying characteristics-from parallel computing and low energy consumption to high-performance storage capabilities-thereby enabling its wide application in nanomachines, molecular encryption, biological detection, medical diagnosis, etc. Based on DNA computing, the most rapidly developed field focuses on DNA molecular logic-gates computing. In particular, the recent advances in enzyme-based DNA logic gates has emerged as ideal materials for constructing DNA logic gates. In this review, we explore protein enzymes that can manipulate DNA, especially, nicking enzymes and polymerases with high efficiency and specificity, which are widely used in constructing DNA logic gates, as well as ribozyme that can construct DNA logic gates following various mechanism with distinct biomaterials. Accordingly, the review highlights the characteristics and applications of various types of DNAzyme-based logic gates models, considering their future developments in information, biomedicine, chemistry, and computers.
Collapse
|
8
|
Cao B, Wang B, Zhang Q. GCNSA: DNA storage encoding with a graph convolutional network and self-attention. iScience 2023; 26:106231. [PMID: 36876131 PMCID: PMC9982308 DOI: 10.1016/j.isci.2023.106231] [Citation(s) in RCA: 13] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2022] [Revised: 01/31/2023] [Accepted: 02/14/2023] [Indexed: 02/22/2023] Open
Abstract
DNA Encoding, as a key step in DNA storage, plays an important role in reading and writing accuracy and the storage error rate. However, currently, the encoding efficiency is not high enough and the encoding speed is not fast enough, which limits the performance of DNA storage systems. In this work, a DNA storage encoding system with a graph convolutional network and self-attention (GCNSA) is proposed. The experimental results show that DNA storage code constructed by GCNSA increases by 14.4% on average under the basic constraints, and by 5%-40% under other constraints. The increase of DNA storage codes effectively improves the storage density of 0.7-2.2% in the DNA storage system. The GCNSA predicted more DNA storage codes in less time while ensuring the quality of codes, which lays a foundation for higher read and write efficiency in DNA storage.
Collapse
Affiliation(s)
- Ben Cao
- School of Computer Science and Technology, Dalian University of Technology, Dalian 116024, China
| | - Bin Wang
- Key Laboratory of Advanced Design and Intelligent Computing, Ministry of Education, School of Software Engineering, Dalian University, Dalian 116622, China
| | - Qiang Zhang
- School of Computer Science and Technology, Dalian University of Technology, Dalian 116024, China
| |
Collapse
|
9
|
Harris hawks optimizer based on the novice protection tournament for numerical and engineering optimization problems. APPL INTELL 2023. [DOI: 10.1007/s10489-022-03743-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/01/2023]
|
10
|
Yin Q, Zheng Y, Wang B, Zhang Q. Design of Constraint Coding Sets for Archive DNA Storage. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2022; 19:3384-3394. [PMID: 34762590 DOI: 10.1109/tcbb.2021.3127271] [Citation(s) in RCA: 24] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
With the advent of the era of massive data, the increase of storage demand has far exceeded current storage capacity. DNA molecules provide a reliable solution for big data storage by virtue of their large capacity, high density, and long-term stability. To reduce errors in storing procedures, constructing a sufficient set of constraint encoding is critical for achieving DNA storage. A new version of the Marine Predator algorithm (called QRSS-MPA) is proposed in this paper to increase the lower bound of the coding set while satisfying the specific combination of constraints. In order to demonstrate the effectiveness of the improvement, the classical CEC-05 test function is used to test and compare the mean, variance, scalability, and significance. In terms of storage, the lower bound of construction is compared with previous works, and the result is found to be significantly improved. In order to prevent the emergence of a secondary structure that leads to sequencing failure, we give a more stringent lower bound for the constraint coding set, which is of great significance for reducing the error rate of DNA storage amidst its rapid development.
Collapse
|
11
|
Abstract
The Harris hawk optimizer is a recent population-based metaheuristics algorithm that simulates the hunting behavior of hawks. This swarm-based optimizer performs the optimization procedure using a novel way of exploration and exploitation and the multiphases of search. In this review research, we focused on the applications and developments of the recent well-established robust optimizer Harris hawk optimizer (HHO) as one of the most popular swarm-based techniques of 2020. Moreover, several experiments were carried out to prove the powerfulness and effectivness of HHO compared with nine other state-of-art algorithms using Congress on Evolutionary Computation (CEC2005) and CEC2017. The literature review paper includes deep insight about possible future directions and possible ideas worth investigations regarding the new variants of the HHO algorithm and its widespread applications.
Collapse
|
12
|
Zhu C, Zhang Y, Pan X, Chen Q, Fu Q. Improved Harris Hawks Optimization algorithm based on quantum correction and Nelder-Mead simplex method. MATHEMATICAL BIOSCIENCES AND ENGINEERING : MBE 2022; 19:7606-7648. [PMID: 35801438 DOI: 10.3934/mbe.2022358] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
Harris Hawks Optimization (HHO) algorithm is a kind of intelligent algorithm that simulates the predation behavior of hawks. It suffers several shortcomings, such as low calculation accuracy, easy to fall into local optima and difficult to balance exploration and exploitation. In view of the above problems, this paper proposes an improved HHO algorithm named as QC-HHO. Firstly, the initial population is generated by Hénon Chaotic Map to enhance the randomness and ergodicity. Secondly, the quantum correction mechanism is introduced in the local search phase to improve optimization accuracy and population diversity. Thirdly, the Nelder-Mead simplex method is used to improve the search performance and breadth. Fourthly, group communication factors describing the relationship between individuals is taken into consideration. Finally, the energy consumption law is integrated into the renewal process of escape energy factor E and jump distance J to balance exploration and exploitation. The QC-HHO is tested on 10 classical benchmark functions and 30 CEC2014 benchmark functions. The results show that it is superior to original HHO algorithm and other improved HHO algorithms. At the same time, the improved algorithm studied in this paper is applied to gas leakage source localization by wireless sensor networks. The experimental results indicate that the accuracy of position and gas release rate are excellent, which verifies the feasibility for application of QC-HHO in practice.
Collapse
Affiliation(s)
- Cheng Zhu
- School of Information Engineering, Tianjin University of Commerce, Tianjin 300134, China
| | - Yong Zhang
- School of Information Engineering, Tianjin University of Commerce, Tianjin 300134, China
| | - Xuhua Pan
- School of Information Engineering, Tianjin University of Commerce, Tianjin 300134, China
| | - Qi Chen
- School of Information Engineering, Tianjin University of Commerce, Tianjin 300134, China
| | - Qingyu Fu
- School of Information Engineering, Tianjin University of Commerce, Tianjin 300134, China
| |
Collapse
|
13
|
Cao B, Ii X, Zhang X, Wang B, Zhang Q, Wei X. Designing Uncorrelated Address Constrain for DNA Storage by DMVO Algorithm. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2022; 19:866-877. [PMID: 32750895 DOI: 10.1109/tcbb.2020.3011582] [Citation(s) in RCA: 40] [Impact Index Per Article: 13.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
At present, huge amounts of data are being produced every second, a situation that will gradually overwhelm current storage technology. DNA is a storage medium that features high storage density and long-term stability and is now considered to be a feasible storage solution. Errors are easily made during the sequencing and synthesis of DNA, however. In order to reduce the error rate, novel uncorrelated address constrain are reported, and a Damping Multi-Verse Optimizer (DMVO)algorithm is proposed to construct a set of DNA coding, which is used as the non-payload. The DMVO algorithm exchanges objects through black/white holes in order to achieve a stable state and adds damping factors as disturbances. Compared with previous work, the coding set obtained by the DMVO algorithm is larger in size and of higher quality. The results of this study reveal that the size of the DNA storage coding set obtained by the DMVO algorithm increased by 4-16 percent, and the variance of the melting temperature decreased by 3-18 percent.
Collapse
|
14
|
Wu J, Zheng Y, Wang B, Zhang Q. Enhancing Physical and Thermodynamic Properties of DNA Storage Sets with End-constraint. IEEE Trans Nanobioscience 2021; 21:184-193. [PMID: 34662278 DOI: 10.1109/tnb.2021.3121278] [Citation(s) in RCA: 19] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
With the explosion of data, DNA is considered as an ideal carrier for storage due to its high storage density. However, low-quality DNA sets hamper the widespread use of DNA storage. This work proposes a new method to design high-quality DNA storage sets. Firstly, random switch and double-weight offspring strategies are introduced in Double-strategy Black Widow Optimization Algorithm (DBWO). Experimental results of 26 benchmark functions show that the exploration and exploitation abilities of DBWO are greatly improved from previous work. Secondly, DBWO is applied in designing DNA storage sets, and compared with previous work, the lower bounds of storage sets are boosted by 9%-37%. Finally, to improve the poor stabilities of sequences, the End-constraint is proposed in designing DNA storage sets. By measuring the number of hairpin structures, melting temperature, and minimum free energy, it is evaluated that with our innovative constraint, DBWO can construct not only a larger number of storage sets, but also enhance physical and thermodynamic properties of DNA storage sets.
Collapse
|
15
|
Xiaoru L, Ling G. Combinatorial constraint coding based on the EORS algorithm in DNA storage. PLoS One 2021; 16:e0255376. [PMID: 34324571 PMCID: PMC8320985 DOI: 10.1371/journal.pone.0255376] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2021] [Accepted: 07/15/2021] [Indexed: 11/19/2022] Open
Abstract
The development of information technology has produced massive amounts of data, which has brought severe challenges to information storage. Traditional electronic storage media cannot keep up with the ever-increasing demand for data storage, but in its place DNA has emerged as a feasible storage medium with high density, large storage capacity and strong durability. In DNA data storage, many different approaches can be used to encode data into codewords. DNA coding is a key step in DNA storage and can directly affect storage performance and data integrity. However, since errors are prone to occur in DNA synthesis and sequencing, and non-specific hybridization is prone to occur in the solution, how to effectively encode DNA has become an urgent problem to be solved. In this article, we propose a DNA storage coding method based on the equilibrium optimization random search (EORS) algorithm, which meets the Hamming distance, GC content and no-runlength constraints and can reduce the error rate in storage. Simulation experiments have shown that the size of the DNA storage code set constructed by the EORS algorithm that meets the combination constraints has increased by an average of 11% compared with previous work. The increase in the code set means that shorter DNA chains can be used to store more data.
Collapse
Affiliation(s)
- Li Xiaoru
- Hulunbeier Vocational and Technical College, Hulunbeier, Inner Mongolia, China
| | - Guo Ling
- Baidu Co., Ltd., Shanghai, China
| |
Collapse
|
16
|
Li X, Wei Z, Wang B, Song T. Stable DNA Sequence Over Close-Ending and Pairing Sequences Constraint. Front Genet 2021; 12:644484. [PMID: 34079580 PMCID: PMC8165483 DOI: 10.3389/fgene.2021.644484] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2020] [Accepted: 04/12/2021] [Indexed: 11/15/2022] Open
Abstract
DNA computing is a new method based on molecular biotechnology to solve complex problems. The design of DNA sequences is a multi-objective optimization problem in DNA computing, whose objective is to obtain optimized sequences that satisfy multiple constraints to improve the quality of the sequences. However, the previous optimized DNA sequences reacted with each other, which reduced the number of DNA sequences that could be used for molecular hybridization in the solution and thus reduced the accuracy of DNA computing. In addition, a DNA sequence and its complement follow the principle of complementary pairing, and the sequence of base GC at both ends is more stable. To optimize the above problems, the constraints of Pairing Sequences Constraint (PSC) and Close-ending along with the Improved Chaos Whale (ICW) optimization algorithm were proposed to construct a DNA sequence set that satisfies the combination of constraints. The ICW optimization algorithm is added to a new predator–prey strategy and sine and cosine functions under the action of chaos. Compared with other algorithms, among the 23 benchmark functions, the new algorithm obtained the minimum value for one-third of the functions and two-thirds of the current minimum value. The DNA sequences satisfying the constraint combination obtained the minimum of fitness values and had stable and usable structures.
Collapse
Affiliation(s)
- Xue Li
- The Key Laboratory of Advanced Design and Intelligent Computing, Ministry of Education, School of Software Engineering, Dalian University, Dalian, China
| | - Ziqi Wei
- School of Software, Tsinghua University, Beijing, China
| | - Bin Wang
- The Key Laboratory of Advanced Design and Intelligent Computing, Ministry of Education, School of Software Engineering, Dalian University, Dalian, China
| | - Tao Song
- College of Computer and Communication Engineering, China University of Petroleum, Qingdao, China
| |
Collapse
|
17
|
Zheng Y, Wu J, Wang B. CLGBO: An Algorithm for Constructing Highly Robust Coding Sets for DNA Storage. Front Genet 2021; 12:644945. [PMID: 34017354 PMCID: PMC8129200 DOI: 10.3389/fgene.2021.644945] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2020] [Accepted: 04/08/2021] [Indexed: 11/22/2022] Open
Abstract
In the era of big data, new storage media are urgently needed because the storage capacity for global data cannot meet the exponential growth of information. Deoxyribonucleic acid (DNA) storage, where primer and address sequences play a crucial role, is one of the most promising storage media because of its high density, large capacity and durability. In this study, we describe an enhanced gradient-based optimizer that includes the Cauchy and Levy mutation strategy (CLGBO) to construct DNA coding sets, which are used as primer and address libraries. Our experimental results show that the lower bounds of DNA storage coding sets obtained using the CLGBO algorithm are increased by 4.3–13.5% compared with previous work. The non-adjacent subsequence constraint was introduced to reduce the error rate in the storage process. This helps to resolve the problem that arises when consecutive repetitive subsequences in the sequence cause errors in DNA storage. We made use of the CLGBO algorithm and the non-adjacent subsequence constraint to construct larger and more highly robust coding sets.
Collapse
Affiliation(s)
- Yanfen Zheng
- The Key Laboratory of Advanced Design and Intelligent Computing, Ministry of Education, School of Software Engineering, Dalian University, Dalian, China
| | - Jieqiong Wu
- The Key Laboratory of Advanced Design and Intelligent Computing, Ministry of Education, School of Software Engineering, Dalian University, Dalian, China
| | - Bin Wang
- The Key Laboratory of Advanced Design and Intelligent Computing, Ministry of Education, School of Software Engineering, Dalian University, Dalian, China
| |
Collapse
|
18
|
Alabool HM, Alarabiat D, Abualigah L, Heidari AA. Harris hawks optimization: a comprehensive review of recent variants and applications. Neural Comput Appl 2021. [DOI: 10.1007/s00521-021-05720-5] [Citation(s) in RCA: 38] [Impact Index Per Article: 9.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
|
19
|
Xue X, Zhou D, Zhou C. New insights into the existing image encryption algorithms based on DNA coding. PLoS One 2020; 15:e0241184. [PMID: 33095816 PMCID: PMC7584250 DOI: 10.1371/journal.pone.0241184] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/21/2020] [Accepted: 10/10/2020] [Indexed: 12/03/2022] Open
Abstract
Because a DNA nucleotide sequence has the characteristics of large storage capacity, high parallelism, and low energy consumption, DNA cryptography is favored by information security researchers. Image encryption algorithms based on DNA coding have become a research hotspot in the field of image encryption and security. In this study, based on a comprehensive review of the existing studies and their results, we present new insights into the existing image encryption algorithms based on DNA coding. First, the existing algorithms were summarized and classified into five types, depending on the type of DNA coding: DNA fixed coding, DNA dynamic coding, different types of base complement operation, different DNA sequence algebraic operations, and combinations of multiple DNA operations. Second, we analyzed and studied each classification algorithm using simulation and obtained their advantages and disadvantages. Third, the DNA coding mechanisms, DNA algebraic operations, and DNA algebraic combination operations were compared and discussed. Then, a new scheme was proposed by combining the optimal coding mechanism with the optimal DNA coding operation. Finally, we revealed the shortcomings of the existing studies and the future direction for improving image encryption methods based on DNA coding.
Collapse
Affiliation(s)
- Xianglian Xue
- Section of Computer Teaching and Research, Shaanxi University of Chinese Medicine, Xianyang, China
- Laboratory of Network Computer and Security Technology in Shaanxi Province, Xi’an University of Technology, Xi’an, China
- * E-mail:
| | - Dongsheng Zhou
- Key Laboratory of Advanced Design and Intelligent Computing, Ministry of Education, Dalian University, Dalian, China
| | - Changjun Zhou
- College of Mathematics and Computer Science, Zhejiang Normal University, Jinhua, China
| |
Collapse
|