Designing Biological Sequences without Prior Knowledge Using Evolutionary Reinforcement Learning

Xi Zeng; Xiaotian Hao; Hongyao Tang; Zhentao Tang; Shaoqing Jiao; Dazhi Lu; Jiajie Peng

doi:10.1609/aaai.v38i1.27792

Authors

Xi Zeng School of Computer Science, Northwestern Polytechnical University
Xiaotian Hao College of Intelligence and Computing, Tianjin University
Hongyao Tang College of Intelligence and Computing, Tianjin University
Zhentao Tang Noah’s Ark Lab, Huawei
Shaoqing Jiao School of Computer Science, Northwestern Polytechnical University
Dazhi Lu School of Computer Science, Northwestern Polytechnical University
Jiajie Peng School of Computer Science, Northwestern Polytechnical University Key Laboratory of Big Data Storage and Management, Northwestern Polytechnical University, Ministry of Industry and Information Technology School of Computer Science, Research and Development Institute of Northwestern Polytechnical University in Shenzhen

DOI:

https://doi.org/10.1609/aaai.v38i1.27792

Keywords:

APP: Natural Sciences

Abstract

Designing novel biological sequences with desired properties is a significant challenge in biological science because of the extra large search space. The traditional design process usually involves multiple rounds of costly wet lab evaluations. To reduce the need for expensive wet lab experiments, machine learning methods are used to aid in designing biological sequences. However, the limited availability of biological sequences with known properties hinders the training of machine learning models, significantly restricting their applicability and performance. To fill this gap, we present ERLBioSeq, an Evolutionary Reinforcement Learning algorithm for BIOlogical SEQuence design. ERLBioSeq leverages the capability of reinforcement learning to learn without prior knowledge and the potential of evolutionary algorithms to enhance the exploration of reinforcement learning in the large search space of biological sequences. Additionally, to enhance the efficiency of biological sequence design, we developed a predictor for sequence screening in the biological sequence design process, which incorporates both the local and global sequence information. We evaluated the proposed method on three main types of biological sequence design tasks, including the design of DNA, RNA, and protein. The results demonstrate that the proposed method achieves significant improvement compared to the existing state-of-the-art methods.

Designing Biological Sequences without Prior Knowledge Using Evolutionary Reinforcement Learning

Authors

DOI:

Keywords:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information

Subscription