External Memory Best-First Search for Multiple Sequence Alignment

Authors

  • Matthew Hatem University of New Hampshire
  • Wheeler Ruml University of New Hampshire

DOI:

https://doi.org/10.1609/aaai.v27i1.8626

Keywords:

heuristic search, multiple sequence alignment, distributed, parallel, external memory

Abstract

Multiple sequence alignment (MSA) is a central problem in computational biology. It is well known that MSA can be formulated as a shortest path problem and solved using heuristic search, but the memory requirement of A* makes it impractical for all but the smallest problems. Partial Expansion A* (PEA*) reduces the space complexity of A* by generating only the most promising successor nodes. However, even PEA* exhausts available memory on many problems. Another alternative is Iterative Deepening Dynamic Programming, which uses an uninformed search order but stores only the nodes along the search frontier. However, it too cannot scale to the largest problems. In this paper, we propose storing nodes on cheap and plentiful secondary storage. We present a new general-purpose algorithm, Parallel External PEA* (\xppea), that combines PEA* with Delayed Duplicate Detection to take advantage of external memory and multiple processors to solve large MSA problems. In our experiments, \xppea\ is the first algorithm capable of solving the entire Reference Set 1 of the standard BAliBASE benchmark using a biologically accurate cost function. This work suggests that external best-first search can effectively use heuristic information to surpass methods that rely on uninformed search orders.

Downloads

Published

2013-06-30

How to Cite

Hatem, M., & Ruml, W. (2013). External Memory Best-First Search for Multiple Sequence Alignment. Proceedings of the AAAI Conference on Artificial Intelligence, 27(1), 409-416. https://doi.org/10.1609/aaai.v27i1.8626