Helsinki Algorithms Seminar: "Hardness of Covering Alignment: Phase Transition in Post-Sequence Genomics" Veli Mäkinen, UH
Weekly meeting of researchers in the Helsinki area interested in the art of algorithms and algorithm design
University of Helsinki
Hardness of Covering Alignment: Phase Transition in Post-Sequence Genomics
Covering alignment problems arise from recent developments in genomics; so called pan-genome graphs are replacing reference genomes, and advances in haplotyping enable full content of diploid genomes to be used as basis of sequence analysis. In this paper, we show that the computational complexity will change for natural extensions of alignments to pan-genome representations and to diploid genomes. More broadly, our approach can also be seen as a minimal extension of sequence alignment to labelled directed acyclic graphs (labeled DAGs). Namely, we show that finding a *covering alignment* of two labeled DAGs is NP-hard even on binary alphabets. A covering alignment asks for two paths R1 (red) and G1 (green) in DAG D1 and two paths R2 (red) and G2 (green) in DAG D2 that cover the nodes of the graphs and maximize the sum of the global alignment scores: as(spell(R1),spell(R2))+as(spell(G1),spell(G2)), where spell(P) is the concatenation of labels on the path P, and sp(A,B) is the global alignment score of strings A and B. Pair-wise alignment of haplotype sequences forming a diploid chromosome can be converted to a two-path coverable labelled DAG, and then the covering alignment models the similarity of two diploids over arbitrary recombinations. We also give a reduction to the other direction, to show that such a recombination-oblivious diploid alignment is NP-hard on alphabets of size 3.
This is joint work with Romeo Rizzi, Massimo Cairo, Daniel Valenzuela, and Alexandru Tomescu.
Helsinki Algorithms Seminar is a weekly meeting of researchers in the Helsinki area interested in the art of algorithms and algorithm design, broadly interpreted to cover both theoretical ideas and algorithm engineering on concrete computing platforms. In most cases we have a presentation prepared for each meeting to communicate an idea, a recent result, work-in-progress, or demo, but this should not be at the expense of discussion and simply having fun with algorithms.
Our affiliations are with Aalto University and the University of Helsinki, and accordingly our activities alternate between the Otaniemi Campus of Aalto University and the Kumpula Campus of University of Helsinki, catalyzed by the Helsinki Institute for Information Technology HIIT, under the Algorithmic Data Analysis (ADA) programme.