Optimal DALI protein structure alignment
Résumé
We present a mathematical model and exact algorithm for protein structure alignment using dali scoring, which is an NP-hard problem. dali scoring is based on comparing the inter-residue distance matrices of proteins and is the scoring model of the widely used heuristic dali program. Our model and algorithm extend an integer linear programming approach previously used for the related contact map overlap problem. To this end, we introduce a novel type of constraint that handles negative structure scores and relax it in a Lagrangian fashion. We also review options that allow to consider less pairs of inter-residue distances explicitly, because their large number makes it difficult to optimize dali scoring optimally. We use our exact algorithm dalix to compute many provably score-optimal dali alignments for the first time, using four data sets of varying structural similarity. Further, using our exact dalix alignments, it is for the very first time possible to qualitatively benchmark the heuristic dali program in sound mathematical terms. The results indicate that dali often computes optimal or close to optimal alignments, but also that in cases of aligning small proteins it tends to fail generating
Origine : Fichiers produits par l'(les) auteur(s)