RUS  ENG JOURNALS   PEOPLE   ORGANISATIONS   CONFERENCES   SEMINARS   VIDEO LIBRARY   PERSONAL OFFICE
General information
Latest issue
Archive
Guidelines for authors
Submit a manuscript

Search papers
Search references

RSS
Latest issue
Current issues
Archive issues
What is RSS



Program Systems: Theory and Applications:
Year:
Volume:
Issue:
Page:
Find






Personal entry:
Login:
Password:
Save password
Enter
Forgotten password?
Register


Program Systems: Theory and Applications, 2015, Volume 6, Issue 1, Pages 189–197 (Mi ps164)  

This article is cited in 3 scientific papers (total in 3 papers)

Mathematical Foundations of Programming

A model and algorithm for sequence alignment

S. V. Znamenskij

Program Systems Institute of RAS

Abstract: The change detection problem is aimed at identifying common and different strings and usually has non-unique solutions. The identification of the best alignment is canonically based on finding a longest common subsequence (LCS) and is widely used for various purposes. However, many recent version control systems prefer alternative heuristic algorithms which not only are faster but also usually produce better alignment than finding an LCS.
Two basic shortcomings of known alignment algorithms are outlined in the paper:
  • even when the length of the longest common substring is close to that of the LCS, the latter may consist of a great number of short uninformative substrings;
  • known alternative algorithms start with identifying the most informative common string, which sometimes omits from consideration common subsequence containing arbitrarily many aligned substrings of similar quality.
The sequence alignment problem is considered to be an abstract model for change detection in collaborative text editing designed to minimize the probability of merge conflict. A new cost function is defined as the probability of overlap between detected changes and a random string. This optimization avoids both shortcomings mentioned above. The simple cubic algorithm is proposed.

Key words and phrases: similarity of strings, sequence alignment, software development, diff, LCS, edit distance, Levenshtein metric.

Full text: PDF file (1151 kB)
References: PDF file   HTML file

Document Type: Article
UDC: 004.416
Received: 14.12.2014
Accepted: 28.01.2015
Language: English

Citation: S. V. Znamenskij, “A model and algorithm for sequence alignment”, Program Systems: Theory and Applications, 6:1 (2015), 189–197

Citation in format AMSBIB
\Bibitem{Zna15}
\by S.~V.~Znamenskij
\paper A model and algorithm for sequence alignment
\jour Program Systems: Theory and Applications
\yr 2015
\vol 6
\issue 1
\pages 189--197
\mathnet{http://mi.mathnet.ru/ps164}


Linking options:
  • http://mi.mathnet.ru/eng/ps164
  • http://mi.mathnet.ru/eng/ps/v6/i1/p189

    SHARE: VKontakte.ru FaceBook Twitter Mail.ru Livejournal Memori.ru


    Citing articles on Google Scholar: Russian citations, English citations
    Related articles on Google Scholar: Russian articles, English articles

    This publication is cited in the following articles:
    1. Sergej V. Znamenskij, “Simple essential improvements to the ROUGE-W algorithm”, Zhurn. SFU. Ser. Matem. i fiz., 8:4 (2015), 497–501  mathnet  crossref
    2. S. V. Znamenskij, “Stable assessment of the quality of similarity algorithms of character strings and their normalizations”, Programmnye sistemy: teoriya i prilozheniya, 9:4 (2018), 561–578  mathnet  crossref
    3. S. V. Znamenskii, “Ustoichivaya otsenka kachestva algoritmov skhodstva simvolnykh strok i ikh normalizatsii”, Programmnye sistemy: teoriya i prilozheniya, 9:4 (2018), 579–596  mathnet  crossref
  • Program Systems: Theory and Applications
    Number of views:
    This page:103
    Full text:41
    References:7

     
    Contact us:
     Terms of Use  Registration  Logotypes © Steklov Mathematical Institute RAS, 2019