In Partial Fulfillment of the Requirements for the Degree of
Master of Science
Will defend his thesis
Biological sequences like DNA, RNA, and protein are frequently analyzed in the field of bioinformatics. Multiple sequence alignments (MSAs) help identify similarity between the sequences and reveal their functional, structural, or evolutionary relationships. MSA analysis often requires various visualization techniques, which assist researchers to better understand, evaluate, and learn from it.
In this thesis, we introduce Mavis, a new approach to coloring MSAs and highlighting its quality and structure. Instead of using a pre-defined color scheme based on residue types, we design a new algorithm to dynamically generate a color for each residue based on a user-determined similarity score between this residue and others. This new tool colors an MSA in such a way that a well aligned region is represented by a solid-color block, while a poorly aligned one by a mosaic with various colors. Thus, the alignment quality and internal structure are clearly displayed. Mavis is ready-to-use for biologists without requiring advanced computer science skills and will deliver the visualization result of a typical MSA in minutes.