[Chimera-users] Writing alignment and RMSD out

Mon Dec 10 15:31:27 PST 2007

On Dec 10, 2007, at 2:54 PM, Eric Pettersen wrote:

> In regards to Match->Align producing a different number of aligned  
> columns than the MatchMaker "core" even with the same cutoff value,  
> that is totally believable.  You will get situations where loop  
> residues happen to cross each other within the cutoff distance but  
> those residues were not in the same column of the MatchMaker  
> alignment [nor should they be really] and therefore could not be in  
> it's final "core".

Just wanted to add:

I've found that with more difficult-to-align sequences (worst case is  
very low sequence identity and all-beta or all-alpha secondary  
structure), there may be incorrect segments in the initial MatchMaker  
alignment that are corrected in alignment from a subsequent Match- 
 >Align step.  The purpose of MatchMaker is to generate a correct  
superposition, and this is still successful in nearly all cases  
because the incorrect areas are pruned during fit iteration (only the  
correct columns are used to generate the final superposition).  Match- 
 >Align will then generate columns for all the superimposed parts,  
some of which were not used in the prior fitting step.

Which alignment is better depends on the situation.  For example,  
there could be different structures of the same protein where one  
loop moves a lot.  The MatchMaker alignment will simply align the  
entire identical or nearly identical sequences, whereas Match->Align  
will not align the loops that are poorly superimposed in space.

If you are comparing different structural matches, especially hard-to- 
align distantly related cases, it may be more appropriate to use the  
number of pairs in the Match->Align alignment rather than those  
values from MatchMaker. In our paper, for example, we reported the  
number of pairs and RMSDs from the Match->Align alignment, not the  
MatchMaker one:

Tools for integrated sequence-structure analysis with UCSF Chimera:  
E.C. Meng, E.F. Pettersen, G.S. Couch, C.C. Huang, and T.E. Ferrin,  
BMC Bioinformatics 7, 339 (2006).  http://www.biomedcentral.com/ 
1471-2105/7/339

The drawback is that getting that RMSD value requires another round  
of fitting, this time on all positions in the Match->Align alignment  
(without iteration).

This may have been more detail than you wanted!
Elaine
  -----
Elaine C. Meng, Ph.D.                          meng at cgl.ucsf.edu
UCSF Computer Graphics Lab and Babbitt Lab
Department of Pharmaceutical Chemistry
University of California, San Francisco
                      http://www.cgl.ucsf.edu/home/meng/index.html