[Chimera-users] Parameters choice for match->align and quality assessment

Elaine Meng meng at cgl.ucsf.edu
Tue Jul 6 09:56:12 PDT 2010


Dear D,
First let me clarify which tools in Chimera you might be using:

(A) MatchMaker will superimpose the structures.  Also, if you chose the option to "After superposition, compute structure-based multiple sequence alignment" it will call the other tool, Match -> Align.
<http://www.cgl.ucsf.edu/chimera/docs/ContributedSoftware/matchmaker/matchmaker.html>
There is also a command version of matchmaker:
<http://www.cgl.ucsf.edu/chimera/docs/UsersGuide/midas/mmaker.html>

(B) Match -> Align will not superimpose the structures when they are far apart, but it will create a sequence alignment consistent with the 3D superposition that you give it.  It can also improve the input 3D superposition by an iterative process, but you need to start with the proteins at least roughly superimposed already.  You could make the starting superposition any way you want, but the convenient way in Chimera is with MatchMaker.
<http://www.cgl.ucsf.edu/chimera/docs/ContributedSoftware/matchalign/matchalign.html>

It sounds like you might be using both of those tools.  If your proteins are closely related, they are probably "easy" to superimpose, and you could just use the Matchmaker default parameter values to start.  We have chosen these defaults to work in a broad range of situations.  It is only the "hard" cases that might require playing around with the parameters.  

You would only use Match -> Align if (A) you wanted multiple sequence alignment that goes with the 3D superposition, or (B) you wanted to improve the 3D superposition using the multiple sequence alignment, or (C) you wanted to calculate a score for your structural superposition.   If  purpose A, the proper cutoff (your question #1) depends on what you want this sequence alignment to mean.  Match -> Align  will obey your specifications.  If you want only the residues with CA-CA distance <4 angstroms to be in the same column, then you would use cutoff 4 angstroms.  If there are floppy loops that are farther apart but you think they are evolutionarily equivalent and should be allowed to be aligned in the sequence alignment, you would want to use a larger cutoff.  Otherwise (purposes B and C) you could just try using the default settings everywhere (your questions #1 and #2).  

However, this connects to your question #3, how you do evaluate the results so you know which parameter values are best?  Often, people just evaluate the superposition visually.  There is no single measure that everybody agrees is the right way to evaluate the superposition.  In fact, dozens and maybe even hundreds of different measures have been published!  They usually combine RMSD and number of aligned positions in some way.  However, we have recently added the calculation of two different measures by Match -> Align, SDM and Q-score.  These are described in more detail along with literature references in the Match -> Align documentation:
<http://www.cgl.ucsf.edu/chimera/docs/ContributedSoftware/matchalign/matchalign.html#measures>

Those scores will be reported in the Reply Log (under Favorites in the menu), but you need to have a fairly new version of Chimera (version 1.5, so a daily build).  If you are concerned that your 3D superposition results might not be the optimum, you can try adjusting parameters and seeing if these scores improve.  However, it sounds like your proteins are relatively easy to superimpose, and my guess is that a wide range of settings will give you about the same results.  I just wanted you to be fully informed so that in your hands, the tools will be powerful and useful.

Finally, this Chimera tutorial includes use of MatchMaker and Match -> Align:
<http://www.cgl.ucsf.edu/chimera/docs/UsersGuide/tutorials/alignments.html>

I hope this helps,
Elaine
----------
Elaine C. Meng, Ph.D. 
UCSF Computer Graphics Lab (Chimera team) and Babbitt Lab
Department of Pharmaceutical Chemistry
University of California, San Francisco

On Jul 5, 2010, at 9:41 AM, compchem compchems wrote:

> Dear Chimera developers, great thanks to you for possibility to work with your very useful program.
> 
> I am new in structural alignment studies and have some questions about parameters in match->align.
> 
> My set cosists of few closely related proteins with same function and conservative spatial structure, but with some divergence in aminoacid sequences except of conservative functional residues.
> 
> First, what "Residue aligned in column if within cutoff of" influences on and what preferable choice may be in my case.
> 
> Second, "Superimpose full column" - how can i decide what to select for my set or what play can help me.
> 
> Third, how can i evaluate quality or correctness of obtained alignment (may be, common procedure).
> 
> I am sorry for long letter and thanks for your attention to my problem.
> 
> D. Yegorov.




More information about the Chimera-users mailing list