[Chimera-users] Text statistics from Matchmaker
Elaine Meng
meng at cgl.ucsf.edu
Mon Aug 4 15:38:59 PDT 2008
Hi Jim,
Thanks for your kind words - we're glad MatchMaker has been useful!!
Now to your questions... I am not sure exactly what you want...
One issue:
When you say RMSD, do you mean the RMSD of the overall pairwise fit
consisting of many CA-CA pairs, or do you want a measure of the
structural variability within each individual column of the
alignment? The former information is sent to the Reply Log, while the
latter is available as the RMSD header, which can be shown as a
histogram above the sequences in the alignment. In the most recent
versions of Chimera (newer than the July 9 production release), the
RMSD header is automatically shown in alignments from MatchMaker; in
the production release you can just turn it on using the Headers menu
in the sequence alignment window. In the pairwise case, this header
simply shows the CA-CA distance between the two residues associated
with a column in the alignment.
You can save header values to a file. However, simply using the
values within Chimera can be more powerful. As header values are
assigned as attributes of the associated residues, you can just select
the residues with mavRMSD attribute values (see Select... By Attribute
Value in main Chimera menu) above or below some number and apply
various actions to them, including writing them out to a list. You
can use "Structure... Match" in the sequence alignment menu to
superimpose the structures with or without fit iteration (without
iteration uses all columns regardless of any cutoff). You can save
the alignment itself (File... Save As in alignment window menu) as
well as a text file of which structure residue is associated with each
position in the alignment (File... Save Association Info).
A second issue:
Depending on how easy-to-align the sequences are (level of percent
identity), you may want to go through a cycle of using Match->Align to
generate a second sequence alignment from the MatchMaker superposition
instead of using the initial sequence alignment from MatchMaker. To
get that second alignment after using MatchMaker, you would start
Match->Align (under Tools... Structure Comparison), set a distance
cutoff, and press OK.
Here is a little more detail on that issue:
"The primary purpose of MatchMaker is to superimpose related
structures; the sequence alignments can be considered a by-product.
Successful superposition only requires a partially correct sequence
alignment, as incorrect portions tend to be omitted during fit
iteration. If the sequences are easily alignable (significantly
similar), the MatchMaker sequence alignment is likely to be correct
from beginning to end, but when they are more distantly related, parts
of the sequence alignment may be incorrect even when the resulting
iterated match looks very good. If the goal is to obtain not just a
structural superposition but also an alignment of dissimilar
sequences, Match -> Align is recommended for generating a structurally
verified sequence alignment after the structures have been
superimposed. Furthermore, matching the structures using this
structurally verified sequence alignment will provide better RMSD and
number-of-pairs values for describing structural similarity when the
sequences are divergent (because more columns are aligned correctly)
while having little effect on the superposition."
It sounds like your sequences are so similar, this issue may not come
into play. However, a further advantage of adding a Match-Align step
is that it can make a multiple sequence alignment (beyond pairwise)
given multiple superimposed structures. Unfortunately 100 structures
at a time would cause a combinatorial explosion. I have used it to
make a sequence alignment of <10 superimposed structures.
If these explanations and tips do not give what you are looking for,
or interactive use is impractical given the number of runs you wish to
perform, it should be python-scriptable, but we would need to know the
series of operations you had in mind and precisely which observables
you wanted to write.
I hope this helps,
Elaine
-----
Elaine C. Meng, Ph.D. meng at cgl.ucsf.edu
UCSF Computer Graphics Lab (Chimera team) and Babbitt Lab
Department of Pharmaceutical Chemistry
University of California, San Francisco
http://www.cgl.ucsf.edu/home/meng/index.html
On Aug 4, 2008, at 12:32 PM, nettles wrote:
> I have found MatchMaker in Chimera an incredibly useful alignment
> tool. Thank you to the UCSF team!
>
> Now, I want to do do a mutliple alignment of many structures (~100)
> off a single reference structure, and output a text record of the
> residue pairings involved in the individual alignments along with
> their RMSD.
>
> In this particular case, I'll be looking at only small sequence
> variations of the same protein (with different drugs and cofactors).
> Accordingly, I would also like to output the RMSD of residue pairs
> beyond the alignment cut for relatively easy identification of
> regions having maximal variations.
>
> Advise is appreciated,
> With regard,
> Jim
> ________________________________________________________
>
> James Nettles, Ph. D. Assistant Professor
> Department of Pediatrics, Emory University School of Medicine
> ________________________________________________________
>
More information about the Chimera-users
mailing list