[Chimera-users] Matchmaker 'restrict to selection' not working as expected.

Mon Nov 23 11:38:09 PST 2015

Hi Oliver,
	The domain restriction capability was added well after most of MatchMaker had been implemented and I might have done things differently if it was part of the initial design — and I might not have.  The way it works is that it doesn’t completely ignore the non-selected part of the chain, it just pretends there is no structure associated with those parts and that the sequence composition (but not length) is unknown.  Therefore the sequence-alignment isn’t speeded up by the factor that you would anticipate given the size of the domain vs. the whole sequence.  There is some upside though.  Let’s say you have a chain of two identical domains connected by a hinge and you have two conformations of that.  If you select the second domain in one of the structures and turn on selection restriction, MatchMaker will always match it to the correct domain in the other structure, whereas if you were completely ignoring everything outside the restriction it would be a 50/50 shot as to whether it would match it against the correct domain.
	The MatchMaker code is pretty intricate, given all its options, and I would be hesitant to change how it works in order to address this.  What I would probably do it translate the underlying Needleman-Wunsch sequence alignment code from Python to C++, which would likely make this problem mostly moot.  I will undoubtedly do that for ChimeraX.  I’m somewhat less likely to do it for Chimera “classic”, though I might back port the code once it’s written for ChimeraX.

—Eric

	Eric Pettersen
	UCSF Computer Graphics Lab


> On Nov 21, 2015, at 5:15 PM, Oliver Clarke <olibclarke at gmail.com> wrote:
> 
> Hi all,
> 
> When using Matchmaker, the ‘Further restrict to matching selection’ option does not seem to improve the speed of alignment as much as expected under certain circumstances (when the selection is a subset of a larger chain, as opposed to being an individual chain in a multi chain structure). 
> 
> It seems to still restrict the selection correctly for the final alignment (it is still only aligned on the selected domain), but the “test match” step seems to be conducted on the entire chain, rather than the selection - this makes a big difference to speed for large proteins.
> 
> For example - I tried aligning two conformations of a protein complex consisting of one chain A ~100 residues, and chain B ~5000 residues.
> 
> If I select chain A of both structures, and then select ‘Further restrict to matching selection’ for both the reference and the structure to match, the alignment is almost instantaneous - less than 2s.
> 
> If I perform the same operation, but this time selecting the first 100 residues of chain B in each structure, the alignment takes 3min20s, regardless of whether I leave automatic chain pairing on or select the specific chains to use for alignment manually. It takes only marginally longer - 5 min - if I select the entirety of chain B, a selection that is 50X the size.
> 
> Incidentally I still get the “test match” message in the status bar even when the selection only contains one chain from each structure, or when I manually select (using the chain pairing options) which chains to align - maybe Chimera is still running test match on the whole chain when it should only do so on the selection (or not at all if the selection only includes one possible match).
> 
> This seems like a bug? Or am I missing something?
> 
> Cheers,
> Oliver.
> 
> 
> _______________________________________________
> Chimera-users mailing list
> Chimera-users at cgl.ucsf.edu
> http://plato.cgl.ucsf.edu/mailman/listinfo/chimera-users
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://plato.cgl.ucsf.edu/pipermail/chimera-users/attachments/20151123/3efec419/attachment.html>