Opened 5 years ago

Closed 5 years ago

Last modified 5 years ago

#3138 closed enhancement (fixed)

Build out full side-chains for MD simulations

Reported by: tic20@… Owned by: pett
Priority: moderate Milestone:
Component: Structure Editing Version:
Keywords: Cc: Tom Goddard
Blocked By: Blocking:
Notify when closed: Platform: all
Project: ChimeraX

Description

Tristan wants to be able to add atoms to make complete side-chains (all heavy atoms plus hydrogens) matching OpenMM templates.

Begin forwarded message:

From: Tristan Croll
Subject: Re: ISOLDE and ChimeraX plan for the future
Date: May 4, 2020 at 3:24:17 PM PDT
To: Tom Goddard

Hi Tom,

Coming back to your question on things from the ChimeraX end that I think could help ISOLDE. I can think of a few - some quite easy, some a little more complex.

On the easy side (but moderately urgent): correcting residues where the hydrogens don't match the MD template. I actually have a lot of the necessary infrastructure in place - in particular a method to "complete" a residue with missing atoms (and rename wrongly-named ones), based on graph matching to its CCD template using NetworkX. Problem with that is that (a) it currently adds *all* atoms in the template, including superfluous acid/phosphate hydrogens and atoms that should be removed due to covalent bonding; and (b) it doesn't add hydrogens that are missing from the template but should be there for MD. It should be quite straightforward to use the same graph matching approach to the MD residue template to decide which atoms to remove, but adding the missing hydrogens is more difficult because the MD template has no explicit geometry information. It seems certain that everything necessary for that could be found in Eric's AddH code, but I haven't come to terms with that enough to figure out how.

That one alone could make a pretty huge difference - while ISOLDE has a library of about 15,000 residues, a single-hydrogen mismatch currently just triggers an "isolde doesn't recognise this residue" message, which isn't particularly helpful in fixing it. In fact, right now it doesn't even tell the user if there *is* a paramterisation for that residue but the atoms don't match... needs work. Having a "this residue looks like XXX but it has an extra ... / is missing ... Would you like to fix it?" message instead would immediately boost usability.

...

Change History (14)

comment:1 by Tom Goddard, 5 years ago

I thought one approach for this is using the swapaa command (swap amino acid). You just swap the amino acid using the same type it already has. This has the advantage of trying to choose a rotamer that does not clash. Adding hydrogens could then be done with AddH.

If OpenMM wants different atom names then that is another problem but could perhaps be solved after ChimeraX adds PDB standard atom names.

comment:2 by Tom Goddard, 5 years ago

Having a decent error message that reveals all the information the code knows (ie. what template it was trying, what atom name was missing or was extra, ...) is very little work with a very high gain in usability, basically going from undebuggable by the user, to easily debuggable and probably readily fixable by the user.

in reply to:  3 ; comment:3 by Tristan Croll, 5 years ago

This isn’t so much for amino acids (which as you say are easy now) but for all the non-standard residues (modified amino acids, ligands etc.). OpenMM doesn’t care about atom names, but the wwPDB does and so does Phenix (it’ll refuse to go forward with most things if a single atom is incorrectly named).
 

 


comment:4 by pett, 5 years ago

Status: assignedaccepted

comment:5 by pett, 5 years ago

Status: acceptedfeedback

#3105 is a specific case of this. Are they any specifics to this ticket?

comment:6 by Tom Goddard, 5 years ago

Ticket #3105 is about addh using the template names for hydrogens. This ticket is about adding both non-hydrogen and hydrogen missing atoms I think. Tristan should provide a specific example, e.g. listing a specific PDB id and residue number/name.

in reply to:  7 ; comment:7 by Tristan Croll, 5 years ago

#3105 is primarily focused on atom naming. The more specific issue here 
that needs to be tackled is where you run into the situation of "this 
residue looks like it *should* correspond to this MD template, but it 
has (a) superfluous/missing hydrogen(s)." Superfluous atoms are of 
course relatively easy - just delete them (after double-checking with 
the user in most cases). But to *add* atoms you need code that 
understands molecular geometry. Would be great to be able to call a 
method (from the AddH bundle?) to say, "please add a hydrogen with 
reasonable geometry to this atom".

On 2020-05-05 17:51, ChimeraX wrote:

comment:8 by pett, 5 years ago

Resolution: fixed
Status: feedbackclosed

Does the template give the hybridization of the heavy atom? If so, just call

chimerax.atomic.build_structure.modify_atom(heavy_atom, element, desired_num_bonds, geometry=N)

That will add/remove hydrogens as appropriate. Geometry can be an integer or a constant from chimerax.atomic.bond_geom, namely:

ion (0)
single (1)
linear (2)
planar (3)
tetrahedral (4)

Obviously, geometry must be >= the number of bonds

in reply to:  9 ; comment:9 by Tristan Croll, 5 years ago

A few concrete examples:

- (relatively trivial) dangling -O-PO2 termini on nucleic acids get a 
hydrogen added to the phosphorus (this is maybe better described as a 
bug...)

- FAD consistently gets an extra hydrogen added to the flavin (at N9, if 
I recall correctly) that doesn't agree with the MD template. Admittedly 
FAD states get somewhat complicated, so more work may be needed on the 
MD side to support the different possible permutations. But then, this 
sort of functionality would help a lot in providing some future tool 
allowing the user to switch between different tautomers of a given 
residue/ligand...

- AddH without the extra "metalDist 1" argument often yields missing 
hydrogens at metal sites when the geometry is a bit off (which it all 
too often is). Sure, it's possible to just call AddH again, but 
(especially in big models) it would be much more efficient to just fix 
that site.

... and the general cases where the starting geometry is bad, leading to 
incorrect guessed chemistry. "addh template true" can help prevent this 
in the first place - but particularly if the model is really large with 
lots of complicated ligands, that could become problematic if it 
re-introduces template mismatch problems that were previously fixed.

On 2020-05-05 18:10, ChimeraX wrote:

in reply to:  10 ; comment:10 by Tristan Croll, 5 years ago

Sweet! Will check that out ASAP.

On 2020-05-05 18:27, ChimeraX wrote:

in reply to:  11 ; comment:11 by pett, 5 years ago

Just to provide some feedback on what otherwise might seem to be bugs...


The pKa of a terminal phosphate (much like histidine) is near biological pH, so ChimeraX will sometimes protonate it.  It examines the P-O bond lengths and if they differ significantly and one is near single-bond length, then it will protonate it.


It’s the N1.  ChimeraX prefers to make the central FAD ring charged and aromatic rather than neutral and non-aromatic.  I have looked at various FAD structures in the past and you will frequently see hydrogen-bond donor groups pointing towards that central ring (e.g.  /A:169@N in 1TZL).

in reply to:  12 ; comment:12 by Tristan Croll, 5 years ago

... but in the terminal phosphate case, the hydrogen gets added to the 
*phosphorus* (one of the oxygens is missing, analogous to what you get 
in a peptide chain break).

On 2020-05-05 18:55, ChimeraX wrote:

comment:13 by pett, 5 years ago

Ah, right, that's what you said. If it remains a problem for you, file a ticket for it and I'll try to do something about it.

in reply to:  14 ; comment:14 by Tristan Croll, 5 years ago

I've been meaning for ages to add a little code to catch and auto-fix 
really trivial problems like that one... I should really get around to 
it.

On 2020-05-05 19:09, ChimeraX wrote:
Note: See TracTickets for help on using tickets.