#3138 closed enhancement (fixed)
Build out full side-chains for MD simulations
Reported by: | Owned by: | pett | |
---|---|---|---|
Priority: | moderate | Milestone: | |
Component: | Structure Editing | Version: | |
Keywords: | Cc: | Tom Goddard | |
Blocked By: | Blocking: | ||
Notify when closed: | Platform: | all | |
Project: | ChimeraX |
Description
Tristan wants to be able to add atoms to make complete side-chains (all heavy atoms plus hydrogens) matching OpenMM templates.
Begin forwarded message:
From: Tristan Croll
Subject: Re: ISOLDE and ChimeraX plan for the future
Date: May 4, 2020 at 3:24:17 PM PDT
To: Tom Goddard
Hi Tom,
Coming back to your question on things from the ChimeraX end that I think could help ISOLDE. I can think of a few - some quite easy, some a little more complex.
On the easy side (but moderately urgent): correcting residues where the hydrogens don't match the MD template. I actually have a lot of the necessary infrastructure in place - in particular a method to "complete" a residue with missing atoms (and rename wrongly-named ones), based on graph matching to its CCD template using NetworkX. Problem with that is that (a) it currently adds *all* atoms in the template, including superfluous acid/phosphate hydrogens and atoms that should be removed due to covalent bonding; and (b) it doesn't add hydrogens that are missing from the template but should be there for MD. It should be quite straightforward to use the same graph matching approach to the MD residue template to decide which atoms to remove, but adding the missing hydrogens is more difficult because the MD template has no explicit geometry information. It seems certain that everything necessary for that could be found in Eric's AddH code, but I haven't come to terms with that enough to figure out how.
That one alone could make a pretty huge difference - while ISOLDE has a library of about 15,000 residues, a single-hydrogen mismatch currently just triggers an "isolde doesn't recognise this residue" message, which isn't particularly helpful in fixing it. In fact, right now it doesn't even tell the user if there *is* a paramterisation for that residue but the atoms don't match... needs work. Having a "this residue looks like XXX but it has an extra ... / is missing ... Would you like to fix it?" message instead would immediately boost usability.
...
Change History (14)
comment:1 by , 5 years ago
comment:2 by , 5 years ago
Having a decent error message that reveals all the information the code knows (ie. what template it was trying, what atom name was missing or was extra, ...) is very little work with a very high gain in usability, basically going from undebuggable by the user, to easily debuggable and probably readily fixable by the user.
follow-up: 3 comment:3 by , 5 years ago
This isn’t so much for amino acids (which as you say are easy now) but for all the non-standard residues (modified amino acids, ligands etc.). OpenMM doesn’t care about atom names, but the wwPDB does and so does Phenix (it’ll refuse to go forward with most things if a single atom is incorrectly named).
comment:4 by , 5 years ago
Status: | assigned → accepted |
---|
comment:5 by , 5 years ago
Status: | accepted → feedback |
---|
#3105 is a specific case of this. Are they any specifics to this ticket?
comment:6 by , 5 years ago
Ticket #3105 is about addh using the template names for hydrogens. This ticket is about adding both non-hydrogen and hydrogen missing atoms I think. Tristan should provide a specific example, e.g. listing a specific PDB id and residue number/name.
follow-up: 7 comment:7 by , 5 years ago
#3105 is primarily focused on atom naming. The more specific issue here that needs to be tackled is where you run into the situation of "this residue looks like it *should* correspond to this MD template, but it has (a) superfluous/missing hydrogen(s)." Superfluous atoms are of course relatively easy - just delete them (after double-checking with the user in most cases). But to *add* atoms you need code that understands molecular geometry. Would be great to be able to call a method (from the AddH bundle?) to say, "please add a hydrogen with reasonable geometry to this atom". On 2020-05-05 17:51, ChimeraX wrote:
comment:8 by , 5 years ago
Resolution: | → fixed |
---|---|
Status: | feedback → closed |
Does the template give the hybridization of the heavy atom? If so, just call
chimerax.atomic.build_structure.modify_atom(heavy_atom, element, desired_num_bonds, geometry=N)
That will add/remove hydrogens as appropriate. Geometry can be an integer or a constant from chimerax.atomic.bond_geom, namely:
ion (0)
single (1)
linear (2)
planar (3)
tetrahedral (4)
Obviously, geometry must be >= the number of bonds
follow-up: 9 comment:9 by , 5 years ago
A few concrete examples: - (relatively trivial) dangling -O-PO2 termini on nucleic acids get a hydrogen added to the phosphorus (this is maybe better described as a bug...) - FAD consistently gets an extra hydrogen added to the flavin (at N9, if I recall correctly) that doesn't agree with the MD template. Admittedly FAD states get somewhat complicated, so more work may be needed on the MD side to support the different possible permutations. But then, this sort of functionality would help a lot in providing some future tool allowing the user to switch between different tautomers of a given residue/ligand... - AddH without the extra "metalDist 1" argument often yields missing hydrogens at metal sites when the geometry is a bit off (which it all too often is). Sure, it's possible to just call AddH again, but (especially in big models) it would be much more efficient to just fix that site. ... and the general cases where the starting geometry is bad, leading to incorrect guessed chemistry. "addh template true" can help prevent this in the first place - but particularly if the model is really large with lots of complicated ligands, that could become problematic if it re-introduces template mismatch problems that were previously fixed. On 2020-05-05 18:10, ChimeraX wrote:
follow-up: 10 comment:10 by , 5 years ago
Sweet! Will check that out ASAP. On 2020-05-05 18:27, ChimeraX wrote:
follow-up: 11 comment:11 by , 5 years ago
Just to provide some feedback on what otherwise might seem to be bugs... The pKa of a terminal phosphate (much like histidine) is near biological pH, so ChimeraX will sometimes protonate it. It examines the P-O bond lengths and if they differ significantly and one is near single-bond length, then it will protonate it. It’s the N1. ChimeraX prefers to make the central FAD ring charged and aromatic rather than neutral and non-aromatic. I have looked at various FAD structures in the past and you will frequently see hydrogen-bond donor groups pointing towards that central ring (e.g. /A:169@N in 1TZL).
follow-up: 12 comment:12 by , 5 years ago
... but in the terminal phosphate case, the hydrogen gets added to the *phosphorus* (one of the oxygens is missing, analogous to what you get in a peptide chain break). On 2020-05-05 18:55, ChimeraX wrote:
comment:13 by , 5 years ago
Ah, right, that's what you said. If it remains a problem for you, file a ticket for it and I'll try to do something about it.
follow-up: 14 comment:14 by , 5 years ago
I've been meaning for ages to add a little code to catch and auto-fix really trivial problems like that one... I should really get around to it. On 2020-05-05 19:09, ChimeraX wrote:
I thought one approach for this is using the swapaa command (swap amino acid). You just swap the amino acid using the same type it already has. This has the advantage of trying to choose a rotamer that does not clash. Adding hydrogens could then be done with AddH.
If OpenMM wants different atom names then that is another problem but could perhaps be solved after ChimeraX adds PDB standard atom names.