Opened 8 years ago
Closed 8 years ago
#827 closed defect (fixed)
Structure file I/O issues
Reported by: | Tristan Croll | Owned by: | Eric Pettersen |
---|---|---|---|
Priority: | major | Milestone: | |
Component: | Input/Output | Version: | |
Keywords: | Cc: | ||
Blocked By: | Blocking: | ||
Notify when closed: | Platform: | all | |
Project: | ChimeraX |
Description
Simple-but-important one first: mmcif_write.py currently crashes due to a reference to atoms.occupancy
rather than atoms.occupancies
at line 91.
Another issue on loading: if a residue has all its atoms but the geometry is "aggressively" bad, ChimeraX will sometimes miss a bond. This is cropping up for me when I use Coot's "Add hydrogens using Refmac" command. It's the only currently-available hydrogen-addition tool I know of that (a) the majority of existing structural biologists will have, and (b) provides all the hydrogens necessary for OpenMM to work (including N-terminal hydrogens). Unfortunately, apart from those advantages it's *really* bad. It seems to work OK at adding hydrogens to a "perfect" high-resolution structure, but at lower resolution it doesn't seem to even attempt to provide decent geometry for some hydrogens, and a handful will end up 2-3 Angstroms away from the atom they should be bonded to! I guess they rely on subsequent refinement to fix them - and OpenMM's minimiser can pull them into line in a snap. But, in these cases I inevitably run into a handful of residues where ChimeraX has failed to form a heavy atom-hydrogen bond on loading, leading to the simulation failing to launch.
I presume ChimeraX has some form of geometric test to decide whether or not to form a given bond when loading a structure. I don't think this is advisable, to be honest - or at least, there should be a "permissive" bond that forms all bonds according to the residue dictionary definitions for each residue, no matter what the geometry. Bad geometry happens for all sorts of reasons, after all.
Attachments (2)
Change History (11)
comment:1 by , 8 years ago
Owner: | changed from | to
---|
comment:2 by , 8 years ago
Status: | assigned → accepted |
---|
Fixed the mmcif_write problem,
Are these problematic-hydrogen-connectivity files mmCIF files or PDB files? Do the hydrogens have their standard PDB names?
--Eric
follow-up: 3 comment:3 by , 8 years ago
Yes, they're PDB files with standard nomenclature and all the atoms grouped by residue (hydrogens at the end of the residue). If I adjust the coordinates of the offending hydrogens and re-open, then the bonds appear. I'll find/make an example case tomorrow. Tristan Croll Research Fellow Cambridge Institute for Medical Research University of Cambridge CB2 0XY
comment:4 by , 8 years ago
OK, I've uploaded an example. Plenty of examples of terrible geometry throughout, but in this particular case ChimeraX still originally assigned the bonds correctly. I was able to reproduce the issue by moving the H of the first residue (A 7) by 3 Angstroms. There does seem to be at least some nomenclature aspect here: this is an N-terminal residue with three hydrogens (H, H2, H3) attached to the N. If I rename the H to H1, then it correctly bonds even with the big move.
I've previously seen failure-to-bond issues with sidechain atoms on other residues (methionines for sure, but not sure about others) with structures coming from Coot's hydrogen addition. Can't seem to replicate it when I want to, though. Will pass along any case I find.
comment:5 by , 8 years ago
Not sure what to say here. The PDB-standard names for N-terminal hydrogens are H1, H2, H3, not H, H2, H3. Look at any deposited NMR ensemble (e.g. 1mtx, 1jwe). So you've got a hydrogen with a non-standard name several angstroms away from the nearest heavy atom. The software isn't psychic. 'H' isn't the only possible variant here either -- I've seen HN, HT1, and others.
Another unmentioned problem is that that residue isn't even the true N-terminus -- there is missing structure, so it is unclear if it is even right to have H2/H3 atoms on that nitrogen.
I *am* pretty close to having "simple" hydrogen addition (non-H-bond-guided) in ChimeraX. Another week or two most likely. I don't know if that would resolve this issue for you or if you would still need these problematic external PDB files to work. If the latter, then you might simply nuke their hydrogens and let ChimeraX add them back once it can.
comment:6 by , 8 years ago
You're preaching to the converted here. As I said, given that Coot and Refmac are two of the heaviest used tools in all of experimental structural biology, I was shocked to see how badly they handle this task. I guess the reason is that historically hydrogens have been considered "uninteresting" and even now are usually left out... Looking forward to having your tool available! On 2017-09-13 19:13, ChimeraX wrote:
follow-up: 6 comment:7 by , 8 years ago
There does appear to be a real bug here after all. The attached file has two instances of the residue 2DT (residue 822 on chains X and P). Both at the chain terminus, both with heavy atom and hydrogen names consistent with the PDB, and all atoms present in each residue. Yet the one on chain P doesn't bond H3'1 and H5'', whereas the one on chain X has all bonds present and correct. On 2017-09-13 19:22, ChimeraX wrote:
follow-up: 7 comment:8 by , 8 years ago
Well, 2DT isn't a standard nucleic acid, so Chimera has no builtin connectivity template for it. If this were an mmCIF file, there is some chance that those protons would get connected because the mmCIF code will fetch the connectivity template for 2DT from the RCSB, whereas the PDB code will not. The PDB code looks for atoms within "bonding distance" of each other. For a C-H bond that would be 0.91 angstroms plus a 0.4 angstrom "slop" for a total of 1.31 angstroms. Those protons, for whatever reason, are 1.346 and 1.378 angstroms away from the corresponding carbon.
I'm hoping the new ChimeraX addh capability will reduce the need for me to fight this fight.
--Eric
comment:9 by , 8 years ago
Resolution: | → fixed |
---|---|
Status: | accepted → closed |
The new addh code protonates the 2DT residues of 2ajq correctly.
Issues for Eric.