Opened 5 years ago
Closed 5 years ago
#3338 closed task (fixed)
Phenix mmCIF problems
Reported by: | Owned by: | Greg Couch | |
---|---|---|---|
Priority: | moderate | Milestone: | |
Component: | Input/Output | Version: | |
Keywords: | Cc: | Eric Pettersen, Tom Goddard | |
Blocked By: | Blocking: | ||
Notify when closed: | Platform: | all | |
Project: | ChimeraX |
Description
We should work with the Phenix group to get them to output conforming mmCIF files that work well when read into ChimeraX.
From: Tristan Croll
Subject: Phenix mmCIF writing
Date: May 28, 2020 at 1:50:54 PM PDT
To: Greg Couch
Cc: Tom Goddard, Eric Pettersen
Hi Greg,
I dug through the CCTBX code to find where Phenix writes its data to mmCIF - looks like the key code is here:
It was written by Oleg Sobolev (osobolev@…).
Best regards,
Tristan
Attachments (1)
Change History (7)
comment:1 by , 5 years ago
comment:2 by , 5 years ago
From: Tristan Croll
Subject: Re: Phenix mmCIF writing
Date: May 29, 2020 at 2:42:08 AM PDT
To: Greg Couch
Cc: Tom Goddard, Eric Pettersen
No joy, I'm afraid. That happens in https://github.com/cctbx/cctbx_project/blob/8c1cb611c447c2b9de6cbb170bdea575b080ede6/mmtbx/programs/prepare_pdb_deposition.py - not only does it not fill in the entity_id field, but the result fails to open in ChimeraX at all:
phenix.fetch_pdb --all 1igr
phenix.pdb_as_cif 1igr.pdb
mmtbx.prepare_pdb_deposition 1igr.cif 1igr.fa
chimerax 1igr.deposit_000.cif
Traceback (most recent call last):
File "/opt/UCSF/ChimeraX/lib/python3.7/site-packages/chimerax/cmd_line/tool.py", line 258, in execute
cmd.run(cmd_text)
File "/opt/UCSF/ChimeraX/lib/python3.7/site-packages/chimerax/core/commands/cli.py", line 2805, in run
result = ci.function(session, kw_args)
File "/opt/UCSF/ChimeraX/lib/python3.7/site-packages/chimerax/open_command/cmd.py", line 101, in cmd_open
Command(session, registry=registry).run(provider_cmd_text, log=log)
File "/opt/UCSF/ChimeraX/lib/python3.7/site-packages/chimerax/core/commands/cli.py", line 2805, in run
result = ci.function(session, kw_args)
File "/opt/UCSF/ChimeraX/lib/python3.7/site-packages/chimerax/open_command/cmd.py", line 152, in provider_open
name or model_name_from_path(fi.file_name)), provider_kw)
File "/opt/UCSF/ChimeraX/lib/python3.7/site-packages/chimerax/open_command/cmd.py", line 382, in collated_open
return func(*func_args, func_kw)
File "/opt/UCSF/ChimeraX/lib/python3.7/site-packages/chimerax/atomic/mmcif/init.py", line 38, in open
return mmcif.open_mmcif(session, data, file_name, kw)
File "/opt/UCSF/ChimeraX/lib/python3.7/site-packages/chimerax/atomic/mmcif/mmcif.py", line 88, in open_mmcif
for p in pointers]
File "/opt/UCSF/ChimeraX/lib/python3.7/site-packages/chimerax/atomic/mmcif/mmcif.py", line 88, in <listcomp>
for p in pointers]
File "/opt/UCSF/ChimeraX/lib/python3.7/site-packages/chimerax/atomic/structure.py", line 1146, in init
self._set_chain_descriptions(self.session)
File "/opt/UCSF/ChimeraX/lib/python3.7/site-packages/chimerax/atomic/structure.py", line 1349, in _set_chain_descriptions
entity_to_description[mmcif_chain_to_entity[mmcif_cid]], False)
KeyError: 'A'
KeyError: 'A'
The resulting mmCIF file is attached. I guess the wwPDB is bending over backwards to be accommodating?
-- Tristan
by , 5 years ago
Attachment: | 1igr.deposit_000.cif added |
---|
comment:3 by , 5 years ago
Status: | assigned → accepted |
---|
comment:4 by , 5 years ago
Ouch. And in this case the entity, entity_poly, and entity_poly_seq tables are in the file, so they could fill in the atom_site.label_entity_id. Instead, it explicitly says that the atom_site.label_entity_id is unknown (?).
comment:5 by , 5 years ago
I just had a look back at #2483 (Phenix mmCIF files having faulty _struct_conn
tables) and confirmed that it's still in effect. Problem here is that the code that writes the _struct_conn
table just seems to repeat the auth_...
fields for the label_...
fields, while the code that writes the actual atoms does things differently (and wrongly, in at least one way). Just noticed while doing a quick test run on 1a0m that the _atom_site.label_seq_id
increments each time the altloc changes, which explains why its disulfides end up out of step between the two tables... but I've also had structures with no altlocs where disulfides and glycan links failed to form, so that isn't the only problem. I'm guessing that gaps in the sequence are being mis-handled.
comment:6 by , 5 years ago
Resolution: | → fixed |
---|---|
Status: | accepted → closed |
Changed ChimeraX to not try to give chains descriptions when mmcif_cid (label_asym_id) is '?'. So the ChimeraX side is "fixed".
No response from Phenix folks about fixing the mmtbx.prepare_pdb_deposition program.
From: Greg Couch
Subject: Re: Phenix mmCIF writing
Date: May 28, 2020 at 5:07:58 PM PDT
To: Tristan Croll
Cc: Tom Goddard , Eric Pettersen
That looks like the mmCIF writer for intermediate results. And that is an issue. See line 1121:
entity_id = '?' # XXX how do we determine this?
The missing entity information makes ChimeraX do a lot of guessing.
Your bug from 8 months ago (thought it was older :-)), that I referenced in
is with the secondary structure information. That information is probably added later by the mmtbx.prepare_pdb_deposition program referenced on