Opened 5 years ago

Closed 5 years ago

#3397 closed defect (fixed)

ChimeraX: mmcif connectivity lost in some chains; OK in Coot  + PDB ATOM -> HET ATM

Reported by: jacob_r_anderson@… Owned by: Greg Couch
Priority: normal Milestone:
Component: Input/Output Version:
Keywords: Cc: Eric Pettersen
Blocked By: Blocking:
Notify when closed: Platform: all
Project: ChimeraX

Description

Dear ChimeraX Bugs,

Thank you very much for your work on this great software. We use it daily, and without which it would be extremely difficult to articulate our science.

I am touching base with a problem I am having opeing a mmcif file in chimerax. While trying to avoid this issue, I also ran into another problem when saving a PDB with a two letter chain ID.

For context, I am running ChimeraX on Ubuntu 20.04 using a fresh deb install this morning:

 chimerax --version
UCSF ChimeraX version: 1.0 (2020-06-04)
© 2016-2020 Regents of the University of California.  All rights reserved.

To be explicit, both problems are entirely reproducible. Specifically, I have an mmcif file (attached) that when opened in ccpem's or ccp4s Coot v0.9-pre, has all the connectivity present. However, when I switch to ChimeraX, the connectivity of some chains are lost. A good example of this contrast on chain connectivity fidelity is chain LA and LB. (photo attached). Oddly, in this microtubule, all the alpha tubs have connectivity, and beta do not. I tried reading through the ChimeraX documentation on this CIF format,<https://www.rbvi.ucsf.edu/chimerax/docs/devel/bundles/mmcif/src/mmcif_guidelines.html> but it remained not totally clear to me what is going awry in some chains but not in others.

Secondly, to avoid this connectivity issue, I went to save the model as a PDB instead of an mmcif, with the hope the PDB would maintain connectivity in ChimeraX. I did file conversion in Coot.  Reviewing the file, it looks OK. It also opens and docks OK in ChimeraX with proper connectivity. However, upon saving the pdb with its updated coordinates after using the "Fit To Map" module, all the atoms have been changed to HETATM (in zip as fit_to_map_PDB_savedinchimerax.pdb). Might this be a bug related to the two letter chain ID of the input PDB? or possibly a product of saving a cif file in coot to pdb?  It was reproduced upon a second try and by another member of our laboratory.

Since this was a lot, I guess to summarize the two problems are:

1) A mmcif file that opens with proper connectivity in a different software (Coot), does not open with that same connectivity in ChimeraX

2) A PDB file with two letter chain IDs, when saved, converts all ATOM lines to HETATM lines.

We would very much appreciate any help regarding these two problems.





[cid:868033cd-2f63-4156-863c-ff3ee8685350]


_____________________________

Jacob R. Anderson

GSAS Chemical Biology | G1

Cell: 440-308-9297

Mail: jacob_r_anderson@hms.harvard.edu
<mailto:jacob_r_anderson@hms.harvard.edu>

Added by email2trac

OK_in_coot.mmcif

pdb_hetatm_files.zip

Attachments (3)

image.png (240.0 KB ) - added by jacob_r_anderson@… 5 years ago.
Added by email2trac
OK_in_coot.mmcif (5.2 MB ) - added by jacob_r_anderson@… 5 years ago.
Added by email2trac
pdb_hetatm_files.zip (2.0 MB ) - added by jacob_r_anderson@… 5 years ago.
Added by email2trac

Change History (9)

by jacob_r_anderson@…, 5 years ago

Attachment: image.png added

Added by email2trac

by jacob_r_anderson@…, 5 years ago

Attachment: OK_in_coot.mmcif added

Added by email2trac

by jacob_r_anderson@…, 5 years ago

Attachment: pdb_hetatm_files.zip added

Added by email2trac

comment:1 by Eric Pettersen, 5 years ago

Cc: Eric Pettersen added
Component: UnassignedInput/Output
Owner: set to Greg Couch
Platform: all
Project: ChimeraX
Status: newassigned

The "PDB half" already answered on chimerax-users list

comment:2 by Greg Couch, 5 years ago

Status: assignedaccepted

comment:3 by Greg Couch, 5 years ago

There are many problems with the given mmCIF file. The major problem for ChimeraX is that in the atom_site table, all of the chains are designated as the same entity, but they have different sequences. It would be best if you fixed the program that generated the mmCIF file to give each chain its own entity. It would be even better if the entity, entity_poly, and entity_poly_seq tables were given.

A secondary problem is that the struct_conn table is malformed. The id is not unique. LINK is not a legal conn_type_id. The mandatory ptrn[12]_label_* columns are missing. The ptrn[12]_auth_* columns are given, but ptrn[12]_auth_atom_id's have explicit spaces them that don't match the atom_site.auth_atom_id's. And the ptrn[1]_symmetry fields are bogus. Again, it would be best if you fixed the program that generated the mmCIF file.

That said, while I am reluctant to put in fixes for bad data because it slows down the mmCIF reader for everyone, I will see what I can do for the atom_site table. The struct_conn table is too messed up to workaround.

in reply to:  7 comment:4 by jacob_r_anderson@…, 5 years ago

I really appreciate you taking a look at the file. Naturally I was confused as it opens OK in another software, and not in ChimeraX. It sounds like it is in pretty bad shape. I am not sure how the file was corrupted, but I did use the "Change Chain Order" and "Change Chain IDs" in Coot before saving as a CIF. I will touch base with that developer if it happens again or I can figure out the problem.

Thank you again, greatly, for your time and help with this matter.


_____________________________

Jacob R. Anderson

GSAS Chemical Biology | G1

MD/PhD Candidate | Harvard Medical School

Cell: 440-308-9297

Mail: jacob_r_anderson@hms.harvard.edu<mailto:jacob_r_anderson@hms.harvard.edu>

________________________________
From: ChimeraX <ChimeraX-bugs-admin@cgl.ucsf.edu>
Sent: Friday, June 19, 2020 12:24 AM
Cc: gregc@cgl.ucsf.edu <gregc@cgl.ucsf.edu>; Anderson, Jacob <jacob_r_anderson@hms.harvard.edu>; pett@cgl.ucsf.edu <pett@cgl.ucsf.edu>
Subject: Re: [ChimeraX] #3397: ChimeraX: mmcif connectivity lost in some chains; OK in Coot  + PDB ATOM -> HET ATM

#3397: ChimeraX: mmcif connectivity lost in some chains; OK in Coot  + PDB ATOM
-> HET ATM
-----------------------------------------+------------------------
          Reporter:  jacob_r_anderson@…  |      Owner:  Greg Couch
              Type:  defect              |     Status:  accepted
          Priority:  normal              |  Milestone:
         Component:  Input/Output        |    Version:
        Resolution:                      |   Keywords:
        Blocked By:                      |   Blocking:
Notify when closed:                      |   Platform:  all
           Project:  ChimeraX            |
-----------------------------------------+------------------------

Comment (by Greg Couch):

 There are many problems with the given mmCIF file.  The major problem for
 ChimeraX is that in the atom_site table, all of the chains are designated
 as the same entity, but they have different sequences.  It would be best
 if you fixed the program that generated the mmCIF file to give each chain
 its own entity. It would be even better if the entity, entity_poly, and
 entity_poly_seq tables were given.

 A secondary problem is that the struct_conn table is malformed.  The id is
 not unique.  LINK is not a legal conn_type_id.  The mandatory
 ptrn[12]_label_* columns are missing.  The ptrn[12]_auth_* columns are
 given, but ptrn[12]_auth_atom_id's have explicit spaces them that don't
 match the atom_site.auth_atom_id's.  And the ptrn[1]_symmetry fields are
 bogus.  Again, it would be best if you fixed the program that generated
 the mmCIF file.

 That said, while I am reluctant to put in fixes for bad data because it
 slows down the mmCIF reader for everyone, I will see what I can do for the
 atom_site table.  The struct_conn table is too messed up to workaround.

--
Ticket URL: <https://plato.cgl.ucsf.edu/trac/ChimeraX/ticket/3397#comment:3>
ChimeraX <http://www.rbvi.ucsf.edu/chimerax/>
ChimeraX Issue Tracker

comment:5 by Greg Couch, 5 years ago

Revised code that deals with missing entity_poly_seq to work in this case. It is still confused about the GDP residues. Will investigate a little more.

comment:6 by Greg Couch, 5 years ago

Resolution: fixed
Status: acceptedclosed

GDP isn't broken. GDP's type is "RNA linking", ie., a polymeric residue, so if it is in an mmCIF entity, it should be connected. So the missing segment bond is correct. GTP's type is "non-polymer", so if it is in an entity, there should be explicit inter-residue bonds to it in the struct_conn table.

Note: See TracTickets for help on using tickets.