Opened 4 years ago

Closed 6 months ago

#4682 closed defect (fixed)

Bug writing mmCIF for glycosylated protein

Reported by: Tristan Croll Owned by: Greg Couch
Priority: normal Milestone:
Component: Input/Output Version:
Keywords: Cc:
Blocked By: Blocking:
Notify when closed: Platform: all
Project: ChimeraX

Description (last modified by Greg Couch)

(tried to submit via ChimeraX bug report tool, but server is down)

Something is going very wrong when saving a glycosylated structure to mmCIF (but not to PDB)... attached session (needs ISOLDE in ChimeraX 1.2) contains the model I was working on as model #2, and the result of saving it to mmCIF and reloading it as model #4. There are multiple problems, seemingly all to do with the glycans on chain A:


- all glycosylated asparagines on chain A end up bonded to the same NAG (N 1)

- the fucose on chain L also ends up bonded to NAG N 1 rather than NAG L 1

- NAG L 2 gets renamed to NAG K 2


A bit of further detail:

- the map is C3 symmetric. One of the things that led to this state is that after rebuilding chain A I re-imposed symmetry by deleting chains B and C (and their glycans) and (where `transforms` is a `Places` array and `m` is the working model):


from chimerax.isolde.atomic.building.ncs import create_all_ncs_copies

create_all_ncs_copies(m, transforms)


While this code isn't perfect (mainly because it doesn't preserve bonds between chains), it's worked for me without any major trouble many times until now. This time, saving to mmCIF led to all protein residues past the first chain break in each new chain being treated as ligands (which was seemingly resolved by using `chain.bulk_set()` to reassign them to match chain A).


The other thing I did (after reimposing the missing bonds) was to run `chimerax.isolde.atomic.ligand_utils.recluster_ligands()` to reassign the glycan chain IDs to correspond to their parent chains. That too is something I've used quite a few times now without apparent trouble... but after that is when I noticed the new bonding issues. Note that code assigns the glycans 4-character chain IDs (Agl0, Agl1, etc.) to match their parent protein chain - I reassigned them back to 1-character IDs in order to test whether saving to PDB would still be OK (it was).


cif_bug.cxs

Attachments (2)

cif_bug.cxs (6.5 MB ) - added by Tristan Croll 4 years ago.
Added by email2trac
test.cif (4.4 MB ) - added by Tristan Croll 3 years ago.
Repeated process in new daily build

Change History (6)

by Tristan Croll, 4 years ago

Attachment: cif_bug.cxs added

Added by email2trac

comment:1 by pett, 4 years ago

Component: UnassignedInput/Output
Owner: set to Greg Couch
Platform: all
Project: ChimeraX
Status: newassigned

comment:2 by Greg Couch, 3 years ago

Status: assignedfeedback

Is this still a problem?

I wonder if this has the same solution as #4342 (try tomorrow's daily build). NAG L 2 being renamed to NAG K 2 might not a bug -- unless L and K are the auth_asym_id's.

by Tristan Croll, 3 years ago

Attachment: test.cif added

Repeated process in new daily build

comment:3 by Tristan Croll, 3 years ago

Hmm... I deleted everything but chain A and the glycans attached to it and repeated the original process. After saving to .cif and reopening the linking now appears correct, but chains B and C aren't identified as protein (POLYMER_TYPE==0).

comment:4 by Greg Couch, 6 months ago

Description: modified (diff)
Resolution: fixed
Status: feedbackclosed

After opening test.cif, all chains have polymer_type == 1

Note: See TracTickets for help on using tickets.