Opened 5 years ago

Closed 3 years ago

#3539 closed defect (limitation)

Problems with large de novo mmCIF structure

Reported by: Tristan Croll Owned by: Greg Couch
Priority: normal Milestone:
Component: Input/Output Version:
Keywords: Cc: Eric Pettersen
Blocked By: Blocking:
Notify when closed: Platform: all
Project: ChimeraX

Description

The following bug report has been submitted:
Platform:        Linux-3.10.0-1127.13.1.el7.x86_64-x86_64-with-centos-7.8.2003-Core
ChimeraX Version: 1.0 (2020-06-04 23:15:07 UTC)
Description
Something seems slightly weird about this mmCIF file (which, for the record, will be to my knowledge the first big new experimental complex built essentially entirely in ChimeraX/ISOLDE, starting from homology models for individual chains). Up-front: it's unpublished and confidential, so could you remove it off the bug tracker once you have a local copy?

Anyway, I noticed after opening it that one ligand (CDL, /a:1025) was shown in sphere representation like the protein, rather than sticks like all the many other ligands. Also, looking at the mmCIF file in a text editor I see a number of different ligands (as in, different residue names) have the same _struct_asym.entity_id. This *may* be because I "mutated" quite a few of the lipids (i.e. renamed the residue and rebuilt from the template for the new one, keeping atoms common between the old and new).

Log:
UCSF ChimeraX version: 1.0 (2020-06-04)  
© 2016-2020 Regents of the University of California. All rights reserved.  
How to cite UCSF ChimeraX  

> open Tristan_working_edited_Sc_20July20.cif

Summary of feedback from opening Tristan_working_edited_Sc_20July20.cif  
---  
warnings | Unknown polymer entity '1' near line 376  
Unknown polymer entity '2' near line 2754  
Unknown polymer entity '3' near line 5432  
Unknown polymer entity '4' near line 8583  
Unknown polymer entity '5' near line 15222  
25 messages similar to the above omitted  
Atom H1 is not in the residue template for MET /A:1  
Atom H1 is not in the residue template for LEU /B:14  
Atom H3 is not in the residue template for ALA /C:4  
Atom H3 is not in the residue template for ASP /D:2  
Atom HH11 is not in the residue template for 2MR /D:65  
Atom H1 is not in the residue template for MET /E:1  
Atom H3 is not in the residue template for GLN /E:9  
Atom H3 is not in the residue template for ALA /G:2  
Atom H1 is not in the residue template for ALA /H:2  
Atom H4 is not in the residue template for P5S /H:1003  
Atom H1 is not in the residue template for ALA /I:2  
Atom H1 is not in the residue template for MET /J:1  
Atom H1 is not in the residue template for MET /K:1  
Atom H4 is not in the residue template for P5S /L:2003  
Atom H1 is not in the residue template for MET /M:1  
Atom H3 is not in the residue template for ALA /P:3  
Atom H1 is not in the residue template for MET /Q:1  
Atom H1 is not in the residue template for THR /Z:2  
Atom H4 is not in the residue template for P5S /a:1017  
Atom H1 is not in the residue template for ALA /b:212  
Atom H1 is not in the residue template for ALA /e:212  
Atom H1 is not in the residue template for GLN /h:30  
Atom H1 is not in the residue template for ALA /i:2  
Atom H3 is not in the residue template for THR /j:8  
Atom H1 is not in the residue template for GLN /l:30  
Atom H1 is not in the residue template for ALA /m:2  
Atom H1 is not in the residue template for THR /n:8  
Atom H3 is not in the residue template for PHE /o:2  
Atom H1 is not in the residue template for PHE /p:2  
Atom H1 is not in the residue template for MET /q:1  
Missing or incomplete entity_poly_seq table. Inferred polymer connectivity.  
  
Chain information for Tristan_working_edited_Sc_20July20.cif #1  
---  
Chain | Description  
A | ?  
B | ?  
C | ?  
D | ?  
E | ?  
F | ?  
G | ?  
H | ?  
I | ?  
J | ?  
K | ?  
L | ?  
M | ?  
N | ?  
P | ?  
Q | ?  
R | ?  
Z | ?  
a | ?  
b e | ?  
c | ?  
d | ?  
f | ?  
g k | ?  
h l | ?  
i m | ?  
j n | ?  
o | ?  
p | ?  
q | ?  
  

> open 375Box_Default_BFactor_postprocess_job1571_1_048_pix.mrc

Opened 375Box_Default_BFactor_postprocess_job1571_1_048_pix.mrc, grid size
375,375,375, pixel 1.05, shown at level 0.0235, step 2, values float32  

> volume #2 level 0.03367

> cartoon

> hide protein

> volume #2 step 1

> select /a:1025@O1

1 atom, 1 model selected  

> select up

256 atoms, 255 bonds, 1 model selected  

> ui tool show Shell

/opt/UCSF/ChimeraX/lib/python3.7/site-packages/IPython/core/history.py:226:
UserWarning: IPython History requires SQLite, your history will not be saved  
warn("IPython History requires SQLite, your history will not be saved")  

> select /BT:51

Nothing selected  

> select /C:51

19 atoms, 18 bonds, 1 model selected  

> select /a:1025@C75

1 atom, 1 model selected  

> select /BK

Nothing selected  

> select /AA

Nothing selected  




OpenGL version: 3.3.0 NVIDIA 450.51.05
OpenGL renderer: TITAN Xp/PCIe/SSE2
OpenGL vendor: NVIDIA Corporation
Manufacturer: Dell Inc.
Model: Precision T5600
OS: CentOS Linux 7 Core
Architecture: 64bit ELF
CPU: 32 Intel(R) Xeon(R) CPU E5-2687W 0 @ 3.10GHz
Cache Size: 20480 KB
Memory:
	              total        used        free      shared  buff/cache   available
	Mem:            62G         12G         37G        249M         13G         49G
	Swap:          4.9G          0B        4.9G

Graphics:
	03:00.0 VGA compatible controller [0300]: NVIDIA Corporation GP102 [TITAN Xp] [10de:1b02] (rev a1)	
	Subsystem: NVIDIA Corporation Device [10de:11df]	
	Kernel driver in use: nvidia
PyQt version: 5.12.3
Compiled Qt version: 5.12.4
Runtime Qt version: 5.12.8
File attachment: Tristan_working_edited_Sc_20July20.cif

Tristan_working_edited_Sc_20July20.cif

Change History (10)

comment:1 by Eric Pettersen, 5 years ago

Cc: Eric Pettersen added
Component: UnassignedInput/Output
Owner: set to Greg Couch
Platform: all
Project: ChimeraX
Status: newassigned
Summary: ChimeraX bug report submissionProblems with large de novo mmCIF structure

I think the "CDL not classified as ligand" is me, and the "non-distinct asym.entity_ids" is Greg. Greg, when you've downloaded the attached structure can you delete the attachment? I've already downloaded a copy.

--Eric

comment:2 by Greg Couch, 5 years ago

I have the Tristan_working_edited_Sc_20July20.cif file. But I don't see how to delete the attachment.

comment:3 by Greg Couch, 5 years ago

Figure it out. The attachment has been deleted.

comment:4 by Eric Pettersen, 5 years ago

Okay, the CDL thing is fixed. The rules used to classify the parts of the model into ligand, etc. are a set of heuristics. One of the heuristics for ligand is that it is a maximum of 250 atoms. CDL has 256 atoms, so I tweaked the limit up to 256.

in reply to:  5 ; comment:5 by Tristan Croll, 5 years ago

Biggest ligand currently in the CCD is JSG, an ungodly multi-lipid-tail 
construct with 440 atoms.

On 2020-07-22 16:39, ChimeraX wrote:

comment:6 by Eric Pettersen, 5 years ago

Thanks for the info. I'm probably only going to raise that threshold on an "as needed" basis. Since it's a heuristic, the larger I make it, the more likely it gets other scenarios wrong -- like classifying shorter peptide chains as ligand in a larger system.

in reply to:  7 ; comment:7 by Tristan Croll, 5 years ago

You're probably pretty safe there. Only 6 ligands over 256 atoms 
currently exist: B4X, GXB, JSG, L0W, LHI, X12, and between them they're 
found in 7 structures.

On 2020-07-22 17:06, ChimeraX wrote:

comment:8 by Greg Couch, 5 years ago

Yes the mmCIF file has problems. The CDL residue has the same label_asym_id and label_entity_id as the two DU0 residues around it. So they were probably in the same ChimeraX internal Chain aka a mmCIF entity. I would have expected the CDL to be a different entity (and label_asym_id) than the DU0s.

I've added a bestGuess option to the mmCIF writer that outputs the entity/entity_poly/entity_poly_seq tables that correspond to ChimeraX's internal data structures. That should make it easier to stop the bad entities after the fact.

comment:9 by Greg Couch, 5 years ago

ack. *see* the bad entities after the fact. Which should help fixing the code that confused ChimeraX in the first place.

comment:10 by Greg Couch, 3 years ago

Resolution: limitation
Status: assignedclosed
Note: See TracTickets for help on using tickets.