Opened 5 years ago
Closed 3 years ago
#3539 closed defect (limitation)
Problems with large de novo mmCIF structure
Reported by: | Tristan Croll | Owned by: | Greg Couch |
---|---|---|---|
Priority: | normal | Milestone: | |
Component: | Input/Output | Version: | |
Keywords: | Cc: | Eric Pettersen | |
Blocked By: | Blocking: | ||
Notify when closed: | Platform: | all | |
Project: | ChimeraX |
Description
The following bug report has been submitted: Platform: Linux-3.10.0-1127.13.1.el7.x86_64-x86_64-with-centos-7.8.2003-Core ChimeraX Version: 1.0 (2020-06-04 23:15:07 UTC) Description Something seems slightly weird about this mmCIF file (which, for the record, will be to my knowledge the first big new experimental complex built essentially entirely in ChimeraX/ISOLDE, starting from homology models for individual chains). Up-front: it's unpublished and confidential, so could you remove it off the bug tracker once you have a local copy? Anyway, I noticed after opening it that one ligand (CDL, /a:1025) was shown in sphere representation like the protein, rather than sticks like all the many other ligands. Also, looking at the mmCIF file in a text editor I see a number of different ligands (as in, different residue names) have the same _struct_asym.entity_id. This *may* be because I "mutated" quite a few of the lipids (i.e. renamed the residue and rebuilt from the template for the new one, keeping atoms common between the old and new). Log: UCSF ChimeraX version: 1.0 (2020-06-04) © 2016-2020 Regents of the University of California. All rights reserved. How to cite UCSF ChimeraX > open Tristan_working_edited_Sc_20July20.cif Summary of feedback from opening Tristan_working_edited_Sc_20July20.cif --- warnings | Unknown polymer entity '1' near line 376 Unknown polymer entity '2' near line 2754 Unknown polymer entity '3' near line 5432 Unknown polymer entity '4' near line 8583 Unknown polymer entity '5' near line 15222 25 messages similar to the above omitted Atom H1 is not in the residue template for MET /A:1 Atom H1 is not in the residue template for LEU /B:14 Atom H3 is not in the residue template for ALA /C:4 Atom H3 is not in the residue template for ASP /D:2 Atom HH11 is not in the residue template for 2MR /D:65 Atom H1 is not in the residue template for MET /E:1 Atom H3 is not in the residue template for GLN /E:9 Atom H3 is not in the residue template for ALA /G:2 Atom H1 is not in the residue template for ALA /H:2 Atom H4 is not in the residue template for P5S /H:1003 Atom H1 is not in the residue template for ALA /I:2 Atom H1 is not in the residue template for MET /J:1 Atom H1 is not in the residue template for MET /K:1 Atom H4 is not in the residue template for P5S /L:2003 Atom H1 is not in the residue template for MET /M:1 Atom H3 is not in the residue template for ALA /P:3 Atom H1 is not in the residue template for MET /Q:1 Atom H1 is not in the residue template for THR /Z:2 Atom H4 is not in the residue template for P5S /a:1017 Atom H1 is not in the residue template for ALA /b:212 Atom H1 is not in the residue template for ALA /e:212 Atom H1 is not in the residue template for GLN /h:30 Atom H1 is not in the residue template for ALA /i:2 Atom H3 is not in the residue template for THR /j:8 Atom H1 is not in the residue template for GLN /l:30 Atom H1 is not in the residue template for ALA /m:2 Atom H1 is not in the residue template for THR /n:8 Atom H3 is not in the residue template for PHE /o:2 Atom H1 is not in the residue template for PHE /p:2 Atom H1 is not in the residue template for MET /q:1 Missing or incomplete entity_poly_seq table. Inferred polymer connectivity. Chain information for Tristan_working_edited_Sc_20July20.cif #1 --- Chain | Description A | ? B | ? C | ? D | ? E | ? F | ? G | ? H | ? I | ? J | ? K | ? L | ? M | ? N | ? P | ? Q | ? R | ? Z | ? a | ? b e | ? c | ? d | ? f | ? g k | ? h l | ? i m | ? j n | ? o | ? p | ? q | ? > open 375Box_Default_BFactor_postprocess_job1571_1_048_pix.mrc Opened 375Box_Default_BFactor_postprocess_job1571_1_048_pix.mrc, grid size 375,375,375, pixel 1.05, shown at level 0.0235, step 2, values float32 > volume #2 level 0.03367 > cartoon > hide protein > volume #2 step 1 > select /a:1025@O1 1 atom, 1 model selected > select up 256 atoms, 255 bonds, 1 model selected > ui tool show Shell /opt/UCSF/ChimeraX/lib/python3.7/site-packages/IPython/core/history.py:226: UserWarning: IPython History requires SQLite, your history will not be saved warn("IPython History requires SQLite, your history will not be saved") > select /BT:51 Nothing selected > select /C:51 19 atoms, 18 bonds, 1 model selected > select /a:1025@C75 1 atom, 1 model selected > select /BK Nothing selected > select /AA Nothing selected OpenGL version: 3.3.0 NVIDIA 450.51.05 OpenGL renderer: TITAN Xp/PCIe/SSE2 OpenGL vendor: NVIDIA Corporation Manufacturer: Dell Inc. Model: Precision T5600 OS: CentOS Linux 7 Core Architecture: 64bit ELF CPU: 32 Intel(R) Xeon(R) CPU E5-2687W 0 @ 3.10GHz Cache Size: 20480 KB Memory: total used free shared buff/cache available Mem: 62G 12G 37G 249M 13G 49G Swap: 4.9G 0B 4.9G Graphics: 03:00.0 VGA compatible controller [0300]: NVIDIA Corporation GP102 [TITAN Xp] [10de:1b02] (rev a1) Subsystem: NVIDIA Corporation Device [10de:11df] Kernel driver in use: nvidia PyQt version: 5.12.3 Compiled Qt version: 5.12.4 Runtime Qt version: 5.12.8 File attachment: Tristan_working_edited_Sc_20July20.cif
Change History (10)
comment:1 by , 5 years ago
Cc: | added |
---|---|
Component: | Unassigned → Input/Output |
Owner: | set to |
Platform: | → all |
Project: | → ChimeraX |
Status: | new → assigned |
Summary: | ChimeraX bug report submission → Problems with large de novo mmCIF structure |
comment:2 by , 5 years ago
I have the Tristan_working_edited_Sc_20July20.cif file. But I don't see how to delete the attachment.
comment:4 by , 5 years ago
Okay, the CDL thing is fixed. The rules used to classify the parts of the model into ligand, etc. are a set of heuristics. One of the heuristics for ligand is that it is a maximum of 250 atoms. CDL has 256 atoms, so I tweaked the limit up to 256.
follow-up: 5 comment:5 by , 5 years ago
Biggest ligand currently in the CCD is JSG, an ungodly multi-lipid-tail construct with 440 atoms. On 2020-07-22 16:39, ChimeraX wrote:
comment:6 by , 5 years ago
Thanks for the info. I'm probably only going to raise that threshold on an "as needed" basis. Since it's a heuristic, the larger I make it, the more likely it gets other scenarios wrong -- like classifying shorter peptide chains as ligand in a larger system.
follow-up: 7 comment:7 by , 5 years ago
You're probably pretty safe there. Only 6 ligands over 256 atoms currently exist: B4X, GXB, JSG, L0W, LHI, X12, and between them they're found in 7 structures. On 2020-07-22 17:06, ChimeraX wrote:
comment:8 by , 5 years ago
Yes the mmCIF file has problems. The CDL residue has the same label_asym_id and label_entity_id as the two DU0 residues around it. So they were probably in the same ChimeraX internal Chain aka a mmCIF entity. I would have expected the CDL to be a different entity (and label_asym_id) than the DU0s.
I've added a bestGuess option to the mmCIF writer that outputs the entity/entity_poly/entity_poly_seq tables that correspond to ChimeraX's internal data structures. That should make it easier to stop the bad entities after the fact.
comment:9 by , 5 years ago
ack. *see* the bad entities after the fact. Which should help fixing the code that confused ChimeraX in the first place.
comment:10 by , 3 years ago
Resolution: | → limitation |
---|---|
Status: | assigned → closed |
I think the "CDL not classified as ligand" is me, and the "non-distinct asym.entity_ids" is Greg. Greg, when you've downloaded the attached structure can you delete the attachment? I've already downloaded a copy.
--Eric