Opened 5 years ago

Last modified 5 years ago

#3435 assigned defect

2D molecule drawing

Reported by: Tristan Croll Owned by: Tristan Croll
Priority: normal Milestone:
Component: Depiction Version:
Keywords: Cc: pett, Tom Goddard
Blocked By: Blocking:
Notify when closed: Platform: all
Project: ChimeraX

Description

The following bug report has been submitted:
Platform:        Linux-3.10.0-1127.10.1.el7.x86_64-x86_64-with-centos-7.8.2003-Core
ChimeraX Version: 1.0 (2020-06-04 23:15:07 UTC)
Description
Would be nice to have built-in access to a library that can draw a 2D sketch of a molecule. My graph-matching "find possible MD templates" approach appears to be working reasonably nicely now - but the new problem is that the "raw" results are going to be quite confusing to the user. If I delete the sidechain amine from a lysine, for example, it finds the following potential templates:
LYN, LYS, CLYS, NLYS, PTM_LYZ, PTM_MLY, PTM_MLZ, ZK
... which are all correct in that they're indeed variants of lysine - but without a more informative description it's going to be hard for the user to choose. I can see a few paths forward:
- amend the OpenMM `ForceField` to allow longer descriptions of a residue to be stored directly with its template (I have a ticket in with them suggesting this)
- get the description from the CCD template (will be possible when the MD template name contains the CCD template name, but quite difficult otherwise)
- draw a 2D sketch highlighting mismatched atoms
- do it in 3D, similarly to how I handle rotamer previews now (create an extra minimal molecule from the base residue, edit it to match the template, and find a display method to clearly highlight the mismatches)
- some combination of the above.

Log:
UCSF ChimeraX version: 1.0 (2020-06-04)  
© 2016-2020 Regents of the University of California. All rights reserved.  
How to cite UCSF ChimeraX  
Successfully installed 'ChimeraX_ISOLDE-1.0rc1-cp37-cp37m-linux_x86_64.whl'  
Looking in indexes: https://pypi.org/simple,
https://cxtoolshed.rbvi.ucsf.edu/pypi/  
Processing
./.cache/ChimeraX/1.0/installers/ChimeraX_ISOLDE-1.0rc1-cp37-cp37m-linux_x86_64.whl  
Requirement already satisfied, skipping upgrade: ChimeraX-Atomic>=1.0 in
/opt/UCSF/ChimeraX/lib/python3.7/site-packages (from ChimeraX-ISOLDE==1.0rc1)
(1.0)  
Requirement already satisfied, skipping upgrade: ChimeraX-Arrays~=1.0 in
/opt/UCSF/ChimeraX/lib/python3.7/site-packages (from ChimeraX-ISOLDE==1.0rc1)
(1.0)  
Requirement already satisfied, skipping upgrade: ChimeraX-
Core~=1.0rc202005052344 in /opt/UCSF/ChimeraX/lib/python3.7/site-packages
(from ChimeraX-ISOLDE==1.0rc1) (1.0)  
Requirement already satisfied, skipping upgrade: ChimeraX-Clipper~=0.13.0 in
./.local/share/ChimeraX/1.0/site-packages (from ChimeraX-ISOLDE==1.0rc1)
(0.13.0)  
Requirement already satisfied, skipping upgrade: ChimeraX-Graphics~=1.0 in
/opt/UCSF/ChimeraX/lib/python3.7/site-packages (from ChimeraX-
Atomic>=1.0->ChimeraX-ISOLDE==1.0rc1) (1.0)  
Requirement already satisfied, skipping upgrade: ChimeraX-Geometry~=1.0 in
/opt/UCSF/ChimeraX/lib/python3.7/site-packages (from ChimeraX-
Atomic>=1.0->ChimeraX-ISOLDE==1.0rc1) (1.0)  
Installing collected packages: ChimeraX-ISOLDE  
Attempting uninstall: ChimeraX-ISOLDE  
Found existing installation: ChimeraX-ISOLDE 1.0rc1  
Uninstalling ChimeraX-ISOLDE-1.0rc1:  
Successfully uninstalled ChimeraX-ISOLDE-1.0rc1  
Successfully installed ChimeraX-ISOLDE-1.0rc1  
Lock 140545846217744 acquired on
/home/tic20/.cache/ChimeraX/1.0/toolshed/bundle_info.cache.lock  
Lock 140545846217744 released on
/home/tic20/.cache/ChimeraX/1.0/toolshed/bundle_info.cache.lock  
  

WARNING: You are using pip version 20.1; however, version 20.1.1 is available.  
You should consider upgrading via the '/usr/bin/chimerax -m pip install
--upgrade pip' command.  
  

> open /home/tic20/chimerax_presets/test_find_templates.py format python

3io0 title:  
Crystal structure of EtuB from Clostridium kluyveri [more info...]  
  
Chain information for 3io0 #1  
---  
Chain | Description  
A | EtuB protein  
  
3io0 mmCIF Assemblies  
---  
1| author_and_software_defined_assembly  
  

> addh

Summary of feedback from adding hydrogens to 3io0 #1  
---  
notes | Termini for 3io0 (#1) chain A determined from SEQRES records  
Chain-initial residues that are actual N termini:  
Chain-initial residues that are not actual N termini: /A PRO 76  
Chain-final residues that are actual C termini: /A PHE 304  
Chain-final residues that are not actual C termini:  
Missing OXT added to C-terminal residue /A PHE 304  
198 hydrogen bonds  
1692 hydrogens added  
  

> isolde start

> set selectionWidth 4

3io0 title:  
Crystal structure of EtuB from Clostridium kluyveri [more info...]  
  
Chain information for 3io0  
---  
Chain | Description  
1.2/A | EtuB protein  
  
3io0 mmCIF Assemblies  
---  
1| author_and_software_defined_assembly  
  
Num heavy atoms: 8  
Name matches:  
Topology matches: LYN, LYS, CLYS, NLYS, PTM_LYZ, PTM_MLY, PTM_MLZ, ZK  
executed test_find_templates.py  
Done loading forcefield  

> ui tool show Shell

/opt/UCSF/ChimeraX/lib/python3.7/site-packages/IPython/core/history.py:226:
UserWarning: IPython History requires SQLite, your history will not be saved  
warn("IPython History requires SQLite, your history will not be saved")  
Fetching CCD LYZ from http://ligand-expo.rcsb.org/reports/L/LYZ/LYZ.cif  

> help help:user




OpenGL version: 3.3.0 NVIDIA 450.36.06
OpenGL renderer: TITAN Xp/PCIe/SSE2
OpenGL vendor: NVIDIA Corporation
Manufacturer: Dell Inc.
Model: Precision T5600
OS: CentOS Linux 7 Core
Architecture: 64bit ELF
CPU: 32 Intel(R) Xeon(R) CPU E5-2687W 0 @ 3.10GHz
Cache Size: 20480 KB
Memory:
	              total        used        free      shared  buff/cache   available
	Mem:            62G        5.1G         48G        157M        9.2G         56G
	Swap:          4.9G          0B        4.9G

Graphics:
	03:00.0 VGA compatible controller [0300]: NVIDIA Corporation GP102 [TITAN Xp] [10de:1b02] (rev a1)	
	Subsystem: NVIDIA Corporation Device [10de:11df]	
	Kernel driver in use: nvidia
PyQt version: 5.12.3
Compiled Qt version: 5.12.4
Runtime Qt version: 5.12.8

Change History (5)

comment:1 by pett, 5 years ago

Cc: pett Tom Goddard added
Component: UnassignedDepiction
Owner: set to Tristan Croll
Platform: all
Project: ChimeraX
Status: newassigned
Summary: ChimeraX bug report submission2D molecule drawing

comment:2 by pett, 5 years ago

Perhaps you could put Mogli (https://pypi.org/project/mogli/) into your dependencies and include PNGs alongside your various template choices.

in reply to:  3 ; comment:3 by Tristan Croll, 5 years ago

I don't think that will do it. The problem is that MD templates only 
provide information about which atoms are bonded to which - not 
coordinates. So it needs a library smart enough to infer chemistry and 
generate the drawing accordingly. OpenBabel might do it, but the PyPI 
package only includes the Python wrapper - the core library has to be 
installed separately.

On 2020-06-24 15:53, ChimeraX wrote:

comment:4 by Tom Goddard, 5 years ago

I am not sure I understand your use case. The user tries to run MD but some residues either do not have OpenMM templates or are missing some atoms. If we just don't have an OpenMM template we are stuck. If the residue is just missing some atoms then we presume the user knows what the residue is supposed to be, otherwise they aren't going to be able to make any choice. If they know what the residue is supposed to be it seems the might recognized the abbreviated code for it. That last point is the key. You seem to think they won't know what abbreviation is the right template.

To handle these residues is your plan that you will add all the missing atoms? If so, then I think your solution of letting the user simply flip through them quickly seems reasonable and adds little additional complexity.

in reply to:  5 ; comment:5 by Tristan Croll, 5 years ago

Maybe that is the way to go - and perhaps for the more obscure cases I 
should consider renaming the templates themselves. It all gets 
particularly nasty for things like glycans, where the GLYCAM 
nomenclature is just awful - in order to fit within the 3-character 
limit of AMBER and handle all the different permutations of bonding 
arrangements, they came up with a scheme that's easy enough to address 
computationally, but is utterly unintelligible to the reader (e.g. BMA 
can become any of 2MB, 3MB, 4MB, 6MB, ZMB, YMB, XMB, ... depending on 
exactly which oxygens are linked to other sugars). But I suppose I could 
consider taking advantage of the fact that OpenMM puts no limit on 
template name lengths to do away with that...

On 2020-06-24 18:44, ChimeraX wrote:
Note: See TracTickets for help on using tickets.