Opened 2 years ago

Last modified 7 months ago

#9174 accepted defect

Charge estimation test set

Reported by: Tristan Croll Owned by: Eric Pettersen
Priority: normal Milestone:
Component: Structure Analysis Version:
Keywords: Cc:
Blocked By: Blocking:
Notify when closed: Platform: all
Project: ChimeraX

Description

The following bug report has been submitted:
Platform:        Linux-5.19.0-43-generic-x86_64-with-glibc2.35
ChimeraX Version: 1.6.1 (2023-05-09 17:57:07 UTC)
Description
I've been messing around a bit with the SPICE dataset of model compounds (https://zenodo.org/record/7338495) over the last couple of days. The main point is to start exploring with John Chodera whether it would be feasible to train a variant of ESPALOMA to predict MD parameters from ChimeraX `idatm_type` nodes connected by generic bonds, as opposed to their current approach of elements connected by bonds with specified bond order. As a side benefit of that I thought it might be sensible to run `add_charge.estimate_net_charge()` on each molecule and compare the result to the sum of charges stored in the dataset entry, to get a set of examples where the current algorithm fails. There's disagreement for about 1 in 12 entries. I looked at the first few, and for those it does indeed seem that the ChimeraX estimate is incorrect. Admittedly some of these are fairly rare edge cases (including at least a few that would never be stable under real-world biological conditions), but I thought it might be a useful reference if you're interested in data-mining for commonalities. The attached file is comma-delimited text containing 1054 examples, providing the SPICE identifier, SMILES string, official charge and ChimeraX charge estimate. 

OpenGL version: 3.3.0 NVIDIA 515.105.01
OpenGL renderer: NVIDIA GeForce RTX 3070/PCIe/SSE2
OpenGL vendor: NVIDIA Corporation

Python: 3.9.11
Locale: en_GB.UTF-8
Qt version: PyQt6 6.4.2, Qt 6.4.2
Qt runtime version: 6.4.3
Qt platform: xcb

XDG_SESSION_TYPE=x11
DESKTOP_SESSION=ubuntu
XDG_SESSION_DESKTOP=ubuntu
XDG_CURRENT_DESKTOP=ubuntu:GNOME
DISPLAY=:1
Manufacturer: Dell Inc.
Model: XPS 8950
OS: Ubuntu 22.04 Jammy Jellyfish
Architecture: 64bit ELF
Virtual Machine: none
CPU: 20 12th Gen Intel(R) Core(TM) i7-12700
Cache Size: 25600 KB
Memory:
	               total        used        free      shared  buff/cache   available
	Mem:            31Gi       8.8Gi       2.4Gi       419Mi        19Gi        21Gi
	Swap:          2.0Gi          0B       2.0Gi

Graphics:
	0000:01:00.0 VGA compatible controller [0300]: NVIDIA Corporation GA104 [GeForce RTX 3070 Lite Hash Rate] [10de:2488] (rev a1)	
	Subsystem: Dell GA104 [GeForce RTX 3070 Lite Hash Rate] [1028:c903]	
	Kernel driver in use: nvidia

Installed Packages:
    alabaster: 0.7.13
    appdirs: 1.4.4
    asttokens: 2.2.1
    Babel: 2.12.1
    backcall: 0.2.0
    beautifulsoup4: 4.11.2
    blockdiag: 3.0.0
    build: 0.10.0
    certifi: 2023.5.7
    cftime: 1.6.2
    charset-normalizer: 3.1.0
    ChimeraX-AddCharge: 1.5.9.1
    ChimeraX-AddH: 2.2.5
    ChimeraX-AlignmentAlgorithms: 2.0.1
    ChimeraX-AlignmentHdrs: 3.3.1
    ChimeraX-AlignmentMatrices: 2.1
    ChimeraX-Alignments: 2.9.3
    ChimeraX-AlphaFold: 1.0
    ChimeraX-AltlocExplorer: 1.0.3
    ChimeraX-AmberInfo: 1.0
    ChimeraX-Arrays: 1.1
    ChimeraX-Atomic: 1.43.10
    ChimeraX-AtomicLibrary: 10.0.6
    ChimeraX-AtomSearch: 2.0.1
    ChimeraX-AxesPlanes: 2.3.2
    ChimeraX-BasicActions: 1.1.2
    ChimeraX-BILD: 1.0
    ChimeraX-BlastProtein: 2.1.2
    ChimeraX-BondRot: 2.0.1
    ChimeraX-BugReporter: 1.0.1
    ChimeraX-BuildStructure: 2.8
    ChimeraX-Bumps: 1.0
    ChimeraX-BundleBuilder: 1.2.2
    ChimeraX-ButtonPanel: 1.0.1
    ChimeraX-CageBuilder: 1.0.1
    ChimeraX-CellPack: 1.0
    ChimeraX-Centroids: 1.3.2
    ChimeraX-ChangeChains: 1.0.2
    ChimeraX-CheckWaters: 1.3.1
    ChimeraX-ChemGroup: 2.0.1
    ChimeraX-Clashes: 2.2.4
    ChimeraX-Clipper: 0.21.0
    ChimeraX-ColorActions: 1.0.3
    ChimeraX-ColorGlobe: 1.0
    ChimeraX-ColorKey: 1.5.3
    ChimeraX-CommandLine: 1.2.5
    ChimeraX-ConnectStructure: 2.0.1
    ChimeraX-Contacts: 1.0.1
    ChimeraX-Core: 1.6.1
    ChimeraX-CoreFormats: 1.1
    ChimeraX-coulombic: 1.4.2
    ChimeraX-Crosslinks: 1.0
    ChimeraX-Crystal: 1.0
    ChimeraX-CrystalContacts: 1.0.1
    ChimeraX-DataFormats: 1.2.3
    ChimeraX-Dicom: 1.2
    ChimeraX-DistMonitor: 1.4
    ChimeraX-DockPrep: 1.1.1
    ChimeraX-Dssp: 2.0
    ChimeraX-EMDB-SFF: 1.0
    ChimeraX-ESMFold: 1.0
    ChimeraX-FileHistory: 1.0.1
    ChimeraX-FunctionKey: 1.0.1
    ChimeraX-Geometry: 1.3
    ChimeraX-gltf: 1.0
    ChimeraX-Graphics: 1.1.1
    ChimeraX-Hbonds: 2.4
    ChimeraX-Help: 1.2.1
    ChimeraX-HKCage: 1.3
    ChimeraX-IHM: 1.1
    ChimeraX-ImageFormats: 1.2
    ChimeraX-IMOD: 1.0
    ChimeraX-IO: 1.0.1
    ChimeraX-ISOLDE: 1.6.0
    ChimeraX-ItemsInspection: 1.0.1
    ChimeraX-Label: 1.1.7
    ChimeraX-LinuxSupport: 1.0.1
    ChimeraX-ListInfo: 1.1.1
    ChimeraX-Log: 1.1.5
    ChimeraX-LookingGlass: 1.1
    ChimeraX-Maestro: 1.8.2
    ChimeraX-Map: 1.1.4
    ChimeraX-MapData: 2.0
    ChimeraX-MapEraser: 1.0.1
    ChimeraX-MapFilter: 2.0.1
    ChimeraX-MapFit: 2.0
    ChimeraX-MapSeries: 2.1.1
    ChimeraX-Markers: 1.0.1
    ChimeraX-Mask: 1.0.2
    ChimeraX-MatchMaker: 2.0.12
    ChimeraX-MDcrds: 2.6
    ChimeraX-MedicalToolbar: 1.0.2
    ChimeraX-Meeting: 1.0.1
    ChimeraX-MLP: 1.1.1
    ChimeraX-mmCIF: 2.12
    ChimeraX-MMTF: 2.2
    ChimeraX-Modeller: 1.5.9
    ChimeraX-ModelPanel: 1.3.7
    ChimeraX-ModelSeries: 1.0.1
    ChimeraX-Mol2: 2.0
    ChimeraX-Mole: 1.0
    ChimeraX-Morph: 1.0.2
    ChimeraX-MouseModes: 1.2
    ChimeraX-Movie: 1.0
    ChimeraX-Neuron: 1.0
    ChimeraX-Nifti: 1.0
    ChimeraX-NRRD: 1.0
    ChimeraX-Nucleotides: 2.0.3
    ChimeraX-OpenCommand: 1.10.1
    ChimeraX-PDB: 2.7.2
    ChimeraX-PDBBio: 1.0
    ChimeraX-PDBLibrary: 1.0.2
    ChimeraX-PDBMatrices: 1.0
    ChimeraX-PickBlobs: 1.0.1
    ChimeraX-Positions: 1.0
    ChimeraX-PresetMgr: 1.1
    ChimeraX-PubChem: 2.1
    ChimeraX-QScore: 1.0
    ChimeraX-ReadPbonds: 1.0.1
    ChimeraX-Registration: 1.1.1
    ChimeraX-RemoteControl: 1.0
    ChimeraX-RenderByAttr: 1.1
    ChimeraX-RenumberResidues: 1.1
    ChimeraX-ResidueFit: 1.0.1
    ChimeraX-RestServer: 1.1
    ChimeraX-RNALayout: 1.0
    ChimeraX-RotamerLibMgr: 3.0
    ChimeraX-RotamerLibsDunbrack: 2.0
    ChimeraX-RotamerLibsDynameomics: 2.0
    ChimeraX-RotamerLibsRichardson: 2.0
    ChimeraX-SaveCommand: 1.5.1
    ChimeraX-SchemeMgr: 1.0
    ChimeraX-SDF: 2.0.1
    ChimeraX-Segger: 1.0
    ChimeraX-Segment: 1.0.1
    ChimeraX-SelInspector: 1.0
    ChimeraX-SeqView: 2.8.3
    ChimeraX-Shape: 1.0.1
    ChimeraX-Shell: 1.0.1
    ChimeraX-Shortcuts: 1.1.1
    ChimeraX-ShowSequences: 1.0.1
    ChimeraX-SideView: 1.0.1
    ChimeraX-Smiles: 2.1
    ChimeraX-SmoothLines: 1.0
    ChimeraX-SpaceNavigator: 1.0
    ChimeraX-StdCommands: 1.10.3
    ChimeraX-STL: 1.0.1
    ChimeraX-Storm: 1.0
    ChimeraX-StructMeasure: 1.1.2
    ChimeraX-Struts: 1.0.1
    ChimeraX-Surface: 1.0.1
    ChimeraX-SwapAA: 2.0.1
    ChimeraX-SwapRes: 2.2.1
    ChimeraX-TapeMeasure: 1.0
    ChimeraX-Test: 1.0
    ChimeraX-Toolbar: 1.1.2
    ChimeraX-ToolshedUtils: 1.2.1
    ChimeraX-Topography: 1.0
    ChimeraX-Tug: 1.0.1
    ChimeraX-UI: 1.28.4
    ChimeraX-uniprot: 2.2.2
    ChimeraX-UnitCell: 1.0.1
    ChimeraX-ViewDockX: 1.2
    ChimeraX-VIPERdb: 1.0
    ChimeraX-Vive: 1.1
    ChimeraX-VolumeMenu: 1.0.1
    ChimeraX-VTK: 1.0
    ChimeraX-WavefrontOBJ: 1.0
    ChimeraX-WebCam: 1.0.2
    ChimeraX-WebServices: 1.1.1
    ChimeraX-Zone: 1.0.1
    colorama: 0.4.6
    comm: 0.1.3
    contourpy: 1.0.7
    cxservices: 1.2.2
    cycler: 0.11.0
    Cython: 0.29.33
    debugpy: 1.6.7
    decorator: 5.1.1
    distro: 1.7.0
    docutils: 0.19
    executing: 1.2.0
    filelock: 3.9.0
    fonttools: 4.39.3
    funcparserlib: 1.0.1
    grako: 3.16.5
    h5py: 3.8.0
    html2text: 2020.1.16
    idna: 3.4
    ihm: 0.35
    imagecodecs: 2022.9.26
    imagesize: 1.4.1
    importlib-metadata: 6.6.0
    ipykernel: 6.21.1
    ipython: 8.10.0
    ipython-genutils: 0.2.0
    ipywidgets: 8.0.6
    jedi: 0.18.2
    Jinja2: 3.1.2
    jupyter-client: 8.0.2
    jupyter-core: 5.3.0
    jupyterlab-widgets: 3.0.7
    kiwisolver: 1.4.4
    line-profiler: 4.0.2
    lxml: 4.9.2
    lz4: 4.3.2
    MarkupSafe: 2.1.2
    matplotlib: 3.6.3
    matplotlib-inline: 0.1.6
    msgpack: 1.0.4
    nest-asyncio: 1.5.6
    netCDF4: 1.6.2
    networkx: 2.8.8
    nibabel: 5.0.1
    nptyping: 2.5.0
    numexpr: 2.8.4
    numpy: 1.23.5
    openvr: 1.23.701
    packaging: 23.1
    ParmEd: 3.4.3
    parso: 0.8.3
    pep517: 0.13.0
    pexpect: 4.8.0
    pickleshare: 0.7.5
    Pillow: 9.3.0
    pip: 23.0
    pkginfo: 1.9.6
    platformdirs: 3.5.0
    prompt-toolkit: 3.0.38
    psutil: 5.9.4
    ptyprocess: 0.7.0
    pure-eval: 0.2.2
    pycollada: 0.7.2
    pydicom: 2.3.0
    Pygments: 2.14.0
    pynrrd: 1.0.0
    PyOpenGL: 3.1.5
    PyOpenGL-accelerate: 3.1.5
    pyparsing: 3.0.9
    pyproject-hooks: 1.0.0
    PyQt6-commercial: 6.4.2
    PyQt6-Qt6: 6.4.3
    PyQt6-sip: 13.4.1
    PyQt6-WebEngine-commercial: 6.4.0
    PyQt6-WebEngine-Qt6: 6.4.3
    python-dateutil: 2.8.2
    pytz: 2023.3
    pyzmq: 25.0.2
    qtconsole: 5.4.0
    QtPy: 2.3.1
    RandomWords: 0.4.0
    rdkit-pypi: 2022.9.2
    requests: 2.28.2
    scipy: 1.9.3
    setuptools: 67.4.0
    sfftk-rw: 0.7.3
    six: 1.16.0
    snowballstemmer: 2.2.0
    sortedcontainers: 2.4.0
    soupsieve: 2.4.1
    sphinx: 6.1.3
    sphinx-autodoc-typehints: 1.22
    sphinxcontrib-applehelp: 1.0.4
    sphinxcontrib-blockdiag: 3.0.0
    sphinxcontrib-devhelp: 1.0.2
    sphinxcontrib-htmlhelp: 2.0.1
    sphinxcontrib-jsmath: 1.0.1
    sphinxcontrib-qthelp: 1.0.3
    sphinxcontrib-serializinghtml: 1.1.5
    stack-data: 0.6.2
    tables: 3.7.0
    tcia-utils: 1.2.0
    tifffile: 2022.10.10
    tinyarray: 1.2.4
    tomli: 2.0.1
    tornado: 6.3.1
    traitlets: 5.9.0
    typing-extensions: 4.5.0
    tzdata: 2023.3
    urllib3: 1.26.15
    wcwidth: 0.2.6
    webcolors: 1.12
    wheel: 0.38.4
    wheel-filename: 1.4.1
    widgetsnbextension: 4.0.7
    zipp: 3.15.0
File attachment: charge_failures.txt

charge_failures.txt

Attachments (1)

charge_failures.txt (255.0 KB ) - added by Tristan Croll 2 years ago.
Added by email2trac

Download all attachments as: .zip

Change History (20)

by Tristan Croll, 2 years ago

Attachment: charge_failures.txt added

Added by email2trac

comment:1 by Eric Pettersen, 2 years ago

Component: UnassignedStructure Analysis
Owner: set to Eric Pettersen
Platform: all
Project: ChimeraX
Status: newaccepted
Summary: ChimeraX bug report submissionCharge estimation test set

Though I don't necessarily expect ChimeraX to correctly analyze non-organic moieties, the couple of examples in your set that I looked at (and that happened to also be organic) seemed to be things that could be improved.

Will see what I can do when I have some time.

comment:3 by Eric Pettersen, 11 months ago

Second one (SPICE 103918110) also fixed by same change as the previous one.

comment:4 by Eric Pettersen, 11 months ago

Third (SPICE 103918296) also fixed by same change.

comment:5 by Eric Pettersen, 11 months ago

Same for the fourth (SPICE 103919971)

comment:6 by Eric Pettersen, 11 months ago

Not fixing the fifth one (SPICE 103921318). If protonated as per biological pH (i.e. no protons on the sulfur oxygens) then the charge is correctly estimated as zero.

comment:7 by Eric Pettersen, 11 months ago

Sixth one (SPICE 103922579) already fixed.

comment:8 by Eric Pettersen, 11 months ago

Fixed the seventh one (SPICE 103923499) by expanding the previous fix to include double bonds to nitrogen

Fix: https://github.com/RBVI/ChimeraX/commit/7c78afad4b0c344939e4995b187c210da51dda93

comment:9 by Eric Pettersen, 11 months ago

Not fixing the eighth one (SPICE 103923944). I have zero idea how the carboxylate carbons can _both_ be protonated. ChimeraX doesn't operate at whatever pH that would happen at.

comment:10 by Eric Pettersen, 11 months ago

The ninth one (SPICE 103929961) is already fixed

comment:11 by Eric Pettersen, 7 months ago

The 10th one (SPICE 103930835) is already fixed

comment:12 by Eric Pettersen, 7 months ago

The 11th one (SPICE 103935579) also already fixed

comment:13 by Eric Pettersen, 7 months ago

Won't be fixing the 12th one (SPICE 103940406) -- there is no IDATM type for a planar phosphorus.

comment:14 by Eric Pettersen, 7 months ago

Last edited 7 months ago by Eric Pettersen (previous) (diff)

comment:15 by Eric Pettersen, 7 months ago

The 14th case (SPICE 103943735) is already fixed.

Last edited 7 months ago by Eric Pettersen (previous) (diff)

comment:16 by Eric Pettersen, 7 months ago

The 15th case (SPICE 103943807) is already fixed.

comment:17 by Eric Pettersen, 7 months ago

Case 16 (SPICE 103946180) also already fixed.

comment:18 by Eric Pettersen, 7 months ago

Case 17 (SPICE 103947422) and 18 (SPICE 103948733) already fixed.

Note: See TracTickets for help on using tickets.