Opened 2 years ago

Closed 2 years ago

#9193 closed enhancement (fixed)

Computing molecular solvent excluded surface is slow for large numbers of atoms

Reported by: Eric Pettersen Owned by: Tom Goddard
Priority: normal Milestone:
Component: Surface Version:
Keywords: Cc: phil.cruz@…
Blocked By: Blocking:
Notify when closed: Platform: all
Project: ChimeraX

Description

The following bug report has been submitted:
Platform:        macOS-10.16-x86_64-i386-64bit
ChimeraX Version: 1.6.1 (2023-05-09 17:57:07 UTC)
Description
Computing molecule SES surfaces with large numbers of atoms (>500,000) is much slower than expected.  For a 6 million atom virus capsid (6b1t) it took an hour using large grid spacing 2.5.  These surfaces are computed by the NIAID NIH 3D pipeline.

Log:
Could not find tool "Tabbed Toolbar"  
UCSF ChimeraX version: 1.6.1 (2023-05-09)  
© 2016-2023 Regents of the University of California. All rights reserved.  
How to cite UCSF ChimeraX  

> open 2bbv format mmcif fromDatabase pdbe_bio maxAssemblies 1

Summary of feedback from opening 2bbv fetched from pdbe_bio  
---  
warning | Missing or incomplete entity_poly_seq table. Inferred polymer
connectivity.  
  
2bbv bioassembly 1 title:  
The refined three-dimensional structure of an insect virus At 2.8 angstroms
resolution [more info...]  
  
Chain information for 2bbv bioassembly 1 #1  
---  
Chain | Description  
A AA AAA AAB AAC AAD AAE AAF AAG AAH AAI AAJ AAK AAL AAM AAN AAO AAP AAQ AAR
AAS AAT AAU AAV AAW AAX AAY AAZ AB ABA ABB ABC ABD ABE ABF ABG AC AD AE AF AG
AH AI AJ AK AL AM AN AO AP AQ AR AS AT AU AV AW AX AY AZ | RNA
(5'-R(*UP*CP*UP*UP*AP*UP*AP*UP*CP*U)-3')  
B BA BAA BAB BAC BAD BAE BAF BAG BAH BAI BAJ BAK BAL BAM BAN BAO BAP BAQ BAR
BAS BAT BAU BAV BAW BAX BAY BAZ BB BBA BBB BBC BBD BBE BBF BBG BC BD BE BF BG
BH BI BJ BK BL BM BN BO BP BQ BR BS BT BU BV BW BX BY BZ D DA DAA DAB DAC DAD
DAE DAF DAG DAH DAI DAJ DAK DAL DAM DAN DAO DAP DAQ DAR DAS DAT DAU DAV DAW
DAX DAY DAZ DB DBA DBB DBC DBD DBE DBF DBG DC DD DE DF DG DH DI DJ DK DL DM DN
DO DP DQ DR DS DT DU DV DW DX DY DZ | PROTEIN (BLACK BEETLE VIRUS CAPSID
PROTEIN)  
C CA CAA CAB CAC CAD CAE CAF CAG CAH CAI CAJ CAK CAL CAM CAN CAO CAP CAQ CAR
CAS CAT CAU CAV CAW CAX CAY CAZ CB CBA CBB CBC CBD CBE CBF CBG CC CD CE CF CG
CH CI CJ CK CL CM CN CO CP CQ CR CS CT CU CV CW CX CY CZ E EA EAA EAB EAC EAD
EAE EAF EAG EAH EAI EAJ EAK EAL EAM EAN EAO EAP EAQ EAR EAS EAT EAU EAV EAW
EAX EAY EAZ EB EBA EBB EBC EBD EBE EBF EBG EC ED EE EF EG EH EI EJ EK EL EM EN
EO EP EQ ER ES ET EU EV EW EX EY EZ G GA GAA GAB GAC GAD GAE GAF GAG GAH GAI
GAJ GAK GAL GAM GAN GAO GAP GAQ GAR GAS GAT GAU GAV GAW GAX GAY GAZ GB GBA GBB
GBC GBD GBE GBF GBG GC GD GE GF GG GH GI GJ GK GL GM GN GO GP GQ GR GS GT GU
GV GW GX GY GZ | PROTEIN (BLACK BEETLE VIRUS CAPSID PROTEIN)  
F FA FAA FAB FAC FAD FAE FAF FAG FAH FAI FAJ FAK FAL FAM FAN FAO FAP FAQ FAR
FAS FAT FAU FAV FAW FAX FAY FAZ FB FBA FBB FBC FBD FBE FBF FBG FC FD FE FF FG
FH FI FJ FK FL FM FN FO FP FQ FR FS FT FU FV FW FX FY FZ | PROTEIN (BLACK
BEETLE VIRUS CAPSID PROTEIN)  
  
Non-standard residues in 2bbv bioassembly 1 #1  
---  
CA — (CA)  
  
2bbv bioassembly 1 mmCIF Assemblies  
---  
1| complete icosahedral assembly  
2| icosahedral asymmetric unit  
3| icosahedral pentamer  
4| icosahedral 23 hexamer  
5| icosahedral asymmetric unit, std point frame  
6| crystal asymmetric unit, crystal frame  
  
Opened 1 biological assemblies for 2bbv  

> time surface enclose #1 sharp false grid 2.0

> surface enclose #1 sharpBoundaries false gridSpacing 2.0

command time 26.56 seconds  
draw time 0.07606 seconds  




OpenGL version: 4.1 ATI-4.12.7
OpenGL renderer: AMD Radeon Pro 580 OpenGL Engine
OpenGL vendor: ATI Technologies Inc.

Python: 3.9.11
Locale: UTF-8
Qt version: PyQt6 6.4.2, Qt 6.4.2
Qt runtime version: 6.4.3
Qt platform: cocoa
Hardware:

    Hardware Overview:

      Model Name: iMac
      Model Identifier: iMac18,3
      Processor Name: Quad-Core Intel Core i7
      Processor Speed: 4.2 GHz
      Number of Processors: 1
      Total Number of Cores: 4
      L2 Cache (per Core): 256 KB
      L3 Cache: 8 MB
      Hyper-Threading Technology: Enabled
      Memory: 32 GB
      System Firmware Version: 512.0.0.0.0
      OS Loader Version: 577~170
      SMC Version (system): 2.41f2

Software:

    System Software Overview:

      System Version: macOS 13.4 (22F66)
      Kernel Version: Darwin 22.5.0
      Time since boot: 14 days, 23 hours, 10 minutes

Graphics/Displays:

    Radeon Pro 580:

      Chipset Model: Radeon Pro 580
      Type: GPU
      Bus: PCIe
      PCIe Lane Width: x16
      VRAM (Total): 8 GB
      Vendor: AMD (0x1002)
      Device ID: 0x67df
      Revision ID: 0x00c0
      ROM Revision: 113-D000AA-931
      VBIOS Version: 113-D0001A1X-025
      EFI Driver Version: 01.00.931
      Metal Support: Metal 2
      Displays:
        iMac:
          Display Type: Built-In Retina LCD
          Resolution: Retina 5K (5120 x 2880)
          Framebuffer Depth: 30-Bit Color (ARGB2101010)
          Main Display: Yes
          Mirror: Off
          Online: Yes
          Automatically Adjust Brightness: Yes
          Connection Type: Internal


Installed Packages:
    alabaster: 0.7.13
    appdirs: 1.4.4
    appnope: 0.1.3
    asttokens: 2.2.1
    Babel: 2.12.1
    backcall: 0.2.0
    beautifulsoup4: 4.11.2
    blockdiag: 3.0.0
    build: 0.10.0
    certifi: 2021.10.8
    cftime: 1.6.2
    charset-normalizer: 3.1.0
    ChimeraX-AddCharge: 1.5.9.1
    ChimeraX-AddH: 2.2.5
    ChimeraX-AlignmentAlgorithms: 2.0.1
    ChimeraX-AlignmentHdrs: 3.3.1
    ChimeraX-AlignmentMatrices: 2.1
    ChimeraX-Alignments: 2.9.3
    ChimeraX-AlphaFold: 1.0
    ChimeraX-AltlocExplorer: 1.0.3
    ChimeraX-AmberInfo: 1.0
    ChimeraX-Arrays: 1.1
    ChimeraX-Atomic: 1.43.10
    ChimeraX-AtomicLibrary: 10.0.6
    ChimeraX-AtomSearch: 2.0.1
    ChimeraX-AxesPlanes: 2.3.2
    ChimeraX-BasicActions: 1.1.2
    ChimeraX-BILD: 1.0
    ChimeraX-BlastProtein: 2.1.2
    ChimeraX-BondRot: 2.0.1
    ChimeraX-BugReporter: 1.0.1
    ChimeraX-BuildStructure: 2.8
    ChimeraX-Bumps: 1.0
    ChimeraX-BundleBuilder: 1.2.2
    ChimeraX-ButtonPanel: 1.0.1
    ChimeraX-CageBuilder: 1.0.1
    ChimeraX-CellPack: 1.0
    ChimeraX-Centroids: 1.3.2
    ChimeraX-ChangeChains: 1.0.2
    ChimeraX-CheckWaters: 1.3.1
    ChimeraX-ChemGroup: 2.0.1
    ChimeraX-Clashes: 2.2.4
    ChimeraX-Clipper: 0.20.0
    ChimeraX-ColorActions: 1.0.3
    ChimeraX-ColorGlobe: 1.0
    ChimeraX-ColorKey: 1.5.3
    ChimeraX-CommandLine: 1.2.5
    ChimeraX-ConnectStructure: 2.0.1
    ChimeraX-Contacts: 1.0.1
    ChimeraX-Core: 1.6.1
    ChimeraX-CoreFormats: 1.1
    ChimeraX-coulombic: 1.4.2
    ChimeraX-Crosslinks: 1.0
    ChimeraX-Crystal: 1.0
    ChimeraX-CrystalContacts: 1.0.1
    ChimeraX-DataFormats: 1.2.3
    ChimeraX-Dicom: 1.2
    ChimeraX-DistMonitor: 1.4
    ChimeraX-DockPrep: 1.1.1
    ChimeraX-Dssp: 2.0
    ChimeraX-EMDB-SFF: 1.0
    ChimeraX-ESMFold: 1.0
    ChimeraX-FileHistory: 1.0.1
    ChimeraX-FunctionKey: 1.0.1
    ChimeraX-Geometry: 1.3
    ChimeraX-gltf: 1.0
    ChimeraX-Graphics: 1.1.1
    ChimeraX-Hbonds: 2.4
    ChimeraX-Help: 1.2.1
    ChimeraX-HKCage: 1.3
    ChimeraX-IHM: 1.1
    ChimeraX-ImageFormats: 1.2
    ChimeraX-IMOD: 1.0
    ChimeraX-IO: 1.0.1
    ChimeraX-ISOLDE: 1.6.dev0
    ChimeraX-ItemsInspection: 1.0.1
    ChimeraX-Label: 1.1.7
    ChimeraX-ListInfo: 1.1.1
    ChimeraX-Log: 1.1.5
    ChimeraX-LookingGlass: 1.1
    ChimeraX-Maestro: 1.8.2
    ChimeraX-Map: 1.1.4
    ChimeraX-MapData: 2.0
    ChimeraX-MapEraser: 1.0.1
    ChimeraX-MapFilter: 2.0.1
    ChimeraX-MapFit: 2.0
    ChimeraX-MapSeries: 2.1.1
    ChimeraX-Markers: 1.0.1
    ChimeraX-Mask: 1.0.2
    ChimeraX-MatchMaker: 2.0.12
    ChimeraX-MDcrds: 2.6
    ChimeraX-MedicalToolbar: 1.0.2
    ChimeraX-Meeting: 1.0.1
    ChimeraX-MLP: 1.1.1
    ChimeraX-mmCIF: 2.12
    ChimeraX-MMTF: 2.2
    ChimeraX-Modeller: 1.5.9
    ChimeraX-ModelPanel: 1.3.7
    ChimeraX-ModelSeries: 1.0.1
    ChimeraX-Mol2: 2.0
    ChimeraX-Mole: 1.0
    ChimeraX-Morph: 1.0.2
    ChimeraX-MouseModes: 1.2
    ChimeraX-Movie: 1.0
    ChimeraX-Neuron: 1.0
    ChimeraX-Nifti: 1.0
    ChimeraX-NRRD: 1.0
    ChimeraX-Nucleotides: 2.0.3
    ChimeraX-OpenCommand: 1.10.1
    ChimeraX-PDB: 2.7.2
    ChimeraX-PDBBio: 1.0
    ChimeraX-PDBLibrary: 1.0.2
    ChimeraX-PDBMatrices: 1.0
    ChimeraX-PhenixUI: 1.1.7
    ChimeraX-PickBlobs: 1.0.1
    ChimeraX-PICKLUSTER: 0.1
    ChimeraX-Positions: 1.0
    ChimeraX-PresetMgr: 1.1
    ChimeraX-PubChem: 2.1
    ChimeraX-QScore: 1.0
    ChimeraX-ReadPbonds: 1.0.1
    ChimeraX-Registration: 1.1.1
    ChimeraX-RemoteControl: 1.0
    ChimeraX-RenderByAttr: 1.1
    ChimeraX-RenumberResidues: 1.1
    ChimeraX-ResidueFit: 1.0.1
    ChimeraX-RestServer: 1.1
    ChimeraX-RMF: 0.12
    ChimeraX-RNALayout: 1.0
    ChimeraX-RotamerLibMgr: 3.0
    ChimeraX-RotamerLibsDunbrack: 2.0
    ChimeraX-RotamerLibsDynameomics: 2.0
    ChimeraX-RotamerLibsRichardson: 2.0
    ChimeraX-SaveCommand: 1.5.1
    ChimeraX-SchemeMgr: 1.0
    ChimeraX-SDF: 2.0.1
    ChimeraX-Segger: 1.0
    ChimeraX-Segment: 1.0.1
    ChimeraX-SelInspector: 1.0
    ChimeraX-SeqView: 2.8.3
    ChimeraX-Shape: 1.0.1
    ChimeraX-Shell: 1.0.1
    ChimeraX-Shortcuts: 1.1.1
    ChimeraX-ShowSequences: 1.0.1
    ChimeraX-SideView: 1.0.1
    ChimeraX-Smiles: 2.1
    ChimeraX-SmoothLines: 1.0
    ChimeraX-SpaceNavigator: 1.0
    ChimeraX-StdCommands: 1.10.3
    ChimeraX-STL: 1.0.1
    ChimeraX-Storm: 1.0
    ChimeraX-StructMeasure: 1.1.2
    ChimeraX-Struts: 1.0.1
    ChimeraX-Surface: 1.0.1
    ChimeraX-SwapAA: 2.0.1
    ChimeraX-SwapRes: 2.2.1
    ChimeraX-TapeMeasure: 1.0
    ChimeraX-Test: 1.0
    ChimeraX-Toolbar: 1.1.2
    ChimeraX-ToolshedUtils: 1.2.1
    ChimeraX-Topography: 1.0
    ChimeraX-Tug: 1.0.1
    ChimeraX-TugLigands: 1.1
    ChimeraX-UI: 1.28.4
    ChimeraX-uniprot: 2.2.2
    ChimeraX-UnitCell: 1.0.1
    ChimeraX-ViewDockX: 1.2
    ChimeraX-VIPERdb: 1.0
    ChimeraX-Vive: 1.1
    ChimeraX-VolumeMenu: 1.0.1
    ChimeraX-VTK: 1.0
    ChimeraX-WavefrontOBJ: 1.0
    ChimeraX-WebCam: 1.0.2
    ChimeraX-WebServices: 1.1.1
    ChimeraX-Zone: 1.0.1
    colorama: 0.4.6
    comm: 0.1.3
    contourpy: 1.0.7
    cxservices: 1.2.2
    cycler: 0.11.0
    Cython: 0.29.33
    debugpy: 1.6.7
    decorator: 5.1.1
    docutils: 0.19
    executing: 1.2.0
    filelock: 3.9.0
    fonttools: 4.39.3
    funcparserlib: 1.0.1
    grako: 3.16.5
    h5py: 3.8.0
    html2text: 2020.1.16
    idna: 3.4
    ihm: 0.35
    imagecodecs: 2022.2.22
    imagesize: 1.4.1
    importlib-metadata: 6.6.0
    ipykernel: 6.21.1
    ipython: 8.10.0
    ipython-genutils: 0.2.0
    ipywidgets: 8.0.6
    jedi: 0.18.2
    Jinja2: 3.1.2
    jupyter-client: 8.0.2
    jupyter-core: 5.3.0
    jupyterlab-widgets: 3.0.7
    kiwisolver: 1.4.4
    line-profiler: 4.0.2
    lxml: 4.9.2
    lz4: 4.3.2
    MarkupSafe: 2.1.2
    matplotlib: 3.6.3
    matplotlib-inline: 0.1.6
    msgpack: 1.0.4
    nest-asyncio: 1.5.6
    netCDF4: 1.6.2
    networkx: 2.8.8
    nibabel: 5.0.1
    nptyping: 2.5.0
    numexpr: 2.8.4
    numpy: 1.23.5
    openvr: 1.23.701
    opt-einsum: 3.3.0
    packaging: 21.3
    ParmEd: 3.4.3
    parso: 0.8.3
    pep517: 0.13.0
    pexpect: 4.8.0
    pickleshare: 0.7.5
    Pillow: 9.3.0
    pip: 23.0
    pkginfo: 1.9.6
    platformdirs: 3.5.0
    prompt-toolkit: 3.0.38
    psutil: 5.9.4
    ptyprocess: 0.7.0
    pure-eval: 0.2.2
    pycollada: 0.7.2
    pydicom: 2.3.0
    Pygments: 2.14.0
    pynrrd: 1.0.0
    PyOpenGL: 3.1.5
    PyOpenGL-accelerate: 3.1.5
    pyparsing: 3.0.9
    pyproject-hooks: 1.0.0
    PyQt6-commercial: 6.4.2
    PyQt6-Qt6: 6.4.3
    PyQt6-sip: 13.4.1
    PyQt6-WebEngine-commercial: 6.4.0
    PyQt6-WebEngine-Qt6: 6.4.3
    python-dateutil: 2.8.2
    pytz: 2023.3
    pyzmq: 25.0.2
    qtconsole: 5.4.0
    QtPy: 2.3.1
    RandomWords: 0.4.0
    requests: 2.28.2
    scipy: 1.9.3
    setuptools: 67.4.0
    setuptools-scm: 7.0.5
    sfftk-rw: 0.7.3
    six: 1.16.0
    snowballstemmer: 2.2.0
    sortedcontainers: 2.4.0
    soupsieve: 2.4.1
    sphinx: 6.1.3
    sphinx-autodoc-typehints: 1.22
    sphinxcontrib-applehelp: 1.0.4
    sphinxcontrib-blockdiag: 3.0.0
    sphinxcontrib-devhelp: 1.0.2
    sphinxcontrib-htmlhelp: 2.0.1
    sphinxcontrib-jsmath: 1.0.1
    sphinxcontrib-qthelp: 1.0.3
    sphinxcontrib-serializinghtml: 1.1.5
    stack-data: 0.6.2
    tables: 3.7.0
    tcia-utils: 1.2.0
    tifffile: 2022.10.10
    tinyarray: 1.2.4
    tomli: 2.0.1
    tornado: 6.3.1
    traitlets: 5.9.0
    typing-extensions: 4.5.0
    tzdata: 2023.3
    urllib3: 1.26.15
    wcwidth: 0.2.6
    webcolors: 1.12
    wheel: 0.38.4
    wheel-filename: 1.4.1
    widgetsnbextension: 4.0.7
    zipp: 3.15.0

Change History (2)

comment:1 by Tom Goddard, 2 years ago

Cc: phil.cruz@… added
Component: UnassignedSurface
Owner: set to Tom Goddard
Platform: all
Project: ChimeraX
Reporter: changed from goddard@… to Eric Pettersen
Status: newassigned
Summary: ChimeraX bug report submissionComputing molecular solvent excluded surface is slow for large numbers of atoms
Type: defectenhancement

Eric identified this problem in this email.

From: Eric Pettersen
Subject: Re: Mesh file differences between Chimera/3DPX and ChimeraX/NIH 3D
Date: June 14, 2023 at 12:45:55 AM PDT
To: "Cruz, Phil (NIH/NIAID) [C]"
Cc: Chimera Staff, "McCarthy, Meghan (NIH/NIAID) [C]", "Hurt, Darrell (NIH/NIAID) [E]" , "Browne, Kristen (NIH/NIAID) [C]", "Piya, Bhinnata (NIH/NIAID) [C]" <, Scooter Morris <scooter@cgl.ucsf.edu>, Greg Couch, "Stolarczyk, Michal (NIH/NIAID) [C]" 

A Thursday meeting is fine by me.

The reason that the grid size was changed to 6 for systems with >250K atoms was that computing the surfaces took so long that Kristen mistakenly thought the pipeline was hung (e.g. #7897 (Hanging 6B1T) – ChimeraX), particularly since the pipeline computes the surface multiple times.  Here's some timings I just did on my 2015 iMac:

2bbv biological assembly (469K atoms):
	grid 2.5, sharp true:  33 seconds
	grid 2.5, sharp false:  29 seconds
	grid 6, sharp false:  12 seconds

6b1t biological assembly (6008K atoms):
	grid 2.5, sharp true:  6898 seconds
	grid 2.5, sharp false:  6836 seconds
	grid 6, sharp false: 2804 seconds

Notes:
	(1) As Tom surmised, going to grid 6 is not actually a good idea despite the time savings because the resulting surface is (very much) not continuous -- it's dozens of fragments.  
	(2) Increasing the number of atoms by 12.8x increases the computation time by 209x, which I find surprising.  Is that in line with your expectations Tom?

Unless we can speed up the computation, perhaps the pipeline should not try to produce these representations for large systems (just provide some of the fast gaussian-surface "blob" presets).

--Eric

comment:2 by Tom Goddard, 2 years ago

Resolution: fixed
Status: assignedclosed

Fixed.

The 6b1t virus capsid (6 million atoms) where the surface at grid spacing 2.5 took 3600 seconds on my iMac now takes 20 second (180 times faster). And the 2bbv virus capsid (470K atoms) that took 20 seconds now takes 1.3 seconds (15 times faster).

These large single surfaces are for the NIH 3D pipeline and we decide to lower the grid spacing to 2.0 and those times are similarly fast, 6b1t 23 seconds (13 million triangles), 2bbv 1.5 seconds (1 million triangles).

There was a surprising bottleneck in the SES surface calculation code when making a single surface over a large number of atoms (> 500,000). It was eliminating artifact surfaces created by the grid SES code that appear one probe sphere diameter away from the true surface. It located those surfaces because they are too far from all atoms. But it turns out the algorithm can produce 10000 or more such artifact surface patches and the code to eliminate each compared the distance of one patch vertex to all the atom positions which was very slow. I changed it to use the accelerated find_close_points() C++ code.

It is embarrassing and fantastic to be able to improve the surface speed. In practice the speed-up is small in practical cases, e.g. 5% for computing surfaces for all chains of a ribosome. The speed-up is only large when a single surfaces covers a large number of atoms. Also at the default grid spacing of 0.5 Angstroms which is almost always used the speedup is much less, a factor of 3 for making a single surface on the 2bbv virus capsid of 470000 atoms (27 seconds instead of 90 seconds). Single surfaces on large numbers of atoms are rarely calculated because the default ChimeraX surface behavior is to make a separate surface for each chain. Still the optimization is huge for the rare cases.

Note: See TracTickets for help on using tickets.