#15688 closed enhancement (fixed)

Improve speed of foldseek ligands command

Reported by: goddard@… Owned by: Tom Goddard
Priority: normal Milestone:
Component: Structure Analysis Version:
Keywords: Cc:
Blocked By: Blocking:
Notify when closed: Platform: all
Project: ChimeraX

Description

The following bug report has been submitted:
Platform:        macOS-14.5-arm64-arm-64bit
ChimeraX Version: 1.9.dev202407260123 (2024-07-26 01:23:22 UTC)
Description
Looking at how to make foldseek ligands command faster.  I edited the code to just open the 845 structures and close them, took 49 seconds.

Next I will try opening, trimming and closing to see if trimming is taking a lot of the time.


Log:
UCSF ChimeraX version: 1.9.dev202407260123 (2024-07-26)  
© 2016-2024 Regents of the University of California. All rights reserved.  
How to cite UCSF ChimeraX  

> open /Users/goddard/Downloads/ChimeraX/Foldseek/8jnb_B/pdb100.m8 format
> foldseek

Summary of feedback from opening
/Users/goddard/Downloads/ChimeraX/Foldseek/8jnb_B/pdb100.m8  
---  
notes | 8jnb.pdb title:  
Craf ras-binding domain chimera, ligand complex [more info...]  
  
  
  
  
| Chain information for 8jnb.pdb #1  
  
---  
  
  
Chain  
| Description  
| UniProt  
  
  
  
  
  
B  
| raf proto-oncogene serine/threonine-protein kinase, craf  
| RAF1_HUMAN 50-101 114-141, BRAF_HUMAN 102-113  
  
  
  
  
  
  
  
Non-standard residues in 8jnb.pdb #1  
  
---  
  
  
  
  
USX — 2-[4-[[(2S)-1-ethanoyl-3-oxidanylidene-2H-indol-2-yl]methyl]-2-methoxy-
phenoxy]ethanamide  
  
  
  
  
3 atoms have alternate locations. Control/examine alternate locations with
Altloc Explorer [start tool...] or the altlocs command.  
Foldseek search for similar structures to /B in pdb100 found 845 hits  
  

> time fold lig

> foldseek ligands

Found 0 ligands in 0 hits:  
command time 48.86 seconds  
draw time 0.01458 seconds  




OpenGL version: 4.1 Metal - 88.1
OpenGL renderer: Apple M2 Ultra
OpenGL vendor: Apple

Python: 3.11.4
Locale: en_US.UTF-8
Qt version: PyQt6 6.7.0, Qt 6.7.1
Qt runtime version: 6.7.2
Qt platform: cocoa
Hardware:

    Hardware Overview:

      Model Name: Mac Studio
      Model Identifier: Mac14,14
      Model Number: Z1800003VLL/A
      Chip: Apple M2 Ultra
      Total Number of Cores: 24 (16 performance and 8 efficiency)
      Memory: 64 GB
      System Firmware Version: 10151.121.1
      OS Loader Version: 10151.121.1

Software:

    System Software Overview:

      System Version: macOS 14.5 (23F79)
      Kernel Version: Darwin 23.5.0
      Time since boot: 38 days, 15 hours, 38 minutes

Graphics/Displays:

    Apple M2 Ultra:

      Chipset Model: Apple M2 Ultra
      Type: GPU
      Bus: Built-In
      Total Number of Cores: 60
      Vendor: Apple (0x106b)
      Metal Support: Metal 3
      Displays:
        PHL 278B1:
          Resolution: 3840 x 2160 (2160p/4K UHD 1 - Ultra High Definition)
          UI Looks like: 1920 x 1080 @ 60.00Hz
          Main Display: Yes
          Mirror: Off
          Online: Yes
          Rotation: Supported


Installed Packages:
    alabaster: 0.7.16
    appdirs: 1.4.4
    appnope: 0.1.4
    asttokens: 2.4.1
    Babel: 2.15.0
    beautifulsoup4: 4.12.3
    biopython: 1.83
    blockdiag: 3.0.0
    blosc2: 2.0.0
    build: 1.2.1
    certifi: 2023.11.17
    cftime: 1.6.4
    charset-normalizer: 3.3.2
    ChimeraX-AddCharge: 1.5.17
    ChimeraX-AddH: 2.2.6
    ChimeraX-AlignmentAlgorithms: 2.0.2
    ChimeraX-AlignmentHdrs: 3.5
    ChimeraX-AlignmentMatrices: 2.1
    ChimeraX-Alignments: 2.14
    ChimeraX-AlphaFold: 1.0.1
    ChimeraX-AltlocExplorer: 1.1.1
    ChimeraX-AmberInfo: 1.0
    ChimeraX-Arrays: 1.1
    ChimeraX-Atomic: 1.58.3
    ChimeraX-AtomicLibrary: 14.1.1
    ChimeraX-AtomSearch: 2.0.1
    ChimeraX-AxesPlanes: 2.4
    ChimeraX-BasicActions: 1.1.2
    ChimeraX-BILD: 1.0
    ChimeraX-BlastProtein: 2.4.6
    ChimeraX-BondRot: 2.0.4
    ChimeraX-BugReporter: 1.0.1
    ChimeraX-BuildStructure: 2.13
    ChimeraX-Bumps: 1.0
    ChimeraX-BundleBuilder: 1.2.7
    ChimeraX-ButtonPanel: 1.0.1
    ChimeraX-CageBuilder: 1.0.1
    ChimeraX-CellPack: 1.0
    ChimeraX-Centroids: 1.4
    ChimeraX-ChangeChains: 1.1
    ChimeraX-CheckWaters: 1.4
    ChimeraX-ChemGroup: 2.0.1
    ChimeraX-Clashes: 2.2.4
    ChimeraX-Clipper: 0.23.0
    ChimeraX-clix: 0.1.4
    ChimeraX-ColorActions: 1.0.5
    ChimeraX-ColorGlobe: 1.0
    ChimeraX-ColorKey: 1.5.6
    ChimeraX-CommandLine: 1.2.5
    ChimeraX-ConnectStructure: 2.0.1
    ChimeraX-Contacts: 1.0.1
    ChimeraX-Core: 1.9.dev202407260123
    ChimeraX-CoreFormats: 1.2
    ChimeraX-coulombic: 1.4.4
    ChimeraX-Crosslinks: 1.0
    ChimeraX-Crystal: 1.0
    ChimeraX-CrystalContacts: 1.0.1
    ChimeraX-DataFormats: 1.2.3
    ChimeraX-DeepMutationalScan: 1.0
    ChimeraX-Dicom: 1.2.4
    ChimeraX-DiffPlot: 1.0
    ChimeraX-DistMonitor: 1.4.2
    ChimeraX-DockPrep: 1.1.3
    ChimeraX-Dssp: 2.0
    ChimeraX-EMDB-SFF: 1.0
    ChimeraX-ESMFold: 1.0
    ChimeraX-FileHistory: 1.0.1
    ChimeraX-Foldseek: 1.0.1
    ChimeraX-FunctionKey: 1.0.1
    ChimeraX-Geometry: 1.3
    ChimeraX-gltf: 1.0
    ChimeraX-Graphics: 1.3
    ChimeraX-Hbonds: 2.4
    ChimeraX-Help: 1.3
    ChimeraX-HKCage: 1.3
    ChimeraX-IHM: 1.1
    ChimeraX-ImageFormats: 1.2
    ChimeraX-IMOD: 1.0
    ChimeraX-IO: 1.0.1
    ChimeraX-ItemsInspection: 1.0.1
    ChimeraX-IUPAC: 1.0
    ChimeraX-Label: 1.1.10
    ChimeraX-ListInfo: 1.2.2
    ChimeraX-Log: 1.1.7
    ChimeraX-LookingGlass: 1.1
    ChimeraX-Maestro: 1.9.1
    ChimeraX-Map: 1.2
    ChimeraX-MapData: 2.0
    ChimeraX-MapEraser: 1.0.1
    ChimeraX-MapFilter: 2.0.1
    ChimeraX-MapFit: 2.0
    ChimeraX-MapSeries: 2.1.1
    ChimeraX-Markers: 1.0.1
    ChimeraX-Mask: 1.0.2
    ChimeraX-maskChains: 1.1
    ChimeraX-MatchMaker: 2.1.5
    ChimeraX-MCopy: 1.0
    ChimeraX-MDcrds: 2.7.1
    ChimeraX-MedicalToolbar: 1.0.3
    ChimeraX-Meeting: 1.0.1
    ChimeraX-MLP: 1.1.1
    ChimeraX-mmCIF: 2.14.1
    ChimeraX-MMTF: 2.2
    ChimeraX-Modeller: 1.5.17
    ChimeraX-ModelPanel: 1.5
    ChimeraX-ModelSeries: 1.0.1
    ChimeraX-Mol2: 2.0.3
    ChimeraX-Mole: 1.0
    ChimeraX-Morph: 1.0.2
    ChimeraX-MouseModes: 1.2
    ChimeraX-Movie: 1.0
    ChimeraX-Neuron: 1.0
    ChimeraX-Nifti: 1.2
    ChimeraX-NIHPresets: 1.1.19
    ChimeraX-NMRSTAR: 1.0.2
    ChimeraX-NRRD: 1.2
    ChimeraX-Nucleotides: 2.0.3
    ChimeraX-OpenCommand: 1.13.5
    ChimeraX-OrthoPick: 1.0.1
    ChimeraX-PDB: 2.7.6
    ChimeraX-PDBBio: 1.0.1
    ChimeraX-PDBLibrary: 1.0.4
    ChimeraX-PDBMatrices: 1.0
    ChimeraX-PickBlobs: 1.0.1
    ChimeraX-Positions: 1.0
    ChimeraX-PresetMgr: 1.1.2
    ChimeraX-PubChem: 2.2
    ChimeraX-ReadPbonds: 1.0.1
    ChimeraX-Registration: 1.1.2
    ChimeraX-RemoteControl: 1.0
    ChimeraX-RenderByAttr: 1.4.2
    ChimeraX-RenumberResidues: 1.1
    ChimeraX-ResidueFit: 1.0.1
    ChimeraX-RestServer: 1.3
    ChimeraX-RNALayout: 1.0
    ChimeraX-RotamerLibMgr: 4.0
    ChimeraX-RotamerLibsDunbrack: 2.0
    ChimeraX-RotamerLibsDynameomics: 2.0
    ChimeraX-RotamerLibsRichardson: 2.0
    ChimeraX-SaveCommand: 1.5.1
    ChimeraX-SchemeMgr: 1.0
    ChimeraX-SDF: 2.0.2
    ChimeraX-Segger: 1.0
    ChimeraX-Segment: 1.0.1
    ChimeraX-Segmentations: 3.1.5
    ChimeraX-SelInspector: 1.0
    ChimeraX-SeqView: 2.13
    ChimeraX-Shape: 1.0.1
    ChimeraX-Shell: 1.0.1
    ChimeraX-Shortcuts: 1.1.3
    ChimeraX-ShowSequences: 1.0.3
    ChimeraX-SideView: 1.0.1
    ChimeraX-Smiles: 2.1.2
    ChimeraX-SmoothLines: 1.0
    ChimeraX-SpaceNavigator: 1.0
    ChimeraX-StdCommands: 1.18
    ChimeraX-STL: 1.0.1
    ChimeraX-Storm: 1.0
    ChimeraX-StructMeasure: 1.2.1
    ChimeraX-Struts: 1.0.1
    ChimeraX-Surface: 1.0.1
    ChimeraX-SwapAA: 2.0.1
    ChimeraX-SwapRes: 2.5
    ChimeraX-TapeMeasure: 1.0
    ChimeraX-TaskManager: 1.0
    ChimeraX-Test: 1.0
    ChimeraX-Toolbar: 1.2.3
    ChimeraX-ToolshedUtils: 1.2.4
    ChimeraX-Topography: 1.0
    ChimeraX-ToQuest: 1.0
    ChimeraX-Tug: 1.0.1
    ChimeraX-UI: 1.39.8
    ChimeraX-uniprot: 2.3.1
    ChimeraX-UnitCell: 1.0.1
    ChimeraX-ViewDockX: 1.4.3
    ChimeraX-VIPERdb: 1.0
    ChimeraX-Vive: 1.1
    ChimeraX-VolumeMenu: 1.0.1
    ChimeraX-vrml: 1.0
    ChimeraX-VTK: 1.0
    ChimeraX-WavefrontOBJ: 1.0
    ChimeraX-WebCam: 1.0.2
    ChimeraX-WebServices: 1.1.4
    ChimeraX-Zone: 1.0.1
    colorama: 0.4.6
    comm: 0.2.2
    contourpy: 1.2.1
    cxservices: 1.2.2
    cycler: 0.12.1
    Cython: 3.0.10
    debugpy: 1.8.2
    decorator: 5.1.1
    docutils: 0.20.1
    executing: 2.0.1
    filelock: 3.13.4
    fonttools: 4.53.1
    fsspec: 2024.3.1
    funcparserlib: 2.0.0a0
    glfw: 2.7.0
    grako: 3.16.5
    h5py: 3.11.0
    html2text: 2024.2.26
    idna: 3.7
    ihm: 1.0
    imagecodecs: 2024.1.1
    imagesize: 1.4.1
    ipykernel: 6.29.5
    ipython: 8.26.0
    ipywidgets: 8.1.3
    jedi: 0.19.1
    Jinja2: 3.1.4
    joblib: 1.4.2
    jupyter_client: 8.6.2
    jupyter_core: 5.7.2
    jupyterlab_widgets: 3.0.11
    kiwisolver: 1.4.5
    line-profiler: 4.1.2
    llvmlite: 0.42.0
    lxml: 5.2.1
    lz4: 4.3.3
    MarkupSafe: 2.1.5
    matplotlib: 3.8.4
    matplotlib-inline: 0.1.7
    mpmath: 1.3.0
    mrcfile: 1.5.0
    msgpack: 1.0.8
    nest-asyncio: 1.6.0
    netCDF4: 1.6.5
    networkx: 3.3
    nibabel: 5.2.0
    nptyping: 2.5.0
    numba: 0.59.1
    numexpr: 2.10.1
    numpy: 1.26.4
    openvr: 1.26.701
    packaging: 23.2
    ParmEd: 4.2.2
    parso: 0.8.4
    pep517: 0.13.1
    pexpect: 4.9.0
    pillow: 10.3.0
    pip: 24.1.2
    pkginfo: 1.10.0
    platformdirs: 4.2.2
    prompt_toolkit: 3.0.47
    psutil: 5.9.8
    ptyprocess: 0.7.0
    pure_eval: 0.2.3
    py-cpuinfo: 9.0.0
    pycollada: 0.8
    pydicom: 2.4.4
    Pygments: 2.17.2
    pynmrstar: 3.3.4
    pynndescent: 0.5.12
    pynrrd: 1.0.0
    PyOpenGL: 3.1.7
    PyOpenGL-accelerate: 3.1.7
    pyopenxr: 1.0.3401
    pyparsing: 3.1.2
    pyproject_hooks: 1.1.0
    PyQt6: 6.7.0
    PyQt6-Qt6: 6.7.2
    PyQt6-sip: 13.6.0
    PyQt6-WebEngine: 6.7.0
    PyQt6-WebEngine-Qt6: 6.7.2
    PyQt6-WebEngineSubwheel-Qt6: 6.7.2
    python-dateutil: 2.9.0.post0
    pytz: 2024.1
    pyzmq: 26.0.3
    qtconsole: 5.5.2
    QtPy: 2.4.1
    RandomWords: 0.4.0
    requests: 2.32.3
    scikit-learn: 1.4.2
    scipy: 1.13.0
    setuptools: 70.3.0
    setuptools-scm: 8.0.4
    sfftk-rw: 0.8.1
    six: 1.16.0
    snowballstemmer: 2.2.0
    sortedcontainers: 2.4.0
    soupsieve: 2.5
    Sphinx: 7.2.6
    sphinx-autodoc-typehints: 2.0.1
    sphinxcontrib-applehelp: 1.0.8
    sphinxcontrib-blockdiag: 3.0.0
    sphinxcontrib-devhelp: 1.0.6
    sphinxcontrib-htmlhelp: 2.0.6
    sphinxcontrib-jsmath: 1.0.1
    sphinxcontrib-qthelp: 1.0.8
    sphinxcontrib-serializinghtml: 1.1.10
    stack-data: 0.6.3
    superqt: 0.6.3
    sympy: 1.12
    tables: 3.8.0
    tcia_utils: 1.5.1
    threadpoolctl: 3.5.0
    tifffile: 2024.1.30
    tinyarray: 1.2.4
    torch: 2.3.0
    tornado: 6.4.1
    tqdm: 4.66.4
    traitlets: 5.14.2
    typing_extensions: 4.12.2
    tzdata: 2024.1
    umap-learn: 0.5.6
    urllib3: 2.2.2
    wcwidth: 0.2.13
    webcolors: 1.13
    wheel: 0.43.0
    wheel-filename: 1.4.1
    widgetsnbextension: 4.0.11

Change History (4)

comment:1 by Tom Goddard, 15 months ago

Component: UnassignedStructure Analysis
Owner: set to Tom Goddard
Platform: all
Project: ChimeraX
Status: newassigned
Summary: ChimeraX bug report submissionImprove speed of foldseek ligands command
Type: defectenhancement

comment:2 by Tom Goddard, 15 months ago

Including trimming, remembering sequence alignment and showing ribbon, but not aligning raises the time to 298 seconds (6x longer).

Commenting out the trimming reduces time back to 49 seconds.

Trimming but with trim false which still computes the aligned residues subset takes 50 seconds.

Extra chain and sequence trimming without ligand trimming takes 202 seconds.

Extra chain trimming but no sequence or ligand trimming 162 seconds.

Trimming chains but using Python instead of the deleting command, took 328 seconds. PDB 8j07 with hundreds of chains took the most time. The Python code looped through chains and did chain.residues.delete(). Maybe that caused a lot of costly chain recomputations. Try deleting all the extra chain residues as a single collection. That took 57 seconds.

Trimming chains and sequences in Python took 56 seconds. Strange it is faster than just trimming chains. Probably just timing noise. Trimming the sequence ends apparently takes little time.

Trimming chains, sequences and ligands all in Python took 58 seconds.

Since opening the models takes 50 seconds, the trimming now takes 8 seconds compared 248 seconds if using the delete command.

Last edited 15 months ago by Tom Goddard (previous) (diff)

comment:3 by Tom Goddard, 15 months ago

Using Python to trim the hits dramatically increases the speed so that ligands for all 845 structures are processed in 75 seconds.

My Python trimming also changed the behavior of trim ligands slightly in that it considers ligands from different chain ids and my trim chains is only trimming the other polymer chains but not the ligands of other chain ids. This leads to getting 300 more ligands. It has the drawback that "trim chains" without trimming ligands leaves many ligands from other chains floating in space. If I trim ligands I think I want it to keep ligands in other chains, but if I don't trim ligands then trim chains should probably trim everything including ligands with other chain ids. Made that change.

comment:4 by Tom Goddard, 15 months ago

Resolution: fixed
Status: assignedclosed

Fixed.

Improved speed finding ligands of 845 hits from 500 seconds down to 75 seconds by replacing structure trimming delete commands with Python code.

Note: See TracTickets for help on using tickets.