#15739 closed defect (fixed)

Loading an ESMFold BLAST result gives error "Expected a keyword"

Reported by: goddard@… Owned by: Tom Goddard
Priority: normal Milestone:
Component: Sequence Version:
Keywords: Cc:
Blocked By: Blocking:
Notify when closed: Platform: all
Project: ChimeraX

Description

The following bug report has been submitted:
Platform:        macOS-14.5-arm64-arm-64bit
ChimeraX Version: 1.9.dev202407250342 (2024-07-25 03:42:42 UTC)
Description
Fetching structures from the BLAST output for esmfold is broken.  It does not put the ID into the esmfold fetch command.

Log:
UCSF ChimeraX version: 1.9.dev202407250342 (2024-07-25)  
© 2016-2024 Regents of the University of California. All rights reserved.  
How to cite UCSF ChimeraX  

> open 7sx3

7sx3 title:  
Human NALCN-FAM155A-UNC79-UNC80 channelosome with CaM bound, conformation 1/2
[more info...]  
  
Chain information for 7sx3 #1  
---  
Chain | Description | UniProt  
A | Sodium leak channel non-selective protein,Enhanced green fluorescent protein | NALCN_HUMAN 1-1738, A0A7G8ZY66_MUHV1 1760-2000  
B | Transmembrane protein FAM155A | F155A_HUMAN 1-458  
C | Calmodulin-1 | CALM1_HUMAN 1-149  
D | UNC79,Protein unc-79 homolog,Protein unc-79 homolog | UNC79_HUMAN 174-2635  
E | Protein unc-80 homolog | UNC80_HUMAN 1-3258  
  
Non-standard residues in 7sx3 #1  
---  
NAG — 2-acetamido-2-deoxy-beta-D-glucopyranose (N-acetyl-beta-D-glucosamine;
2-acetamido-2-deoxy-beta-D-glucose; 2-acetamido-2-deoxy-D-glucose;
2-acetamido-2-deoxy-glucose; N-ACETYL-D-GLUCOSAMINE)  
PEV —
(1S)-2-{[(2-aminoethoxy)(hydroxy)phosphoryl]oxy}-1-[(palmitoyloxy)methyl]ethyl
stearate (phosphatidylethanolamine; 1-palmitoyl-2-oleoyl-Sn-
glycero-3-phosphoethanolamine)  
PGV —
(1R)-2-{[{[(2S)-2,3-dihydroxypropyl]oxy}(hydroxy)phosphoryl]oxy}-1-[(palmitoyloxy)methyl]ethyl
(11E)-octadec-11-enoate (phosphatidylglycerol; 2-vaccenoyl-1-palmitoyl-Sn-
glycerol-3-phosphoglycerol)  
Y01 — cholesterol hemisuccinate  
  

> select #1/D:174-2635

10737 atoms, 10967 bonds, 25 pseudobonds, 1344 residues, 2 models selected  

> open Q9P2D8 fromDatabase uniprot associate #1/D

Summary of feedback from opening Q9P2D8 fetched from uniprot  
---  
notes | Fetching compressed Q9P2D8 UniProt info from https://www.uniprot.org/uniprot/Q9P2D8.xml  
Alignment identifier is Q9P2D8  
Associated 7sx3 chain D to Q9P2D8 with 99 mismatches and/or gaps  
  
Opened UniProt Q9P2D8  

> select /D:174-2626

10737 atoms, 10967 bonds, 25 pseudobonds, 1344 residues, 2 models selected  

> ~select

Nothing selected  

> ~select

Nothing selected  

> ~select

Nothing selected  

> ~select

Nothing selected  

> select
> /D:176-183,187-199,201-205,222-233,237-250,254-264,267-280,282-284,435-437,440-455,498-510,520-537,554-567,569-576,596-613,619-634,644-651,663-666,668-674,679-694,701-715,797-815,826-840,860-878,951-963,969-983,988-995,997-1006,1008-1014,1021-1035,1040-1052,1057-1072,1076-1079,1083-1099,1105-1115,1120-1136,1138-1154,1163-1180,1209-1219,1228-1231,1273-1289,1299-1315,1329-1334,1336-1351,1353-1371,1376-1379,1387-1389,1392-1408,1414-1432,2020-2036,2038-2057,2074-2092,2094-2099,2108-2115,2124-2137,2145-2158,2170-2186,2195-2205,2210-2214,2217-2230,2235-2248,2252-2270,2278-2292,2319-2333,2337-2342,2357-2377,2383-2387,2389-2392,2407-2428,2441-2457,2471-2488,2504-2526,2531-2547,2551-2565,2576-2584,2605-2624

7971 atoms, 8079 bonds, 988 residues, 1 model selected  

> select /D:174-2626

10737 atoms, 10967 bonds, 25 pseudobonds, 1344 residues, 2 models selected  

> select /D:174-2626

10737 atoms, 10967 bonds, 25 pseudobonds, 1344 residues, 2 models selected  

> ~select

Nothing selected  

> ~select

Nothing selected  

> ~select

Nothing selected  

> ~select

Nothing selected  

> ~select

Nothing selected  

> ~select

Nothing selected  

> ~select

Nothing selected  

> select /D:1284

8 atoms, 7 bonds, 1 residue, 1 model selected  

> select /D:2183,2444

13 atoms, 11 bonds, 2 residues, 1 model selected  

> select /D:174-177

34 atoms, 35 bonds, 4 residues, 1 model selected  

> select /D:652-655,675-677,1325-1327,1380-1382,2120-2122,2138-2140

162 atoms, 160 bonds, 19 residues, 1 model selected  

> select /D:2223-2243,2468-2486

324 atoms, 333 bonds, 40 residues, 1 model selected  

> select /D:638-641,984-986,1221-1223,1320-1322

106 atoms, 103 bonds, 13 residues, 1 model selected  

> select /C:5-147

926 atoms, 932 bonds, 5 pseudobonds, 114 residues, 2 models selected  

> select /C:5-147

926 atoms, 932 bonds, 5 pseudobonds, 114 residues, 2 models selected  

> ui tool show "Blast Protein"

> blastprotein /C database esmfold cutoff 1e-3 matrix BLOSUM62 maxSeqs 100
> version 0 name bp1

Webservices job id: NYJHI1LX2XPT0KEP  

> esmfold fetch version 0 alignTo #1/C

Expected a keyword  

> esmfold fetch version 0 alignTo #1/C

Expected a keyword  




OpenGL version: 4.1 Metal - 88.1
OpenGL renderer: Apple M1 Max
OpenGL vendor: Apple

Python: 3.11.4
Locale: UTF-8
Qt version: PyQt6 6.7.0, Qt 6.7.1
Qt runtime version: 6.7.2
Qt platform: cocoa
Hardware:

    Hardware Overview:

      Model Name: MacBook Pro
      Model Identifier: MacBookPro18,2
      Model Number: MK1H3LL/A
      Chip: Apple M1 Max
      Total Number of Cores: 10 (8 performance and 2 efficiency)
      Memory: 32 GB
      System Firmware Version: 10151.121.1
      OS Loader Version: 10151.121.1

Software:

    System Software Overview:

      System Version: macOS 14.5 (23F79)
      Kernel Version: Darwin 23.5.0
      Time since boot: 14 days, 7 hours, 49 minutes

Graphics/Displays:

    Apple M1 Max:

      Chipset Model: Apple M1 Max
      Type: GPU
      Bus: Built-In
      Total Number of Cores: 32
      Vendor: Apple (0x106b)
      Metal Support: Metal 3
      Displays:
        Color LCD:
          Display Type: Built-in Liquid Retina XDR Display
          Resolution: 3456 x 2234 Retina
          Main Display: Yes
          Mirror: Off
          Online: Yes
          Automatically Adjust Brightness: No
          Connection Type: Internal


Installed Packages:
    alabaster: 0.7.16
    appdirs: 1.4.4
    appnope: 0.1.4
    asttokens: 2.4.1
    Babel: 2.15.0
    beautifulsoup4: 4.12.3
    blockdiag: 3.0.0
    blosc2: 2.0.0
    build: 1.2.1
    certifi: 2023.11.17
    cftime: 1.6.4
    charset-normalizer: 3.3.2
    ChimeraX-AddCharge: 1.5.17
    ChimeraX-AddH: 2.2.6
    ChimeraX-AlignmentAlgorithms: 2.0.2
    ChimeraX-AlignmentHdrs: 3.5
    ChimeraX-AlignmentMatrices: 2.1
    ChimeraX-Alignments: 2.14
    ChimeraX-AlphaFold: 1.0.1
    ChimeraX-AltlocExplorer: 1.1.1
    ChimeraX-AmberInfo: 1.0
    ChimeraX-Arrays: 1.1
    ChimeraX-Atomic: 1.58.4
    ChimeraX-AtomicLibrary: 14.1.3
    ChimeraX-AtomSearch: 2.0.1
    ChimeraX-AxesPlanes: 2.4
    ChimeraX-BasicActions: 1.1.2
    ChimeraX-BILD: 1.0
    ChimeraX-BlastProtein: 2.4.6
    ChimeraX-BondRot: 2.0.4
    ChimeraX-BugReporter: 1.0.1
    ChimeraX-BuildStructure: 2.13
    ChimeraX-Bumps: 1.0
    ChimeraX-BundleBuilder: 1.2.7
    ChimeraX-ButtonPanel: 1.0.1
    ChimeraX-CageBuilder: 1.0.1
    ChimeraX-CellPack: 1.0
    ChimeraX-Centroids: 1.4
    ChimeraX-ChangeChains: 1.1
    ChimeraX-CheckWaters: 1.4
    ChimeraX-ChemGroup: 2.0.1
    ChimeraX-Clashes: 2.2.4
    ChimeraX-ColorActions: 1.0.5
    ChimeraX-ColorGlobe: 1.0
    ChimeraX-ColorKey: 1.5.6
    ChimeraX-CommandLine: 1.2.5
    ChimeraX-ConnectStructure: 2.0.1
    ChimeraX-Contacts: 1.0.1
    ChimeraX-Core: 1.9.dev202407250342
    ChimeraX-CoreFormats: 1.2
    ChimeraX-coulombic: 1.4.4
    ChimeraX-Crosslinks: 1.0
    ChimeraX-Crystal: 1.0
    ChimeraX-CrystalContacts: 1.0.1
    ChimeraX-DataFormats: 1.2.3
    ChimeraX-DeepMutationalScan: 1.0
    ChimeraX-Dicom: 1.2.4
    ChimeraX-DiffPlot: 1.0
    ChimeraX-DistMonitor: 1.4.2
    ChimeraX-DockPrep: 1.1.3
    ChimeraX-Dssp: 2.0
    ChimeraX-EMDB-SFF: 1.0
    ChimeraX-ESMFold: 1.0
    ChimeraX-FileHistory: 1.0.1
    ChimeraX-Foldseek: 1.0.1
    ChimeraX-FunctionKey: 1.0.1
    ChimeraX-Geometry: 1.3
    ChimeraX-gltf: 1.0
    ChimeraX-Graphics: 1.3
    ChimeraX-Hbonds: 2.4
    ChimeraX-Help: 1.3
    ChimeraX-HKCage: 1.3
    ChimeraX-IHM: 1.1
    ChimeraX-ImageFormats: 1.2
    ChimeraX-IMOD: 1.0
    ChimeraX-IO: 1.0.1
    ChimeraX-ItemsInspection: 1.0.1
    ChimeraX-IUPAC: 1.0
    ChimeraX-Label: 1.1.10
    ChimeraX-ListInfo: 1.2.2
    ChimeraX-Log: 1.1.7
    ChimeraX-LookingGlass: 1.1
    ChimeraX-Maestro: 1.9.1
    ChimeraX-Map: 1.2
    ChimeraX-MapData: 2.0
    ChimeraX-MapEraser: 1.0.1
    ChimeraX-MapFilter: 2.0.1
    ChimeraX-MapFit: 2.0
    ChimeraX-MapSeries: 2.1.1
    ChimeraX-Markers: 1.0.1
    ChimeraX-Mask: 1.0.2
    ChimeraX-MatchMaker: 2.1.5
    ChimeraX-MCopy: 1.0
    ChimeraX-MDcrds: 2.7.1
    ChimeraX-MedicalToolbar: 1.0.3
    ChimeraX-Meeting: 1.0.1
    ChimeraX-MLP: 1.1.1
    ChimeraX-mmCIF: 2.14.1
    ChimeraX-MMTF: 2.2
    ChimeraX-Modeller: 1.5.17
    ChimeraX-ModelPanel: 1.5
    ChimeraX-ModelSeries: 1.0.1
    ChimeraX-Mol2: 2.0.3
    ChimeraX-Mole: 1.0
    ChimeraX-Morph: 1.0.2
    ChimeraX-MouseModes: 1.2
    ChimeraX-Movie: 1.0
    ChimeraX-Neuron: 1.0
    ChimeraX-Nifti: 1.2
    ChimeraX-NIHPresets: 1.1.18
    ChimeraX-NMRSTAR: 1.0.2
    ChimeraX-NRRD: 1.2
    ChimeraX-Nucleotides: 2.0.3
    ChimeraX-OpenCommand: 1.13.5
    ChimeraX-OrthoPick: 1.0.1
    ChimeraX-PDB: 2.7.6
    ChimeraX-PDBBio: 1.0.1
    ChimeraX-PDBLibrary: 1.0.4
    ChimeraX-PDBMatrices: 1.0
    ChimeraX-PickBlobs: 1.0.1
    ChimeraX-Positions: 1.0
    ChimeraX-PresetMgr: 1.1.2
    ChimeraX-PubChem: 2.2
    ChimeraX-ReadPbonds: 1.0.1
    ChimeraX-Registration: 1.1.2
    ChimeraX-RemoteControl: 1.0
    ChimeraX-RenderByAttr: 1.4.2
    ChimeraX-RenumberResidues: 1.1
    ChimeraX-ResidueFit: 1.0.1
    ChimeraX-RestServer: 1.3
    ChimeraX-RNALayout: 1.0
    ChimeraX-RotamerLibMgr: 4.0
    ChimeraX-RotamerLibsDunbrack: 2.0
    ChimeraX-RotamerLibsDynameomics: 2.0
    ChimeraX-RotamerLibsRichardson: 2.0
    ChimeraX-SaveCommand: 1.5.1
    ChimeraX-SchemeMgr: 1.0
    ChimeraX-SDF: 2.0.2
    ChimeraX-Segger: 1.0
    ChimeraX-Segment: 1.0.1
    ChimeraX-Segmentations: 3.1.5
    ChimeraX-SelInspector: 1.0
    ChimeraX-SeqView: 2.13
    ChimeraX-Shape: 1.0.1
    ChimeraX-Shell: 1.0.1
    ChimeraX-Shortcuts: 1.1.3
    ChimeraX-ShowSequences: 1.0.3
    ChimeraX-SideView: 1.0.1
    ChimeraX-Smiles: 2.1.2
    ChimeraX-SmoothLines: 1.0
    ChimeraX-SpaceNavigator: 1.0
    ChimeraX-StdCommands: 1.18
    ChimeraX-STL: 1.0.1
    ChimeraX-Storm: 1.0
    ChimeraX-StructMeasure: 1.2.1
    ChimeraX-Struts: 1.0.1
    ChimeraX-Surface: 1.0.1
    ChimeraX-SwapAA: 2.0.1
    ChimeraX-SwapRes: 2.5
    ChimeraX-TapeMeasure: 1.0
    ChimeraX-TaskManager: 1.0
    ChimeraX-Test: 1.0
    ChimeraX-Toolbar: 1.2.3
    ChimeraX-ToolshedUtils: 1.2.4
    ChimeraX-Topography: 1.0
    ChimeraX-ToQuest: 1.0
    ChimeraX-Tug: 1.0.1
    ChimeraX-UI: 1.39.8
    ChimeraX-Ummbas-Anaglyph: 0.1
    ChimeraX-uniprot: 2.3.1
    ChimeraX-UnitCell: 1.0.1
    ChimeraX-ViewDockX: 1.4.3
    ChimeraX-VIPERdb: 1.0
    ChimeraX-Vive: 1.1
    ChimeraX-VolumeMenu: 1.0.1
    ChimeraX-vrml: 1.0
    ChimeraX-VTK: 1.0
    ChimeraX-WavefrontOBJ: 1.0
    ChimeraX-WebCam: 1.0.2
    ChimeraX-WebServices: 1.1.4
    ChimeraX-Zone: 1.0.1
    colorama: 0.4.6
    comm: 0.2.2
    contourpy: 1.2.1
    cxservices: 1.2.2
    cycler: 0.12.1
    Cython: 3.0.10
    debugpy: 1.8.2
    decorator: 5.1.1
    docutils: 0.20.1
    executing: 2.0.1
    filelock: 3.13.4
    fonttools: 4.53.1
    funcparserlib: 2.0.0a0
    glfw: 2.7.0
    grako: 3.16.5
    h5py: 3.11.0
    html2text: 2024.2.26
    idna: 3.7
    ihm: 1.0
    imagecodecs: 2024.1.1
    imagesize: 1.4.1
    ipykernel: 6.29.5
    ipython: 8.26.0
    ipywidgets: 8.1.3
    jedi: 0.19.1
    Jinja2: 3.1.4
    joblib: 1.4.2
    jupyter_client: 8.6.2
    jupyter_core: 5.7.2
    jupyterlab_widgets: 3.0.11
    kiwisolver: 1.4.5
    line-profiler: 4.1.2
    llvmlite: 0.43.0
    lxml: 5.2.1
    lz4: 4.3.3
    MarkupSafe: 2.1.5
    matplotlib: 3.8.4
    matplotlib-inline: 0.1.7
    msgpack: 1.0.8
    nest-asyncio: 1.6.0
    netCDF4: 1.6.5
    networkx: 3.3
    nibabel: 5.2.0
    nptyping: 2.5.0
    numba: 0.60.0
    numexpr: 2.10.1
    numpy: 1.26.4
    openvr: 1.26.701
    packaging: 23.2
    ParmEd: 4.2.2
    parso: 0.8.4
    pep517: 0.13.1
    pexpect: 4.9.0
    pillow: 10.3.0
    pip: 24.1.2
    pkginfo: 1.10.0
    platformdirs: 4.2.2
    prompt_toolkit: 3.0.47
    psutil: 5.9.8
    ptyprocess: 0.7.0
    pure_eval: 0.2.3
    py-cpuinfo: 9.0.0
    pycollada: 0.8
    pydicom: 2.4.4
    Pygments: 2.17.2
    pynmrstar: 3.3.4
    pynndescent: 0.5.13
    pynrrd: 1.0.0
    PyOpenGL: 3.1.7
    PyOpenGL-accelerate: 3.1.7
    pyopenxr: 1.0.3401
    pyparsing: 3.1.2
    pyproject_hooks: 1.1.0
    PyQt6: 6.7.0
    PyQt6-Qt6: 6.7.2
    PyQt6-sip: 13.6.0
    PyQt6-WebEngine: 6.7.0
    PyQt6-WebEngine-Qt6: 6.7.2
    PyQt6-WebEngineSubwheel-Qt6: 6.7.2
    python-dateutil: 2.9.0.post0
    pytz: 2024.1
    pyzmq: 26.0.3
    qtconsole: 5.5.2
    QtPy: 2.4.1
    RandomWords: 0.4.0
    requests: 2.32.3
    scikit-learn: 1.5.1
    scipy: 1.13.0
    setuptools: 70.3.0
    setuptools-scm: 8.0.4
    sfftk-rw: 0.8.1
    six: 1.16.0
    snowballstemmer: 2.2.0
    sortedcontainers: 2.4.0
    soupsieve: 2.5
    Sphinx: 7.2.6
    sphinx-autodoc-typehints: 2.0.1
    sphinxcontrib-applehelp: 1.0.8
    sphinxcontrib-blockdiag: 3.0.0
    sphinxcontrib-devhelp: 1.0.6
    sphinxcontrib-htmlhelp: 2.0.6
    sphinxcontrib-jsmath: 1.0.1
    sphinxcontrib-qthelp: 1.0.8
    sphinxcontrib-serializinghtml: 1.1.10
    stack-data: 0.6.3
    superqt: 0.6.3
    tables: 3.8.0
    tcia_utils: 1.5.1
    threadpoolctl: 3.5.0
    tifffile: 2024.1.30
    tinyarray: 1.2.4
    tornado: 6.4.1
    tqdm: 4.66.4
    traitlets: 5.14.2
    typing_extensions: 4.12.2
    tzdata: 2024.1
    umap-learn: 0.5.6
    urllib3: 2.2.2
    wcwidth: 0.2.13
    webcolors: 1.13
    wheel: 0.43.0
    wheel-filename: 1.4.1
    widgetsnbextension: 4.0.11

Change History (4)

comment:1 by Tom Goddard, 15 months ago

Component: UnassignedSequence
Owner: set to Tom Goddard
Platform: all
Project: ChimeraX
Status: newassigned
Summary: ChimeraX bug report submissionLoading an ESMFold BLAST result gives error "Expected a keyword"

When the BLAST Protein tool tries to load an ESMFold hit it uses the "description" field in the PDB fasta format which is the third "|" separated field. For example here is a PDB fasta format sequence title line:

>7SX3_1|Chain A|Sodium leak channel non-selective protein,Enhanced green fluorescent protein|Homo sapiens (9606)

But the ESMFold fasta sequences here (from ticket https://www.rbvi.ucsf.edu/trac/ChimeraX/ticket/7970#comment:13)

https://dl.fbaipublicfiles.com/esmatlas/v0/full/atlas.fasta

have sequence title lines like

MGYP0123456

I rebuilt the ESMFold BLAST database today because it was accidentally deleted (ticket #15734). Apparently the original ESMFold BLAST database was built from a different fasta file which had the PDB format. Maybe the original FASTA was the MGnify database fasta. The blast results table shows columns for the MGnify ID and the Name so I guess there were multiple fields.

I have a vague memory that I started with MGnify and only later got the ESMFold fasta and possibly I filtered MGnify using the ESMFold ids.

comment:2 by Tom Goddard, 15 months ago

There are a couple possible fixes. I could just fix ChimeraX to get the correct mgnify id. The old ChimeraX versions won't work when loading an esmfold blast hit but new ChimeraX will. I suspect blast esmfold is so rare this is probably a reasonable route, minimizing effort.

Another possibility is to try to recreate what I did before by merging the MGnify fasta title lines with the ESM fasta. That has the virtue that there will probably be a better description of the entries and old ChimeraX will continue to work. Might be at least worth seeing what was in those MGnify fasta sequence titles.

comment:3 by Tom Goddard, 15 months ago

I was not able to find any per-sequence descriptions for MGnify which makes sense since it is just sequences from metagenomic sequencing. So I'm not sure there is any additional info beyond the MGYP01234... ID to provide. So I am inclined to just fix the ChimeraX code to expect just that ID in the BLAST hit results.

comment:4 by Tom Goddard, 15 months ago

Resolution: fixed
Status: assignedclosed

Fixed.

Updated ChimeraX BLAST Protein tool to expect just the MGnify ID in the results. This means that older ChimeraX will continue to give an error fetching an esmfold blast result structure. I suspect this feature is almost never used so the fix in the daily build is adequate.

Note: See TracTickets for help on using tickets.