Opened 15 months ago
Closed 15 months ago
#15739 closed defect (fixed)
Loading an ESMFold BLAST result gives error "Expected a keyword"
| Reported by: | Owned by: | Tom Goddard | |
|---|---|---|---|
| Priority: | normal | Milestone: | |
| Component: | Sequence | Version: | |
| Keywords: | Cc: | ||
| Blocked By: | Blocking: | ||
| Notify when closed: | Platform: | all | |
| Project: | ChimeraX |
Description
The following bug report has been submitted:
Platform: macOS-14.5-arm64-arm-64bit
ChimeraX Version: 1.9.dev202407250342 (2024-07-25 03:42:42 UTC)
Description
Fetching structures from the BLAST output for esmfold is broken. It does not put the ID into the esmfold fetch command.
Log:
UCSF ChimeraX version: 1.9.dev202407250342 (2024-07-25)
© 2016-2024 Regents of the University of California. All rights reserved.
How to cite UCSF ChimeraX
> open 7sx3
7sx3 title:
Human NALCN-FAM155A-UNC79-UNC80 channelosome with CaM bound, conformation 1/2
[more info...]
Chain information for 7sx3 #1
---
Chain | Description | UniProt
A | Sodium leak channel non-selective protein,Enhanced green fluorescent protein | NALCN_HUMAN 1-1738, A0A7G8ZY66_MUHV1 1760-2000
B | Transmembrane protein FAM155A | F155A_HUMAN 1-458
C | Calmodulin-1 | CALM1_HUMAN 1-149
D | UNC79,Protein unc-79 homolog,Protein unc-79 homolog | UNC79_HUMAN 174-2635
E | Protein unc-80 homolog | UNC80_HUMAN 1-3258
Non-standard residues in 7sx3 #1
---
NAG — 2-acetamido-2-deoxy-beta-D-glucopyranose (N-acetyl-beta-D-glucosamine;
2-acetamido-2-deoxy-beta-D-glucose; 2-acetamido-2-deoxy-D-glucose;
2-acetamido-2-deoxy-glucose; N-ACETYL-D-GLUCOSAMINE)
PEV —
(1S)-2-{[(2-aminoethoxy)(hydroxy)phosphoryl]oxy}-1-[(palmitoyloxy)methyl]ethyl
stearate (phosphatidylethanolamine; 1-palmitoyl-2-oleoyl-Sn-
glycero-3-phosphoethanolamine)
PGV —
(1R)-2-{[{[(2S)-2,3-dihydroxypropyl]oxy}(hydroxy)phosphoryl]oxy}-1-[(palmitoyloxy)methyl]ethyl
(11E)-octadec-11-enoate (phosphatidylglycerol; 2-vaccenoyl-1-palmitoyl-Sn-
glycerol-3-phosphoglycerol)
Y01 — cholesterol hemisuccinate
> select #1/D:174-2635
10737 atoms, 10967 bonds, 25 pseudobonds, 1344 residues, 2 models selected
> open Q9P2D8 fromDatabase uniprot associate #1/D
Summary of feedback from opening Q9P2D8 fetched from uniprot
---
notes | Fetching compressed Q9P2D8 UniProt info from https://www.uniprot.org/uniprot/Q9P2D8.xml
Alignment identifier is Q9P2D8
Associated 7sx3 chain D to Q9P2D8 with 99 mismatches and/or gaps
Opened UniProt Q9P2D8
> select /D:174-2626
10737 atoms, 10967 bonds, 25 pseudobonds, 1344 residues, 2 models selected
> ~select
Nothing selected
> ~select
Nothing selected
> ~select
Nothing selected
> ~select
Nothing selected
> select
> /D:176-183,187-199,201-205,222-233,237-250,254-264,267-280,282-284,435-437,440-455,498-510,520-537,554-567,569-576,596-613,619-634,644-651,663-666,668-674,679-694,701-715,797-815,826-840,860-878,951-963,969-983,988-995,997-1006,1008-1014,1021-1035,1040-1052,1057-1072,1076-1079,1083-1099,1105-1115,1120-1136,1138-1154,1163-1180,1209-1219,1228-1231,1273-1289,1299-1315,1329-1334,1336-1351,1353-1371,1376-1379,1387-1389,1392-1408,1414-1432,2020-2036,2038-2057,2074-2092,2094-2099,2108-2115,2124-2137,2145-2158,2170-2186,2195-2205,2210-2214,2217-2230,2235-2248,2252-2270,2278-2292,2319-2333,2337-2342,2357-2377,2383-2387,2389-2392,2407-2428,2441-2457,2471-2488,2504-2526,2531-2547,2551-2565,2576-2584,2605-2624
7971 atoms, 8079 bonds, 988 residues, 1 model selected
> select /D:174-2626
10737 atoms, 10967 bonds, 25 pseudobonds, 1344 residues, 2 models selected
> select /D:174-2626
10737 atoms, 10967 bonds, 25 pseudobonds, 1344 residues, 2 models selected
> ~select
Nothing selected
> ~select
Nothing selected
> ~select
Nothing selected
> ~select
Nothing selected
> ~select
Nothing selected
> ~select
Nothing selected
> ~select
Nothing selected
> select /D:1284
8 atoms, 7 bonds, 1 residue, 1 model selected
> select /D:2183,2444
13 atoms, 11 bonds, 2 residues, 1 model selected
> select /D:174-177
34 atoms, 35 bonds, 4 residues, 1 model selected
> select /D:652-655,675-677,1325-1327,1380-1382,2120-2122,2138-2140
162 atoms, 160 bonds, 19 residues, 1 model selected
> select /D:2223-2243,2468-2486
324 atoms, 333 bonds, 40 residues, 1 model selected
> select /D:638-641,984-986,1221-1223,1320-1322
106 atoms, 103 bonds, 13 residues, 1 model selected
> select /C:5-147
926 atoms, 932 bonds, 5 pseudobonds, 114 residues, 2 models selected
> select /C:5-147
926 atoms, 932 bonds, 5 pseudobonds, 114 residues, 2 models selected
> ui tool show "Blast Protein"
> blastprotein /C database esmfold cutoff 1e-3 matrix BLOSUM62 maxSeqs 100
> version 0 name bp1
Webservices job id: NYJHI1LX2XPT0KEP
> esmfold fetch version 0 alignTo #1/C
Expected a keyword
> esmfold fetch version 0 alignTo #1/C
Expected a keyword
OpenGL version: 4.1 Metal - 88.1
OpenGL renderer: Apple M1 Max
OpenGL vendor: Apple
Python: 3.11.4
Locale: UTF-8
Qt version: PyQt6 6.7.0, Qt 6.7.1
Qt runtime version: 6.7.2
Qt platform: cocoa
Hardware:
Hardware Overview:
Model Name: MacBook Pro
Model Identifier: MacBookPro18,2
Model Number: MK1H3LL/A
Chip: Apple M1 Max
Total Number of Cores: 10 (8 performance and 2 efficiency)
Memory: 32 GB
System Firmware Version: 10151.121.1
OS Loader Version: 10151.121.1
Software:
System Software Overview:
System Version: macOS 14.5 (23F79)
Kernel Version: Darwin 23.5.0
Time since boot: 14 days, 7 hours, 49 minutes
Graphics/Displays:
Apple M1 Max:
Chipset Model: Apple M1 Max
Type: GPU
Bus: Built-In
Total Number of Cores: 32
Vendor: Apple (0x106b)
Metal Support: Metal 3
Displays:
Color LCD:
Display Type: Built-in Liquid Retina XDR Display
Resolution: 3456 x 2234 Retina
Main Display: Yes
Mirror: Off
Online: Yes
Automatically Adjust Brightness: No
Connection Type: Internal
Installed Packages:
alabaster: 0.7.16
appdirs: 1.4.4
appnope: 0.1.4
asttokens: 2.4.1
Babel: 2.15.0
beautifulsoup4: 4.12.3
blockdiag: 3.0.0
blosc2: 2.0.0
build: 1.2.1
certifi: 2023.11.17
cftime: 1.6.4
charset-normalizer: 3.3.2
ChimeraX-AddCharge: 1.5.17
ChimeraX-AddH: 2.2.6
ChimeraX-AlignmentAlgorithms: 2.0.2
ChimeraX-AlignmentHdrs: 3.5
ChimeraX-AlignmentMatrices: 2.1
ChimeraX-Alignments: 2.14
ChimeraX-AlphaFold: 1.0.1
ChimeraX-AltlocExplorer: 1.1.1
ChimeraX-AmberInfo: 1.0
ChimeraX-Arrays: 1.1
ChimeraX-Atomic: 1.58.4
ChimeraX-AtomicLibrary: 14.1.3
ChimeraX-AtomSearch: 2.0.1
ChimeraX-AxesPlanes: 2.4
ChimeraX-BasicActions: 1.1.2
ChimeraX-BILD: 1.0
ChimeraX-BlastProtein: 2.4.6
ChimeraX-BondRot: 2.0.4
ChimeraX-BugReporter: 1.0.1
ChimeraX-BuildStructure: 2.13
ChimeraX-Bumps: 1.0
ChimeraX-BundleBuilder: 1.2.7
ChimeraX-ButtonPanel: 1.0.1
ChimeraX-CageBuilder: 1.0.1
ChimeraX-CellPack: 1.0
ChimeraX-Centroids: 1.4
ChimeraX-ChangeChains: 1.1
ChimeraX-CheckWaters: 1.4
ChimeraX-ChemGroup: 2.0.1
ChimeraX-Clashes: 2.2.4
ChimeraX-ColorActions: 1.0.5
ChimeraX-ColorGlobe: 1.0
ChimeraX-ColorKey: 1.5.6
ChimeraX-CommandLine: 1.2.5
ChimeraX-ConnectStructure: 2.0.1
ChimeraX-Contacts: 1.0.1
ChimeraX-Core: 1.9.dev202407250342
ChimeraX-CoreFormats: 1.2
ChimeraX-coulombic: 1.4.4
ChimeraX-Crosslinks: 1.0
ChimeraX-Crystal: 1.0
ChimeraX-CrystalContacts: 1.0.1
ChimeraX-DataFormats: 1.2.3
ChimeraX-DeepMutationalScan: 1.0
ChimeraX-Dicom: 1.2.4
ChimeraX-DiffPlot: 1.0
ChimeraX-DistMonitor: 1.4.2
ChimeraX-DockPrep: 1.1.3
ChimeraX-Dssp: 2.0
ChimeraX-EMDB-SFF: 1.0
ChimeraX-ESMFold: 1.0
ChimeraX-FileHistory: 1.0.1
ChimeraX-Foldseek: 1.0.1
ChimeraX-FunctionKey: 1.0.1
ChimeraX-Geometry: 1.3
ChimeraX-gltf: 1.0
ChimeraX-Graphics: 1.3
ChimeraX-Hbonds: 2.4
ChimeraX-Help: 1.3
ChimeraX-HKCage: 1.3
ChimeraX-IHM: 1.1
ChimeraX-ImageFormats: 1.2
ChimeraX-IMOD: 1.0
ChimeraX-IO: 1.0.1
ChimeraX-ItemsInspection: 1.0.1
ChimeraX-IUPAC: 1.0
ChimeraX-Label: 1.1.10
ChimeraX-ListInfo: 1.2.2
ChimeraX-Log: 1.1.7
ChimeraX-LookingGlass: 1.1
ChimeraX-Maestro: 1.9.1
ChimeraX-Map: 1.2
ChimeraX-MapData: 2.0
ChimeraX-MapEraser: 1.0.1
ChimeraX-MapFilter: 2.0.1
ChimeraX-MapFit: 2.0
ChimeraX-MapSeries: 2.1.1
ChimeraX-Markers: 1.0.1
ChimeraX-Mask: 1.0.2
ChimeraX-MatchMaker: 2.1.5
ChimeraX-MCopy: 1.0
ChimeraX-MDcrds: 2.7.1
ChimeraX-MedicalToolbar: 1.0.3
ChimeraX-Meeting: 1.0.1
ChimeraX-MLP: 1.1.1
ChimeraX-mmCIF: 2.14.1
ChimeraX-MMTF: 2.2
ChimeraX-Modeller: 1.5.17
ChimeraX-ModelPanel: 1.5
ChimeraX-ModelSeries: 1.0.1
ChimeraX-Mol2: 2.0.3
ChimeraX-Mole: 1.0
ChimeraX-Morph: 1.0.2
ChimeraX-MouseModes: 1.2
ChimeraX-Movie: 1.0
ChimeraX-Neuron: 1.0
ChimeraX-Nifti: 1.2
ChimeraX-NIHPresets: 1.1.18
ChimeraX-NMRSTAR: 1.0.2
ChimeraX-NRRD: 1.2
ChimeraX-Nucleotides: 2.0.3
ChimeraX-OpenCommand: 1.13.5
ChimeraX-OrthoPick: 1.0.1
ChimeraX-PDB: 2.7.6
ChimeraX-PDBBio: 1.0.1
ChimeraX-PDBLibrary: 1.0.4
ChimeraX-PDBMatrices: 1.0
ChimeraX-PickBlobs: 1.0.1
ChimeraX-Positions: 1.0
ChimeraX-PresetMgr: 1.1.2
ChimeraX-PubChem: 2.2
ChimeraX-ReadPbonds: 1.0.1
ChimeraX-Registration: 1.1.2
ChimeraX-RemoteControl: 1.0
ChimeraX-RenderByAttr: 1.4.2
ChimeraX-RenumberResidues: 1.1
ChimeraX-ResidueFit: 1.0.1
ChimeraX-RestServer: 1.3
ChimeraX-RNALayout: 1.0
ChimeraX-RotamerLibMgr: 4.0
ChimeraX-RotamerLibsDunbrack: 2.0
ChimeraX-RotamerLibsDynameomics: 2.0
ChimeraX-RotamerLibsRichardson: 2.0
ChimeraX-SaveCommand: 1.5.1
ChimeraX-SchemeMgr: 1.0
ChimeraX-SDF: 2.0.2
ChimeraX-Segger: 1.0
ChimeraX-Segment: 1.0.1
ChimeraX-Segmentations: 3.1.5
ChimeraX-SelInspector: 1.0
ChimeraX-SeqView: 2.13
ChimeraX-Shape: 1.0.1
ChimeraX-Shell: 1.0.1
ChimeraX-Shortcuts: 1.1.3
ChimeraX-ShowSequences: 1.0.3
ChimeraX-SideView: 1.0.1
ChimeraX-Smiles: 2.1.2
ChimeraX-SmoothLines: 1.0
ChimeraX-SpaceNavigator: 1.0
ChimeraX-StdCommands: 1.18
ChimeraX-STL: 1.0.1
ChimeraX-Storm: 1.0
ChimeraX-StructMeasure: 1.2.1
ChimeraX-Struts: 1.0.1
ChimeraX-Surface: 1.0.1
ChimeraX-SwapAA: 2.0.1
ChimeraX-SwapRes: 2.5
ChimeraX-TapeMeasure: 1.0
ChimeraX-TaskManager: 1.0
ChimeraX-Test: 1.0
ChimeraX-Toolbar: 1.2.3
ChimeraX-ToolshedUtils: 1.2.4
ChimeraX-Topography: 1.0
ChimeraX-ToQuest: 1.0
ChimeraX-Tug: 1.0.1
ChimeraX-UI: 1.39.8
ChimeraX-Ummbas-Anaglyph: 0.1
ChimeraX-uniprot: 2.3.1
ChimeraX-UnitCell: 1.0.1
ChimeraX-ViewDockX: 1.4.3
ChimeraX-VIPERdb: 1.0
ChimeraX-Vive: 1.1
ChimeraX-VolumeMenu: 1.0.1
ChimeraX-vrml: 1.0
ChimeraX-VTK: 1.0
ChimeraX-WavefrontOBJ: 1.0
ChimeraX-WebCam: 1.0.2
ChimeraX-WebServices: 1.1.4
ChimeraX-Zone: 1.0.1
colorama: 0.4.6
comm: 0.2.2
contourpy: 1.2.1
cxservices: 1.2.2
cycler: 0.12.1
Cython: 3.0.10
debugpy: 1.8.2
decorator: 5.1.1
docutils: 0.20.1
executing: 2.0.1
filelock: 3.13.4
fonttools: 4.53.1
funcparserlib: 2.0.0a0
glfw: 2.7.0
grako: 3.16.5
h5py: 3.11.0
html2text: 2024.2.26
idna: 3.7
ihm: 1.0
imagecodecs: 2024.1.1
imagesize: 1.4.1
ipykernel: 6.29.5
ipython: 8.26.0
ipywidgets: 8.1.3
jedi: 0.19.1
Jinja2: 3.1.4
joblib: 1.4.2
jupyter_client: 8.6.2
jupyter_core: 5.7.2
jupyterlab_widgets: 3.0.11
kiwisolver: 1.4.5
line-profiler: 4.1.2
llvmlite: 0.43.0
lxml: 5.2.1
lz4: 4.3.3
MarkupSafe: 2.1.5
matplotlib: 3.8.4
matplotlib-inline: 0.1.7
msgpack: 1.0.8
nest-asyncio: 1.6.0
netCDF4: 1.6.5
networkx: 3.3
nibabel: 5.2.0
nptyping: 2.5.0
numba: 0.60.0
numexpr: 2.10.1
numpy: 1.26.4
openvr: 1.26.701
packaging: 23.2
ParmEd: 4.2.2
parso: 0.8.4
pep517: 0.13.1
pexpect: 4.9.0
pillow: 10.3.0
pip: 24.1.2
pkginfo: 1.10.0
platformdirs: 4.2.2
prompt_toolkit: 3.0.47
psutil: 5.9.8
ptyprocess: 0.7.0
pure_eval: 0.2.3
py-cpuinfo: 9.0.0
pycollada: 0.8
pydicom: 2.4.4
Pygments: 2.17.2
pynmrstar: 3.3.4
pynndescent: 0.5.13
pynrrd: 1.0.0
PyOpenGL: 3.1.7
PyOpenGL-accelerate: 3.1.7
pyopenxr: 1.0.3401
pyparsing: 3.1.2
pyproject_hooks: 1.1.0
PyQt6: 6.7.0
PyQt6-Qt6: 6.7.2
PyQt6-sip: 13.6.0
PyQt6-WebEngine: 6.7.0
PyQt6-WebEngine-Qt6: 6.7.2
PyQt6-WebEngineSubwheel-Qt6: 6.7.2
python-dateutil: 2.9.0.post0
pytz: 2024.1
pyzmq: 26.0.3
qtconsole: 5.5.2
QtPy: 2.4.1
RandomWords: 0.4.0
requests: 2.32.3
scikit-learn: 1.5.1
scipy: 1.13.0
setuptools: 70.3.0
setuptools-scm: 8.0.4
sfftk-rw: 0.8.1
six: 1.16.0
snowballstemmer: 2.2.0
sortedcontainers: 2.4.0
soupsieve: 2.5
Sphinx: 7.2.6
sphinx-autodoc-typehints: 2.0.1
sphinxcontrib-applehelp: 1.0.8
sphinxcontrib-blockdiag: 3.0.0
sphinxcontrib-devhelp: 1.0.6
sphinxcontrib-htmlhelp: 2.0.6
sphinxcontrib-jsmath: 1.0.1
sphinxcontrib-qthelp: 1.0.8
sphinxcontrib-serializinghtml: 1.1.10
stack-data: 0.6.3
superqt: 0.6.3
tables: 3.8.0
tcia_utils: 1.5.1
threadpoolctl: 3.5.0
tifffile: 2024.1.30
tinyarray: 1.2.4
tornado: 6.4.1
tqdm: 4.66.4
traitlets: 5.14.2
typing_extensions: 4.12.2
tzdata: 2024.1
umap-learn: 0.5.6
urllib3: 2.2.2
wcwidth: 0.2.13
webcolors: 1.13
wheel: 0.43.0
wheel-filename: 1.4.1
widgetsnbextension: 4.0.11
Change History (4)
comment:1 by , 15 months ago
| Component: | Unassigned → Sequence |
|---|---|
| Owner: | set to |
| Platform: | → all |
| Project: | → ChimeraX |
| Status: | new → assigned |
| Summary: | ChimeraX bug report submission → Loading an ESMFold BLAST result gives error "Expected a keyword" |
comment:2 by , 15 months ago
There are a couple possible fixes. I could just fix ChimeraX to get the correct mgnify id. The old ChimeraX versions won't work when loading an esmfold blast hit but new ChimeraX will. I suspect blast esmfold is so rare this is probably a reasonable route, minimizing effort.
Another possibility is to try to recreate what I did before by merging the MGnify fasta title lines with the ESM fasta. That has the virtue that there will probably be a better description of the entries and old ChimeraX will continue to work. Might be at least worth seeing what was in those MGnify fasta sequence titles.
comment:3 by , 15 months ago
I was not able to find any per-sequence descriptions for MGnify which makes sense since it is just sequences from metagenomic sequencing. So I'm not sure there is any additional info beyond the MGYP01234... ID to provide. So I am inclined to just fix the ChimeraX code to expect just that ID in the BLAST hit results.
comment:4 by , 15 months ago
| Resolution: | → fixed |
|---|---|
| Status: | assigned → closed |
Fixed.
Updated ChimeraX BLAST Protein tool to expect just the MGnify ID in the results. This means that older ChimeraX will continue to give an error fetching an esmfold blast result structure. I suspect this feature is almost never used so the fix in the daily build is adequate.
When the BLAST Protein tool tries to load an ESMFold hit it uses the "description" field in the PDB fasta format which is the third "|" separated field. For example here is a PDB fasta format sequence title line:
But the ESMFold fasta sequences here (from ticket https://www.rbvi.ucsf.edu/trac/ChimeraX/ticket/7970#comment:13)
have sequence title lines like
I rebuilt the ESMFold BLAST database today because it was accidentally deleted (ticket #15734). Apparently the original ESMFold BLAST database was built from a different fasta file which had the PDB format. Maybe the original FASTA was the MGnify database fasta. The blast results table shows columns for the MGnify ID and the Name so I guess there were multiple fields.
I have a vague memory that I started with MGnify and only later got the ESMFold fasta and possibly I filtered MGnify using the ESMFold ids.