Opened 2 years ago

Closed 2 years ago

Last modified 2 years ago

#9170 closed defect (fixed)

SMILES needs escape-sequence insertion

Reported by: Tristan Croll Owned by: pett
Priority: normal Milestone:
Component: Input/Output Version:
Keywords: Cc:
Blocked By: Blocking:
Notify when closed: Platform: all
Project: ChimeraX

Description

The following bug report has been submitted:
Platform:        Linux-5.19.0-43-generic-x86_64-with-glibc2.35
ChimeraX Version: 1.6.1 (2023-05-09 17:57:07 UTC)
Description
Trying to create a molecule from a SMILES string with explicit hydrogens didn't go so well, due to mis-parsing of the string at the command-line stage. Trying to fetch it via the `smiles.fetch_smiles` API also failed at first, with the string being rejected by the server. Entering the string directly at https://cactus.nci.nih.gov/translate/ worked, and inspection of the result showed it expected most of the symbols to be escaped. This works:

{{{
def escape_for_html(string):
    string = string.replace('[','%5B')
    string = string.replace(']','%5D')
    string = string.replace('(','%28')
    string = string.replace(')','%29')
    string = string.replace('@','%40')
    string = string.replace('+','%2B')
    return string

smiles = '[H:10][C:1]1([C:2]([C:4]([C:7]([C:5]([C:3]1([H:14])[H:15])([H:18])[H:19])([H:23])[C@:8]([H:24])([C:6]([H:20])([H:21])[H:22])[N+:9]([H:25])([H:26])[H:27])([H:16])[H:17])([H:12])[H:13])[H:11]'

from chimerax.smiles.smiles import fetch_smiles
m = fetch_smiles(session, escape_for_html(smiles))[0]

}}}

... although I'm not sure this example contains *every* character that needs to be escaped.

Log:
> isolde shorthand
    
    
    Initialising ISOLDE-specific command aliases:
    Alias	Equivalent full command
    -------------------------------------------------
    st	isolde step {arguments}
    aw	isolde add water {arguments}
    awsf	isolde add water {arguments} sim false
    al	isolde add ligand {arguments}
    aa	isolde add aa $1 sel {arguments}
    ht	isolde mod his sel {arguments}
    so	setattr sel atoms occupancy {arguments}
    ab	isolde adjust bfactors {arguments}
    ss	isolde sim start sel
    rt	isolde release torsions sel {arguments}
    rd	isolde release distances sel {arguments}
    ra	rd; rt
    pf	isolde pepflip sel
    cf	isolde cisflip sel
    cbb	color bfactor {arguments}
    cbo	color byattr occupancy {arguments}
    cbc	color {arguments} bychain; color {arguments} byhet
    cs	clipper set contourSensitivity {arguments}
    

  
UCSF ChimeraX version: 1.6.1 (2023-05-09)  
© 2016-2023 Regents of the University of California. All rights reserved.  
How to cite UCSF ChimeraX  

> ui tool show Shell

> usage smiles

"smiles" is not a command name  

> open
> b'[H:10][C:1]1([C:2]([C:4]([C:7]([C:5]([C:3]1([H:14])[H:15])([H:18])[H:19])([H:23])[C@:8]([H:24])([C:6]([H:20])([H:21])[H:22])[N+:9]([H:25])([H:26])[H:27])([H:16])[H:17])([H:12])[H:13])[H:11]'
> format smiles

> open
> [H:10][C:1]1([C:2]([C:4]([C:7]([C:5]([C:3]1([H:14])[H:15])([H:18])[H:19])([H:23])[C@:8]([H:24])([C:6]([H:20])([H:21])[H:22])[N+:9]([H:25])([H:26])[H:27])([H:16])[H:17])([H:12])[H:13])[H:11]
> format smiles

[Repeated 1 time(s)]

> usage open format smiles

'smiles' is a database format type; use the command 'usage open database' with
the corresponding database type instead  

> open
> [H:10][C:1]1([C:2]([C:4]([C:7]([C:5]([C:3]1([H:14])[H:15])([H:18])[H:19])([H:23])[C@:8]([H:24])([C:6]([H:20])([H:21])[H:22])[N+:9]([H:25])([H:26])[H:27])([H:16])[H:17])([H:12])[H:13])[H:11]
> from smiles

> usage open database

Missing or invalid "database" argument: Expected one of 'alphafold', 'ccd',
'cellpack', 'cod', 'eds', 'edsdiff', 'emdb', 'emdb_china', 'emdb_europe',
'emdb_japan', 'emdb_us', 'esmfold', 'ftp', 'help', 'http', 'https', 'pcod',
'pdb', 'pdbdev', 'pdbe', 'pdbe_bio', 'pdbj', 'pubchem', 'rcsb_bio', 'smiles',
'tcia', or 'uniprot'  

> usage open smiles

Expected fewer arguments  

> usage open

open names [format format] [fromDatabase fromDatabase] [name a text string]
[format/database-specific arguments]  
— read and display data  
names: file names to open; a name of 'browse' will bring up a file browser  
format: one of aln, amber, amira, apbs, bild, brix, ccd, ccp4, cellpack,
clustal, cmap, cmd, cod, collada, com, compiled python, coords, corecif, cube,
dat, dcd, defattr, delphi, dicom, dock, dsn6, dv, dynamo, emanhdf, fasta,
fchk, fsc, generic_particles, geomodel, gltf, gopenmol, gro, hdf, hssp, html,
ihm, images, imagic, imod, imodmap, ims, log, macmolplt, markers, mmcif, mmtf,
model2point, mol2, mole, motivelist, motl, mrc, msf, mtz, nbo31, nbo47,
netcdfmap, nifti, nrrd, obj, out, pcod, pdb, pdbqt, peet, pfam, photo, pif,
pir, positions, priism, profec, pseudobonds, psf, python, qout, relion, rmf,
rsf, schrodinger maestro, sdf, segger, session, sff, situs, smallcif, smiles,
spider, sqmout, star, stl, stockholm, storm, swc, swissdock, tbl, tom_em, trr,
uhbd, uniprot, viperdb, vtk, web fetch, xplor, xtc, xyz, or zdock  
fromDatabase: one of alphafold, ccd, cellpack, cod, eds, edsdiff, emdb,
emdb_china, emdb_europe, emdb_japan, emdb_us, esmfold, ftp, help, http, https,
pcod, pdb, pdbdev, pdbe, pdbe_bio, pdbj, pubchem, rcsb_bio, smiles, tcia, or
uniprot  
format/database-specific arguments: format- or database-specific arguments; to
see their syntax use 'usage open format format' or 'usage open database
database' commands respectively, where format and database are as per the
above  

> open
> [H:10][C:1]1([C:2]([C:4]([C:7]([C:5]([C:3]1([H:14])[H:15])([H:18])[H:19])([H:23])[C@:8]([H:24])([C:6]([H:20])([H:21])[H:22])[N+:9]([H:25])([H:26])[H:27])([H:16])[H:17])([H:12])[H:13])[H:11]
> format smiles

No such database '[h'  

> open
> [H:10][C:1]1([C:2]([C:4]([C:7]([C:5]([C:3]1([H:14])[H:15])([H:18])[H:19])([H:23])[C@:8]([H:24])([C:6]([H:20])([H:21])[H:22])[N+:9]([H:25])([H:26])[H:27])([H:16])[H:17])([H:12])[H:13])[H:11]
> format smiles

No such database '[h'  

> usage open format smiles

'smiles' is a database format type; use the command 'usage open database' with
the corresponding database type instead  

> usage open smiles

Expected fewer arguments  

> usage open database smiles

open names [ignoreCache true or false] [resName a text string]  
— read and display data  
names: file names to open; a name of 'browse' will bring up a file browser  

> open
> \\[H:10\\]\\[C:1\\]1(\\[C:2\\](\\[C:4\\](\\[C:7\\](\\[C:5\\](\\[C:3\\]1(\\[H:14\\])\\[H:15\\])(\\[H:18\\])\\[H:19\\])(\\[H:23\\])\\[C@:8\\](\\[H:24\\])(\\[C:6\\](\\[H:20\\])(\\[H:21\\])\\[H:22\\])\\[N+:9\\](\\[H:25\\])(\\[H:26\\])\\[H:27\\])(\\[H:16\\])\\[H:17\\])(\\[H:12\\])\\[H:13\\])\\[H:11\\]
> from smiles

> open CCCC fromDatabase smiles

Translated SMILES to 3D structure via NCI web service (SMILES: CCCC)  

> close

> open
> [H:10][C:1]1([C:2]([C:4]([C:7]([C:5]([C:3]1([H:14])[H:15])([H:18])[H:19])([H:23])[C@:8]([H:24])([C:6]([H:20])([H:21])[H:22])[N+:9]([H:25])([H:26])[H:27])([H:16])[H:17])([H:12])[H:13])[H:11]
> from smiles

No such database '[h'  

> open
> r'[H:10][C:1]1([C:2]([C:4]([C:7]([C:5]([C:3]1([H:14])[H:15])([H:18])[H:19])([H:23])[C@:8]([H:24])([C:6]([H:20])([H:21])[H:22])[N+:9]([H:25])([H:26])[H:27])([H:16])[H:17])([H:12])[H:13])[H:11]'
> from smiles

No such database 'r'[h'  
Failed to translate SMILES to 3D structure via NCI web service(SMILES:
[H:10][C:1]1([C:2]([C:4]([C:7]([C:5]([C:3]1([H:14])[H:15])([H:18])[H:19])([H:23])[C@:8]([H:24])([C:6]([H:20])([H:21])[H:22])[N+:9]([H:25])([H:26])[H:27])([H:16])[H:17])([H:12])[H:13])[H:11])  

> open
> %5BH:10%5D%5BC:1%5D1%28%5BC:2%5D%28%5BC:4%5D%28%5BC:7%5D%28%5BC:5%5D%28%5BC:3%5D1%28%5BH:14%5D%29%5BH:15%5D%29%28%5BH:18%5D%29%5BH:19%5D%29%28%5BH:23%5D%29%5BC%40:8%5D%28%5BH:24%5D%29%28%5BC:6%5D%28%5BH:20%5D%29%28%5BH:21%5D%29%5BH:22%5D%29%5BN%2B:9%5D%28%5BH:25%5D%29%28%5BH:26%5D%29%5BH:27%5D%29%28%5BH:16%5D%29%5BH:17%5D%29%28%5BH:12%5D%29%5BH:13%5D%29%5BH:11%5D
> from smiles

[Repeated 1 time(s)]




OpenGL version: 3.3.0 NVIDIA 515.105.01
OpenGL renderer: NVIDIA GeForce RTX 3070/PCIe/SSE2
OpenGL vendor: NVIDIA Corporation

Python: 3.9.11
Locale: en_GB.UTF-8
Qt version: PyQt6 6.4.2, Qt 6.4.2
Qt runtime version: 6.4.3
Qt platform: xcb

XDG_SESSION_TYPE=x11
DESKTOP_SESSION=ubuntu
XDG_SESSION_DESKTOP=ubuntu
XDG_CURRENT_DESKTOP=ubuntu:GNOME
DISPLAY=:1
Manufacturer: Dell Inc.
Model: XPS 8950
OS: Ubuntu 22.04 Jammy Jellyfish
Architecture: 64bit ELF
Virtual Machine: none
CPU: 20 12th Gen Intel(R) Core(TM) i7-12700
Cache Size: 25600 KB
Memory:
	               total        used        free      shared  buff/cache   available
	Mem:            31Gi       6.0Gi       7.6Gi       225Mi        17Gi        24Gi
	Swap:          2.0Gi          0B       2.0Gi

Graphics:
	0000:01:00.0 VGA compatible controller [0300]: NVIDIA Corporation GA104 [GeForce RTX 3070 Lite Hash Rate] [10de:2488] (rev a1)	
	Subsystem: Dell GA104 [GeForce RTX 3070 Lite Hash Rate] [1028:c903]	
	Kernel driver in use: nvidia

Installed Packages:
    alabaster: 0.7.13
    appdirs: 1.4.4
    asttokens: 2.2.1
    Babel: 2.12.1
    backcall: 0.2.0
    beautifulsoup4: 4.11.2
    blockdiag: 3.0.0
    build: 0.10.0
    certifi: 2023.5.7
    cftime: 1.6.2
    charset-normalizer: 3.1.0
    ChimeraX-AddCharge: 1.5.9.1
    ChimeraX-AddH: 2.2.5
    ChimeraX-AlignmentAlgorithms: 2.0.1
    ChimeraX-AlignmentHdrs: 3.3.1
    ChimeraX-AlignmentMatrices: 2.1
    ChimeraX-Alignments: 2.9.3
    ChimeraX-AlphaFold: 1.0
    ChimeraX-AltlocExplorer: 1.0.3
    ChimeraX-AmberInfo: 1.0
    ChimeraX-Arrays: 1.1
    ChimeraX-Atomic: 1.43.10
    ChimeraX-AtomicLibrary: 10.0.6
    ChimeraX-AtomSearch: 2.0.1
    ChimeraX-AxesPlanes: 2.3.2
    ChimeraX-BasicActions: 1.1.2
    ChimeraX-BILD: 1.0
    ChimeraX-BlastProtein: 2.1.2
    ChimeraX-BondRot: 2.0.1
    ChimeraX-BugReporter: 1.0.1
    ChimeraX-BuildStructure: 2.8
    ChimeraX-Bumps: 1.0
    ChimeraX-BundleBuilder: 1.2.2
    ChimeraX-ButtonPanel: 1.0.1
    ChimeraX-CageBuilder: 1.0.1
    ChimeraX-CellPack: 1.0
    ChimeraX-Centroids: 1.3.2
    ChimeraX-ChangeChains: 1.0.2
    ChimeraX-CheckWaters: 1.3.1
    ChimeraX-ChemGroup: 2.0.1
    ChimeraX-Clashes: 2.2.4
    ChimeraX-Clipper: 0.21.0
    ChimeraX-ColorActions: 1.0.3
    ChimeraX-ColorGlobe: 1.0
    ChimeraX-ColorKey: 1.5.3
    ChimeraX-CommandLine: 1.2.5
    ChimeraX-ConnectStructure: 2.0.1
    ChimeraX-Contacts: 1.0.1
    ChimeraX-Core: 1.6.1
    ChimeraX-CoreFormats: 1.1
    ChimeraX-coulombic: 1.4.2
    ChimeraX-Crosslinks: 1.0
    ChimeraX-Crystal: 1.0
    ChimeraX-CrystalContacts: 1.0.1
    ChimeraX-DataFormats: 1.2.3
    ChimeraX-Dicom: 1.2
    ChimeraX-DistMonitor: 1.4
    ChimeraX-DockPrep: 1.1.1
    ChimeraX-Dssp: 2.0
    ChimeraX-EMDB-SFF: 1.0
    ChimeraX-ESMFold: 1.0
    ChimeraX-FileHistory: 1.0.1
    ChimeraX-FunctionKey: 1.0.1
    ChimeraX-Geometry: 1.3
    ChimeraX-gltf: 1.0
    ChimeraX-Graphics: 1.1.1
    ChimeraX-Hbonds: 2.4
    ChimeraX-Help: 1.2.1
    ChimeraX-HKCage: 1.3
    ChimeraX-IHM: 1.1
    ChimeraX-ImageFormats: 1.2
    ChimeraX-IMOD: 1.0
    ChimeraX-IO: 1.0.1
    ChimeraX-ISOLDE: 1.6.0
    ChimeraX-ItemsInspection: 1.0.1
    ChimeraX-Label: 1.1.7
    ChimeraX-LinuxSupport: 1.0.1
    ChimeraX-ListInfo: 1.1.1
    ChimeraX-Log: 1.1.5
    ChimeraX-LookingGlass: 1.1
    ChimeraX-Maestro: 1.8.2
    ChimeraX-Map: 1.1.4
    ChimeraX-MapData: 2.0
    ChimeraX-MapEraser: 1.0.1
    ChimeraX-MapFilter: 2.0.1
    ChimeraX-MapFit: 2.0
    ChimeraX-MapSeries: 2.1.1
    ChimeraX-Markers: 1.0.1
    ChimeraX-Mask: 1.0.2
    ChimeraX-MatchMaker: 2.0.12
    ChimeraX-MDcrds: 2.6
    ChimeraX-MedicalToolbar: 1.0.2
    ChimeraX-Meeting: 1.0.1
    ChimeraX-MLP: 1.1.1
    ChimeraX-mmCIF: 2.12
    ChimeraX-MMTF: 2.2
    ChimeraX-Modeller: 1.5.9
    ChimeraX-ModelPanel: 1.3.7
    ChimeraX-ModelSeries: 1.0.1
    ChimeraX-Mol2: 2.0
    ChimeraX-Mole: 1.0
    ChimeraX-Morph: 1.0.2
    ChimeraX-MouseModes: 1.2
    ChimeraX-Movie: 1.0
    ChimeraX-Neuron: 1.0
    ChimeraX-Nifti: 1.0
    ChimeraX-NRRD: 1.0
    ChimeraX-Nucleotides: 2.0.3
    ChimeraX-OpenCommand: 1.10.1
    ChimeraX-PDB: 2.7.2
    ChimeraX-PDBBio: 1.0
    ChimeraX-PDBLibrary: 1.0.2
    ChimeraX-PDBMatrices: 1.0
    ChimeraX-PickBlobs: 1.0.1
    ChimeraX-Positions: 1.0
    ChimeraX-PresetMgr: 1.1
    ChimeraX-PubChem: 2.1
    ChimeraX-QScore: 1.0
    ChimeraX-ReadPbonds: 1.0.1
    ChimeraX-Registration: 1.1.1
    ChimeraX-RemoteControl: 1.0
    ChimeraX-RenderByAttr: 1.1
    ChimeraX-RenumberResidues: 1.1
    ChimeraX-ResidueFit: 1.0.1
    ChimeraX-RestServer: 1.1
    ChimeraX-RNALayout: 1.0
    ChimeraX-RotamerLibMgr: 3.0
    ChimeraX-RotamerLibsDunbrack: 2.0
    ChimeraX-RotamerLibsDynameomics: 2.0
    ChimeraX-RotamerLibsRichardson: 2.0
    ChimeraX-SaveCommand: 1.5.1
    ChimeraX-SchemeMgr: 1.0
    ChimeraX-SDF: 2.0.1
    ChimeraX-Segger: 1.0
    ChimeraX-Segment: 1.0.1
    ChimeraX-SelInspector: 1.0
    ChimeraX-SeqView: 2.8.3
    ChimeraX-Shape: 1.0.1
    ChimeraX-Shell: 1.0.1
    ChimeraX-Shortcuts: 1.1.1
    ChimeraX-ShowSequences: 1.0.1
    ChimeraX-SideView: 1.0.1
    ChimeraX-Smiles: 2.1
    ChimeraX-SmoothLines: 1.0
    ChimeraX-SpaceNavigator: 1.0
    ChimeraX-StdCommands: 1.10.3
    ChimeraX-STL: 1.0.1
    ChimeraX-Storm: 1.0
    ChimeraX-StructMeasure: 1.1.2
    ChimeraX-Struts: 1.0.1
    ChimeraX-Surface: 1.0.1
    ChimeraX-SwapAA: 2.0.1
    ChimeraX-SwapRes: 2.2.1
    ChimeraX-TapeMeasure: 1.0
    ChimeraX-Test: 1.0
    ChimeraX-Toolbar: 1.1.2
    ChimeraX-ToolshedUtils: 1.2.1
    ChimeraX-Topography: 1.0
    ChimeraX-Tug: 1.0.1
    ChimeraX-UI: 1.28.4
    ChimeraX-uniprot: 2.2.2
    ChimeraX-UnitCell: 1.0.1
    ChimeraX-ViewDockX: 1.2
    ChimeraX-VIPERdb: 1.0
    ChimeraX-Vive: 1.1
    ChimeraX-VolumeMenu: 1.0.1
    ChimeraX-VTK: 1.0
    ChimeraX-WavefrontOBJ: 1.0
    ChimeraX-WebCam: 1.0.2
    ChimeraX-WebServices: 1.1.1
    ChimeraX-Zone: 1.0.1
    colorama: 0.4.6
    comm: 0.1.3
    contourpy: 1.0.7
    cxservices: 1.2.2
    cycler: 0.11.0
    Cython: 0.29.33
    debugpy: 1.6.7
    decorator: 5.1.1
    distro: 1.7.0
    docutils: 0.19
    executing: 1.2.0
    filelock: 3.9.0
    fonttools: 4.39.3
    funcparserlib: 1.0.1
    grako: 3.16.5
    h5py: 3.8.0
    html2text: 2020.1.16
    idna: 3.4
    ihm: 0.35
    imagecodecs: 2022.9.26
    imagesize: 1.4.1
    importlib-metadata: 6.6.0
    ipykernel: 6.21.1
    ipython: 8.10.0
    ipython-genutils: 0.2.0
    ipywidgets: 8.0.6
    jedi: 0.18.2
    Jinja2: 3.1.2
    jupyter-client: 8.0.2
    jupyter-core: 5.3.0
    jupyterlab-widgets: 3.0.7
    kiwisolver: 1.4.4
    line-profiler: 4.0.2
    lxml: 4.9.2
    lz4: 4.3.2
    MarkupSafe: 2.1.2
    matplotlib: 3.6.3
    matplotlib-inline: 0.1.6
    msgpack: 1.0.4
    nest-asyncio: 1.5.6
    netCDF4: 1.6.2
    networkx: 2.8.8
    nibabel: 5.0.1
    nptyping: 2.5.0
    numexpr: 2.8.4
    numpy: 1.23.5
    openvr: 1.23.701
    packaging: 23.1
    ParmEd: 3.4.3
    parso: 0.8.3
    pep517: 0.13.0
    pexpect: 4.8.0
    pickleshare: 0.7.5
    Pillow: 9.3.0
    pip: 23.0
    pkginfo: 1.9.6
    platformdirs: 3.5.0
    prompt-toolkit: 3.0.38
    psutil: 5.9.4
    ptyprocess: 0.7.0
    pure-eval: 0.2.2
    pycollada: 0.7.2
    pydicom: 2.3.0
    Pygments: 2.14.0
    pynrrd: 1.0.0
    PyOpenGL: 3.1.5
    PyOpenGL-accelerate: 3.1.5
    pyparsing: 3.0.9
    pyproject-hooks: 1.0.0
    PyQt6-commercial: 6.4.2
    PyQt6-Qt6: 6.4.3
    PyQt6-sip: 13.4.1
    PyQt6-WebEngine-commercial: 6.4.0
    PyQt6-WebEngine-Qt6: 6.4.3
    python-dateutil: 2.8.2
    pytz: 2023.3
    pyzmq: 25.0.2
    qtconsole: 5.4.0
    QtPy: 2.3.1
    RandomWords: 0.4.0
    rdkit-pypi: 2022.9.2
    requests: 2.28.2
    scipy: 1.9.3
    setuptools: 67.4.0
    sfftk-rw: 0.7.3
    six: 1.16.0
    snowballstemmer: 2.2.0
    sortedcontainers: 2.4.0
    soupsieve: 2.4.1
    sphinx: 6.1.3
    sphinx-autodoc-typehints: 1.22
    sphinxcontrib-applehelp: 1.0.4
    sphinxcontrib-blockdiag: 3.0.0
    sphinxcontrib-devhelp: 1.0.2
    sphinxcontrib-htmlhelp: 2.0.1
    sphinxcontrib-jsmath: 1.0.1
    sphinxcontrib-qthelp: 1.0.3
    sphinxcontrib-serializinghtml: 1.1.5
    stack-data: 0.6.2
    tables: 3.7.0
    tcia-utils: 1.2.0
    tifffile: 2022.10.10
    tinyarray: 1.2.4
    tomli: 2.0.1
    tornado: 6.3.1
    traitlets: 5.9.0
    typing-extensions: 4.5.0
    tzdata: 2023.3
    urllib3: 1.26.15
    wcwidth: 0.2.6
    webcolors: 1.12
    wheel: 0.38.4
    wheel-filename: 1.4.1
    widgetsnbextension: 4.0.7
    zipp: 3.15.0

Change History (9)

comment:1 by pett, 2 years ago

Component: UnassignedInput/Output
Owner: set to pett
Platform: all
Project: ChimeraX
Status: newaccepted
Summary: ChimeraX bug report submissionSMILES needs escape-sequence insertion

comment:2 by pett, 2 years ago

Resolution: fixed
Status: acceptedclosed

The SMILES code already did some bespoke quoting of the SMILES string before sending it. Now uses the more comprehensive urllib.parse.quote() function to do the quoting.

Fix: https://github.com/RBVI/ChimeraX/commit/fd5df299e02ba0bce4d090374355cb3cf7112636

comment:3 by Tristan Croll, 2 years ago

I figured there must be something like that, but my google-fu only got me
as far as `html.escape()` which did nothing. :)

A follow-on problem: a particularly long SMILES string leads to failure
because the temp file name is too long (I guess mostly because I'm
currently passing the pre-processed string to `fetch_smiles()` rather than
modified the method to process it internally... but it could in theory
happen regardless).

[Errno 36] File name too long:
'/tmp/tmp2nfd85co%5BH%3A24%5D%5Bc%3A9%5D1%5Bc%3A10%5D%28%5Bc%3A14%5D2%5Bc%3A13%5D%28%5Bc%3A12%5D%28%5Bc%3A11%5D1%5BO%3A7%5D%5BC%3A16%5D%28%5BH%3A27%5D%29%28%5BH%3A28%5D%29%5BC%3A20%5D3%28%5BC%3A18%5D%28%5BC%3A19%5D3%28%5BH%3A32%5D%29%5BH%3A33%5D%29%28%5BH%3A30%5D%29%5BH%3A31%5D%29%5BC%3A6%5D%23%5BN%3A1%5D%29%5BC%3A15%5D%28%5BH%3A26%5D%29%28%5BF%3A4%5D%29%5BF%3A5%5D%29%5BC%3A17%5D%28%5BC%3A21%5D%28%5BS%3A22%5D2%28%3D%5BO%3A2%5D%29%3D%5BO%3A3%5D%29%28%5BH%3A34%5D%29%5BH%3A35%5D%29%28%5BH%3A29%5D%29%5BO%3A8%5D%5BH%3A23%5D%29%5BH%3A25%5D'

On Mon, Jun 12, 2023 at 10:23 PM ChimeraX <ChimeraX-bugs-admin@cgl.ucsf.edu>
wrote:

>
>
>
>
>
>

comment:4 by pett, 2 years ago

Is there a stack trace that goes along with that error? My reading of the code suggests that it is urllib.request.urlopen() using a temp file under the hood -- which would be difficult for me to do anything about. But that's somewhat guesswork and a stack trace would clarify whether my analysis was in fact correct...

comment:5 by Tristan Croll, 2 years ago

That was running in a loop where I was just catching and reporting the text
of any errors before continuing, so I’m afraid not. But can generate it
tomorrow. Was nit much more than a call to `fetch_smiles(session,
smiles=already_quoted_string)`.

On Tue, 13 Jun 2023 at 19:00, ChimeraX <ChimeraX-bugs-admin@cgl.ucsf.edu>
wrote:

>
>

comment:6 by pett, 2 years ago

Well, if my guess was wrong then the stack trace will point me in the right direction! :-)

comment:7 by Tristan Croll, 2 years ago

Sorry it took a little longer than expected to get back to this, but here
you go:

{{{
smiles_str =
'''%5BH%3A32%5D%5Bc%3A14%5D1%5Bc%3A17%5D%28%5Bc%3A18%5D2%5Bc%3A19%5D%28%5
Bc%3A15%5D%28%5Bc%3A16%5D1%5BO%3A5%5D%5BH%3A27%5D%29%5BCl%3A3%5D%29%5BC%3A21%5D%
28%2F%5BC%3A13%5D%28%3D%5BC%3A12%5D%28%5C%5BC%3A20%5D%28%2F%5BC%3A10%5D%28%3D%5B
C%3A11%5D%28%5C%5BC%40%40%3A25%5D3%28%5BC%40%3A26%5D%28%5BO%3A8%5D3%29%28%5BC%3A
23%5D%28%5BC%40%40%3A24%5D%28%5BO%3A7%5D%5BC%3A9%5D2%3D%5BO%3A2%5D%29%28%5BH%3A4
2%5D%29%5BC%3A22%5D%28%5BH%3A37%5D%29%28%5BH%3A38%5D%29%5BH%3A39%5D%29%28%5BH%3A
40%5D%29%5BH%3A41%5D%29%5BH%3A44%5D%29%5BH%3A43%5D%29%2F%5BH%3A30%5D%29%2F%5BH%3
A29%5D%29%28%5BH%3A33%5D%29%5BH%3A34%5D%29%2F%5BH%3A31%5D%29%2F%5BN%3A4%5D%3D%5B
O%3A1%5D%29%28%5BH%3A35%5D%29%5BH%3A36%5D%29%5BO%3A6%5D%5BH%3A28%5D'''

fetch_smiles(session, smiles_str)
---------------------------------------------------------------------------
OSError                                   Traceback (most recent call last)
Cell In[5], line 1
----> 1 fetch_smiles(session, smiles_str)

File
/usr/lib/ucsf-chimerax/lib/python3.9/site-packages/chimerax/smiles/smiles.py:39,
in fetch_smiles(session, smiles_string, res_name, **kw)
     37 for fetcher, moniker, ack_name, info_url in fetcher_info:
     38     try:
---> 39         path = fetcher(session, smiles_string, web_smiles)
     40     except SmilesTranslationError:
     41         pass

File
/usr/lib/ucsf-chimerax/lib/python3.9/site-packages/chimerax/smiles/smiles.py:81,
in _indiana_fetch(session, smiles, web_smiles)
     79 from chimerax.core.fetch import fetch_file
     80 import os
---> 81 filename = fetch_file(session, "
http://cheminfov.informatics.indiana.edu/rest/thread/d3.py/"
     82     "SMILES/%s" % smiles, 'SMILES %s' % smiles, web_smiles, None)
     83 return filename

File
/usr/lib/ucsf-chimerax/lib/python3.9/site-packages/chimerax/core/fetch.py:79,
in fetch_file(session, url, name, save_name, save_dir, uncompress,
transmit_compressed, ignore_cache, check_certificates, timeout,
error_status)
     77 if save_dir is None:
     78     import tempfile
---> 79     f = tempfile.NamedTemporaryFile(suffix=save_name)
     80     filename = f.name
     81     f.close()

File /usr/lib/ucsf-chimerax/lib/python3.9/tempfile.py:545, in
NamedTemporaryFile(mode, buffering, encoding, newline, suffix, prefix, dir,
delete, errors)
    542 if _os.name == 'nt' and delete:
    543     flags |= _os.O_TEMPORARY
--> 545 (fd, name) = _mkstemp_inner(dir, prefix, suffix, flags, output_type)
    546 try:
    547     file = _io.open(fd, mode, buffering=buffering,
    548                     newline=newline, encoding=encoding,
errors=errors)

File /usr/lib/ucsf-chimerax/lib/python3.9/tempfile.py:255, in
_mkstemp_inner(dir, pre, suf, flags, output_type)
    253 _sys.audit("tempfile.mkstemp", file)
    254 try:
--> 255     fd = _os.open(file, flags, 0o600)
    256 except FileExistsError:
    257     continue    # try again

OSError: [Errno 36] File name too long:
'/tmp/tmp28e4vnq0%5BH%3A32%5D%5Bc%3A14%5D1%5Bc%3A17%5D%28%5Bc%3A18%5D2%5Bc%3A19%5D%28%5Bc%3A15%5D%28%5Bc%3A16%5D1%5BO%3A5%5D%5BH%3A27%5D%29%5BCl%3A3%5D%29%5BC%3A21%5D%28%2F%5BC%3A13%5D%28%3D%5BC%3A12%5D%28%5C%5BC%3A20%5D%28%2F%5BC%3A10%5D%28%3D%5BC%3A11%5D%28%5C%5BC%40%40%3A25%5D3%28%5BC%40%3A26%5D%28%5BO%3A8%5D3%29%28%5BC%3A23%5D%28%5BC%40%40%3A24%5D%28%5BO%3A7%5D%5BC%3A9%5D2%3D%5BO%3A2%5D%29%28%5BH%3A42%5D%29%5BC%3A22%5D%28%5BH%3A37%5D%29%28%5BH%3A38%5D%29%5BH%3A39%5D%29%28%5BH%3A40%5D%29%5BH%3A41%5D%29%5BH%3A44%5D%29%5BH%3A43%5D%29%2F%5BH%3A30%5D%29%2F%5BH%3A29%5D%29%28%5BH%3A33%5D%29%5BH%3A34%5D%29%2F%5BH%3A31%5D%29%2F%5BN%3A4%5D%3D%5BO%3A1%5D%29%28%5BH%3A35%5D%29%5BH%3A36%5D%29%5BO%3A6%5D%5BH%3A28%5D'
}}}

On Tue, Jun 13, 2023 at 7:07 PM ChimeraX <ChimeraX-bugs-admin@cgl.ucsf.edu>
wrote:

>
>

comment:8 by pett, 2 years ago

Well, two things:

(1) This is a limitation of tempfile.NamedTemporaryFile() creating a long file name on Windows. I've now truncated the file-name parameter passed into the fetch_file() routine (that calls NamedTemporaryFile()) since we are not going to cache the result.

(2) The value passed into fetch_smiles() should be the normal unquoted SMILES string, since fetch_smiles() will do its own quoting and you don't want two layers of quoting.

comment:9 by Tristan Croll, 2 years ago

Well... it will now. :) It didn't in the version I was using. I realise
it's much, much less likely to hit the filename length limit after that
change.

On Wed, Jun 21, 2023 at 9:32 PM ChimeraX <ChimeraX-bugs-admin@cgl.ucsf.edu>
wrote:

>
>
>
>
Note: See TracTickets for help on using tickets.