Opened 4 years ago

Closed 4 years ago

#6026 closed defect (nonchimerax)

Incomplete retrieval of AlphaFold prediction

Reported by: thomas.lenz@… Owned by: Tom Goddard
Priority: normal Milestone:
Component: Structure Prediction Version:
Keywords: Cc:
Blocked By: Blocking:
Notify when closed: Platform: all
Project: ChimeraX

Description

The following bug report has been submitted:
Platform:        Windows-10-10.0.14393
ChimeraX Version: 1.4.dev202201260121 (2022-01-26 01:21:26 UTC)
Description
ChimeraX 1.4.dev202201260121
Tools --> Structure Prediction --> AlphaFold
Sequence: Uniprot identifier: Q16539,P49137
--> Predict

sequences:
MSQERPTFYRQELNKTIWEVPERYQNLSPVGSGAYGSVCAAFDTKTGLRVAVKKLSRPFQSIIHAKRTYRELRLLKHMKHENVIGLLDVFTPARSLEEFNDVYLVTHLMGADLNNIVKCQKLTDDHVQFLIYQILRGLKYIHSADIIHRDLKPSNLAVNEDCELKILDFGLARHTDDEMTGYVATRWYRAPEIMLNWMHYNQTVDIWSVGCIMAELLTGRTLFPGTDHIDQLKLILRLVGTPGAELLKKISSESARNYIQSLTQMPKMNFANVFIGANPLAVDLLEKMLVLDSDKRITAAQALAHAYFAQYHDPDDEPVADPYDQSFESRDLLIDEWKSLTYDEVISFVPPPLDQEEMES,
MLSNSQGQSPPVPFPAPAPPPQPPTPALPHPPAQPPPPPPQQFPQFHVKSGLQIKKNAIIDDYKVTSQVLGLGINGKVLQIFNKRTQEKFALKMLQDCPKARREVELHWRASQCPHIVRIVDVYENLYAGRKCLLIVMECLDGGELFSRIQDRGDQAFTEREASEIMKSIGEAIQYLHSINIAHRDVKPENLLYTSKRPNAILKLTDFGFAKETTSHNSLTTPCYTPYYVAPEVLGPEKYDKSCDMWSLGVIMYILLCGYPPFYSNHGLAISPGMKTRIRMGQYEFPNPEWSEVSEEVKMLIRNLLKTEPTQRMTITEFMNHPWIMQSTKVPQTPLHTSRVLKEDKERWEDVKEEMTSALATMRVDYEQIKIKKIEDASNPLLLKRRKKARALEAAALAH

2 sequences, total length 760 (= 360 + 400)
Have Colab GPU runtime
Searching sequence databases (245 Gbytes).
Search will take 49 minutes or more.
Finding fastest mirror for sequence databases using europe
Searching uniprot sequence database, 98 Gbytes
 1 2 3 4 5
---------------------------------------------------------------------------
ContentTooShortError                      Traceback (most recent call last)
<ipython-input-2-477933d0a925> in <module>()
    649     seq_list = seq_list[1:]
    650 
--> 651 run_prediction(seq_list, is_prokaryote=is_prokaryote)

8 frames
/usr/lib/python3.7/urllib/request.py in urlretrieve(url, filename, reporthook, data)
    286         raise ContentTooShortError(
    287             "retrieval incomplete: got only %i out of %i bytes"
--> 288             % (read, size), result)
    289 
    290     return result

ContentTooShortError: <urlopen error retrieval incomplete: got only 838709867 out of 1073742050 bytes>

Log:
Startup Messages  
---  
note | available bundle cache has not been initialized yet  
  
UCSF ChimeraX version: 1.4.dev202201260121 (2022-01-26)  
© 2016-2021 Regents of the University of California. All rights reserved.  
How to cite UCSF ChimeraX  

> ui tool show AlphaFold

> alphafold match Q16539,P49137

Fetching AlphaFold database settings from
https://www.rbvi.ucsf.edu/chimerax/data/status/alphafold_database.json  
Fetching compressed AlphaFold Q16539 from
https://alphafold.ebi.ac.uk/files/AF-Q16539-F1-model_v2.cif  
Fetching compressed AlphaFold P49137 from
https://alphafold.ebi.ac.uk/files/AF-P49137-F1-model_v2.cif  
2 AlphaFold models found using UniProt identifiers: Q16539 (UniProt Q16539),
P49137 (UniProt P49137)  
Opened 2 AlphaFold models  

> alphafold predict Q16539,P49137

Running AlphaFold prediction  
[Repeated 2 time(s)]

> color bfactor #1 palette alphafold

2907 atoms, 360 residues, atom bfactor range 27.1 to 98.9  

> hide #2 models

> color bfactor #2 palette alphafold

3202 atoms, 400 residues, atom bfactor range 30.9 to 98.8  

> hide #1 models

> show #2 models

> matchmaker #2 #1

Missing required "to" argument  

> matchmaker #2 to #1

Parameters  
---  
Chain pairing | bb  
Alignment algorithm | Needleman-Wunsch  
Similarity matrix | BLOSUM-62  
SS fraction | 0.3  
Gap open (HH/SS/other) | 18/18/6  
Gap extend | 1  
SS matrix |  |  | H | S | O  
---|---|---|---  
H | 6 | -9 | -6  
S |  | 6 | -6  
O |  |  | 4  
Iteration cutoff | 2  
  
Matchmaker AlphaFold Q16539, chain A (#1) with AlphaFold P49137, chain A (#2),
sequence alignment score = 365.6  
RMSD between 119 pruned atom pairs is 1.085 angstroms; (across all 314 pairs:
18.825)  
  

> show #1 models

> hide #1 models

> show #1 models

> hide #2 models

> show #2 models

> hide #2 models

> show #2 models

> hide #2 models

> show #2 models

> hide #2 models

> color #1 blue

> color #2 red

> show #2 models

> load Q16644

Unknown command: load Q16644  

> open Q16644

'Q16644' has no suffix  
Fetching Q16644 UniProt info from https://www.uniprot.org/uniprot/Q16644.xml  

> alphafold match Q16644

Fetching compressed AlphaFold Q16644 from
https://alphafold.ebi.ac.uk/files/AF-Q16644-F1-model_v2.cif  
1 AlphaFold model found using UniProt identifier: Q16644 (UniProt Q16644)  
Opened 1 AlphaFold model  
Fetching Q15746 UniProt info from https://www.uniprot.org/uniprot/Q15746.xml  

> alphafold match Q15746

Fetching compressed AlphaFold Q15746 from
https://alphafold.ebi.ac.uk/files/AF-Q15746-F1-model_v2.cif  
1 AlphaFold model found using UniProt identifier: Q15746 (UniProt Q15746)  
Opened 1 AlphaFold model  

> color #3 green

> color #4 yellow

> matchmaker #3 to #1

Parameters  
---  
Chain pairing | bb  
Alignment algorithm | Needleman-Wunsch  
Similarity matrix | BLOSUM-62  
SS fraction | 0.3  
Gap open (HH/SS/other) | 18/18/6  
Gap extend | 1  
SS matrix |  |  | H | S | O  
---|---|---|---  
H | 6 | -9 | -6  
S |  | 6 | -6  
O |  |  | 4  
Iteration cutoff | 2  
  
Matchmaker AlphaFold Q16539, chain A (#1) with AlphaFold Q16644, chain A (#3),
sequence alignment score = 378.5  
RMSD between 135 pruned atom pairs is 1.136 angstroms; (across all 309 pairs:
17.338)  
  

> matchmaker #4 to #1

Parameters  
---  
Chain pairing | bb  
Alignment algorithm | Needleman-Wunsch  
Similarity matrix | BLOSUM-62  
SS fraction | 0.3  
Gap open (HH/SS/other) | 18/18/6  
Gap extend | 1  
SS matrix |  |  | H | S | O  
---|---|---|---  
H | 6 | -9 | -6  
S |  | 6 | -6  
O |  |  | 4  
Iteration cutoff | 2  
  
Matchmaker AlphaFold Q16539, chain A (#1) with AlphaFold Q15746, chain A (#4),
sequence alignment score = 414  
RMSD between 124 pruned atom pairs is 0.970 angstroms; (across all 306 pairs:
20.792)  
  

> hide cartoons

> show cartoons

> hide atoms

> show atoms

> hide atoms

> hide #1 models

> hide #4 models

Drag select of 5 residues  
Drag select of 28 residues  

> select #3

3016 atoms, 3084 bonds, 382 residues, 1 model selected  

> ~select #3

Nothing selected  

> hide #3 models

> show #1 models

> show #4 models

> hide #2 models

> hide #1 models

> show #1 models

> hide #1 models

> show #2 models

> show #1 models

> hide #4 models

> show #4 models

> hide #4 models

> show #3 models

> show #4 models

> alphafold predict Q16539,P49137

Running AlphaFold prediction  

> help help:credits.html




OpenGL version: 3.3.0 NVIDIA 353.62
OpenGL renderer: GeForce GT 630M/PCIe/SSE2
OpenGL vendor: NVIDIA Corporation

Locale: de_DE.cp1252
Qt version: PyQt5 5.15.2, Qt 5.15.2
Qt platform: windows

Manufacturer: Dell Inc.
Model: XPS L421X
OS: Microsoft Windows 10 Enterprise 2016 LTSB (Build 14393)
Memory: 8,461,746,176
MaxProcessMemory: 137,438,953,344
CPU: 4 Intel(R) Core(TM) i7-3667U CPU @ 2.00GHz
OSLanguage: de-DE

Installed Packages:
    alabaster: 0.7.12
    appdirs: 1.4.4
    Babel: 2.9.1
    backcall: 0.2.0
    blockdiag: 3.0.0
    certifi: 2021.10.8
    cftime: 1.5.2
    charset-normalizer: 2.0.10
    ChimeraX-AddCharge: 1.2.2
    ChimeraX-AddH: 2.1.11
    ChimeraX-AlignmentAlgorithms: 2.0
    ChimeraX-AlignmentHdrs: 3.2
    ChimeraX-AlignmentMatrices: 2.0
    ChimeraX-Alignments: 2.2.3
    ChimeraX-AlphaFold: 1.0
    ChimeraX-AltlocExplorer: 1.0.1
    ChimeraX-AmberInfo: 1.0
    ChimeraX-Arrays: 1.0
    ChimeraX-Atomic: 1.33.1
    ChimeraX-AtomicLibrary: 5.0
    ChimeraX-AtomSearch: 2.0
    ChimeraX-AtomSearchLibrary: 1.0
    ChimeraX-AxesPlanes: 2.1
    ChimeraX-BasicActions: 1.1
    ChimeraX-BILD: 1.0
    ChimeraX-BlastProtein: 2.0
    ChimeraX-BondRot: 2.0
    ChimeraX-BugReporter: 1.0
    ChimeraX-BuildStructure: 2.6.1
    ChimeraX-Bumps: 1.0
    ChimeraX-BundleBuilder: 1.1
    ChimeraX-ButtonPanel: 1.0
    ChimeraX-CageBuilder: 1.0
    ChimeraX-CellPack: 1.0
    ChimeraX-Centroids: 1.2
    ChimeraX-ChemGroup: 2.0
    ChimeraX-Clashes: 2.2.2
    ChimeraX-ColorActions: 1.0
    ChimeraX-ColorGlobe: 1.0
    ChimeraX-ColorKey: 1.5.1
    ChimeraX-CommandLine: 1.2
    ChimeraX-ConnectStructure: 2.0
    ChimeraX-Contacts: 1.0
    ChimeraX-Core: 1.4.dev202201260121
    ChimeraX-CoreFormats: 1.1
    ChimeraX-coulombic: 1.3.2
    ChimeraX-Crosslinks: 1.0
    ChimeraX-Crystal: 1.0
    ChimeraX-CrystalContacts: 1.0
    ChimeraX-DataFormats: 1.2.2
    ChimeraX-Dicom: 1.0
    ChimeraX-DistMonitor: 1.1.5
    ChimeraX-Dssp: 2.0
    ChimeraX-EMDB-SFF: 1.0
    ChimeraX-ExperimentalCommands: 1.0
    ChimeraX-FileHistory: 1.0
    ChimeraX-FunctionKey: 1.0
    ChimeraX-Geometry: 1.1
    ChimeraX-gltf: 1.0
    ChimeraX-Graphics: 1.1
    ChimeraX-Hbonds: 2.1.2
    ChimeraX-Help: 1.2
    ChimeraX-HKCage: 1.3
    ChimeraX-IHM: 1.1
    ChimeraX-ImageFormats: 1.2
    ChimeraX-IMOD: 1.0
    ChimeraX-IO: 1.0.1
    ChimeraX-ItemsInspection: 1.0
    ChimeraX-Label: 1.1
    ChimeraX-ListInfo: 1.1.1
    ChimeraX-Log: 1.1.5
    ChimeraX-LookingGlass: 1.1
    ChimeraX-Maestro: 1.8.1
    ChimeraX-Map: 1.1
    ChimeraX-MapData: 2.0
    ChimeraX-MapEraser: 1.0
    ChimeraX-MapFilter: 2.0
    ChimeraX-MapFit: 2.0
    ChimeraX-MapSeries: 2.1
    ChimeraX-Markers: 1.0
    ChimeraX-Mask: 1.0
    ChimeraX-MatchMaker: 2.0.6
    ChimeraX-MDcrds: 2.6
    ChimeraX-MedicalToolbar: 1.0.1
    ChimeraX-Meeting: 1.0
    ChimeraX-MLP: 1.1
    ChimeraX-mmCIF: 2.5
    ChimeraX-MMTF: 2.1
    ChimeraX-Modeller: 1.5.1
    ChimeraX-ModelPanel: 1.3.1
    ChimeraX-ModelSeries: 1.0
    ChimeraX-Mol2: 2.0
    ChimeraX-Morph: 1.0
    ChimeraX-MouseModes: 1.1
    ChimeraX-Movie: 1.0
    ChimeraX-Neuron: 1.0
    ChimeraX-Nucleotides: 2.0.2
    ChimeraX-OpenCommand: 1.8
    ChimeraX-PDB: 2.6.5
    ChimeraX-PDBBio: 1.0
    ChimeraX-PDBLibrary: 1.0.2
    ChimeraX-PDBMatrices: 1.0
    ChimeraX-PickBlobs: 1.0
    ChimeraX-Positions: 1.0
    ChimeraX-PresetMgr: 1.1
    ChimeraX-PubChem: 2.1
    ChimeraX-ReadPbonds: 1.0.1
    ChimeraX-Registration: 1.1
    ChimeraX-RemoteControl: 1.0
    ChimeraX-ResidueFit: 1.0
    ChimeraX-RestServer: 1.1
    ChimeraX-RNALayout: 1.0
    ChimeraX-RotamerLibMgr: 2.0.1
    ChimeraX-RotamerLibsDunbrack: 2.0
    ChimeraX-RotamerLibsDynameomics: 2.0
    ChimeraX-RotamerLibsRichardson: 2.0
    ChimeraX-SaveCommand: 1.5
    ChimeraX-SchemeMgr: 1.0
    ChimeraX-SDF: 2.0
    ChimeraX-Segger: 1.0
    ChimeraX-Segment: 1.0
    ChimeraX-SelInspector: 1.0
    ChimeraX-SeqView: 2.4.6
    ChimeraX-Shape: 1.0.1
    ChimeraX-Shell: 1.0
    ChimeraX-Shortcuts: 1.1
    ChimeraX-ShowAttr: 1.0
    ChimeraX-ShowSequences: 1.0
    ChimeraX-SideView: 1.0
    ChimeraX-Smiles: 2.1
    ChimeraX-SmoothLines: 1.0
    ChimeraX-SpaceNavigator: 1.0
    ChimeraX-StdCommands: 1.7.4
    ChimeraX-STL: 1.0
    ChimeraX-Storm: 1.0
    ChimeraX-StructMeasure: 1.0.1
    ChimeraX-Struts: 1.0.1
    ChimeraX-Surface: 1.0
    ChimeraX-SwapAA: 2.0
    ChimeraX-SwapRes: 2.1.1
    ChimeraX-TapeMeasure: 1.0
    ChimeraX-Test: 1.0
    ChimeraX-Toolbar: 1.1
    ChimeraX-ToolshedUtils: 1.2.1
    ChimeraX-Tug: 1.0
    ChimeraX-UI: 1.16
    ChimeraX-uniprot: 2.2
    ChimeraX-UnitCell: 1.0
    ChimeraX-ViewDockX: 1.0.1
    ChimeraX-VIPERdb: 1.0
    ChimeraX-Vive: 1.1
    ChimeraX-VolumeMenu: 1.0
    ChimeraX-VTK: 1.0
    ChimeraX-WavefrontOBJ: 1.0
    ChimeraX-WebCam: 1.0
    ChimeraX-WebServices: 1.0
    ChimeraX-Zone: 1.0
    colorama: 0.4.4
    comtypes: 1.1.10
    cxservices: 1.1
    cycler: 0.11.0
    Cython: 0.29.26
    debugpy: 1.5.1
    decorator: 5.1.1
    docutils: 0.17.1
    entrypoints: 0.3
    filelock: 3.4.2
    fonttools: 4.29.0
    funcparserlib: 1.0.0a0
    grako: 3.16.5
    h5py: 3.6.0
    html2text: 2020.1.16
    idna: 3.3
    ihm: 0.26
    imagecodecs: 2021.11.20
    imagesize: 1.3.0
    ipykernel: 6.6.1
    ipython: 7.31.1
    ipython-genutils: 0.2.0
    jedi: 0.18.1
    Jinja2: 3.0.3
    jupyter-client: 7.1.0
    jupyter-core: 4.9.1
    kiwisolver: 1.3.2
    line-profiler: 3.4.0
    lxml: 4.7.1
    lz4: 3.1.10
    MarkupSafe: 2.0.1
    matplotlib: 3.5.1
    matplotlib-inline: 0.1.3
    msgpack: 1.0.3
    nest-asyncio: 1.5.4
    netCDF4: 1.5.8
    networkx: 2.6.3
    numexpr: 2.8.1
    numpy: 1.22.1
    openvr: 1.16.802
    packaging: 21.3
    ParmEd: 3.4.3
    parso: 0.8.3
    pickleshare: 0.7.5
    Pillow: 9.0.0
    pip: 21.3.1
    pkginfo: 1.8.2
    prompt-toolkit: 3.0.24
    psutil: 5.9.0
    pycollada: 0.7.2
    pydicom: 2.2.2
    Pygments: 2.11.2
    PyOpenGL: 3.1.5
    PyOpenGL-accelerate: 3.1.5
    pyparsing: 3.0.7
    PyQt5-commercial: 5.15.2
    PyQt5-sip: 12.8.1
    PyQtWebEngine-commercial: 5.15.2
    python-dateutil: 2.8.2
    pytz: 2021.3
    pywin32: 303
    pyzmq: 22.3.0
    qtconsole: 5.2.2
    QtPy: 2.0.0
    RandomWords: 0.3.0
    requests: 2.27.1
    scipy: 1.7.3
    setuptools: 59.8.0
    sfftk-rw: 0.7.1
    six: 1.16.0
    snowballstemmer: 2.2.0
    sortedcontainers: 2.4.0
    Sphinx: 4.3.2
    sphinx-autodoc-typehints: 1.15.2
    sphinxcontrib-applehelp: 1.0.2
    sphinxcontrib-blockdiag: 3.0.0
    sphinxcontrib-devhelp: 1.0.2
    sphinxcontrib-htmlhelp: 2.0.0
    sphinxcontrib-jsmath: 1.0.1
    sphinxcontrib-qthelp: 1.0.3
    sphinxcontrib-serializinghtml: 1.1.5
    suds-community: 1.0.0
    tables: 3.7.0
    tifffile: 2021.11.2
    tinyarray: 1.2.4
    tornado: 6.1
    traitlets: 5.1.1
    urllib3: 1.26.8
    wcwidth: 0.2.5
    webcolors: 1.11.1
    wheel: 0.37.1
    wheel-filename: 1.3.0
    WMI: 1.5.1

Change History (2)

comment:1 by pett, 4 years ago

Component: UnassignedStructure Prediction
Owner: set to Tom Goddard
Platform: all
Project: ChimeraX
Status: newassigned
Summary: ChimeraX bug report submissionIncomplete retrieval of AlphaFold prediction

comment:2 by Tom Goddard, 4 years ago

Resolution: nonchimerax
Status: assignedclosed

The error you report in ChimeraX running AlphaFold

"ContentTooShortError: <urlopen error retrieval incomplete: got only 838709867 out of 1073742050 bytes>"

comes from AlphaFold running on the Google Colab server, not from ChimeraX. It is reading 1 Gbyte chunks of the Uniprot database to build a sequence alignment at the start of the prediction and had read 5 Gbytes and gave this error trying to read the 6th chunk (of 98). It was reading the chunk from the European mirror of the database maintained by Google. Trying to retrieve the 1 GByte chunk gave an error because it only got 838 Mbytes. Why? I don't know, maybe their server went down.

Unfortunately AlphaFold is very fragile software and so instead of trying to fetch the chunk of the Uniprot database again it simply fails.

The only solution is to try running again, maybe after waiting a big for Google's uniprot database server to recover.

Note: See TracTickets for help on using tickets.