Opened 3 years ago
Closed 3 years ago
#6959 closed defect (fixed)
AlphaFold prediction energy OpenMM minimization always fails
Reported by: | Tom Goddard | Owned by: | Tom Goddard |
---|---|---|---|
Priority: | moderate | Milestone: | |
Component: | Structure Prediction | Version: | |
Keywords: | Cc: | ||
Blocked By: | Blocking: | ||
Notify when closed: | Platform: | all | |
Project: | ChimeraX |
Description
Tomas Fernandez noted in a YouTube comment of a ChimeraX AlphaFold video that AlphaFold runs that used to work are now failing in energy minimization.
Tomas writes:
"Nevertheless, and in order to confirm that this error is dependent on the predicted structure, I tried to predict the structure of a 70 amino acid protein whose model has been calculated several times through AlphaFold in classes by many students, and the error still pops up: ValueError Traceback (most recent call last) in () 692 seq_list = seq_list[1:] 693 --> 694 run_prediction(seq_list, energy_minimize = not dont_minimize) 5 frames /usr/local/lib/python3.7/dist-packages/alphafold/relax/amber_minimize.py in _run_one_iteration(pdb_string, max_iterations, tolerance, stiffness, restraint_set, max_attempts, use_gpu, exclude_residues) 417 logging.info(e) 418 if not minimized: --> 419 raise ValueError(f"Minimization failed after {max_attempts} attempts.") 420 retopt_time = time.time() - start 421 retmin_attempts = attempts ValueError: Minimization failed after 100 attempts."
I also tested on 128 amino acid 7mrx chain A and it failed in the same way although it works correctly with full alphafold on minsky.
So it looks like OpenMM is somehow broken now on Google Colab. ChimeraX installs fixed versions of AlphaFold and OpenMM and all dependencies but possibly some implicit dependency without a specified version number updated and broke things. Another possibility is that Google Colab changed, possibly updating their CUDA version and that broke OpenMM.
Change History (12)
comment:1 by , 3 years ago
comment:2 by , 3 years ago
If I change to use_gpu=False in the AmberRelaxation() call in the colab Python notebook then the energy minimization succeeds. So the problem may have to do with OpenMM using the GPU.
comment:3 by , 3 years ago
Adding logging to AlphaFold
from absl import logging logging.set_verbosity(logging.INFO)
shows OpenMM cannot use CUDA
Energy minimizing best structure model_4_ptm with OpenMM and Amber forcefield INFO:absl:alterations info: {'nonstandard_residues': [], 'removed_heterogens': set(), 'missing_residues': {}, 'missing_heavy_atoms': {}, 'missing_terminals': {<Residue 127 (ARG) of chain 0>: ['OXT']}, 'Se_in_MET': [], 'removed_chains': {0: []}} INFO:absl:Minimizing protein, attempt 1 of 100. INFO:absl:Restraining 1022 / 1991 particles. INFO:absl:No compatible CUDA device is available INFO:absl:Minimizing protein, attempt 2 of 100. INFO:absl:Restraining 1022 / 1991 particles. INFO:absl:No compatible CUDA device is available ... 100 attempts then fails.
Here is the CUDA version according to the Colab terminal
/content# nvcc --version nvcc: NVIDIA (R) Cuda compiler driver Copyright (c) 2005-2020 NVIDIA Corporation Built on Mon_Oct_12_20:09:46_PDT_2020 Cuda compilation tools, release 11.1, V11.1.105 Build cuda_11.1.TC455_06.29190527_0 /content# nvidia-smi Wed May 25 22:36:39 2022 +-----------------------------------------------------------------------------+ | NVIDIA-SMI 460.32.03 Driver Version: 460.32.03 CUDA Version: 11.2 | |-------------------------------+----------------------+----------------------+ | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | | | | MIG M. | |===============================+======================+======================| | 0 Tesla P100-PCIE... Off | 00000000:00:04.0 Off | 0 | | N/A 39C P0 33W / 250W | 16069MiB / 16280MiB | 0% Default | | | | N/A | +-------------------------------+----------------------+----------------------+ +-----------------------------------------------------------------------------+ | Processes: | | GPU GI CI PID Type Process name GPU Memory | | ID ID Usage | |=============================================================================| +-----------------------------------------------------------------------------+ /content#
comment:4 by , 3 years ago
Conda installing OpenMM pulled in the cudatoolkit-11.7.0 that was released 14 days ago. Maybe that is incompatible with the old OpenMM 7.5.1.
/opt/conda/bin/conda install -qy -c conda-forge python=3.7 openmm=7.5.1 pdbfixer Collecting package metadata (current_repodata.json): ...working... done Solving environment: ...working... failed with initial frozen solve. Retrying with flexible solve. Solving environment: ...working... failed with repodata from current_repodata.json, will retry with next repodata source. Collecting package metadata (repodata.json): ...working... done Solving environment: ...working... done ## Package Plan ## environment location: /opt/conda added / updated specs: - openmm=7.5.1 - pdbfixer - python=3.7 The following packages will be downloaded: package | build ---------------------------|----------------- ca-certificates-2022.5.18.1| ha878542_0 144 KB conda-forge certifi-2022.5.18.1 | py37h89c1867_0 150 KB conda-forge cffi-1.14.6 | py37hc58025e_0 225 KB conda-forge colorama-0.4.4 | pyh9f0ad1d_0 18 KB conda-forge conda-4.12.0 | py37h89c1867_0 1.0 MB conda-forge conda-package-handling-1.8.1| py37h540881e_1 1.0 MB conda-forge cryptography-37.0.2 | py37h38fbfac_0 1.5 MB conda-forge cudatoolkit-11.7.0 | hd8887f6_10 831.6 MB conda-forge fftw-3.3.10 |nompi_h77c792f_102 6.4 MB conda-forge libblas-3.9.0 |14_linux64_openblas 12 KB conda-forge libcblas-3.9.0 |14_linux64_openblas 12 KB conda-forge libgfortran-ng-12.1.0 | h69a702a_16 23 KB conda-forge libgfortran5-12.1.0 | hdcd56e2_16 1.8 MB conda-forge liblapack-3.9.0 |14_linux64_openblas 12 KB conda-forge libopenblas-0.3.20 |pthreads_h78a6416_0 10.1 MB conda-forge numpy-1.21.6 | py37h976b520_0 6.1 MB conda-forge ocl-icd-2.3.1 | h7f98852_0 119 KB conda-forge ocl-icd-system-1.0.0 | 1 4 KB conda-forge openmm-7.5.1 | py37h96c4ddf_1 10.7 MB conda-forge openssl-1.1.1o | h166bdaf_0 2.1 MB conda-forge pdbfixer-1.7 | pyhd3deb0d_0 167 KB conda-forge pip-22.1.1 | pyhd8ed1ab_0 1.5 MB conda-forge pycosat-0.6.3 |py37h540881e_1010 107 KB conda-forge pysocks-1.7.1 | py37h89c1867_5 28 KB conda-forge python-3.7.10 |hffdb5ce_100_cpython 57.3 MB conda-forge python_abi-3.7 | 2_cp37m 4 KB conda-forge ruamel_yaml-0.15.80 |py37h5e8e339_1006 270 KB conda-forge setuptools-62.3.2 | py37h89c1867_0 1.4 MB conda-forge six-1.16.0 | pyh6c4a22f_0 14 KB conda-forge tqdm-4.64.0 | pyhd8ed1ab_0 81 KB conda-forge urllib3-1.25.8 | py37hc8dfbb8_1 160 KB conda-forge ------------------------------------------------------------ Total: 934.1 MB The following NEW packages will be INSTALLED: colorama conda-forge/noarch::colorama-0.4.4-pyh9f0ad1d_0 cudatoolkit conda-forge/linux-64::cudatoolkit-11.7.0-hd8887f6_10 fftw conda-forge/linux-64::fftw-3.3.10-nompi_h77c792f_102 libblas conda-forge/linux-64::libblas-3.9.0-14_linux64_openblas libcblas conda-forge/linux-64::libcblas-3.9.0-14_linux64_openblas libgfortran-ng conda-forge/linux-64::libgfortran-ng-12.1.0-h69a702a_16 libgfortran5 conda-forge/linux-64::libgfortran5-12.1.0-hdcd56e2_16 liblapack conda-forge/linux-64::liblapack-3.9.0-14_linux64_openblas libopenblas conda-forge/linux-64::libopenblas-0.3.20-pthreads_h78a6416_0 numpy conda-forge/linux-64::numpy-1.21.6-py37h976b520_0 ocl-icd conda-forge/linux-64::ocl-icd-2.3.1-h7f98852_0 ocl-icd-system conda-forge/linux-64::ocl-icd-system-1.0.0-1 openmm conda-forge/linux-64::openmm-7.5.1-py37h96c4ddf_1 pdbfixer conda-forge/noarch::pdbfixer-1.7-pyhd3deb0d_0 python_abi conda-forge/linux-64::python_abi-3.7-2_cp37m six conda-forge/noarch::six-1.16.0-pyh6c4a22f_0 The following packages will be REMOVED: brotlipy-0.7.0-py39h27cfd23_1003 The following packages will be UPDATED: ca-certificates pkgs/main::ca-certificates-2022.4.26-~ --> conda-forge::ca-certificates-2022.5.18.1-ha878542_0 conda-package-han~ pkgs/main::conda-package-handling-1.8~ --> conda-forge::conda-package-handling-1.8.1-py37h540881e_1 cryptography pkgs/main::cryptography-37.0.1-py39h9~ --> conda-forge::cryptography-37.0.2-py37h38fbfac_0 pip pkgs/main/linux-64::pip-21.2.4-py39h0~ --> conda-forge/noarch::pip-22.1.1-pyhd8ed1ab_0 pycosat pkgs/main::pycosat-0.6.3-py39h27cfd23~ --> conda-forge::pycosat-0.6.3-py37h540881e_1010 pysocks pkgs/main::pysocks-1.7.1-py39h06a4308~ --> conda-forge::pysocks-1.7.1-py37h89c1867_5 setuptools pkgs/main::setuptools-61.2.0-py39h06a~ --> conda-forge::setuptools-62.3.2-py37h89c1867_0 The following packages will be SUPERSEDED by a higher-priority channel: certifi pkgs/main::certifi-2022.5.18.1-py39h0~ --> conda-forge::certifi-2022.5.18.1-py37h89c1867_0 cffi pkgs/main::cffi-1.15.0-py39hd667e15_1 --> conda-forge::cffi-1.14.6-py37hc58025e_0 conda pkgs/main::conda-4.12.0-py39h06a4308_0 --> conda-forge::conda-4.12.0-py37h89c1867_0 openssl pkgs/main::openssl-1.1.1o-h7f8727e_0 --> conda-forge::openssl-1.1.1o-h166bdaf_0 python pkgs/main::python-3.9.12-h12debd9_0 --> conda-forge::python-3.7.10-hffdb5ce_100_cpython ruamel_yaml pkgs/main::ruamel_yaml-0.15.100-py39h~ --> conda-forge::ruamel_yaml-0.15.80-py37h5e8e339_1006 tqdm pkgs/main/linux-64::tqdm-4.64.0-py39h~ --> conda-forge/noarch::tqdm-4.64.0-pyhd8ed1ab_0 urllib3 pkgs/main::urllib3-1.26.9-py39h06a430~ --> conda-forge::urllib3-1.25.8-py37hc8dfbb8_1 Preparing transaction: ...working... done Verifying transaction: ...working... done Executing transaction: ...working... By downloading and using the CUDA Toolkit conda packages, you accept the terms and conditions of the CUDA End User License Agreement (EULA): https://docs.nvidia.com/cuda/eula/index.html done + cd /opt/conda/lib/python3.7/site-packages/ + patch -p0 patching file simtk/openmm/app/topology.py Hunk #1 succeeded at 353 (offset -3 lines). + ln -s /opt/conda/lib/python3.7/site-packages/simtk . + ln -s /opt/conda/lib/python3.7/site-packages/pdbfixer .
comment:5 by , 3 years ago
I tried conda installing cudatoolkit 11.6.0 that came out 4 months ago instead of cudatoolkit 11.7.0 but got the same OpenMM error "No compatible CUDA device is available".
comment:6 by , 3 years ago
Full alphafold has use_gpu_relax false by default but if I turn it on and run 7mrx chain A on minsky it minimizes without errors. So the problem appears to be with OpenMM using GPU on Google Colab, something wrong with the GPU CUDA configuration on Colab.
comment:7 by , 3 years ago
AlphaFold's standard Google Colab notebook (provided by DeepMind) also fails in OpenMM minimization in the same way. This test was also run on my Colab Pro account and had an Nvidia P100 gpu.
AMBER relaxation: 86% 6/7 [elapsed: 26:10 remaining: 04:04] /usr/local/lib/python3.7/dist-packages/jax/_src/tree_util.py:189: FutureWarning: jax.tree_util.tree_multimap() is deprecated. Please use jax.tree_util.tree_map() instead as a drop-in replacement. 'instead as a drop-in replacement.', FutureWarning) --------------------------------------------------------------------------- ValueError Traceback (most recent call last) <ipython-input-5-f89616eedc2d> in <module>() 88 max_outer_iterations=3, 89 use_gpu=True) ---> 90 relaxed_pdb, _, _ = amber_relaxer.process(prot=unrelaxed_proteins[best_model_name]) 91 else: 92 print('Warning: Running without the relaxation stage.') 2 frames /opt/conda/lib/python3.7/site-packages/alphafold/relax/relax.py in process(self, prot) 64 exclude_residues=self._exclude_residues, 65 max_outer_iterations=self._max_outer_iterations, ---> 66 use_gpu=self._use_gpu) 67 min_pos = out['pos'] 68 start_pos = out['posinit'] /opt/conda/lib/python3.7/site-packages/alphafold/relax/amber_minimize.py in run_pipeline(prot, stiffness, use_gpu, max_outer_iterations, place_hydrogens_every_iteration, max_iterations, tolerance, restraint_set, max_attempts, checks, exclude_residues) 481 restraint_set=restraint_set, 482 max_attempts=max_attempts, --> 483 use_gpu=use_gpu) 484 prot = protein.from_pdb_string(ret["min_pdb"]) 485 if place_hydrogens_every_iteration: /opt/conda/lib/python3.7/site-packages/alphafold/relax/amber_minimize.py in _run_one_iteration(pdb_string, max_iterations, tolerance, stiffness, restraint_set, max_attempts, use_gpu, exclude_residues) 417 logging.info(e) 418 if not minimized: --> 419 raise ValueError(f"Minimization failed after {max_attempts} attempts.") 420 ret["opt_time"] = time.time() - start 421 ret["min_attempts"] = attempts ValueError: Minimization failed after 100 attempts.
comment:8 by , 3 years ago
Testing whether CUDA is working on Google Colab with numba from conda using command "numba -s" suggests it is working. Not clear though. What does "CUDA NVIDIA Bindings Available: False" mean?
# /opt/conda/bin/numba -s System info: -------------------------------------------------------------------------------- __Time Stamp__ Report started (local time) : 2022-05-26 00:50:04.490246 UTC start time : 2022-05-26 00:50:04.490258 Running time (s) : 0.661247 __Hardware Information__ Machine : x86_64 CPU Name : skylake-avx512 CPU Count : 2 Number of accessible CPUs : 2 List of accessible CPUs cores : 0-1 CFS Restrictions (CPUs worth of runtime) : None CPU Features : 64bit adx aes avx avx2 avx512bw avx512cd avx512dq avx512f avx512vl bmi bmi2 clflushopt clwb cmov cx16 cx8 f16c fma fsgsbase fxsr invpcid lzcnt mmx movbe pclmul popcnt prfchw rdrnd rdseed rtm sahf sse sse2 sse3 sse4.1 sse4.2 ssse3 xsave xsavec xsaveopt xsaves Memory Total (MB) : 12986 Memory Available (MB) : 2171 __OS Information__ Platform Name : Linux-5.4.188+-x86_64-with-debian-buster-sid Platform Release : 5.4.188+ OS Name : Linux OS Version : #1 SMP Sun Apr 24 10:03:06 PDT 2022 OS Specific Version : ? CUDA NVIDIA Bindings Available : False CUDA NVIDIA Bindings In Use : False CUDA Detect Output: Found 1 CUDA devices id 0 b'Tesla P100-PCIE-16GB' [SUPPORTED] Compute Capability: 6.0 PCI Device ID: 4 PCI Bus ID: 0 UUID: GPU-5f0dd9f9-fda9-7793-69c8-720784ac49d6 Watchdog: Disabled FP32/FP64 Performance Ratio: 2 Summary: 1/1 devices are supported CUDA Libraries Test Output: Finding nvvm from Conda environment named libnvvm.so.4.0.0 trying to open library... ok Finding cudart from Conda environment named libcudart.so.11.6.55 trying to open library... ok Finding cudadevrt from Conda environment named libcudadevrt.a Finding libdevice from Conda environment searching for compute_20... ok searching for compute_30... ok searching for compute_35... ok searching for compute_50... ok __SVML Information__ SVML State, config.USING_SVML : False SVML Library Loaded : False llvmlite Using SVML Patched LLVM : True SVML Operational : False __Threading Layer Information__ TBB Threading Layer Available : True +-->TBB imported successfully. OpenMP Threading Layer Available : True +-->Vendor: GNU Workqueue Threading Layer Available : True +-->Workqueue imported successfully. __Numba Environment Variable Information__ None found. __Conda Information__ Conda not available. __Installed Packages__ Package Version ---------------------- ----------- brotlipy 0.7.0 certifi 2022.5.18.1 cffi 1.15.0 charset-normalizer 2.0.4 conda 4.12.0 conda-package-handling 1.8.1 cryptography 37.0.1 idna 3.3 llvmlite 0.38.0 numba 0.55.1 numpy 1.21.6 OpenMM 7.5.1 pdbfixer 1.7 pip 21.2.2 pycosat 0.6.3 pycparser 2.21 pyOpenSSL 22.0.0 PySocks 1.7.1 requests 2.27.1 ruamel-yaml-conda 0.15.100 setuptools 61.2.0 tqdm 4.64.0 urllib3 1.26.9 wheel 0.37.1 No errors reported. __Warning log__ Warning: Conda not available. Error was [Errno 2] No such file or directory: 'conda': 'conda' Warning (psutil): psutil cannot be imported. For more accuracy, consider installing it. Warning (no file): /sys/fs/cgroup/cpuacct/cpu.cfs_quota_us Warning (no file): /sys/fs/cgroup/cpuacct/cpu.cfs_period_us -------------------------------------------------------------------------------- If requested, please copy and paste the information between the dashed (----) lines, or from a given specific section as appropriate. ============================================================= IMPORTANT: Please ensure that you are happy with sharing the contents of the information present, any information that you wish to keep private you should remove before sharing. =============================================================
comment:9 by , 3 years ago
Testing OpenMM on Google Colab gives the same no compatible CUDA device error
# /opt/conda/bin/python -m simtk.testInstallation OpenMM Version: 7.5.1 Git Revision: a9cfd7fb9343e21c3dbb76e377c721328830a3ee There are 4 Platforms available: 1 Reference - Successfully computed forces 2 CPU - Successfully computed forces 3 CUDA - Error computing forces with CUDA platform 4 OpenCL - Error computing forces with OpenCL platform CUDA platform error: No compatible CUDA device is available OpenCL platform error: Error initializing context: clCreateContext (-6) Median difference in forces between platforms: Reference vs. CPU: 6.29577e-06 All differences are within tolerance.
Testing OpenMM on minsky shows it can use CUDA
goddard@minsky:~/ucsf/af/runs$ python -m simtk.testInstallation OpenMM Version: 7.5.1 Git Revision: a9cfd7fb9343e21c3dbb76e377c721328830a3ee There are 4 Platforms available: 1 Reference - Successfully computed forces 2 CPU - Successfully computed forces 3 CUDA - Successfully computed forces 4 OpenCL - Successfully computed forces Median difference in forces between platforms: Reference vs. CPU: 6.29406e-06 Reference vs. CUDA: 6.73166e-06 CPU vs. CUDA: 7.34172e-07 Reference vs. OpenCL: 6.75475e-06 CPU vs. OpenCL: 8.05102e-07 CUDA vs. OpenCL: 2.71517e-07 All differences are within tolerance.
comment:10 by , 3 years ago
comment:11 by , 3 years ago
I think the problem is because conda has decided to install cudatoolkit version 11.7.0 which is too new for the Google Colab Nvidia driver. Also 11.6.0, 11.5.0, 11.3.1 are too new. But 11.2.2 and 11.0.3 worked. I tested these different versions of OpenMM on Google Colab without alphafold using the following notebook code to install openmm and run the OpenMM test code. nvcc reports the colab cuda version is 11.1 and maybe that is suppose to match the condacudatoolkit version.
def run_shell_commands(commands, filename, install_log): with open(filename, 'w') as f: f.write(commands) # The -x option logs each command with a prompt in front of it. !bash -x "{filename}" >> "{install_log}" 2>&1 if _exit_code != 0: raise RuntimeError('Error running shell script %s, output in log file %s' % (filename, install_log)) def install_openmm( conda_install_sh = 'https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh', install_log = 'install_log.txt'): '''Must install alphafold first since an openmm patch from alphafold is used.''' # Install Conda import os.path conda_install = os.path.join('/tmp', os.path.basename(conda_install_sh)) cmds = f''' # Exit if any command fails set -e wget -q -P /tmp {conda_install_sh} \ && bash "{conda_install}" -b -p /opt/conda -f \ && rm "{conda_install}" # Install Python, OpenMM and pdbfixer in Conda /opt/conda/bin/conda update -qy conda && \ /opt/conda/bin/conda install -qy -c conda-forge python=3.7 openmm=7.5.1 cudatoolkit=11.0.3 ''' run_shell_commands(cmds, 'install_openmm.sh', install_log) install_openmm() !/opt/conda/bin/python -m simtk.testInstallation
comment:12 by , 3 years ago
Resolution: | → fixed |
---|---|
Status: | assigned → closed |
Fixed.
Specified conda cudatoolkit version 11.2 in colab openmm install script to assure it is compatible with colab cuda driver.
Not clear whether Google Colab or Conda changed to break conda from automatically choosing a compatible cudatoolkit version. Also not sure how long it has been broken.
Here's a ChimeraX AlphaFold run output showing the failure