Opened 3 years ago

Closed 2 years ago

#8975 closed defect (fixed)

Fix AlphaFold energy minimization

Reported by: Tom Goddard Owned by: Tom Goddard
Priority: high Milestone:
Component: Structure Prediction Version:
Keywords: Cc:
Blocked By: Blocking:
Notify when closed: Platform: all
Project: ChimeraX

Description

Google Colab updated from Python 3.9 to 3.10 on April 28, 2023 (without announcing, grr) which broke ColabFold which required an old tensorflow version not available with Python 3.10. ColabFold fixed that problem and I updated ChimeraX to use the new ColabFold. Also OpenMM 7.5.1 install did not work with Python 3.10 so ColabFold updated it to OpenMM 7.7.0. But now energy minimization fails because an old problem where OpenMM changes the default text encoding from utf-8 to ascii breaks all Google Colab shell magic has returned. ChimeraX uses the Google Colab shell magic (e.g. "!zip -r results.zip ..."). I had a work-around in the code to fix the text encoding but it is no longer working with Python 3.10 and the updated OpenMM 7.7.0. I need to figure out a new fix so that OpenMM does not change the text encoding.

In the meantime I changed the ChimeraX AlphaFold script so it warns that energy minimization is not available, and it predicts with no energy minimization even if minimization was requested.

Change History (1)

comment:1 by Tom Goddard, 2 years ago

Resolution: fixed
Status: assignedclosed

Fixed.

I put in another hack, this time replacing locale.getpreferredencoding() so it always returns 'UTF-8'. Formerly I had a hack that replaced _local.nl_langinfo() but that no longer worked in Python 3.10. I tested that running two predictions in the same Colab session works correctly.

I did further debugging to try to find a cleaner way to restore the encoding to 'UTF-8' after energy minimization somehow changes it to ANSI_X4.3-1968 (ie ascii). I tried calling the C library setlocale() and nl_langinfo() to see if it was just the Python calls that were somehow broken. But after setting the locale to en_US.UTF-8 successfully the nl_langinfo() still reported ascii. The Python calls work as expected when Google Colab starts or if no energy minimization is done. But for unknown reasons the posix C-library calls seem broken after energy minimization.

import ctypes
libc = ctypes.CDLL('libc.so.6')

libc.setlocale.argtypes = [ctypes.c_int, ctypes.c_char_p]
libc.setlocale.restype = ctypes.c_char_p

import locale
libc.setlocale(locale.LC_LANG, 'en_US.UTF-8')

libc.nl_langinfo.restype = ctypes.c_char_p
libc.nl_langinfo(ctypes.c_int(locale.CODESET))

With a lot more work I could put in print statements in the ColabFold energy minimization code and try to find exactly where the nl_langinfo() return value changes. It seems not worth the trouble.

Note: See TracTickets for help on using tickets.