Changes between Version 1 and Version 2 of Ticket #8313, comment 7


Ignore:
Timestamp:
Jan 18, 2023, 4:51:33 PM (3 years ago)
Author:
Tom Goddard

Legend:

Unmodified
Added
Removed
Modified
  • Ticket #8313, comment 7

    v1 v2  
    1 In Python 3.8, locale.getpreferredencoding() calls _locale.nl_langinfo(_locale.CODESET) and this is implemented in C and makes the system call nl_langinfo(CODESET).  That system call indeed returns ANSI_X4.3-1968 (official name for ASCII) after running a prediction with minimization.  Predictions without minimization have it return the correct 'UTF-8'.  Hours of testing and study of system locale documentation did not reveal how this could be.  The setlocale(LC_ALL, "") system call should copy the locale from the environment variables.  Those give LANG=en_US.UTF-8.  But still the nl_langinfo() Python call gives ANSI.  I tried a separate C program compiled and run in the Google Colab terminal for the broken Colab session and it returned "UTF-8".
     1In Python 3.8, locale.getpreferredencoding() calls _locale.nl_langinfo(_locale.CODESET) and this is implemented in C and makes the C library call nl_langinfo(CODESET).  That system call indeed returns ANSI_X4.3-1968 (official name for ASCII) after running a prediction with minimization.  Predictions without minimization have it return the correct 'UTF-8'.  Hours of testing and study of C library locale documentation did not reveal how this could be.  The setlocale(LC_ALL, "") system call should copy the locale from the environment variables.  Those give LANG=en_US.UTF-8.  But still the nl_langinfo() Python call gives ANSI.  I tried a separate C program compiled and run in the Google Colab terminal for the broken Colab session and it returned "UTF-8".
    22
    33{{{