Opened 6 years ago

Last modified 6 years ago

#2668 assigned enhancement

RFE: open compressed maps directly

Reported by: lpravda@… Owned by: Tom Goddard
Priority: normal Milestone:
Component: Volume Data Version:
Keywords: Cc: Elaine Meng
Blocked By: Blocking:
Notify when closed: Platform: all
Project: ChimeraX

Description

The following bug report has been submitted:
Platform:        Darwin-19.0.0-x86_64-i386-64bit
ChimeraX Version: 0.91 (2019-12-05)
Description
I tried opening compressed EMDB map downloaded from ftp://ftp.ebi.ac.uk/pub/databases/emdb/structures/EMD-2752/map/emd_2752.map.gz. 

Although opening decompressed map works, I think allowing opening compressed maps would be very usefull functionality as well (if this is presently unsupported)

Log:
UCSF ChimeraX version: 0.91 (2019-12-05)  
© 2016-2019 Regents of the University of California. All rights reserved.  
How to cite UCSF ChimeraX  

> open /Users/lpravda/Downloads/emd_2752.map.gz

Traceback (most recent call last):  
File
"/Applications/ChimeraX_Daily.app/Contents/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-
packages/chimerax/map/data/fileformats.py", line 157, in open_file  
data.extend(open_func(p, **kw))  
File
"/Applications/ChimeraX_Daily.app/Contents/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-
packages/chimerax/map/data/ccp4/__init__.py", line 18, in open  
return [CCP4Grid(path)]  
File
"/Applications/ChimeraX_Daily.app/Contents/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-
packages/chimerax/map/data/ccp4/ccp4_grid.py", line 21, in __init__  
MRCGrid.__init__(self, path, file_type = 'ccp4')  
File
"/Applications/ChimeraX_Daily.app/Contents/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-
packages/chimerax/map/data/mrc/mrc_grid.py", line 24, in __init__  
d = mrc_format.MRC_Data(path, file_type)  
File
"/Applications/ChimeraX_Daily.app/Contents/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-
packages/chimerax/map/data/mrc/mrc_format.py", line 43, in __init__  
v = self.read_header_values(file, file_size, file_type)  
File
"/Applications/ChimeraX_Daily.app/Contents/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-
packages/chimerax/map/data/mrc/mrc_format.py", line 189, in read_header_values  
% v['nsymbt']))  
File "<string>", line None  
SyntaxError: MRC header value nsymbt (-1248613872) is invalid  
  
During handling of the above exception, another exception occurred:  
  
Traceback (most recent call last):  
File
"/Applications/ChimeraX_Daily.app/Contents/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-
packages/chimerax/ui/gui.py", line 628, in _qt_safe  
run(session, "open " + " ".join([FileNameArg.unparse(p) for p in paths]))  
File
"/Applications/ChimeraX_Daily.app/Contents/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-
packages/chimerax/core/commands/run.py", line 31, in run  
results = command.run(text, log=log)  
File
"/Applications/ChimeraX_Daily.app/Contents/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-
packages/chimerax/core/commands/cli.py", line 2837, in run  
result = ci.function(session, **kw_args)  
File
"/Applications/ChimeraX_Daily.app/Contents/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-
packages/chimerax/core/commands/open.py", line 68, in open  
path_models = session.models.open(paths, format=format, name=name, **kw)  
File
"/Applications/ChimeraX_Daily.app/Contents/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-
packages/chimerax/core/models.py", line 700, in open  
session, filenames, format=format, name=name, **kw)  
File
"/Applications/ChimeraX_Daily.app/Contents/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-
packages/chimerax/core/io.py", line 485, in open_multiple_data  
models, status = open_func(session, paths, mname, **kw)  
File
"/Applications/ChimeraX_Daily.app/Contents/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-
packages/chimerax/map/volume.py", line 3610, in open_map_format  
return open_map(session, path, name=name, format=format, **kw)  
File
"/Applications/ChimeraX_Daily.app/Contents/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-
packages/chimerax/map/volume.py", line 3209, in open_map  
verbose = kw.get('verbose'))  
File
"/Applications/ChimeraX_Daily.app/Contents/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-
packages/chimerax/map/data/fileformats.py", line 159, in open_file  
raise FileFormatError(value)  
chimerax.map.data.fileformats.FileFormatError: MRC header value nsymbt
(-1248613872) is invalid  
  
chimerax.map.data.fileformats.FileFormatError: MRC header value nsymbt
(-1248613872) is invalid  
  
File
"/Applications/ChimeraX_Daily.app/Contents/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-
packages/chimerax/map/data/fileformats.py", line 159, in open_file  
raise FileFormatError(value)  
  
See log for complete Python traceback.  
  




OpenGL version: 4.1 ATI-3.2.24
OpenGL renderer: AMD Radeon Pro 560 OpenGL Engine
OpenGL vendor: ATI Technologies Inc.

Change History (7)

comment:1 by Eric Pettersen, 6 years ago

Component: UnassignedVolume Data
Owner: set to Tom Goddard
Platform: all
Project: ChimeraX
Status: newassigned
Summary: ChimeraX bug report submissionRFE: open compressed maps directly
Type: defectenhancement

Requested by Lukas Pravda

comment:2 by Tom Goddard, 6 years ago

Currently ChimeraX does not support opening compressed map files. It shouldn't give a traceback -- I'll fix that so it just says it can't open compressed maps.

The map reading code only reads planes and blocks of the map that it needs when it needs them. It can't do that unless the file is fully uncompressed. So to handle reading a compressed map it would have to be fully uncompressed. Would you want ChimeraX to uncompress the file to the same directory or a temporary directory?

comment:3 by Tom Goddard, 6 years ago

I fixed the traceback so it simply reports it that it cannot read compressed map files.

comment:4 by Tom Goddard, 6 years ago

Normally a user would open an EMDB map in ChimeraX using the ChimeraX open command

open 2752 from EMDB

which takes care of the decompression of the file. And after that they'd just use the thumbnail image in the ChimeraX file history.

It is not so clear what to do if the user tries to open emd_2752.map.gz that they downloaded from a web browser. I think it would be bad if we just uncompressed to a temporary folder, because the user might come back to view this file on many different days and every time they would have to wait for the whole file to be decompressed. So really the user should have a decompressed version in a semi-permanent location. ChimeraX could decompress and put the decompressed file in the same directory (if it has write permission). But you really would most likely want to delete the *.gz file. But I don't think it is reasonable for ChimeraX to delete the *.gz file -- opening a file should not delete it! So you'd end up with both compressed and uncompressed files -- probably ok. A further complication is what to do when the user tries to open the *.gz file again (maybe by just clicking on a file history thumbnail). It has already been decompressed and we can see that a file without the *.gz suffix exists -- do we just use that one? It is highly likely that the *.gz file and the decompressed version correspond to each other, but it is not assured. We could check that the creation time of the decompressed one is later than the *.gz one as one sanity check.

On macOS downloading the *.gz file from EMDB automatically decompresses it. I guess Windows and Linux don't do that. Having ChimeraX uncompress the file is a convenience -- it will use more disk space since both compressed and uncompressed versions will need to be kept if ChimeraX does the decompression. It seems probably worth doing, but not a high priority.

in reply to:  5 ; comment:5 by lpravda@…, 6 years ago

Hi Tom, thank you for looking into this that quickly. I think just reporting that gzipped files cannot be opened directly, without crashing is a sensible decision. I thought that ChimeraX could open gzipped maps as I was confused by the documentation stating "Files that are gzipped, as indicated by their additional .gz suffix, can also be read" (https://www.rbvi.ucsf.edu/chimerax/docs/user/commands/open.html) 

Best,
Lukas

On 10/12/2019, 02:17, "ChimeraX" <ChimeraX-bugs-admin@cgl.ucsf.edu> wrote:

    #2668: RFE: open compressed maps directly
    ----------------------------------+-------------------------
              Reporter:  lpravda@…    |      Owner:  Tom Goddard
                  Type:  enhancement  |     Status:  assigned
              Priority:  normal       |  Milestone:
             Component:  Volume Data  |    Version:
            Resolution:               |   Keywords:
            Blocked By:               |   Blocking:
    Notify when closed:               |   Platform:  all
               Project:  ChimeraX     |
    ----------------------------------+-------------------------
    
    Comment (by Tom Goddard):
    
     Normally a user would open an EMDB map in ChimeraX using the ChimeraX open
     command
    
       open 2752 from EMDB
    
     which takes care of the decompression of the file.  And after that they'd
     just use the thumbnail image in the ChimeraX file history.
    
     It is not so clear what to do if the user tries to open emd_2752.map.gz
     that they downloaded from a web browser.  I think it would be bad if we
     just uncompressed to a temporary folder, because the user might come back
     to view this file on many different days and every time they would have to
     wait for the whole file to be decompressed.  So really the user should
     have a decompressed version in a semi-permanent location.  ChimeraX could
     decompress and put the decompressed file in the same directory (if it has
     write permission).  But you really would most likely want to delete the
     *.gz file.  But I don't think it is reasonable for ChimeraX to delete the
     *.gz file -- opening a file should not delete it!  So you'd end up with
     both compressed and uncompressed files -- probably ok.  A further
     complication is what to do when the user tries to open the *.gz file again
     (maybe by just clicking on a file history thumbnail).  It has already been
     decompressed and we can see that a file without the *.gz suffix exists --
     do we just use that one?  It is highly likely that the *.gz file and the
     decompressed version correspond to each other, but it is not assured.  We
     could check that the creation time of the decompressed one is later than
     the *.gz one as one sanity check.
    
     On macOS downloading the *.gz file from EMDB automatically decompresses
     it.  I guess Windows and Linux don't do that.  Having ChimeraX uncompress
     the file is a convenience -- it will use more disk space since both
     compressed and uncompressed versions will need to be kept if ChimeraX does
     the decompression.  It seems probably worth doing, but not a high
     priority.
    
    --
    Ticket URL: <https://plato.cgl.ucsf.edu/trac/ChimeraX/ticket/2668#comment:4>
    ChimeraX <http://www.rbvi.ucsf.edu/chimerax/>
    ChimeraX Issue Tracker
    


comment:6 by Tom Goddard, 6 years ago

Cc: Elaine Meng added

Yes it is much better now that a clear message is given that compressed maps are not handled.

We will improve the open command documentation so it indicates that compressed maps (and possibly some other file formats) cannot be read directly. Any ChimeraX file reader that can read the data from a stream can handle decompression, but if the reader takes a file path instead of a stream then it cannot handle compression.

comment:7 by Tom Goddard, 6 years ago

Here are the formats that can read compressed files (.gz, .bz2, .xz) and those that cannot.

handles compression:

ALN, BILD, Collada, Compiled Python code, DICOM image, Directional FSC, FASTA, glTF, HSSP, MMTF, MSF, PDB, Pfam, PIR, Pseudobonds, Python code, RSF, StereoLithography, Stockholm, STORM, VTK PolyData, Wavefront OBJ

does not accept compression:

Amira mesh, APBS potential, BRIX density map, CCP4 density map, Chimera map, ChimeraX commands, ChimeraX session, CNS or XPLOR density map, DCD coordinates, DelPhi or GRASP potential, DeltaVision map, DOCK scoring grid, DSN6 density map, EMAN HDF map, Gaussian cube grid, gOpenMol grid, Gromacs compressed coordinates, Gromacs full-precision coordinates, HTML, IHM, Image stack, Imaris map, IMOD map, MacMolPlt grid, Marker model, mmCIF, mol2, MRC density map, NetCDF generic array, pdbqt, Point signals, Priism microscope image, PROFEC free energy grid, Purdue image format, Schrodinger Maestro, Segmentation, SITUS map file, SPIDER volume data, Structure factor MTZ, TOM toolbox EM density map, UHBD grid, binary

Note: See TracTickets for help on using tickets.