[chimerax-users] Bio-Formats with ChimeraX?

Tom Goddard goddard at sonic.net
Thu Feb 7 12:05:56 PST 2019


Hi Matt,

  ChimeraX tries to handle multi-channel, multi-time TIFFs such as OME TIFF and ImageJ TIFF.  I am interested in seeing examples where that is not working as I’d definitely like to improve ChimeraX for 3D optical microscopy.  Handling the many optical microscopy formats is a big problem, I think bioformats reads over 100 formats many of them proprietary from microscope vendors.  The trouble is bioformats is written in Java and ChimeraX is Python and C++.  It is hard to integrate Java with Python.  I looked at solutions for this specifically to make use of bioformats but a year ago it looked like this would both require shipping some poorly supported Java/Python interoperability with ChimeraX, either requiring users to install Java (which will greatly limit the number people who will ever use it) or including a Java which would make ChimeraX downloads even more enormous, and even accepting all that the speed reading the very large optical microscopy data would like be glacial.  So it didn’t look feasible with the limited resources we have to develop ChimeraX.  If you think there is a way I missed, tell me.

  So my strategy is to go with the lowest common denominator in optical microscopy which is TIFF.  TIFF is a horrible format for performance with 3D-5D data — it is designed for sets of 2D images and those are in a linked list in the file where you cannot simply jump to the 1000th 2D image without reading the first 999 images.  That isn’t always true, some formats (e.g. MicroManager tiff) create an index in the TIFF header so they can find the 2d images efficiently.  At any rate, TIFF is the common exchange format.  For higher performance my effort has been to use HDF5 which is specifically intended for high performance with large multidimensional arrays.  There are many formats based on HDF5 like Imaris *.ims files or .minc files used in optical microscopy.  But there is no standard HDF5 format — it is basically just a directory structure for numeric arrays in a single file and to define an HDF5 format you need to know what that directory structure in the file looks like.  About 10 years ago I used HDF5 to make a format called Chimera map format (suffix *.cmap) for electron microscopy.  ChimeraX reads and writes that (and also will read IMS format and EMAN HDF5).  Chimera map format is what I use when I want to process 3d multichannel optical microscopy time series.  But it turns out HDF5 often has very poor speed, 5x or more times slower than expected.  HDF5 is pretty complex, it tiles the data, can compress individual tiles, it has caches to keep recently used tiles in memory.  Your code can control all the parameters (like tile size, cache sizes) and ChimeraX sometimes sets those, or HDF5 will use defaults.  Either way it is very common to get very poor performance due to reading tiles from all over the disk (fragmentation of 3d arrays), or HDF5 running out of cache and reading tiles over and over and over, or you try to get one slice and reads 50x times more data because the tiles to cover that slice are 50 slices thick.  Sadly, the main way to compensate for this is to use a fast SSD drive, current ones can read at 3 Gbytes/second instead of spinning drives at 0.1 Gbytes/second.

  I’m interested in improving ChimeraX for optical microscopy but we have little time and little (really no) funding for that so if you have suggestions they need to confer maximum benefit for minimum effort to have a chance of getting implemented.

	Tom


> On Feb 6, 2019, at 9:02 PM, Matthew Akamatsu wrote:
> 
> Hi,
> 
> I just started using ChimeraX to render and save 3D time lapse microscopy data as a high-performance replacement of Fiji. I am wondering how flexible ChimeraX can be in opening different TIF-derived file formats - particularly hyperstacks (multi color, multi Z, multi time). Other desired formats would be proprietary formats like .nd2 and .sld .  Bio-Formats does all of this in an open-source manner - how feasible would it be to have bio-formats type functionality in ChimeraX?
> 
> Thanks,
> Matt
> 
> 
> _______________________________________________
> ChimeraX-users mailing list
> ChimeraX-users at cgl.ucsf.edu
> Manage subscription:
> http://www.rbvi.ucsf.edu/mailman/listinfo/chimerax-users





More information about the ChimeraX-users mailing list