{{{ #!html
Tom Goddard
December 3, 2010.
Possible projects related to electron microscopy (EM) and molecular assemblies for RBVI next NCRR 5-year grant proposal to be submited in May 2011.
Will talk about 10 project ideas that fall in 2 broad categories:
Dissemination: Communicating Analysis Results and Analysis Methods
Technology: Advances in Visualization and Analysis Methods
Enable users to easily create web pages showing computer readable 3d data and analysis and models and interactive 3d renderings. Establish and document operational file formats (hdf5) to represent volume symmetry, segmentations, coarse grain models, ..., that can be adopted by public databases EMDB, ViPERdb, PDB, CCDB.
Opportunity:
EM and molecular assemblies data and analysis is 95% lost -- only literature publication (pictures and words) of results. Computer readable results are not available except by personal request to the lab. Few EM maps, molecular models, symmetry parameters, sequence alignments, SAXS profiles, lists of interacting residues, ... are put into public archives. This stymies the whole research community effort to build computational understanding of molecular machines, cells, microbial communities. The build-up of computational knowledge from past decades of work is small at EM resolutions compared to what has been achieved at finer levels: proteins and sequences (PDB and seq databases).
Possible Products:
Extend the community of Chimera developers. Create programmer documentation, simpler APIs, training through collaborations and workshops.
Opportunity:
Chimera contains many libraries to analyze volume data, molecules and assemblies. This is the powerful toolkit I use day-to-day to quickly build new analysis capabilities for collaborators. Ability to write a page or two of Python code greatly extends the analysis capabilities of Chimera. Programming by users can multiply the value of our core Chimera libraries many-fold and extend their lifetime, and avoid others reimplementing the same capabilities (e.g. Gorgon, V3D, UROX).
Possible Products:
Screen-capture videos showing how to do common analysis tasks with Chimera. Currently even advanced users know little about Chimera.
Opportunity:
Most people who have used Chimera hundreds of times know only 1/4 of the capabilities they could productively use. I see this several times per week. (Yesterday's example Jiang Zhu, NIH, modeling proteins, uses Chimera for volumes, Grasp2 for multiple sequence alignments.) Video how-to documentation can greatly reduce the barrier to learning advanced Chimera techniques. Easy to follow, no missing steps, shows both how and what can be done.
Possible Products:
High performance computing: e.g. multi-threading, instancing, large atomic models, gpu computing, hdf5 files.
Opportunity:
The most widely cited Chimera volume capability is fitting a molecule in a density map. Dozens of programs do this. Our unique advantage is the fit is done in one second. This allows trying many possibilities. One of the most common reasons verbally given for using Chimera for volume display is "It loads my very large map, and other programs choke". Analysis algorithm literature focuses almost entirely on quality of results, not speed. But in practice, so much goes wrong in analysis that speed to allow many alternate analysis attempts proves more important to whether high quality results are achieved. Where long-running calculations fail to find the right answer, many refined quick analysis tries with human inspection can often produce the right answer.
Possible Products:
Continuous and direct mouse interaction (mouse modes).
Opportunity:
Continuous hand/eye interaction using mouse dragging is highly valuable in analysis. Most obvious example is rotating a model using a mouse drag. The advantage 30 frame/sec continuous hand control becomes very apparent when compared to only being able to change view direction with a typed command (as in some older software). Translating, zooming, volume contour level adjustment, rotamer bond rotation, clip plane positioning, hand fitting, volume cropping, molecular dynamics playback, volume morphing are all powerful data exploration methods in Chimera. Many more are not available in Chimera.
Possible Products:
Tools to compare large numbers of objects: e.g. conformations from BLAST pdb, interfaces between virus proteins, bacteria in termite gut, enzymes binding sites (SFLD), alternative fits of molecules in maps.
Opportunity:
Researchers often compares dozens of homologous structures, alternate map fits, segmented volume regions, binding interfaces. I commonly see Chimera user's with 10 - 30 open models. As biology research accumulates more models, analysis of many models becomes as important as the one-at-a-time analysis that Chimera, designed in a more data poor era, focuses on. Working with many models becomes too time consuming and tedious to be feasible without multi-model analysis tools. The Chimera View Dock tool is a successful example of multi-model analysis.
Possible Products:
Opportunity:
No one has established even a simple common representation for models at lower than atomic resolution. A simple framework supporting geometric models: spheres, ellipsoids, tubes, connections, coloring, hierarchy, and exchange file format would allow sharing models of very interesting biology. For example, Davide Bau visited this week and showed chromatin model, 50 Kbases of DNA adopts unique shapes during transcription and when inactive (Nature Struct Biol, out next week). He used Chimera volume tracer. Sali lab IMP collaboration. Auer lab cellular structures collaboration. This may be a subproject of web data publication.
Possible Products:
Opportunity:
3d consumer computer displays and televisions appear to just be taking off. ESPN 3D offers 3d sports broadcasts. Stereo animation may be attractive for web data publication.Possible Products:
Opportunity:
SAXS data and model visualization I think is an uncolonized niche. Standard molecular viewers used with little specialized support. Would be possible to make Chimera the standard SAXS visualization software (similar to current our monopoly on single-particle EM volume display). Don't currently have experimental collaborators (Sali lab methods devel), but have talked with Alex Shkumatov in Dmitri Svergun lab -- world leader in SAXS computation analysis.
Possible Products:
High resolution (3-4 Angstrom) EM model building.
Opportunity:
Single particle EM maps in the 3 to 4 Angstrom resolution range for viruses are becoming common. This is another opportunity to monopolize software for an emerging subfield. I formerly thought existing low-resolution xray model building tools would be used for this data. It appears a new generation of model building software is needed. Matt Baker in collaboration with U. Washington computer science dept is developing Gorgon visualization and model building from scratch, for 2 years. It might be disruptive to compete with that project. Also it is a very hard problem, perhaps requiring more resources than we can give it. Gorgon is unlikely to succeed for lack of man-power.
Possible Products:
I favor large-scale incremental software changes, instead of starting over, but we have never made such changes (can't shake OTF, wrappy, Tk, fixed function OpenGL, extend atom specs to surfaces, memory efficient molecules, abstract models are molecules). Our practised accretion method with no major changes offers good stability for outside developers but we don't have many of those for other reasons.
Developing a next-generation Chimera 2 while maintaining Chimera 1 is the pattern we followed with the MidasPlus to Chimera transition, but it required more than 5 years to get the next-generation code into initial distribution. Chimera 2 could leverage much of the existing volume C++ code.
}}}