| Version 5 (modified by , 10 years ago) ( diff ) | 
|---|
Chimera2 Sessions -- Take 2
Background
From the user perspective, Chimera2 is an application with a suite of tools. In any given session, there may be some user data (e.g., molecules and volumes) and some active tools (e.g., Reply Log and Command Line). One of the required functionality of Chimera2 is to be able to save the current state of the session, and then resume the session in the same state at the future time. This sounds simple enough, but there are some details that need to be explicitly enumerated, as not everyone will agree what consists "the state of the session".
From the developer side, we can distinguish a couple different types of data. Some data are the same for all sessions; for example, start-up preferences and the list of installed tools. Some data are different for different sessions; for example, the list of _active_ tools and the currently opened data files. Let's call the former "initial data" and the latter "session data". Chimera2's "state of the session" consists only of "session data". This means that if the user (a) saves a session, and then (b) changes his start-up preference (e.g., background color) and it was not explicitly overridden in a session (e.g., by changing the background color), then (c) on resumption, the session will use the new background color. Clearly, this may be unexpected for the user. We will need to define what data normally defined by "initial data" that should always be saved as "session data" so that these surprises are minimized. (The answer is not "all of them" because clearly the list of installed tools may change, and additions and deletions cannot be ignored.)
(Chimera2 data can be categorized as raw data, GUI data, and organizational data. Expound.)
Data Types
All session data is simple, i.e., int, float, string, list, dict, etc. or an instance of a class provided by a tool.
Protocols
Saving protocol:
- visit all session state and record:
- what tool is responsible for the state
- what data it refers to and
- only referred to data needs to be named
 
 
- save the list of used tools
- need tool's state version, not tool version
 
- serialize state
- keep track of named data that was saved
 
- confirm that all referred to data was saved
- if not, then the session is incomplete/corrupt
 
Restoring protocol:
- Read the list of used tools
- if installed tools are insufficient, then give user option to cancel or load tools
 
- deserialze state inside block/unblock of triggers
The Problem
The problem is how to serialize and deserialize the session state. Ideally, when restoring a session, all data that is referenced by some other data has already been restored. Therefore all of dependencies need to be known by the saving code and there can be no circular dependencies. This avoids two-phase initialization, which is especially important for C++ objects.
Where is Session State?
All session state is accessible via a Session object. The session object has a registry of which of its attributes contain savable state and conform to the State API. Attributes that may contain nested data, and that data can use the same State API.
Examples of State
Session attributes:
- open models
- open tools
- running tasks
- scenes
- user colors
- user colormaps
- view (camera, lighting, clipping, etc.)
- selections (TODO)
Nested data:
- Tool GUI state
- molecules: atoms/bonds/residues & graphical state
- pseudobonds
- surfaces
- generic 3D graphics (e.g., STL)
- atom/bond annotations
Note that "tool" state may be in a session attribute rather than in the GUI instance.
Issues
- How to give dependencies for ordering session data?
- naive data with back references are circular
- may need to give before and after dependencies
 
- What is the granularity?
- Can we guarantee non-circular references?
- Can we provide a "simple" API for tools?
Possible Solution
Two competing solutions:
- allow circular references via two-phase initialization
- like Python's pickle
 
- Specify dependencies and save/restore in the right order
Example
- molecular data needs to be saved early
- molecular data is in a model
- session gets a model from its list of models
- a tool may provide a new model that depends on molecular data and tool state
- so some models will need to saved before others
- so some tools may need to saved before associated model
- does that mean that tools are saved first?
How the data is organized in the session does not match the order in which the data needs to be saved.
The Solution
Circular dependencies are natural. So use two-phase initialization when deserializing a session.
Really don't want to use two-phase initialization for C++ objects. So don't. As long as the C++ objects are restored all together (i.e., atomic structures and pseudobonds), we can avoid two-phase initialization for them. If the C++ objects refer to Python objects, that part would need to follow a two-phase protocol.
On the Python side, it is possible to provide an API that takes simple data and implements all of the individual save and restore steps.
Tool Classes
- All classes exported by tools need to have a 'tool_info' attribute, a ToolInfo instance
- ToolInfo needs the State version in addition to the tool version
- When registering a tool's commands, it needs the ToolInfo instance, just like starting a tool's GUI to be able set the the 'tool_info' attribute if there is no GUI
- The tool's module needs a 'get_class' function to return the class associated with a name.
- Tool classes might have an alternate initializer if attributes are not simple
