Opened 8 years ago
Closed 8 years ago
#944 closed enhancement (fixed)
RFE: molc-style API for custom C++ objects
| Reported by: | Tristan Croll | Owned by: | Tom Goddard |
|---|---|---|---|
| Priority: | moderate | Milestone: | |
| Component: | Core | Version: | |
| Keywords: | Cc: | Eric Pettersen | |
| Blocked By: | Blocking: | ||
| Notify when closed: | Platform: | all | |
| Project: | ChimeraX |
Description
I'm getting around to something I've been putting off for a while: developing C++ implementations of the various array objects I need in ISOLDE (in particular, dihedrals, rotamers and possibly restraint targets). At present I have moderately-workable Python implementations (slow to set up, fast to use, but not at all robust to addition/deletion of atoms etc.). What I should be doing (and want to do) is make these use essentially the same framework as Atom, Bond, Residue etc. Looks like I can do it by copy-pasting and adapting bits from molc.cpp, molc.py, molobject.py and molarray.py. But it's a really neat and robust scheme you have, and I think it would be really useful if it were adapted into a proper API for future plugin developers. That is, creating a generic header/cpp with all the base methods from molc.cpp, and turning the contents of molc.py into a class. Unless I misunderstand, at present molc.py is inherently limited to the use of a single C++ library for a given session, since _molc_lib is a global variable set by c_function().
I'm also struggling to trace through how the automatic deletion system (e.g. deleting a bond when one of its atoms is deleted), and whether it's possible for me to make use of this for new, not-built-in types. Is there documentation on this aspect anywhere?
As an example of what I'm trying to do, I'll attach the progress I've made today on putting together a C++ dihedral implementation. It compiles, but I haven't gotten as far as connecting it through to Python yet. That I'm pretty sure I can do, but I'd be very grateful for any pointers on connecting it to the ChangeTracker and DestructionCoordinator systems.
Attachments (3)
Change History (12)
by , 8 years ago
| Attachment: | dihedral.h added |
|---|
by , 8 years ago
| Attachment: | dihedral.cpp added |
|---|
by , 8 years ago
| Attachment: | geometry.h added |
|---|
comment:1 by , 8 years ago
| Cc: | added; removed |
|---|---|
| Owner: | changed from to |
comment:2 by , 8 years ago
Ok, I added to atomic/molc.py a CFunctions class so it can be used to access shared libraries besides the atomic library libmolc. This gets rid of the global path to the shared library that was being used previously.
comment:3 by , 8 years ago
Thanks for the details. Yep - it’s the ability to delete itself when any of the constituent atoms are deleted that I’m after. Then at the next step I have to implement Rotamer as a collection of Dihedral objects, which will have the same need. Tristan Croll Research Fellow Cambridge Institute for Medical Research University of Cambridge CB2 0XY
comment:4 by , 8 years ago
Also, I am working on moving the object_map implementation from molobject.py into C++ for efficiency and so that some C++ functions (i.e. atom.neighbors, atom.bonds) can return Python objects directly (creating them if necessary) rather than returning pointers that get looked up in the Python dict (and then possibly created if needed). So, what I'm saying is that certain parts of this Python/C++ interface mechanism are not completely settled.
As for the magical Collection auto-shrinking behavior, this is set up by the remove_deleted_pointers call in the Collection's init, which eventually adds the Collection's numpy array to the Array_Updater class in molc.cpp. The Array_Updater inherits from DestructionObserver, so it is notified when C++ objects are destroyed. As implemented, Array_Updater only works for 1-dimensional arrays, but maybe it could be enhanced to work on two-dimensional arrays, though the line "*PyArray_DIMS(a) = j; TODO: This hack may break numpy." gives one pause.
--Eric
comment:5 by , 8 years ago
1D is fine by me. :) I’m trying to build this with future VR-level performance in mind, so I’ll have two alternative pipelines: for “normal” use (e.g. on user demand) there will be a Dihedrals.values call, which can then be piped through to the validation lookups. Where maximum graphics performance is needed (within simulations, or playback of trajectories), I’ve put together a threaded Python scheme, ultimately to be ported to a pure C++ once I get around to setting up the n-dimensional interpolator - scipy.RegularGridInterpolator is fast, but still has too much Python in it to work really well within threads. My threaded version uses much of the same code, but just takes the coordinates and returns scores, with no ChimeraX objects involved so thread safety is easy to maintain. I’ve got the code ready to prepare all the annotation drawings in C++ as well, so ultimately the whole rotamer, Ramachandran, omega (and eventually RNA and sugars, chirality etc.) validation pipeline will be able to run effectively independent of the graphics loop. While it’s fast enough now to run without threading, that will change once all the different validation metrics are brought into play (the MolProbity “suiteness” score for RNA is a 9-dimensional look-up of dihedral values!), so it’s best I get a coherent framework sorted out early. Tristan Croll Research Fellow Cambridge Institute for Medical Research University of Cambridge CB2 0XY
follow-up: 3 comment:6 by , 8 years ago
Okay, your C++ Dihedral objects will have to inherit from DestructionObserver and implement that's class's destruction_done() method to check if any of the passed-in pointers (it's a set<void*>) are one of the dihedral atoms, and call it's own destructor if so. The Dihedral's destructor will have to create a DestructionUser instance and then destroy it (typically by declaring it as a local variable and simply letting it go out of scope at the end of the destructor). The PBGroup class is a DestructionObserver that you could look at as an example, and most of the atomic classes do the DestructionUser thing (e.g. ~Atom).
If you do that, then I think that implementing Dihedrals as a subclass of Collection in Python will "just work".
--Eric
comment:7 by , 8 years ago
OK, basics are up and working - I can create a Dihedrals object from a list of atoms and get the angles, and it does indeed automatically delete constituent dihedrals when their atoms are deleted. It's blazingly fast - for a typical array of a few hundred dihedrals the best I was able to do previously (by getting the coordinates as a numpy array and passing them through to a separate C++ function to return a numpy array of angles) was ~300 ns per dihedral. For a test case of 441 dihedrals this implementation gets that down to 150ns. Beautiful! A fair bit still to do, of course - still have to put together a manager for their creation and deletion, and to make the various subclasses (proper and improper dihedrals, chiral centres), etc.. But the path forward is pretty clear now. Thanks for the help! On 2017-12-04 23:42, ChimeraX wrote:
comment:9 by , 8 years ago
| Resolution: | → fixed |
|---|---|
| Status: | assigned → closed |
Eric is the one who wrote the ChangeTracker and DestructionCoordinator classes and all the molecule C++ data structures. Tom wrote the ctypes wrapper including the array interfaces (Collections in atomic/molarray.py).
We are concerned that the speed of the ctypes wrapping is not fast enough, particularly when accessing attributes of atoms and bonds in Python code. Using ctypes introduces overhead and is a factor of maybe 5x slower than C++ Python wrapping we used in Chimera. This can make Python analysis code many times slower than in Chimera. So we have it in our plans to revisit whether we use ctypes for accessing atomic data. One idea is to use Cython for the wrapping. In Chimera we used a C++ wrapper generator that Greg Couch developed called Wrappy and we will not use that. We expect the Python attribute and array access method to stay the same but the underlying implementation may all change.
Not sure what you kind of deletion management you are asking about. When an atom is deleted, then bonds are deleted, that is hard-coded into the C++ data structures. Maybe instead you mean how does an Atom say get removed from an Atoms collection when the C++ atom is deleted? For that look at the remove_deleted_pointers(array) function in atomic/molarray.py. Oh, I see your C++ Dihedral class depends on 4 atoms and needs to know when any of those 4 get deleted. Maybe the DestructionCoordinator class helps with that, Eric will have to advise.
I can fix up atomic/molc.py so the C library that the Python code wraps is not hard-coded.