#3139 closed enhancement (wontfix)
Thread-safe Atom deletion
Reported by: | Owned by: | pett | |
---|---|---|---|
Priority: | moderate | Milestone: | |
Component: | Structure Editing | Version: | |
Keywords: | Cc: | Tom Goddard | |
Blocked By: | Blocking: | ||
Notify when closed: | Platform: | all | |
Project: | ChimeraX |
Description
Tristan suggests using a shared pointer for Atom to avoid deleting the C++ Atom when a thread (e.g. structure validation during ISOLDE MD) is using it.
Begin forwarded message:
From: Tristan Croll <tic20@…>
Subject: Re: ISOLDE and ChimeraX plan for the future
Date: May 4, 2020 at 3:24:17 PM PDT
To: Tom Goddard <goddard@…>
Hi Tom,
Coming back to your question on things from the ChimeraX end that I think could help ISOLDE. I can think of a few - some quite easy, some a little more complex.
[... building side-chains stuff deleted ...]
More difficult, but I think important for further performance improvements: making the core AtomStruct C++ objects safe to use in threads. Not fully thread-safe (a *lot* of work, and could easily reduce single-thread performance), but ensuring that things don't explode if an atom is deleted while a thread is running. I *think* this might be possible reasonably easily by wrapping the objects currently stored as raw pointers managed with new/delete in std::shared_ptr - a "delete" operation from the GUI would then just drop the shared pointer, but the object's destructor won't actually be called until the last reference to it goes out of scope. You could then give the class a Boolean "deleted" flag so anything using it knows to skip it. There may be other complications I haven't thought of, but if it can be done it would be really useful - while making a copy of the atom properties in the main thread for other threads to then process certainly works, it does start to get complicated once you have lots of different things all responding to coordinate changes - either they all have to work through a single central dispatch scheme, or stay single threaded, or each make their own copy of the coordinates to work from. So at the moment I only do the map calculations in separate threads, but the geometric validation and restraint visualisation still happens in the main thread. It's generally fast enough so that's OK for now, but there are more things to be added. Sugar validation and restraints, for a start - working over the past week or two with fully-glycosylated models of the spike protein has made me realise just how urgently that's needed - right now, tugging them around leads to some pretty nasty geometry. Nucleic acid validation is also needed, but quite slow/challenging - while rotamer/Ramachandran lookups are pretty straightforward, validation of nucleotide geometry is a 12-dimensional problem requiring some creative approaches to be tractable at all. Having that minimal level of thread-safety would add a lot of flexibility.
Change History (6)
comment:1 by , 5 years ago
follow-up: 2 comment:2 by , 5 years ago
Can of worms? Probably. What I was (vaguely) thinking is that the cleanup cascade (DestructionObserver/DestructionBatcher etc.) could be moved into a separate function outside of the actual destructors, so from the main application's perspective everything tidies up exactly as before but the "zombie" objects stay in existence until their shared_ptr reference count goes to zero. But this would still be problematic, I suppose - as an example, when an Atom is deleted the Residue's std::vector of Atom* changes length, and *that* will cause a segmentation fault in any thread that happened to be iterating through it at the time. So... never mind, I suppose. What I guess I *can* do safely is put each validation task into its own thread, as long as the main thread remains blocked until they're done. As long as each individual task is no more than a few milliseconds, that should be enough to keep the performance reasonable. On 2020-05-05 00:44, ChimeraX wrote:
comment:3 by , 5 years ago
Resolution: | → wontfix |
---|---|
Status: | assigned → closed |
Glad I didn't need to add my own comments about what a bad idea this is. :-). Just as an example, there is no correct value you can return for structure.num_atoms when some threads need to treat some atoms as dead and other atoms don't -- on top of which you can no longer just return the size() of the atoms container, you have to iterate through it and see which ones are dead.
I should think that a threaded simulation needs to grab the Python GIL in order to make calls into non-thread-safe code. I don't know how expensive that is.
follow-up: 4 comment:4 by , 5 years ago
Hmm... I thought the point of this was to not block the main thread, so calculations like validation can run in parallel while the graphics are still updating. Of course this is a hard and complex problem. The easy reliable solution is threads only use their own copy of the data. If you don't do that and say are checking rotamers, another thread can change the atom coordinates at any point while the validation is running. This can lead to bad results, probably not fatal. Another common method is hold a lock. Don't allow deleting atoms while the validation runs because it holds a lock and so the atom delete code can't get the lock and so the code trying to delete atoms blocks. This can lead to deadlocks, one bad piece of code that does not release the lock freezes everything, even worse than a crash for debugging. Being notified that atoms are about to be deleted is another approach. All approaches are difficult and a specific idea is needed to make any progress.
follow-up: 5 comment:5 by , 5 years ago
Perhaps a "deferred delete" approach? Require each C++ thread currently using atoms to register itself, and have the main code simply hold off on the deletion until they're done? You could avoid locking up by having the GUI thread check if it's safe to apply the delete at each iteration of the event loop, and prevent an "infinite deferral" scenario by preventing new threads from registering themselves until the existing deletion is done. Then the only thing to worry about would be a registered thread that forgets to deregister itself... but that would be relatively easy to debug because the main loop would still be running so you can query what's holding things up. On 2020-05-05 18:19, ChimeraX wrote:
follow-up: 6 comment:6 by , 5 years ago
I've thought about the issue of coordinate changes in the middle of a thread, by the way, and come to the conclusion that it shouldn't be a big deal. Yes, it might cause a bit of a blip in that particular calculation, but the change will trigger an automatic *recalculation* so the effects will be wiped out out almost immediately. There would be other edits that have to be handled carefully, though: adding atoms would obviously also be problematic. Also adding/removing altlocs and creating/removing ANISOUs would risk segmentation faults. Moving to a different coordset... not sure if there are others. On 2020-05-05 18:19, ChimeraX wrote:
I suspect this suggestion is a can of worms. If Atom delete caused all cleanup except the actual C++ delete (triggers fire, C++ track changes notes deletion, collections have atoms removed), then residues can't be deleted, nor structures, nor chains, ... because the Atom can reference all those. Yet the Python code has to not have access to any of these zombie objects or all sorts of trouble will result.
Maybe some other techniques are simpler. Why would a thread that operates on atoms not just register for a trigger that is called just before the atoms are deleted? I don't think we have that, I think we only notify after atoms are deleted now. When it gets this trigger it terminates the thread (that might also be problematic).
A heavier handed approach is before starting a thread it flags the structure it is using as not deletable. If the user tries to delete any atoms of that structure, or maybe do any modification of the bonds, residues, atoms, then it raises a Python error.