Opened 6 years ago
Closed 6 years ago
#2355 closed defect (not a bug)
Suggestion: track visible atoms/ribbons/bonds on C++ side
Reported by: | Tristan Croll | Owned by: | pett |
---|---|---|---|
Priority: | moderate | Milestone: | |
Component: | Core | Version: | |
Keywords: | Cc: | Tom Goddard | |
Blocked By: | Blocking: | ||
Notify when closed: | Platform: | all | |
Project: | ChimeraX |
Description
For big models, there's currently a fairly large fixed overhead to redrawing even when very few atoms are shown. For 4v9o (four ribosomes, just under a million atoms), if I set the lighting to simple and hide all but one atom, then if m
is the model:
%timeit m._update_graphics() 104 ms ± 16.7 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
Most of this can be explained by just four calls:
%timeit m.chain_trace_atoms() 29.7 ms ± 2.12 ms per loop (mean ± std. dev. of 7 runs, 10 loops each) %timeit m.atoms.visibles 12.6 ms ± 857 µs per loop (mean ± std. dev. of 7 runs, 100 loops each) %timeit m.bonds.showns 22.6 ms ± 1.71 ms per loop (mean ± std. dev. of 7 runs, 10 loops each) %timeit m.atoms.num_selected 14.9 ms ± 2.61 ms per loop (mean ± std. dev. of 7 runs, 100 loops each)
... all for essentially the same reason: they're looping over all residues, atoms bonds or atoms (respectively) in the model at each call.
The problem gets compounded when I add ISOLDE's live validation markup, which also needs to know which atoms/bonds are visible (Rama needs to know about backbone atom visibility, Rota display depends on visibility of the CA-CB bond).
What I'd like to suggest is that this information gets cached on the C++ side. That is, AtomicStructure
would provide new methods, e.g. visible_atoms
,'shown_bonds',ribbon_displays
providing the cached data, with any changes with the possibility to change it flagging the need to recalculate on next call. This would avoid any increase of complexity on the Python side - just a trivial adjustment from e.g. m.atoms.visibles
to m.visible_atoms
while vastly cutting down computational cost in these situations - the loop over all objects only gets run once per relevant change, rather than one or more times per coordinate change. Additionally, it would allow for easy future parallelisation of these tasks if you wanted to go down that route.
Change History (11)
comment:1 by , 6 years ago
follow-up: 2 comment:2 by , 6 years ago
I’m talking about the timing of redrawing when you (e.g.) move an atom. Performance when the model is static is just fine.
follow-up: 3 comment:3 by , 6 years ago
Easy illustration: try hiding all but one chain, then using the swapaa mousemode.
comment:4 by , 6 years ago
The thing to look at is why the graphics is updating visible_atoms when just coordinates change. Possibly the code can be improved to not update the list of visible atoms when only a coordinate change happens.
follow-up: 5 comment:5 by , 6 years ago
I thought about that. My concern with the approach is that each object that needs to know about visible atoms will have to independently update its list every time, causing a potentially large hitch in the graphics. Ideally I think you want one central manager that does it once so everything else can just use the result.
comment:6 by , 6 years ago
I don't understand your last comment. visible_atoms is part of AtomsDrawing which is a private class used by Structure. There is no duplication of visible_atoms as far as I know.
follow-up: 7 comment:7 by , 6 years ago
What I mean is, any class that wants to only draw something if some atom(s) or bond(s) are visible currently has to make its own call to atoms.visibles or bonds.showns, and implement its own logic on when/how to update. My Ramachandran, cis peptide and rotamer markups, for example. In these cases I already have that implemented (they only check when the ‘changes’ trigger indicates display or hide events, or added/removed atoms), but even so there’s still duplication of effort. If the structure itself had a call that gave just the visibles without needing to loop over the whole model every time, then I think this would all get both simpler and faster.
comment:8 by , 6 years ago
Ok, so what you want is a Structure.visible_atoms() method that your backbone and rotamer 3d annotations would use and is fast? Even with such a method your code probably wants to know when the visible atoms changed.
The current Atoms.visibles property is C++ and so I am surprised it that is slow. It is just one pass over atoms and then creating a new Atoms collection instance.
Currently AtomsDrawing.visible_atoms is updated only when the graphics is updated. There in general this issue when you are asking about the graphics of whether the property you are asking for is for the last drawn frame, or is it for what would be shown if the frame were drawn now. AtomsDrawing.visible_atoms tells you about the last drawn frame. But Atoms.visible tells you about what would be visible if the frame were drawn now.
In general code that caches state is a massive source of errors and bugs so we are reluctant to add such code unless it is well justified.
comment:9 by , 6 years ago
Well, it *is* fast :m.atoms[m.atoms.visibles]
takes about 13ns per atom, which seems roughly right - but when you're working with 1M atoms, that still adds up to a fair bit of time.
Beyond that, though, my apologies: the timings I gave in the initial post were for the case where everything is flagged as changed. Not really representative of the case where just atoms are moving. To get a better picture of what's happening on a given trigger change, I made some adjustments to triggerset.py
:
class _Trigger: """Keep track of handlers to invoke when activated""" def __init__(self, name, usage_cb, default_one_time, remove_bad_handlers): self._name = name from .orderedset import OrderedSet self._handlers = OrderedSet() # Fire handlers in order they were registered. self._pending_add = set() self._pending_del = set() self._locked = 0 self._blocked = 0 self._need_activate = set() self._need_activate_data = [] self._usage_cb = usage_cb self._default_one_time = default_one_time self._remove_bad_handlers = remove_bad_handlers self._profile_next_run = False self._profile_params = None def profile_next_run(self, sort_by='time', num_entries=20): self._profile_next_run = True self._profile_params = {'sort_by': sort_by, 'num_entries': num_entries} def activate(self, data): if not self._profile_next_run: self._activate(data) return import cProfile, pstats, io params = self._profile_params pr = cProfile.Profile() pr.enable() self._activate(data) pr.disable() s = io.StringIO() ps = pstats.Stats(pr, stream=s).sort_stats(params['sort_by']) ps.print_stats(params['num_entries']) print('Profile for last activation of {} trigger:'.format(self._name)) print(s.getvalue()) self._profile_next_run = False def _activate(self, data): # original, unchanged activate method
Then if I load 4v9o, hide all but chain E5, display that as sticks and select it:
def move_atoms_and_profile_result(selection, trigger): trigger.profile_next_run(sort_by='cumtime') selection.coords += 0 from chimerax.atomic import selected_atoms sel = selected_atoms(session) trigger = session.triggers._triggers['graphics update'] move_atoms_and_profile_result(sel, trigger)
... and the typical result in the log (the first call always seems to take ~twice as long as subsequent ones - no idea why):
Profile for last activation of graphics update trigger: 1191 function calls (1150 primitive calls) in 0.052 seconds Ordered by: cumulative time List reduced from 142 to 20 due to restriction <20> ncalls tottime percall cumtime percall filename:lineno(function) 1 0.000 0.000 0.052 0.052 /opt/UCSF/ChimeraX-daily/lib/python3.7/site-packages/chimerax/core/triggerset.py:201(_activate) 1 0.000 0.000 0.052 0.052 /opt/UCSF/ChimeraX-daily/lib/python3.7/site-packages/chimerax/core/triggerset.py:128(invoke) 1 0.000 0.000 0.052 0.052 /opt/UCSF/ChimeraX-daily/lib/python3.7/site-packages/chimerax/atomic/structure.py:2659(_update_graphics_if_needed) 1 0.000 0.000 0.039 0.039 /opt/UCSF/ChimeraX-daily/lib/python3.7/site-packages/chimerax/atomic/structure.py:303(_update_graphics_if_needed) 1 0.000 0.000 0.038 0.038 /opt/UCSF/ChimeraX-daily/lib/python3.7/site-packages/chimerax/atomic/structure.py:334(_update_graphics) 1 0.000 0.000 0.035 0.035 /opt/UCSF/ChimeraX-daily/lib/python3.7/site-packages/chimerax/atomic/structure.py:1559(_update_ribbon_graphics) 8 0.015 0.002 0.016 0.002 /opt/UCSF/ChimeraX-daily/lib/python3.7/site-packages/chimerax/atomic/molc.py:189(get_prop) 1 0.014 0.014 0.014 0.014 /opt/UCSF/ChimeraX-daily/lib/python3.7/site-packages/chimerax/atomic/molarray.py:596(num_selected) 1 0.000 0.000 0.012 0.012 {built-in method builtins.sum} 2 0.000 0.000 0.012 0.006 /opt/UCSF/ChimeraX-daily/lib/python3.7/site-packages/chimerax/atomic/structure.py:2667(<genexpr>) 6 0.012 0.002 0.012 0.002 /opt/UCSF/ChimeraX-daily/lib/python3.7/site-packages/chimerax/atomic/molc.py:92(get_prop) 2 0.002 0.001 0.003 0.001 /opt/UCSF/ChimeraX-daily/lib/python3.7/site-packages/chimerax/atomic/molc.py:160(get_prop) 25 0.000 0.000 0.002 0.000 /opt/UCSF/ChimeraX-daily/lib/python3.7/site-packages/chimerax/atomic/molc.py:319(pointer) 1 0.000 0.000 0.002 0.002 /opt/UCSF/ChimeraX-daily/lib/python3.7/site-packages/chimerax/atomic/structure.py:343(_update_atom_graphics) 25 0.000 0.000 0.002 0.000 /opt/UCSF/ChimeraX-daily/lib/python3.7/site-packages/numpy/core/_internal.py:291(__init__) 25 0.000 0.000 0.002 0.000 /opt/UCSF/ChimeraX-daily/lib/python3.7/site-packages/numpy/core/_internal.py:274(_get_void_ptr) 11 0.000 0.000 0.001 0.000 /opt/UCSF/ChimeraX-daily/lib/python3.7/site-packages/chimerax/atomic/molarray.py:101(__init__) 4 0.000 0.000 0.001 0.000 /opt/UCSF/ChimeraX-daily/lib/python3.7/site-packages/chimerax/atomic/molarray.py:228(filter) 11 0.000 0.000 0.001 0.000 /opt/UCSF/ChimeraX-daily/lib/python3.7/site-packages/chimerax/atomic/molc.py:236(set_cvec_pointer) 8 0.000 0.000 0.001 0.000 /opt/UCSF/ChimeraX-daily/lib/python3.7/site-packages/chimerax/atomic/molarray.py:1051(__init__)
... so under these circumstances it's really just _update_ribbon_graphics()
that's the time hog - 35 out of the total 38ms for AtomicStructure._update_graphics()
despite no ribbons actually being displayed. Will do some profiling with ISOLDE running to see if I can make improvements there.
An alternative suggestion regarding avoiding duplicating the checks for visible/selected/whatever atoms in different places: what if the "definitive" arrays for these were to be provided in the Changes
object created by the change tracker?
comment:10 by , 6 years ago
Seems another mea culpa is in order. Profiling the AtomicStructure
'changes' trigger with a simulation running immediately picked up that I had Clipper unnecessarily updating all atoms' HIDE_ISOLDE
bits at every coordinate change (it only needs to do that when in "spotlight mode" - i.e. the scrolling sphere of density). After fixing that and turning off the live map recalculation, a typical interactive simulation in 4v9o (3736 atoms visible, 1481 mobile) maintains a quite respectable 15 fps. 43ms on the 'graphics update' trigger (27 of which is _update_ribbon_graphics()
, 7ms for ISOLDE/Clipper's tasks on the 'changes' trigger, ~16ms elsewhere. While some refactoring of _update_ribbon_graphics()
(i.e. filtering up-front to only consider residues with displayed ribbons) could improve it to ~30fps, in all honesty given the stupendous size of this model I'm happy with that as-is.
comment:11 by , 6 years ago
Resolution: | → not a bug |
---|---|
Status: | assigned → closed |
Will open a ticket about optimizing ribbon graphics
When I open 4v90 on my Mac, which is no spring chicken (2012 iMac, GeForce GTX 675MX graphics), I get nearly 40 frames/sec, showing all atoms using full lighting. That's roughly 25ms/frame. The question is why the performance on your machine is more than 4x worse. If I drop to simple lighting, I get the max 60 frames/sec rate.
I do note that if I just sit there doing nothing in ChimeraX, it slowly leaks memory. That probably needs looking into.