Opened 6 years ago

Closed 6 years ago

#2355 closed defect (not a bug)

Suggestion: track visible atoms/ribbons/bonds on C++ side

Reported by: Tristan Croll Owned by: pett
Priority: moderate Milestone:
Component: Core Version:
Keywords: Cc: Tom Goddard
Blocked By: Blocking:
Notify when closed: Platform: all
Project: ChimeraX

Description

For big models, there's currently a fairly large fixed overhead to redrawing even when very few atoms are shown. For 4v9o (four ribosomes, just under a million atoms), if I set the lighting to simple and hide all but one atom, then if m is the model:

%timeit m._update_graphics()
104 ms ± 16.7 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

Most of this can be explained by just four calls:

%timeit m.chain_trace_atoms()
29.7 ms ± 2.12 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
%timeit m.atoms.visibles
12.6 ms ± 857 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
%timeit m.bonds.showns
22.6 ms ± 1.71 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
%timeit m.atoms.num_selected
14.9 ms ± 2.61 ms per loop (mean ± std. dev. of 7 runs, 100 loops each)

... all for essentially the same reason: they're looping over all residues, atoms bonds or atoms (respectively) in the model at each call.

The problem gets compounded when I add ISOLDE's live validation markup, which also needs to know which atoms/bonds are visible (Rama needs to know about backbone atom visibility, Rota display depends on visibility of the CA-CB bond).

What I'd like to suggest is that this information gets cached on the C++ side. That is, AtomicStructure would provide new methods, e.g. visible_atoms,'shown_bonds',ribbon_displays providing the cached data, with any changes with the possibility to change it flagging the need to recalculate on next call. This would avoid any increase of complexity on the Python side - just a trivial adjustment from e.g. m.atoms.visibles to m.visible_atoms while vastly cutting down computational cost in these situations - the loop over all objects only gets run once per relevant change, rather than one or more times per coordinate change. Additionally, it would allow for easy future parallelisation of these tasks if you wanted to go down that route.

Change History (11)

comment:1 by pett, 6 years ago

When I open 4v90 on my Mac, which is no spring chicken (2012 iMac, GeForce GTX 675MX graphics), I get nearly 40 frames/sec, showing all atoms using full lighting. That's roughly 25ms/frame. The question is why the performance on your machine is more than 4x worse. If I drop to simple lighting, I get the max 60 frames/sec rate.

I do note that if I just sit there doing nothing in ChimeraX, it slowly leaks memory. That probably needs looking into.

in reply to:  2 ; comment:2 by Tristan Croll, 6 years ago

I’m talking about the timing of redrawing when you (e.g.) move an atom. Performance when the model is static is just fine.
 

 


in reply to:  3 ; comment:3 by Tristan Croll, 6 years ago

Easy illustration: try hiding all but one chain, then using the swapaa mousemode. 
 


comment:4 by Tom Goddard, 6 years ago

The thing to look at is why the graphics is updating visible_atoms when just coordinates change. Possibly the code can be improved to not update the list of visible atoms when only a coordinate change happens.

in reply to:  5 ; comment:5 by Tristan Croll, 6 years ago

I thought about that. My concern with the approach is that each object that needs to know about visible atoms will have to independently update its list every time, causing a potentially large hitch in the graphics. Ideally I think you want one central manager that does it once so everything else can just use the result. 
 


comment:6 by Tom Goddard, 6 years ago

I don't understand your last comment. visible_atoms is part of AtomsDrawing which is a private class used by Structure. There is no duplication of visible_atoms as far as I know.

in reply to:  7 ; comment:7 by Tristan Croll, 6 years ago

What I mean is, any class that wants to only draw something if some atom(s) or bond(s) are visible currently has to make its own call to atoms.visibles or bonds.showns, and implement its own logic on when/how to update. My Ramachandran, cis peptide and rotamer markups, for example. In these cases I already have that implemented (they only check when the ‘changes’ trigger indicates display or hide events, or added/removed atoms), but even so there’s still duplication of effort. If the structure itself had a call that gave just the visibles without needing to loop over the whole model every time, then I think this would all get both simpler and faster.
 

 


comment:8 by Tom Goddard, 6 years ago

Ok, so what you want is a Structure.visible_atoms() method that your backbone and rotamer 3d annotations would use and is fast? Even with such a method your code probably wants to know when the visible atoms changed.

The current Atoms.visibles property is C++ and so I am surprised it that is slow. It is just one pass over atoms and then creating a new Atoms collection instance.

Currently AtomsDrawing.visible_atoms is updated only when the graphics is updated. There in general this issue when you are asking about the graphics of whether the property you are asking for is for the last drawn frame, or is it for what would be shown if the frame were drawn now. AtomsDrawing.visible_atoms tells you about the last drawn frame. But Atoms.visible tells you about what would be visible if the frame were drawn now.

In general code that caches state is a massive source of errors and bugs so we are reluctant to add such code unless it is well justified.

comment:9 by Tristan Croll, 6 years ago

Well, it *is* fast :m.atoms[m.atoms.visibles] takes about 13ns per atom, which seems roughly right - but when you're working with 1M atoms, that still adds up to a fair bit of time.

Beyond that, though, my apologies: the timings I gave in the initial post were for the case where everything is flagged as changed. Not really representative of the case where just atoms are moving. To get a better picture of what's happening on a given trigger change, I made some adjustments to triggerset.py:

class _Trigger:
    """Keep track of handlers to invoke when activated"""

    def __init__(self, name, usage_cb, default_one_time, remove_bad_handlers):
        self._name = name
        from .orderedset import OrderedSet
        self._handlers = OrderedSet()	# Fire handlers in order they were registered.
        self._pending_add = set()
        self._pending_del = set()
        self._locked = 0
        self._blocked = 0
        self._need_activate = set()
        self._need_activate_data = []
        self._usage_cb = usage_cb
        self._default_one_time = default_one_time
        self._remove_bad_handlers = remove_bad_handlers
        self._profile_next_run = False
        self._profile_params = None

    def profile_next_run(self, sort_by='time', num_entries=20):
        self._profile_next_run = True
        self._profile_params = {'sort_by': sort_by, 'num_entries': num_entries}

    def activate(self, data):
        if not self._profile_next_run:
            self._activate(data)
            return
        import cProfile, pstats, io
        params = self._profile_params
        pr = cProfile.Profile()
        pr.enable()
        self._activate(data)
        pr.disable()
        s = io.StringIO()
        ps = pstats.Stats(pr, stream=s).sort_stats(params['sort_by'])
        ps.print_stats(params['num_entries'])
        print('Profile for last activation of {} trigger:'.format(self._name))
        print(s.getvalue())
        self._profile_next_run = False


    def _activate(self, data):
        # original, unchanged activate method

Then if I load 4v9o, hide all but chain E5, display that as sticks and select it:

def move_atoms_and_profile_result(selection, trigger):
    trigger.profile_next_run(sort_by='cumtime')
    selection.coords += 0

from chimerax.atomic import selected_atoms
sel = selected_atoms(session)
trigger = session.triggers._triggers['graphics update']
move_atoms_and_profile_result(sel, trigger)

... and the typical result in the log (the first call always seems to take ~twice as long as subsequent ones - no idea why):

Profile for last activation of graphics update trigger:
1191 function calls (1150 primitive calls) in 0.052 seconds

Ordered by: cumulative time
List reduced from 142 to 20 due to restriction <20>

ncalls tottime percall cumtime percall filename:lineno(function)
1 0.000 0.000 0.052 0.052 /opt/UCSF/ChimeraX-daily/lib/python3.7/site-packages/chimerax/core/triggerset.py:201(_activate)
1 0.000 0.000 0.052 0.052 /opt/UCSF/ChimeraX-daily/lib/python3.7/site-packages/chimerax/core/triggerset.py:128(invoke)
1 0.000 0.000 0.052 0.052 /opt/UCSF/ChimeraX-daily/lib/python3.7/site-packages/chimerax/atomic/structure.py:2659(_update_graphics_if_needed)
1 0.000 0.000 0.039 0.039 /opt/UCSF/ChimeraX-daily/lib/python3.7/site-packages/chimerax/atomic/structure.py:303(_update_graphics_if_needed)
1 0.000 0.000 0.038 0.038 /opt/UCSF/ChimeraX-daily/lib/python3.7/site-packages/chimerax/atomic/structure.py:334(_update_graphics)
1 0.000 0.000 0.035 0.035 /opt/UCSF/ChimeraX-daily/lib/python3.7/site-packages/chimerax/atomic/structure.py:1559(_update_ribbon_graphics)
8 0.015 0.002 0.016 0.002 /opt/UCSF/ChimeraX-daily/lib/python3.7/site-packages/chimerax/atomic/molc.py:189(get_prop)
1 0.014 0.014 0.014 0.014 /opt/UCSF/ChimeraX-daily/lib/python3.7/site-packages/chimerax/atomic/molarray.py:596(num_selected)
1 0.000 0.000 0.012 0.012 {built-in method builtins.sum}
2 0.000 0.000 0.012 0.006 /opt/UCSF/ChimeraX-daily/lib/python3.7/site-packages/chimerax/atomic/structure.py:2667(<genexpr>)
6 0.012 0.002 0.012 0.002 /opt/UCSF/ChimeraX-daily/lib/python3.7/site-packages/chimerax/atomic/molc.py:92(get_prop)
2 0.002 0.001 0.003 0.001 /opt/UCSF/ChimeraX-daily/lib/python3.7/site-packages/chimerax/atomic/molc.py:160(get_prop)
25 0.000 0.000 0.002 0.000 /opt/UCSF/ChimeraX-daily/lib/python3.7/site-packages/chimerax/atomic/molc.py:319(pointer)
1 0.000 0.000 0.002 0.002 /opt/UCSF/ChimeraX-daily/lib/python3.7/site-packages/chimerax/atomic/structure.py:343(_update_atom_graphics)
25 0.000 0.000 0.002 0.000 /opt/UCSF/ChimeraX-daily/lib/python3.7/site-packages/numpy/core/_internal.py:291(__init__)
25 0.000 0.000 0.002 0.000 /opt/UCSF/ChimeraX-daily/lib/python3.7/site-packages/numpy/core/_internal.py:274(_get_void_ptr)
11 0.000 0.000 0.001 0.000 /opt/UCSF/ChimeraX-daily/lib/python3.7/site-packages/chimerax/atomic/molarray.py:101(__init__)
4 0.000 0.000 0.001 0.000 /opt/UCSF/ChimeraX-daily/lib/python3.7/site-packages/chimerax/atomic/molarray.py:228(filter)
11 0.000 0.000 0.001 0.000 /opt/UCSF/ChimeraX-daily/lib/python3.7/site-packages/chimerax/atomic/molc.py:236(set_cvec_pointer)
8 0.000 0.000 0.001 0.000 /opt/UCSF/ChimeraX-daily/lib/python3.7/site-packages/chimerax/atomic/molarray.py:1051(__init__)

... so under these circumstances it's really just _update_ribbon_graphics() that's the time hog - 35 out of the total 38ms for AtomicStructure._update_graphics() despite no ribbons actually being displayed. Will do some profiling with ISOLDE running to see if I can make improvements there.

An alternative suggestion regarding avoiding duplicating the checks for visible/selected/whatever atoms in different places: what if the "definitive" arrays for these were to be provided in the Changes object created by the change tracker?

comment:10 by Tristan Croll, 6 years ago

Seems another mea culpa is in order. Profiling the AtomicStructure 'changes' trigger with a simulation running immediately picked up that I had Clipper unnecessarily updating all atoms' HIDE_ISOLDE bits at every coordinate change (it only needs to do that when in "spotlight mode" - i.e. the scrolling sphere of density). After fixing that and turning off the live map recalculation, a typical interactive simulation in 4v9o (3736 atoms visible, 1481 mobile) maintains a quite respectable 15 fps. 43ms on the 'graphics update' trigger (27 of which is _update_ribbon_graphics(), 7ms for ISOLDE/Clipper's tasks on the 'changes' trigger, ~16ms elsewhere. While some refactoring of _update_ribbon_graphics() (i.e. filtering up-front to only consider residues with displayed ribbons) could improve it to ~30fps, in all honesty given the stupendous size of this model I'm happy with that as-is.

comment:11 by pett, 6 years ago

Resolution: not a bug
Status: assignedclosed

Will open a ticket about optimizing ribbon graphics

Note: See TracTickets for help on using tickets.