Opened 6 years ago
Closed 6 years ago
#2931 closed enhancement (fixed)
Built in method to profile trigger firings
Reported by: | Tristan Croll | Owned by: | Eric Pettersen |
---|---|---|---|
Priority: | moderate | Milestone: | |
Component: | Performance | Version: | |
Keywords: | Cc: | chimera-programmers | |
Blocked By: | Blocking: | ||
Notify when closed: | Platform: | all | |
Project: | ChimeraX |
Description
I'm once again doing some profiling (trying to work out why adding an amino acid residue to a large model gets *really* slow when ISOLDE is active - ~6s to extend a chain by one residue in a megadalton complex). About 1 second of this is covered if I %prun the method itself (source code at https://github.com/tristanic/isolde/blob/af87760f3793fe9bb2e42ce0b73c8c91094ec9af/isolde/src/atomic/building/build_utils.py#L110), and the rest is presumably in methods firing in response to the changes
trigger. To track down some similar performance issues in #2355 I hacked the below into triggerset.py
... I'm wondering if it would be worth including something like this permanently - perhaps extended to allow accumulating a profile over the next n triggers, or even to profile all triggers for a set period of time? Seems it would be quite useful as the ChimeraX ecosystem becomes more complex.
class _Trigger: """Keep track of handlers to invoke when activated""" def __init__(self, name, usage_cb, default_one_time, remove_bad_handlers): self._name = name from .orderedset import OrderedSet self._handlers = OrderedSet() # Fire handlers in order they were registered. self._pending_add = set() self._pending_del = set() self._locked = 0 self._blocked = 0 self._need_activate = set() self._need_activate_data = [] self._usage_cb = usage_cb self._default_one_time = default_one_time self._remove_bad_handlers = remove_bad_handlers self._profile_next_run = False self._profile_params = None def profile_next_run(self, sort_by='time', num_entries=20): self._profile_next_run = True self._profile_params = {'sort_by': sort_by, 'num_entries': num_entries} def activate(self, data): if not self._profile_next_run: self._activate(data) return import cProfile, pstats, io params = self._profile_params pr = cProfile.Profile() pr.enable() self._activate(data) pr.disable() s = io.StringIO() ps = pstats.Stats(pr, stream=s).sort_stats(params['sort_by']) ps.print_stats(params['num_entries']) print('Profile for last activation of {} trigger:'.format(self._name)) print(s.getvalue()) self._profile_next_run = False def _activate(self, data): # original, unchanged activate method
Attachments (1)
Change History (8)
comment:1 by , 6 years ago
Component: | Unassigned → Performance |
---|---|
Status: | assigned → accepted |
comment:2 by , 6 years ago
Cc: | added |
---|---|
Resolution: | → fixed |
Status: | accepted → closed |
comment:3 by , 6 years ago
Normally I would, but this came up at about 4pm on a Friday and just made me remember that old snippet - but I didn't actually have it in place in the code. Thanks for adding it! I'll see about adding a few convenience functions to access it from the main TriggerSet
API - as written I believe it's only accessible using hidden methods.
by , 6 years ago
Attachment: | triggerset.py.diff added |
---|
comment:4 by , 6 years ago
Resolution: | fixed |
---|---|
Status: | closed → reopened |
Just attached a diff with an extra method exposing it to the TriggerSet
API. Just call triggers.profile_trigger(name)
.
comment:5 by , 6 years ago
... and after all that, I can't reproduce whatever it was that was making adding residues so slow on my colleague's machine. In a much larger system (~3 MDa) with a similar number of chains, in the 0.92 release build adding a residue to a chain takes about 1.3 seconds, almost all of which is in the method call itself (as long as the ribbon isn't drawn, the changes
trigger only amounts to a couple dozen milliseconds). Will see if the delay persists when he updates his ChimeraX, and if so will profile that particular case.
comment:6 by , 6 years ago
Here's the top few entries for a %prun%
of the call to add_amino_acid_residue()
, for what it's worth:
106791 function calls (96161 primitive calls) in 1.763 seconds Ordered by: internal time ncalls tottime percall cumtime percall filename:lineno(function) 8 0.663 0.083 0.684 0.086 molobject.py:1596(polymers) 1 0.267 0.267 0.267 0.267 molarray.py:665(delete) 43 0.228 0.005 0.229 0.005 molarray.py:598(num_selected) 45 0.100 0.002 0.101 0.002 molc.py:202(set_prop) 4 0.081 0.020 0.083 0.021 symmetry.py:100(whole_residue_sym_sphere)
Suggests at least one thing I could improve immediately: avoid adding the OXT, HXT and H2 atoms from the amino acid template in the first place, so I don't have to then delete them. I'm also curious about the 8 calls to polymers
- I'm not calling this anywhere in this method directly, but is something I'm doing triggering extraneous calls? Will look into it.
comment:7 by , 6 years ago
Resolution: | → fixed |
---|---|
Status: | reopened → closed |
Have added the code from your diff.
There are various C++ things that call the C++ polymers() as a side effect, but the calls you're seeing are going through molobject.py and are therefore being generated somewhere in the Python layer. You could put traceback.print_stack() in molobject.polymers() to see what's calling it.
This change seems useful. I have applied it and it will be in the daily build once the 0.92 release is out.
Just for reference, though the full-code listing you gave is useful, it would be good if you also supplied a diff file. That way I can see the exact differences without having to eyeball back and forth between original code and new code, and I can easily apply the changes with 'patch' instead of typing them by hand.
--Eric