Opened 5 years ago

Closed 5 years ago

Last modified 5 years ago

#3290 closed defect (fixed)

selected_residues(session) very slow for large models

Reported by: Tristan Croll Owned by: Eric Pettersen
Priority: moderate Milestone: 1.1
Component: Performance Version:
Keywords: Cc: Tom Goddard
Blocked By: Blocking:
Notify when closed: Platform: all
Project: ChimeraX

Description

From #3285. For 3j3y, with all atoms selected:

m = session.models.list()[0]
from chimerax.atomic import selected_residues
%prun selected_residues(session)

The actual culprit is the call to selected_bonds(session, inter_residue=False).

Change History (5)

comment:1 by Tristan Croll, 5 years ago

Sorry - forgot to include the actual result of the %prun call:

         69281069 function calls (69281067 primitive calls) in 492.462 seconds

   Ordered by: internal time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
 12990102  348.029    0.000  348.029    0.000 {built-in method builtins.dir}
 12990102   61.939    0.000  409.968    0.000 attr_registration.py:60(get_attr)
        1   29.662   29.662  491.837  491.837 structure.py:2156(selected_bonds)
 25980219   26.048    0.000   26.048    0.000 {built-in method builtins.hasattr}
 12990102   22.834    0.000  458.850    0.000 attr_registration.py:35(_getattr_)
  4330034    1.945    0.000    1.945    0.000 {chimerax.atomic.cymol.c_ptr_to_py_inst}

comment:2 by Eric Pettersen, 5 years ago

Status: assignedaccepted

comment:3 by Eric Pettersen, 5 years ago

Cc: Tom Goddard added
Component: CorePerformance
Milestone: 1.1

Hi Tristan,

This change occurred because inter-residue bonds need to _not_ select the residues they connect e.g. without the change "sel #1; ~sel protein; color red sel" would not only color bound ligands and glycosylations red, but also the connected protein residue(s), since the inter-residue bond remained selected. I did not see any way to remove inter-residue bonds from a Collection without resorting to a Python loop, though it's possible that I missed some arcane numpy wizardry.
At any rate, that code needs to be moved to C++ to solve this issue, and with the code freeze that means after the 1.0 release. Probably _immediately_ after, as an Atomic 1.1 bundle on the Toolshed.

--Eric

Eric Pettersen
UCSF Computer Graphics Lab

comment:4 by Eric Pettersen, 5 years ago

Resolution: fixed
Status: acceptedclosed

After looking at the code, I'm not looping in Python. The lesson is that instead of this:

numpy.equal(atoms1.residues, atoms2.residues)

(which calls back into Python for the equality test), use this:

atoms1.residues == atoms2.residues

It reduces the time of selected_residues() on 3j3y to about a third of a second. I'll see If I can get clearance from the other programmers to make this seemingly innocuous change in the 1.0 release.

--Eric

comment:5 by Eric Pettersen, 5 years ago

Okay, got clearance -- will be in tomorrow's build.

BTW, the code review was useful because:

atoms1.residues == atoms2.residues

returns a scalar boolean (as noted by Tom), so the correct change is:

atoms1.residues.pointer == atoms2.residues.pointers

Note: See TracTickets for help on using tickets.