Opened 10 years ago
Closed 9 years ago
#175 closed defect (fixed)
2x slower opening 3j3q
Reported by: | Tom Goddard | Owned by: | Tom Goddard |
---|---|---|---|
Priority: | major | Milestone: | |
Component: | Performance | Version: | |
Keywords: | Cc: | chimera-programmers@… | |
Blocked By: | Blocking: | ||
Notify when closed: | Platform: | all | |
Project: | chimera |
Description
Timing opening 3j3q.cif (2.4 million atoms) until it is displayed now takes 19 seconds on my machine, while it used to take 9 seconds. Here are some times from older Chimera 2 versions
9 seconds, July 15
10 s, Sept 16
15 s, Oct 19
19 s, Oct 30
We should try to get it back to 9 seconds.
Attachments (1)
Change History (19)
comment:1 by , 10 years ago
comment:2 by , 10 years ago
Owner: | changed from | to
---|---|
Status: | new → assigned |
Reassigning to Eric to consider C++ tracker increasing open time almost 50%.
Conrad is working on faster ribbons. Should take almost no time if no ribbons are drawn.
Greg can investigate why mmcif is slower.
comment:3 by , 10 years ago
I tried various things to speed up the _start_change_tracking and the change clearing with no luck. However, by using references instead of copies, I was able to reduce the time for getting changes from 2.78 seconds to .12 seconds.
AFAICT, further improvements would require lazy evaluation of the 'created' lists, which may not produce any actual speedup in practice -- depending on what the change-tracking trigger handlers actually do in a mature, production-use ChimeraX. I am therefore not likely to do anything further on this before the December release.
--Eric
comment:4 by , 10 years ago
Looked at the smart display time. About a third is setting the model color. I don't think that can be speeded up. The other two thirds is asking for the number of chains, which caused the chains to get computed. I could put in code to key off the number of atoms for very large cases, but that just pushes the chain computation off to the creation of the changes trigger, so there is no net win in time to open.
Maybe the chain computation could be speeded up? If it knew for a fact that certain chains had identical composition of atoms/residues then it might be possible...
comment:5 by , 10 years ago
Component: | Molecular Viewer → Performance |
---|
comment:6 by , 10 years ago
Owner: | changed from | to
---|
I investigated whether make_chains() could be speeded up by detecting identical SEQRES/polymeric-chain pairs and reusing the matching instead of redundantly computing it. However, as I suspected, the large bulk of the time spent in make_chains() is finding the polymeric chains (~.43 seconds) and not much in the matching (~.08 seconds), making this optimization not worth doing.
So ultimately the only optimization I was able to find (auto& instead of auto in the ctypes copying of the changes object) saved roughly 2.66 seconds. I'm switching the ticket over to Greg now so he can look into the slower mmCIF parsing.
--Eric
comment:7 by , 10 years ago
Was able to speed up parse_mmCIF_file by 1% with improvements to the generic table reading code. Turns out that the big cost is tokenizing the pdbx_poly_seq_scheme table. In 3j3q there are 3,758,832 values, which takes ~3.4% of the total time in parse_mmCIF_file -- on my computer that is .334 seconds out of 10.1 seconds (compared to 9.54 seconds for 7/15/15 build, not tokenizing pdbx_poly_seq_scheme gives similar times, so to Python, the time is closer to .6 seconds, ie., ~6%).
The next step is to try to eliminate the need for that table. It is used to map the mmCIF file's internal chain ids to the author's chain ids that ChimeraX keeps.
comment:8 by , 10 years ago
In my tests the mmcif parsing (_mmcif.parse_mmCIF_file) took 6.95 seconds in July 15 build and 8.50 seconds in Oct 30 build, an increase in time of 22% (= 8.5-6.95 / 6.95). I don't think it is worth worrying about 1% or 3% but 20% may be worth some trouble. If you do not see the 22% slow down then we should figure out why your timings differ from mine. Maybe different compiler optimization settings were used, or different compilers between July and October on Mac or on Linux that are confounding us.
comment:9 by , 10 years ago
After discussion with Eric and TomG, we decided to defer more parse_mmCIF_file optimization until we can discuss what we want to do at a chimera meeting, hopefully at a time when there are more use cases.
comment:10 by , 10 years ago
After discussion with Eric and TomG, we decided to defer more parse_mmCIF_file optimization until we can discuss what we want to do at a chimera meeting, hopefully at a time when there are more use cases.
comment:11 by , 10 years ago
I changed change-tracking so that newly created structures don't put all their sub-components (atoms, bonds, residues, etc.) on the change-tracking created lists. This speeds 3j3q up by about a third of a second. The change-tracking trigger API will also be changing to make it simpler to get, for example, all created atoms, or just atoms created in already existing structures and so forth. So there will be functions to call to retrieve data instead of the trigger user looking into and sifting the data directly.
I also made changes that speed up *closing* 3j3q by several seconds.
comment:13 by , 10 years ago
11.96 seconds, including draw time, on my office iMac descartes (like other timings in this ticket). time open 3j3q Opened mmCIF data containing 2440800 atoms and 2497752 bonds command time 11.66 seconds draw time 0.3048 seconds
comment:14 by , 10 years ago
Owner: | changed from | to
---|
Reassigning to Conrad, so he'll double check that it doesn't take 2.7 seconds to not create ribbons.
comment:15 by , 9 years ago
Resolution: | → fixed |
---|---|
Status: | assigned → closed |
The _create_ribbon_graphics method now shortcuts to return quickly when no ribbon is being computed. In particular, the polymer chains are not computed.
comment:16 by , 9 years ago
Resolution: | fixed |
---|---|
Status: | closed → reopened |
Been 6 months since our last timings done. Need to re-run "time open 3j3q" to make sure there are no surprises.
comment:17 by , 9 years ago
Owner: | changed from | to
---|---|
Status: | reopened → assigned |
comment:18 by , 9 years ago
Resolution: | → fixed |
---|---|
Status: | assigned → closed |
Current times to open 3j3q on my iMac (Mac OS 10.12.2) are 9.9 seconds with smart initial display, and 9.8 seconds without smart initial display. The time from the July 15, 2015 build (~18 months ago) was 8.5 seconds without smart initial display. When this bug was reported 14 months ago the time was 18 seconds. So we have regained most of the speed. We are still 15% slower than 18 months ago, but so much has changed in that time it would be very hard to track down at this point.
I investigated why current ChimeraX takes 18 seconds to display 3j3q while ChimeraX from July 15, 2015 displayed it in 8.5 seconds, twice as fast. Here is where 9.1 seconds of the extra 9.5 seconds was taken:
C++ change tracker 4.1 seconds (no change tracking in July build)
Ribbon creation 2.7 seconds (no ribbon calc in July build)
Slower mmcif parsing 1.5 second
Smart initial display 0.8 seconds (no smart display in July build)
Here is the data from print statements used to derive the above
Today's daily build:
open 3j3q 18.26 sec
July 15, 2015 ChimeraX:
op mmcif:3j3q
parse mmcif 6.95
create atomic model 1.20
open command 8.38
View.draw 0.03