Opened 7 years ago

Closed 7 years ago

Last modified 7 years ago

#1346 closed defect (fixed)

Same model used multiple times in model hierarchy

Reported by: Tristan Croll Owned by: Eric Pettersen
Priority: major Milestone:
Component: Input/Output Version:
Keywords: Cc: Tom Goddard
Blocked By: Blocking:
Notify when closed: Platform: all
Project: ChimeraX

Description

If I do:

ids = '''1a3z 1a4a 1aiz 1cc3 1cuo 1cur 1fwx 1gyc 1iby 1ifp 1iuz 1joi 1kbv 1ov8 1qhq 1qni 1rcy 1rkr 1spf 1uri 1v5u 1vlx 2aan 2aza 2ccw 2cua 2dv6 2iaa 2mry 2n0 2n0m 2plt 2q9o 2qpd 2zoo 3ay2 3ayf 3cg8 3eh5 3fsa 3gyr 3kw8 3msw 3s8f 3s8g 3sbp 3t9w 3tas 3wia 3wkq 3wni 3zbm 4f2f 4gy4 4hcf 4kns 4w1t 4w9z 4ysa 4ysq 4yss 4yst 5akr 5i5m 5tk2 5xnm'''.split()

from chimerax.core.models import Model
from chimerax.core.commands import open
model_container = Model('models', session)
session.models.add([model_container])
for pdbid in ids:
    try:
        m = open.open(session, pdbid)[0]
    except:
        print("Failed on {}".format(pdbid))
        continue
    m.display = False
    model_container.add([m])


... then the loop runs crazily slow (after ~15 min it's only up to 2aza), while fetching in a Bash loop using phenix.fetch_pdb is done in ~30 sec.

Change History (12)

comment:1 by Tristan Croll, 7 years ago

Nothing to do with the act of downloading, as such. If I run essentially the same loop over pre-downloaded files:

from glob import glob
for filename in glob('*.pdb'):
    ...

... then it also slows to a crawl. If I instead run ChimeraX *.pdb in the directory, everything's loaded in ~20-30 seconds. But if I then do:

from chimerax.atomic import AtomicStructure
models = session.models.list(type=AtomicStructure)
from chimerax.core.models import Model
container = Model('container', session)
session.models.add([container])

... then the last call adding the empty container takes ~10 seconds, then

container.add([models])

... hangs seemingly indefinitely (with only ~3% CPU usage and no unusual-looking memory use).

comment:2 by Tristan Croll, 7 years ago

Sorry, that last line should obviously be:

container.add(models)

I can do a whole host of processing in between with no worries (in this case, pruning down each model to the specific chain I want) - but that final call gets trapped in some horrendous loop.

comment:3 by Eric Pettersen, 7 years ago

Status: assignedaccepted

comment:4 by Eric Pettersen, 7 years ago

Owner: changed from Eric Pettersen to Tom Goddard
Status: acceptedassigned
Summary: Downloading a list of PDB IDs via a loop runs very slowlySame model used multiple times in model hierarchy

Possibly need to be explicitly prohibited / throw an error?

in reply to:  5 ; comment:5 by tic20@…, 7 years ago

It used to be that:

model.add([other model(s) already in hierarchy])

... was the accepted way to “transplant” models from one place to the other. Has that changed?

 
 
Tristan Croll
Research Fellow
Cambridge Institute for Medical Research
University of Cambridge CB2 0XY
 

 


comment:6 by Tom Goddard, 7 years ago

I agree that should move the model in the hierarchy and it used to work. I'll look into it possibly tomorrow.

in reply to:  7 ; comment:7 by Eric Pettersen, 7 years ago

All I know is that if you comment out the code that transfers the opened models into the container model, the script runs fast.

—Eric


comment:8 by Tom Goddard, 7 years ago

Owner: changed from Tom Goddard to Eric Pettersen

All the time is taken by Model Panel. I timed 24 models (each 7000 atoms) being moved into a container model using your code from comment 1. Took 37 seconds with Model Panel showing or 0.01 seconds with Model Panel closed.

Commenting out the line in core/models.py which notifies about model id number changes

session.triggers.activate_trigger(MODEL_ID_CHANGED, id_changed_model)

and leaving model panel showing also makes the move take 0.01 seconds. The MODEL_ID_CHANGED trigger is called once for each of the 24 models and for some reason takes incredibly long for Model Panel to process it.

Reassigning to Eric who wrote Model Panel to fix this up.

comment:9 by Tom Goddard, 7 years ago

Interestingly with the MODEL_ID_CHANGED trigger not being fired, when I show the Model Panel "container" sub-models they are all shown with correct ids. Maybe that is because clicking the disclosure triangle to reveal the submodels updates the ids.

in reply to:  10 ; comment:10 by Eric Pettersen, 7 years ago



It’s because the open-models notifications are being collated into one table rebuild, and that rebuild has the correct IDs.  I just need to collate MODEL_ID_CHANGED triggers the same way.

—Eric


comment:11 by Eric Pettersen, 7 years ago

Status: assignedaccepted

There were three problems:

1) A geometrically increasing number of "model ID changed" triggers were being fired

2) A model-table rebuild was happening for each trigger

3) Model-table rebuilds were very slow for large numbers of models (> 5 seconds)

I have fixed the first two problems and pushed the fixes, so the script now runs in an acceptable amount of time. I am investigating the third problem.

There is problem with the script itself in that for NMR ensembles it only adds the first model of the ensemble into the container (perhaps that's the desired behavior?) and a container model of the rest of the ensemble hangs around.

We are discussing if the API should returned the root model of whatever hierarchy is created rather then a list of "data" models.

Last edited 7 years ago by Eric Pettersen (previous) (diff)

comment:12 by Eric Pettersen, 7 years ago

Cc: Tom Goddard added
Resolution: fixed
Status: acceptedclosed

Okay, fixed the slow table rebuild as well (was asking for selected models once per model instead of once per table rebuild).

If we change the return value of the open() calls, we will probably send mail out to chimerax-users. No decisions have been made at this point -- very much in the early stages.

Last edited 7 years ago by Eric Pettersen (previous) (diff)
Note: See TracTickets for help on using tickets.