Opened 11 years ago

Closed 9 years ago

#69 closed defect (fixed)

Atom spec parsing does not preserve model or chain order

Reported by: Tom Goddard Owned by: Conrad Huang
Priority: major Milestone: Alpha Release
Component: Command Line Version:
Keywords: Cc: meng@…, pett@…, scooter@…
Blocked By: Blocking:
Notify when closed: Platform: all
Project: chimera

Description (last modified by Eric Pettersen)

Tom G.:
Atom spec parsing of #1,​2 returns the models in reverse order 2,1. Need to preserve order for command "vop maximum #1,​2 scale 1.5,2.5" so the two scale factors are applied to the correct models.

Eric:
Also, chain order is not preserved. The command "mmaker #1/V/W to #2/W/V" matches V to V and W to W despite what is specified in the command because the order of the atoms/residues/chains given to the command callback are *both* ordered as V then W. Scooter needs this to work for stuff he's doing with RINalyzer, though I don't know if he needs it for the site visit. Maybe he can comment on that.

Change History (13)

comment:1 by Tom Goddard, 10 years ago

Another example, the map fitting command

fit #1.1 #1.2 #1.3 #1.4 #1.5 in #2 seq 3 res 5

is supposed to fit the maps in the order specified but the command gets them in order 1.5, 1.1, 1.4, 1.3, 1.2. The same unexpected order is produced by

fit #1.1-5 in #2 seq 3 res 5

comment:2 by Tom Goddard, 10 years ago

Resolution: fixed
Status: newclosed

Fixed. Order was following Models.list() ordering which came from an unordered Python dictionary.

comment:3 by Tom Goddard, 9 years ago

Resolution: fixed
Status: closedreopened

Only the case of a spec like #1 #2 #3 was fixed 4 months ago giving models in order. But the case #1,​2,​3 does not give the models in order. Elaine reports that

vop subtract #1,​3

actually subtracts #3 minus #1 because the atomspec parsing does not preserve model order and the command gets them in a list [model #3, model #1]. The order is in fact scrambled and depends on the dictionary order of models of in session.models.

From: Elaine Meng
Subject: Quick Start fixes
Date: July 12, 2016 at 4:20:33 PM PDT
To: Tom Goddard <goddard@…>

Hi Tom,
I was fixing up a couple small things in the Quick Start like the change in “vop zone” syntax. However, I became confused by the “vop subtract” result…. it doesn’t look like map 3 was subtracted from 1. It kind of looks the other way around. (instead of like the figure, it is a mesh in the same place as the groEL molmap). Maybe I messed something up, or did the order of which was subtracted from which change between Chimera1 and ChimeraX? Also it’s a mesh, not the solid surface shown in the image.

So I wasn’t exactly sure how to fix up that part. The command executes now (without the “true”) but result is not as expected, at least from the image.
Elaine

comment:4 by Tom Goddard, 9 years ago

Cc: meng@… added

comment:5 by Eric Pettersen, 9 years ago

Cc: pett@… scooter@… added
Description: modified (diff)
Priority: minormajor
Summary: Atom spec parsing does not preserve model orderAtom spec parsing does not preserve model or chain order

comment:6 by Conrad Huang, 9 years ago

The atomspec comma semantics carried over from Chimera 1. There is no ordering among the comma-separated items, so "#1,2" is exactly the same as "#2,1". The way the atomspec evaluation works now is best explained with an example. Suppose we have models 1, 2 and 3 and we are evaluating "#3,1:12".

  1. The atom spec is parsed as "model (3,1) residue 12"
  2. Because there is only one model entry, we go through the session model list only once.
  3. The session model list is computed from the values of a dictionary and therefore not sorted in any meaningful way.
  4. Each model id is compared against (3,1) and the model is saved if there is a match.
  5. Each saved model is then checked for residue 12.

So it's obvious why order is not preserved for #3,1:12" and also why order is preserved for "#3:12#1:12" (parsed as "model 3 residue 12, model 1 residue 12" so model list is processed twice).

We can change the semantics so that order is always preserved by expanding the parsed specifier so that comma-separated items get split into multiple specifiers. That is, internally translate "model (3,1) residue 12" to "model 3 residue 12, model 1 residue 12".

Does that sound like the right thing to do?

comment:7 by Eric Pettersen, 9 years ago

I'm afraid you seem to be a bit behind the times on Chimera 1. Commas have implied ordering for eight years now (r26245).

Does that sound like the right thing to do?

Well, you are free to do what you want since you are writing the code, but my feeling is that that approach is going to get _pretty_ complicated when there are multiple lists of comma-separated items at multiple levels in the atom spec. What Chimera1 does is note the index in the comma-separated list that the model (or whatever) matched to and then sort the results afterward.

--Eric

comment:8 by Conrad Huang, 9 years ago

Okay...

Are ranges ordered too? That is, is "#1-3" the same as "#1,​2,​3"? Ranges get a little strange when you say ":1-20" and there is a residue "10A".

Conrad

comment:9 by Eric Pettersen, 9 years ago

Ranges are not ordered. If ordering is needed, the range would have to be broken down into a comma-separated list.

Insertion code add considerable complexity to range testing. In Chimera1, :1-20 includes residue 10A obviously. For ranges that start and/or end with insertion codes, the following logic applies:

1) If the insertion codes are the same (e.g. :8P-77P), then _only_ residues in that range with that insertion code match (i.e. :75A does not match, :65 does not match).

2) If the insertion codes differ (e.g. :100A-100D, or :99A-107) then sequence numbers outside that range don't match, and sequence numbers that match one (or both) ends of the range must have insertion codes equal to or greater (for the beginning of the range) or less than or equal to (for the end of the range) the corresponding end of the range. A blank insertion code is "less than" A.

For ChimeraX I suggest dropping 1) above. It's obscure and confusing. People might (rarely) use a range like :99A-202A and be surprised that it only matches a few (if any) of the residues between the endpoints.

comment:10 by goddard@…, 9 years ago

I think ranges should be ordered in ChimeraX even if they are not in Chimera 1.  If I say “shape tube :1-35@CA” I really do want the tube to go through the residues in the order specified.

comment:11 by Eric Pettersen, 9 years ago

Well, saying that ranges are "not ordered" in Chimera 1 was a little misleading. They wind up ordered the same as the lists they were checked against, so for instance residues ranges will be ordered the same as the residues were in the input file, models the same as in the model list, etc.

They certainly weren't sorted though. So if residue 7 followed residue 27 in the input file (say, due to a circular permutation) then :1-100 would have 7 immediately after 27, not in sorted numerical order.

--Eric

comment:12 by Scooter Morris, 9 years ago

Milestone: Alpha Release

comment:13 by Conrad Huang, 9 years ago

Resolution: fixed
Status: reopenedclosed

Fixed in dd03a06c8.

Note: See TracTickets for help on using tickets.