Opened 6 years ago
Closed 6 years ago
#1931 closed defect (fixed)
Atom spec parsing consumes torsion command angle argument
Reported by: | Owned by: | Conrad Huang | |
---|---|---|---|
Priority: | normal | Milestone: | 1.0 |
Component: | Command Line | Version: | |
Keywords: | Cc: | pett, Elaine Meng | |
Blocked By: | Blocking: | ||
Notify when closed: | Platform: | all | |
Project: | ChimeraX |
Description
The following bug report has been submitted: Platform: Darwin-18.5.0-x86_64-i386-64bit ChimeraX Version: 0.9 (2019-05-09) Description open 5c1m torsion /A:84@CD1,CG,CB,CA -109 Must specify exactly 4 atoms for 'torsion' command; you specified 3 torsion /A:84@CD1,CG,CB,CA -109.0 Atom spec parsing gobbles the angle argument in the torsion command if the angle is a negative integer and results in an error where only 3 atoms of the 4 specified in the torsion command are recognized. Works correctly if the angle has a decimal point. Atom spec parsing should not consume the extra argument. Log: UCSF ChimeraX version: 0.9 (2019-05-09) © 2016-2019 Regents of the University of California. All rights reserved. How to cite UCSF ChimeraX > open 5c1m format mmCIF fromDatabase pdb Summary of feedback from opening 5c1m fetched from pdb --- warning | Cannot find consistent set of bond orders for ring system containing atom CAV in residue 4VO /A:401 5c1m title: Crystal structure of active mu-opioid receptor bound to the agonist BU72 [more info...] Chain information for 5c1m #1 --- Chain | Description A | Mu-type opioid receptor B | Nanobody 39 Non-standard residues in 5c1m #1 --- 4VO — (2S,3S,3aR,5aR,6R,11bR,11cS)-3a-methoxy-3,14-dimethyl-2-phenyl-2,3,3a,6,7,11c-hexahydro-1H-6,11b-(epiminoethano)-3,5a-methanonaphtho[2,1-g]indol-10-ol (BU72) CLR — cholesterol OLC — (2R)-2,3-dihydroxypropyl (9Z)-octadec-9-enoate (1-Oleoyl-R-glycerol) P6G — hexaethylene glycol (polyethylene glycol PEG400) PO4 — phosphate ion > torsion /A:84@CD1,CG,CB,CA -109 Must specify exactly 4 atoms for 'torsion' command; you specified 3 > torsion /A:84@CD1,CG,CB,CA -109.0 OpenGL version: 4.1 ATI-2.8.38 OpenGL renderer: AMD Radeon Pro 580 OpenGL Engine OpenGL vendor: ATI Technologies Inc.
Change History (15)
comment:1 by , 6 years ago
Component: | Unassigned → Command Line |
---|---|
Owner: | set to |
Platform: | → all |
Project: | → ChimeraX |
Status: | new → assigned |
Summary: | ChimeraX bug report submission → Atom spec parsing consumes torsion command angle argument |
comment:2 by , 6 years ago
comment:3 by , 6 years ago
Cc: | added |
---|
I don't grasp how the "-109" is being interpreted. It does not seem to be taken as part of the atom name since "109" without the minus sign works.
torsion /A:84@CD1,CG,CB,CA 109
-> works
torsion /A:84@CD1,CG,CB,CA -109
-> fails
select /A:84@CD1,CG,CB,CA -109
-> selects first 3 atoms but not the CA atom.
select /A:84@CD1,CG,CB,CA 109
-> Expected a keyword with 109 higlighted in red.
select -109
-> Expected an objects specifier or a keyword with -109 highlighted in red
And because there is a space and no quotation marks the -109 should have to be an independent atom spec.
comment:4 by , 6 years ago
This seems like a possible failure of the range handling (e.g :15-354) use of "-".
comment:5 by , 6 years ago
It's a "double failure" in that AFAIK we don't allow ranges of atom names in ChimeraX
comment:6 by , 6 years ago
Actually, we allow ranges at all levels syntactically, and then the code for that level (model, residue, atom) decides what to do about the ranges. Right now, at the atom level, they are treated as an alphanumeric range, e.g., CA-CD typically includes CA, CB, CD but not CE or CG.
The generic nature of the parsing explains why "torsion /A:84@CD1,CG,CB,CA -109" is treated as one atomspec (CA-109 is the last atom range), while "torsion /A:84@CD1,CG,CB,CA 109" is treated as an atomspec followed by something else.
If you want to eliminate atom ranges, we would need to change a lot of code. And unlike "color", I don't think of "torsion" as a verb. So "torsion /A:84@CD1,CG,CB,CA set -109" isn't that bad to me.
comment:7 by , 6 years ago
Cc: | added |
---|
I don't what the rules are for allowing spaces within atom specs, but allowing them around the '-' in a range is just problematic. A very common name for a chloride ion is "@CL-"; if the text that follows that atom name gets glommed into a mostly useless atom name range, that is not good.
The simplest fix is that the character that follows a space has to be one of the level indictors (#/:@) to be considered part of the preceding atom spec. That would have the downside of not allowing spaces after commas, but I could live with that.
My alternative solution would be to not allow spaces around the '-' of a range, and also throwing an error if an atom-name range is detected.
follow-up: 8 comment:8 by , 6 years ago
Many people are conditioned by normal English to add spaces after commas, but not hyphens. Since space after comma is such a frequent atomspec error (I remember explaining it repeatedly, as it’s practically finger memory for many people), I’d vote for your second suggestion of not allowing spaces around ‘-‘ of ranges.
comment:9 by , 6 years ago
Right now, commas may be followed by spaces. I think we fixed that very early on after the decision that "whitespace should not be significant" won. Are we proposing that "whitespace should not be allowed around hyphen"?
A completely separate issue is whether atom name ranges should be allowed. If not, we can treat "CL-" as an atom name. "torsion /A:84@CD1,CG,CB,CA -109" would also work. The down side is that it involves a significant implementation change to the atomspec grammar. If that's what is needed, so be it. But it won't get done soon.
comment:10 by , 6 years ago
If the easiest thing is to disallow spaces around hyphens to avoid mis-parsing negative numbers and atom names that end in '-', then that's what I'd run with. If you want to be super fancy, you could make the rule be: don't allow space on only one side of a hyphen (which would allow "57-59" and " 57 - 59" but not "57 -59"), but that would be up to you.
comment:11 by , 6 years ago
Are you proposing that hyphens for all levels (atom, residue, model) be treated this way? All levels share the same code right now, so customizing a single level is problematic.
comment:12 by , 6 years ago
Yes, absolutely, all levels. No way we want to document that "only for _atom_ ranges are you disallowed from having spaces adjacent to the dash". Not to mention that this exact same problem can come into play at the other levels too.
comment:13 by , 6 years ago
Milestone: | → 1.0 |
---|---|
Status: | assigned → accepted |
6fa355afa disallows spaces around hyphens in atomspecs.
Will still need to disallow atom name ranges.
follow-up: 14 comment:14 by , 6 years ago
Documentation only describes the use of ranges for model numbers, residue numbers, and chains (with alphabetical ordering for the latter). So we never “said” atom ranges could be used. For atom names, there isn’t a sensible ordering anyway, since one would really want them in greek letter order: alpha, beta, gamma, delta (CA,CB,CG*,CD* …) and I highly doubt that is implemented. I’ll add the information that hyphens can’t have spaces next to them.
comment:15 by , 6 years ago
Resolution: | → fixed |
---|---|
Status: | accepted → closed |
Fixed in ad8ad0bf6.
Atom name ranges are no longer allowed.
There's no way to parse that correctly without specifying in the command description that "109" cannot be an atom name. Putting in heuristics to guess that 109 is not an atom name seems wrong. Most unambiguous solution is to require a keyword before the angle parameter.