Opened 12 months ago
Last modified 12 months ago
#16249 assigned defect
Similar Structures sequence plot: weirdness coloring with context menu
| Reported by: | Elaine Meng | Owned by: | Tom Goddard | 
|---|---|---|---|
| Priority: | high | Milestone: | |
| Component: | Structure Analysis | Version: | |
| Keywords: | Cc: | ||
| Blocked By: | Blocking: | ||
| Notify when closed: | Platform: | all | |
| Project: | ChimeraX | 
Description
If with the context menu in the sequence plot, I turn on coloring the plot by lddt it turns off (unchcks) coloring by conservation, which makes sense.  However, if I then turn on (check) coloring by lddt, they are both checked and the coloring is some indecipherable mishmash.  I was trying to figure out the "similar sequences" command options logic and I really don't understand it.  Does conservationColoring true (with default colors) mean the white-black-aqua-gold scheme, and then lddt coloring is supposed to completely replace the black-aqua-gold pixels with its own palette, right? Why aren't they mutually exclusive?  I understand white stays white because that means there is no aligned residue.
Change History (6)
comment:1 by , 12 months ago
comment:2 by , 12 months ago
It is allowed to use both "Color conserved" and "Color by LDDT" at the same time and in that case the conserved coloring aqua for identity and gold for conserved (but not black for aligned) appear on top of the LDDT coloring.  I agree it looks horrible.  It looked a better before Scooter had me change the conserved and identity colors for the color-blind.  When the colors were more distinct I liked the overlay because it allowed me to see the conserved positions relative to the good structural match regions (LDDT).  Let me think how I can fix this.  Maybe I could fade the LDDT coloring when both are shown.
Regarding the option names identityColor and conservedColor, identity color means that the similar structure residue is the same amino acid type as the query.  "Identity" seems the right word to describe that.  The conservedColor are residue positions where more than 50% of similar structures that have that residue aligned are the same amino acid type.  Again that seems to be well described by "conserved".  I see your confusion in that "conserved" based on the amino acids in an entire column, while "identity" is for each similar structure at each column.  If you have a better suggestion for the names tell me.  I'm not greatly concerned with those option names because I expect they will almost never be used.
comment:3 by , 12 months ago
I didn't expect you to change the option names, it is more of a complaint that I can't understand what they mean despite spending a long time experimenting and squinting at the plot. So the "identity" one is the same amino acid type as the query. Does it depend on anything else in the column? Is the "conservation" one also same amino acid type as the query, or could it also be a different amino acid type than the query? I had this explanation in the tool docs and now I think it is wrong: [white] \u2013 no aligned residue [black] \u2013 aligned residue of a different amino acid type than the query [aqua] \u2013 aligned residue of the same amino acid type as the query, in column where <0.5 of the residues are the same type [gold] \u2013 aligned residue of the same amino acid type as the query, in column where \u22650.5 of the residues are the same type Now I'm thinking [white] \u2013 no aligned residue [black] \u2013 aligned residue of a different amino acid type than the query [aqua] \u2013 aligned residue of the same amino acid type as the query (regardless of any other residues in that column??) [gold] \u2013 residue same type as \u22650.5 of the other residues in the column ... but then what happens when both the aqua and gold conditions are true?
comment:4 by , 12 months ago
There are 4 situations I am unclear on, all in a highly conserved column (where over half of queries have same residue type as each other), which is the 2x2 matrix of query having or not having the conserved type vs. the specific hit residue having or not having the conserved type.  Is the whole column gold, or do you get some black and aqua in it?
And then when you also have lddt coloring turned on, the aqua and gold pixels would win out, but all the black positions would get the lddt coloring?
comment:5 by , 12 months ago
Oops.  I didn't even describe it right in my previous message on this ticket.  If menu entry "color conserved" is used then positions are gold if the similar structure amino acid type matches the query and at least 50% of the similar structures that are aligned at that position also match the query, and there are 10 or more similar structures are aligned at that position.  Otherwise if the 50% or 10 aligned conditions don't hold but the amino acid type matches then it is aqua.  And if the amino acid type does not match it is black, but black is not shown if LDDT is also shown.  So this differs from what I said before in that the query has to have the amino acid type that is 50% conserved and the minimum of 10 aligned at a position criteria.
What I see in the plot matches this.  There is never gold and aqua in the same column.  Basically gold replaces aqua if the 50% and min 10 condition applies.
I'll add this info to my similar structures web page.
comment:6 by , 12 months ago
One more point.  The sequence conservation residue attribute that is set for query residues when you use Sequence Plot menu entry "color query structure by conservation" defines the conservation percentage using the number of aligned sequences that match the most common amino acid type at that position (ie. the consensus amino acid type), which may be different from the query amino acid type.  So this conservation value does not match what is used in the "Color conserved" plot coloring which counts the sequences with amino acid type matching the query since the query may not have the consensus amino acid type.

Oh sorry I meant "showConserved" not "conservationColoring". Also I find it counterintuitive that (if I understand correctly) the most-conserved gold color has option name conservedColor and the less-conserved aqua color has option name identityColor... I would think of "identity" as super-conserved and "conserved" as less so, but maybe that is me-specific.
To summarize the questions: