Opened 12 months ago

Last modified 12 months ago

#16249 assigned defect

Similar Structures sequence plot: weirdness coloring with context menu

Reported by: Elaine Meng Owned by: Tom Goddard
Priority: high Milestone:
Component: Structure Analysis Version:
Keywords: Cc:
Blocked By: Blocking:
Notify when closed: Platform: all
Project: ChimeraX

Description

If with the context menu in the sequence plot, I turn on coloring the plot by lddt it turns off (unchcks) coloring by conservation, which makes sense. However, if I then turn on (check) coloring by lddt, they are both checked and the coloring is some indecipherable mishmash. I was trying to figure out the "similar sequences" command options logic and I really don't understand it. Does conservationColoring true (with default colors) mean the white-black-aqua-gold scheme, and then lddt coloring is supposed to completely replace the black-aqua-gold pixels with its own palette, right? Why aren't they mutually exclusive? I understand white stays white because that means there is no aligned residue.

Change History (6)

comment:1 by Elaine Meng, 12 months ago

Oh sorry I meant "showConserved" not "conservationColoring". Also I find it counterintuitive that (if I understand correctly) the most-conserved gold color has option name conservedColor and the less-conserved aqua color has option name identityColor... I would think of "identity" as super-conserved and "conserved" as less so, but maybe that is me-specific.

To summarize the questions:

  • are showConserved and lddtColoring supposed to be mutually exclusive, and if not, how do you figure out which pixels get which coloring scheme when both of them are true?
  • what exactly does showConserved turn on or off, only the gold coloring, or the gold-aqua-black combined?

comment:2 by Tom Goddard, 12 months ago

It is allowed to use both "Color conserved" and "Color by LDDT" at the same time and in that case the conserved coloring aqua for identity and gold for conserved (but not black for aligned) appear on top of the LDDT coloring. I agree it looks horrible. It looked a better before Scooter had me change the conserved and identity colors for the color-blind. When the colors were more distinct I liked the overlay because it allowed me to see the conserved positions relative to the good structural match regions (LDDT). Let me think how I can fix this. Maybe I could fade the LDDT coloring when both are shown.

Regarding the option names identityColor and conservedColor, identity color means that the similar structure residue is the same amino acid type as the query. "Identity" seems the right word to describe that. The conservedColor are residue positions where more than 50% of similar structures that have that residue aligned are the same amino acid type. Again that seems to be well described by "conserved". I see your confusion in that "conserved" based on the amino acids in an entire column, while "identity" is for each similar structure at each column. If you have a better suggestion for the names tell me. I'm not greatly concerned with those option names because I expect they will almost never be used.

comment:3 by Elaine Meng, 12 months ago

I didn't expect you to change the option names, it is more of a complaint that I can't understand what they mean despite spending a long time experimenting and squinting at the plot.  So the "identity" one is the same amino acid type as the query.  Does it depend on anything else in the column? Is the "conservation" one also same amino acid type as the query, or could it also be a different amino acid type than the query?  I had this explanation in the tool docs and now I think it is wrong:

   [white]    \u2013 no aligned residue
   [black]    \u2013 aligned residue of a different amino acid type than the query
   [aqua]    \u2013 aligned residue of the same amino acid type as the query, in column where <0.5 of the residues are the same type
   [gold]    \u2013 aligned residue of the same amino acid type as the query, in column where \u22650.5 of the residues are the same type

Now I'm thinking 

 [white]    \u2013 no aligned residue
[black]    \u2013 aligned residue of a different amino acid type than the query
[aqua]    \u2013 aligned residue of the same amino acid type as the query (regardless of any other residues in that column??)
[gold]    \u2013 residue same type as \u22650.5 of the other residues in the column

... but then what happens when both the aqua and gold conditions are true?
Version 0, edited 12 months ago by Elaine Meng (next)

comment:4 by Elaine Meng, 12 months ago

There are 4 situations I am unclear on, all in a highly conserved column (where over half of queries have same residue type as each other), which is the 2x2 matrix of query having or not having the conserved type vs. the specific hit residue having or not having the conserved type. Is the whole column gold, or do you get some black and aqua in it?

And then when you also have lddt coloring turned on, the aqua and gold pixels would win out, but all the black positions would get the lddt coloring?

Last edited 12 months ago by Tom Goddard (previous) (diff)

comment:5 by Tom Goddard, 12 months ago

Oops. I didn't even describe it right in my previous message on this ticket. If menu entry "color conserved" is used then positions are gold if the similar structure amino acid type matches the query and at least 50% of the similar structures that are aligned at that position also match the query, and there are 10 or more similar structures are aligned at that position. Otherwise if the 50% or 10 aligned conditions don't hold but the amino acid type matches then it is aqua. And if the amino acid type does not match it is black, but black is not shown if LDDT is also shown. So this differs from what I said before in that the query has to have the amino acid type that is 50% conserved and the minimum of 10 aligned at a position criteria.

What I see in the plot matches this. There is never gold and aqua in the same column. Basically gold replaces aqua if the 50% and min 10 condition applies.

I'll add this info to my similar structures web page.

comment:6 by Tom Goddard, 12 months ago

One more point. The sequence conservation residue attribute that is set for query residues when you use Sequence Plot menu entry "color query structure by conservation" defines the conservation percentage using the number of aligned sequences that match the most common amino acid type at that position (ie. the consensus amino acid type), which may be different from the query amino acid type. So this conservation value does not match what is used in the "Color conserved" plot coloring which counts the sequences with amino acid type matching the query since the query may not have the consensus amino acid type.

Note: See TracTickets for help on using tickets.