#5267 closed defect (fixed)
blast jobs given nonunique names
Reported by: | Elaine Meng | Owned by: | Zach Pearson |
---|---|---|---|
Priority: | moderate | Milestone: | 1.3 |
Component: | Sequence | Version: | |
Keywords: | Cc: | Tom Goddard | |
Blocked By: | Blocking: | ||
Notify when closed: | Platform: | all | |
Project: | ChimeraX |
Description
If I use alphafold search it always gives the name None. If I use the Blast Protein dialog it will repeat the same name, e.g. sometimes bp1, sometimes bp2, but it doesn't ratchet up the number each time automatically, so you get duplicates.
UCSF ChimeraX version: 1.3.dev202109202344 (2021-09-20)
Attachments (1)
Change History (19)
comment:1 by , 4 years ago
Milestone: | → 1.3 |
---|
comment:2 by , 4 years ago
comment:3 by , 4 years ago
In ChimeraX 1.2.5, the command
blastprotein mav
took a name and then either "selected True" or "selected False". If true, then only the checkboxed hits were shown, if false then all of the hits were shown.
What do you think would be the best way to integrate that with the current UI, Elaine? I think commands should work without mouse input where possible, so I'm thinking of changing "selected" into a range parameter that would take row indices (e.g. 1 - 10).
comment:4 by , 4 years ago
Resolution: | → fixed |
---|---|
Status: | assigned → closed |
This should reenable 'blastprotein mav'. Behavior matches ChimeraX 1.2.5: if selected is False take all the hits from the table; if true then only use what's highlighted in the table. We can open another ticket if you think my range suggestion is desirable, or if you have other suggestions.
follow-up: 5 comment:5 by , 4 years ago
I'm in the middle of writing a long reply. Your comment really didn't make sense to me so it was taking a long time to answer!
comment:6 by , 4 years ago
It's no problem, I just wanted to get the ticket closed out by matching the old functionality so it wouldn't appear to hold up the release. We can still make changes. :)
follow-up: 7 comment:7 by , 4 years ago
Hi Zach, I don't understand at least half of your question. The "blastprotein" command allows specifying a sequence that is shown in the sequence viewer window (either alone or as part of a multiple alignment) as the query. This has nothing to do with which hits are shown after the search. You cannot use command "blastprotein mav" in 1.2.5 and it also does not have a "selected" option as far as I ever knew. Instead the syntax is something like blastprotein #1/A ...to use sequence of chain A in model 1 (a currently open structure) as query blastprotein super8.msf:-2 ...to use as query the next-to-last sequence in alignment named super8.msf currently shown in the Sequence Viewer We want to (eventually) add two more ways to input the query, (1) uniprot ID or uniprot name (2) paste in plain text Chimera's Blast Protein has tabs "From Structure" and "Plain Text" for entering the query. I don't exactly know what are the possibilities in the ChimeraX GUI framework. Maybe a tabbed section, or if that's not doable something like a menu "Query" with choices "From structure" "From Sequence Viewer" "Plain text" "Uniprot" and then showing additional options for whichever of those was chosen. If somebody chose "From Sequence Viewer" then they'd have menus of containing the currently open aligments, and then after choosing an alignment, either the sequence names or the indices within that alignment. If they choose From structure, then they get a menu of the current structure protein chains, like they do now. If they choose Uniprot or Plain text, they get an entry field for that. An example of a chooser for an alignment and then a sequence within the alignment is Modeller Comparative (Tools... Structure Predictions... Modeller Comparative). For example open the attached alignment super8.msf, then you can choose it in the top section, and then in the section below it, choose the sequence. Maybe this answers your question?? None of this has anything to do with this ticket's main subject. The "blastprotein" command has a "toolId" option that specifies a name for the job output window. For this ticket, you would just make the GUIs for Blast Protein and Alphafold (search option) issue a blastprotein command that uses this "toolId" option to specify a unique name. Elaine
follow-up: 8 comment:8 by , 4 years ago
You could offer a job name field in the respective GUIs (AlphaFold Search "Blast job name" or Blast Protein "Job name") but I really don't think that is necessary if the tools automatically issue a unique job name.
comment:9 by , 4 years ago
Try opening 1.2.5, then running any BLAST job. Tick a couple of checkboxes when the results come up, then type 'blastprotein mav' into the command bar. The entire naming scheme, I discovered, was to enable something like 'blastprotein mav bp1' or 'blastprotein mav bp1 selected false'
follow-up: 10 comment:10 by , 4 years ago
Nobody ever told me about that option. Conrad and/or Eric specifically chose to not expose it to the user. Don't you already have a GUI way to show the chosen rows as a sequence alignment (in the context menu, right?)? So you already expose that functionality. The context menu could work via issuing the command but we'd need to change the keywords to make it OK for the user to see. I.e. we do not call the sequence viewer in ChimeraX "mav" and "selected" is reserved for actual selection (with the green highlights and all).
follow-up: 11 comment:11 by , 4 years ago
Also a previous suggestion was to have "Show Alignment" and "Load Structure" buttons that would work on the chosen rows, in addition to or instead of the context menu, since the context menu is somewhat hard to discover. However, that was in another ticket, not this one about blast job naming.
comment:12 by , 4 years ago
I'm guessing that 'blastprotein mav' is the command that gets issued when the user clicks "Show in sequence viewer" in the blast-results panel. It's unclear that there really needs to be a command equivalent for that since it doesn't seem useful either for scripting or for typing at the command line since the user has to make interactive selections first.
comment:13 by , 4 years ago
OK. The code that actually shows the sequence viewer doesn't reference that command at all. I think if we don't want the user to stumble across it we may as well remove it.
comment:14 by , 4 years ago
This subsequent commit removes that command, since the actual code didn't depend on it. Thanks for working it out with me!
follow-up: 15 comment:15 by , 4 years ago
If I restore a session with bp1 and bp2 results and then run blastprotein again (via command) in 10/12 daily build, I get another set of results named bp1. So maybe this is still a problem? E.g. I open attached session and then use "blast #1/e"
comment:16 by , 4 years ago
I'm thinking if sessions start with 'bp1', 'bp2', ..., 'bpN' then we throw away the old name and reassign them as 'bp1', 'bp2', ..., 'bpN', but keep the old name if it's a custom name from the user. That way I can just check if the name is 3 characters long and starts with 'bp', invoke my automatic name machinery, and then it'd count up from the correct number. Does that sound reasonable?
comment:17 by , 4 years ago
Not sure I understand your comment, as to what exactly would get thrown away. However, I don't feel strongly about how the naming issue is resolved, as long as the results end up with unique names!
follow-up: 17 comment:18 by , 4 years ago
I think it is best if the blast result names stay the same when a session is restored. I'd suggest making a routine that creates a unique name when a new job is run. It will make a set of the existing result names then just test if bp1 bp2, bp3, ... is in the set until it finds one not in the set. Efficiency is not an issue. Simplicity of code and preserving the session without changes are the goal.
This is also important because, after reviewing 1.2.5 code, the purpose of the names is to enable showing mav from the command bar. I'm thinking the control panel should always be just "Blast Protein" and we only give names to results panels.