Opened 4 years ago

Closed 3 years ago

Last modified 3 years ago

#6599 closed enhancement (fixed)

RFE: Blast protein option to "List only best-matching chain per PDB entry"

Reported by: Elaine Meng Owned by: Zach Pearson
Priority: moderate Milestone:
Component: Sequence Version:
Keywords: Cc:
Blocked By: Blocking:
Notify when closed: Platform: all
Project: ChimeraX

Description

Would like "List only best-matching chain per PDB entry" option like Chimera Blastprotein had. Currently all hits per PDB entry are given, for example:

open 4hhb
blastprotein #1/A
... gives several hits that are 2 different chains in the same entry, e.g.
1A00_A & 1A00_C and several other pairs.

Attachments (1)

Screenshot 2023-04-12 at 13.45.39.png (283.3 KB ) - added by Zach Pearson 3 years ago.

Download all attachments as: .zip

Change History (9)

comment:1 by Zach Pearson, 3 years ago

How do we determine the best hit? By default we show the e-value and the score -- for your example specifically, the e-values and scores for many of the hits seem to be the same.

in reply to:  2 ; comment:2 by Elaine Meng, 3 years ago

It's common for PDB structures to contain many copies of the same peptide that get identical scores, in which case you'd just retain any one of those hits.  You can see the results of this option by doing Blast search in Chimera with the option turned on.  Chimera just went with the last one, so if it was A,B,C,D homotetramer, D would be listed.  So that would be OK if easiest, although for esthetic reasons A would be nicest (if easy to do).

comment:3 by Zach Pearson, 3 years ago

Resolution: fixed
Status: assignedclosed

Will be in tomorrow's daily build. This was a pretty low-risk patch. May also put on 1.6 branch if desired.

in reply to:  4 ; comment:4 by Elaine Meng, 3 years ago

I don't see any option in the daily build UCSF ChimeraX version: 1.6.dev202304120334 (2023-04-12).  

Did it not make it in?  I am looking in the Blast Protein GUI.  Presumably there would be a command option too, but I don't see any new keywords from using "usage blastprotein":

blastprotein atoms [database database] [cutoff a number] [matrix matrix] [maxSeqs an integer] [version a text string] [log true or false] [name a text string]
    — Search PDB/NR using BLAST

Incidentally, I see there is a "name" keyword which I had documented as "toolId" ... do you know when that changed?  Thanks

Elaine

by Zach Pearson, 3 years ago

comment:5 by Zach Pearson, 3 years ago

It's an option on the results UI, not a command line option.

comment:6 by Zach Pearson, 3 years ago

No idea when that changed.

in reply to:  8 comment:7 by Elaine Meng, 3 years ago

Oh, I see it's an option on the results panel.  In Chimera, it's done differently as an option given at the time of searching.  I'm OK with how you did it, but would like different text, say one of these or some variant combination of them:

Hide duplicate chains per PDB
Show only one chain per PDB
List only best chain per PDB

(only capitalize first word since we also have "Show columns" and "For chosen entries")

comment:8 by Zach Pearson, 3 years ago

Got it, will be fixed tomorrow!

Note: See TracTickets for help on using tickets.