Opened 3 weeks ago
Last modified 4 days ago
#19201 assigned defect
Somehow make UniProt info more "Claude accessible"
| Reported by: | Tom Goddard | Owned by: | Eric Pettersen |
|---|---|---|---|
| Priority: | moderate | Milestone: | |
| Component: | Sequence | Version: | |
| Keywords: | Cc: | ||
| Blocked By: | Blocking: | ||
| Notify when closed: | Platform: | all | |
| Project: | ChimeraX |
Description
I just tried to prod Claude into selecting the S4 voltage sensor helix of pdb 9my3. It tried valiantly but repeatedly only came up with a typical range for these potassium channels of 220-245. Opening the uniprot annotations in ChimeraX shows under Features / Transmembrane regions S4 is 217-232. Claude kept begging me to look at the uniprot page or ChimeraX uniprot features and tell it the answer, but I refused. I'm trying to think of how Claude can extract this info from ChimeraX commands. It opened the uniprot annotations but the command return value didn't have the details. Maybe for use with Claude we should have the return_json option provide more detailed info. Another case of this is Claude often tried the help command, but the return value does not include the text of the help page so I don't think Claude got much out of it.
Change History (4)
comment:1 by , 3 weeks ago
| Reporter: | changed from to |
|---|
comment:3 by , 7 days ago
This seems somewhat intractable. UniProt annotations aren't associated with chains, they're associated with UniProt sequences. Chains may then be associated with the UniProt sequence, but the numbering of the annotation may vary wildly from the numbering of the chain due to signal peptides and propeptides, and could vary between different depositions of the same structure. Claude would need to know the position of feature on the UniProt sequence, and then the match map between the associated chain of interest and the UniProt sequence. I don't see that happening as the return value of ChimeraX commands.
If Claude talked to ChimeraX in Python, then things would be different.
comment:4 by , 4 days ago
It seems like the functionality and data of the UniProt Sequence Features tool is not very accessible via commands. There is no command like
uniprot select "metal ion-binding site / Zinc 2"
or
uniprot list features
I'm not sure these would be useful. But I think the UniProt annotations are very useful and having them only accessible from the GUI is limiting. For instance if I want to make a tutorial with executable links that used UniProt features it would be a challenge.
Are there sequence commands to list and operate on the assigned sequence regions?
It seems like this would need to be wedged into "info chains" somehow, since the "open" command only knows about models, not chains, and doesn't ask for JSON info for the models. I mean it could do that, the atomic model could include then include the UniProt info , but realistically it should put in a lot more info than just the UniProt chain annotations, which would duplicate much of the effort already put into the "info" command.
Specifically, "info chains /a attr features" almost gets you there now, but Sequence.features is a method. Some property for getting UniProt annotations would need to be added, that in turn calls Sequence.features(...)