#14771 closed defect (not a bug)
AlphaFold2 colab notebook error
| Reported by: | Owned by: | Tom Goddard | |
|---|---|---|---|
| Priority: | normal | Milestone: | |
| Component: | Structure Prediction | Version: | |
| Keywords: | Cc: | ||
| Blocked By: | Blocking: | ||
| Notify when closed: | Platform: | all | |
| Project: | ChimeraX |
Description
Hello
I've been trying to use the colab on ChimeraX, and I used it previously this week and it worked perfectly fine. I'm now getting the following error
2024-03-14 18:44:43,264 Starting prediction on 2024-03-14 UTC time
2024-03-14 18:44:43,264 Installing ColabFold on Google Colab virtual machine.
Using Tesla T4 graphics processor
2024-03-14 18:44:43,493 Running on GPU
2024-03-14 18:44:43,496 Found 5 citations for tools or databases
2024-03-14 18:44:43,497 Query 1/1: af42 (length 42)
COMPLETE: 100%|██████████| 150/150 [elapsed: 00:01 remaining: 00:00]
2024-03-14 18:44:44,734 Could not generate input features af42: Invalid character in the sequence:
Traceback (most recent call last):
File "/usr/local/lib/python3.10/dist-packages/colabfold/batch.py", line 1512, in run
= generate_input_feature(query_seqs_unique, query_seqs_cardinality, unpaired_msa, paired_msa,
File "/usr/local/lib/python3.10/dist-packages/colabfold/batch.py", line 1039, in generate_input_feature
feature_dict = build_monomer_feature(
File "/usr/local/lib/python3.10/dist-packages/colabfold/batch.py", line 892, in build_monomer_feature
**pipeline.make_sequence_features(
File "/usr/local/lib/python3.10/dist-packages/alphafold/data/pipeline.py", line 40, in make_sequence_features
features['aatype'] = residue_constants.sequence_to_onehot(
File "/usr/local/lib/python3.10/dist-packages/alphafold/common/residue_constants.py", line 580, in sequence_to_onehot
raise ValueError(f'Invalid character in the sequence: {aa_type}')
ValueError: Invalid character in the sequence:
2024-03-14 18:44:44,736 Done
Downloading structure predictions to directory Downloads/ChimeraX/AlphaFold
cp: cannot stat '*_relaxed_rank_001_*.pdb': No such file or directory
cp: cannot stat '*_scores_rank_001_*.json': No such file or directory
when I put the same sequence into the colab not through ChimeraX, it works. I've tried restarting everything, trying different amino acid sequences and I just get the same error.
Here is the actual amino acid input sequence: MTTTLPEGVSHRVGFKPHLRVEIVRGEAVYLLSERGTTALQ
I'm not sure how to go about fixing this.
Thank you
Miriam Bregman
Change History (7)
comment:1 by , 20 months ago
| Component: | Unassigned → Structure Prediction |
|---|---|
| Owner: | set to |
| Platform: | → all |
| Project: | → ChimeraX |
| Status: | new → assigned |
comment:2 by , 20 months ago
| Resolution: | → not a bug |
|---|---|
| Status: | assigned → closed |
Your sequence ends in Q which is not a 1 letter code for the 20 standard amino acids that Alphafold allows.
comment:3 by , 20 months ago
Hi
I’m sorry I’m confused. Q is the code for glutamine. What letter code am I supposed to use?
Thanks
Miriam Bregman
Get Outlook for iOS<https://aka.ms/o0ukef>
________________________________
From: ChimeraX <ChimeraX-bugs-admin@cgl.ucsf.edu>
Sent: Thursday, March 14, 2024 8:31:54 PM
To: goddard@cgl.ucsf.edu <goddard@cgl.ucsf.edu>; Bregman, Miriam <bregman3@illinois.edu>
Subject: Re: [ChimeraX] #14771: AlphaFold2 colab notebook error
#14771: AlphaFold2 colab notebook error
-------------------------------------------+-------------------------
Reporter: bregman3@… | Owner: Tom Goddard
Type: defect | Status: closed
Priority: normal | Milestone:
Component: Structure Prediction | Version:
Resolution: not a bug | Keywords:
Blocked By: | Blocking:
Notify when closed: | Platform: all
Project: ChimeraX |
-------------------------------------------+-------------------------
Changes (by Tom Goddard):
* resolution: => not a bug
* status: assigned => closed
Comment:
Your sequence ends in Q which is not a 1 letter code for the 20 standard
amino acids that Alphafold allows.
--
Ticket URL: <https://urldefense.com/v3/__https://www.rbvi.ucsf.edu/trac/ChimeraX/ticket/14771*comment:2__;Iw!!DZ3fjg!8qHTWlf7nA5-BtMOWj6o4UMBuyZldRRsyhYQ4FmxK8JvmDVLyOe1oD8R225QiescQbJigBHsyAWeEsMSOZieB40OcU9nApERPVXB$ >
ChimeraX <https://urldefense.com/v3/__https://www.rbvi.ucsf.edu/chimerax/__;!!DZ3fjg!8qHTWlf7nA5-BtMOWj6o4UMBuyZldRRsyhYQ4FmxK8JvmDVLyOe1oD8R225QiescQbJigBHsyAWeEsMSOZieB40OcU9nAqBP4NOX$ >
ChimeraX Issue Tracker
comment:4 by , 20 months ago
You are right, Q is fine. The error message says
Invalid character in the sequence:
I do not see any invalid character. Also the message is supposed to report the invalid character but it shows nothing after the colon. So I suspect you have pasted the sequence into ChimeraX and it contains an invisible character. That could happen if the source you are pasting from does something strange. So try pasting your sequence from som other application, or type it by hand.
comment:5 by , 20 months ago
Hello
I tried it again without using two sequences and it still does not work.
This is the error I'm getting still
Traceback (most recent call last): File "/usr/local/lib/python3.10/dist-packages/colabfold/batch.py", line 1512, in run = generate_input_feature(query_seqs_unique, query_seqs_cardinality, unpaired_msa, paired_msa, File "/usr/local/lib/python3.10/dist-packages/colabfold/batch.py", line 1039, in generate_input_feature feature_dict = build_monomer_feature( File "/usr/local/lib/python3.10/dist-packages/colabfold/batch.py", line 892, in build_monomer_feature **pipeline.make_sequence_features( File "/usr/local/lib/python3.10/dist-packages/alphafold/data/pipeline.py", line 40, in make_sequence_features features['aatype'] = residue_constants.sequence_to_onehot( File "/usr/local/lib/python3.10/dist-packages/alphafold/common/residue_constants.py", line 580, in sequence_to_onehot raise ValueError(f'Invalid character in the sequence: {aa_type}') ValueError: Invalid character in the sequence: 2024-03-15 16:50:32,023 Done
this is the sequence used: MSTALTNARPDVESANAVALANDHRIALLTARTALEPALAQRYTEDPRSLLAEFGLVAVEPAYAAWGTEDDTHLLIEDLDRTGSGGEGFSIVFTKSDVPFPSVGTARR
there are no breaks, no spaces, or non-AA characters. This sequence works perfectly fine when I use it just through chrome on the colab folder. When I try to use AF on ChimeraX it is also crashing the application.
Thanks
Miriam Bregman
________________________________
From: ChimeraX <ChimeraX-bugs-admin@cgl.ucsf.edu>
Sent: Thursday, March 14, 2024 9:14 PM
To: Bregman, Miriam <bregman3@illinois.edu>; goddard@cgl.ucsf.edu <goddard@cgl.ucsf.edu>
Subject: Re: [ChimeraX] #14771: AlphaFold2 colab notebook error
#14771: AlphaFold2 colab notebook error
-------------------------------------------+-------------------------
Reporter: bregman3@… | Owner: Tom Goddard
Type: defect | Status: closed
Priority: normal | Milestone:
Component: Structure Prediction | Version:
Resolution: not a bug | Keywords:
Blocked By: | Blocking:
Notify when closed: | Platform: all
Project: ChimeraX |
-------------------------------------------+-------------------------
Comment (by goddard@…):
{{{
You are right, Q is fine. The error message says
Invalid character in the sequence:
I do not see any invalid character. Also the message is supposed to report
the invalid character but it shows nothing after the colon. So I suspect
you have pasted the sequence into ChimeraX and it contains an invisible
character. That could happen if the source you are pasting from does
something strange. So try pasting your sequence from som other
application, or type it by hand.
}}}
--
Ticket URL: <https://urldefense.com/v3/__https://www.rbvi.ucsf.edu/trac/ChimeraX/ticket/14771*comment:4__;Iw!!DZ3fjg!5rJEpvNm6JWMSmtBmKRYO4OMztRIr6yJl4ocII2vW78_IlFkUBRChVcrwGqi3__2kAfpQxXoj34gorbD1dU9sh7dP4TcqZZSlm2h$ >
ChimeraX <https://urldefense.com/v3/__https://www.rbvi.ucsf.edu/chimerax/__;!!DZ3fjg!5rJEpvNm6JWMSmtBmKRYO4OMztRIr6yJl4ocII2vW78_IlFkUBRChVcrwGqi3__2kAfpQxXoj34gorbD1dU9sh7dP4TcqTI-8iN7$ >
ChimeraX Issue Tracker
comment:6 by , 20 months ago
ChimeraX Alphafold prediction is broken due to a change in Google Colab. Sorry it took a while to figure out since I am on vacation and only have a phone. I am not sure when it will get fixed, may be days or maybe in April when I am back to work. The problem is sending the sequence to Google Colab no longer works. I don't see any simple fix.
Note:
See TracTickets
for help on using tickets.
Reported by Miriam Bregman