#14771 closed defect (not a bug)
AlphaFold2 colab notebook error
Reported by: | Owned by: | Tom Goddard | |
---|---|---|---|
Priority: | normal | Milestone: | |
Component: | Structure Prediction | Version: | |
Keywords: | Cc: | ||
Blocked By: | Blocking: | ||
Notify when closed: | Platform: | all | |
Project: | ChimeraX |
Description
Hello I've been trying to use the colab on ChimeraX, and I used it previously this week and it worked perfectly fine. I'm now getting the following error 2024-03-14 18:44:43,264 Starting prediction on 2024-03-14 UTC time 2024-03-14 18:44:43,264 Installing ColabFold on Google Colab virtual machine. Using Tesla T4 graphics processor 2024-03-14 18:44:43,493 Running on GPU 2024-03-14 18:44:43,496 Found 5 citations for tools or databases 2024-03-14 18:44:43,497 Query 1/1: af42 (length 42) COMPLETE: 100%|██████████| 150/150 [elapsed: 00:01 remaining: 00:00] 2024-03-14 18:44:44,734 Could not generate input features af42: Invalid character in the sequence: Traceback (most recent call last): File "/usr/local/lib/python3.10/dist-packages/colabfold/batch.py", line 1512, in run = generate_input_feature(query_seqs_unique, query_seqs_cardinality, unpaired_msa, paired_msa, File "/usr/local/lib/python3.10/dist-packages/colabfold/batch.py", line 1039, in generate_input_feature feature_dict = build_monomer_feature( File "/usr/local/lib/python3.10/dist-packages/colabfold/batch.py", line 892, in build_monomer_feature **pipeline.make_sequence_features( File "/usr/local/lib/python3.10/dist-packages/alphafold/data/pipeline.py", line 40, in make_sequence_features features['aatype'] = residue_constants.sequence_to_onehot( File "/usr/local/lib/python3.10/dist-packages/alphafold/common/residue_constants.py", line 580, in sequence_to_onehot raise ValueError(f'Invalid character in the sequence: {aa_type}') ValueError: Invalid character in the sequence: 2024-03-14 18:44:44,736 Done Downloading structure predictions to directory Downloads/ChimeraX/AlphaFold cp: cannot stat '*_relaxed_rank_001_*.pdb': No such file or directory cp: cannot stat '*_scores_rank_001_*.json': No such file or directory when I put the same sequence into the colab not through ChimeraX, it works. I've tried restarting everything, trying different amino acid sequences and I just get the same error. Here is the actual amino acid input sequence: MTTTLPEGVSHRVGFKPHLRVEIVRGEAVYLLSERGTTALQ I'm not sure how to go about fixing this. Thank you Miriam Bregman
Change History (7)
comment:1 by , 20 months ago
Component: | Unassigned → Structure Prediction |
---|---|
Owner: | set to |
Platform: | → all |
Project: | → ChimeraX |
Status: | new → assigned |
comment:2 by , 20 months ago
Resolution: | → not a bug |
---|---|
Status: | assigned → closed |
Your sequence ends in Q which is not a 1 letter code for the 20 standard amino acids that Alphafold allows.
comment:3 by , 20 months ago
Hi I’m sorry I’m confused. Q is the code for glutamine. What letter code am I supposed to use? Thanks Miriam Bregman Get Outlook for iOS<https://aka.ms/o0ukef> ________________________________ From: ChimeraX <ChimeraX-bugs-admin@cgl.ucsf.edu> Sent: Thursday, March 14, 2024 8:31:54 PM To: goddard@cgl.ucsf.edu <goddard@cgl.ucsf.edu>; Bregman, Miriam <bregman3@illinois.edu> Subject: Re: [ChimeraX] #14771: AlphaFold2 colab notebook error #14771: AlphaFold2 colab notebook error -------------------------------------------+------------------------- Reporter: bregman3@… | Owner: Tom Goddard Type: defect | Status: closed Priority: normal | Milestone: Component: Structure Prediction | Version: Resolution: not a bug | Keywords: Blocked By: | Blocking: Notify when closed: | Platform: all Project: ChimeraX | -------------------------------------------+------------------------- Changes (by Tom Goddard): * resolution: => not a bug * status: assigned => closed Comment: Your sequence ends in Q which is not a 1 letter code for the 20 standard amino acids that Alphafold allows. -- Ticket URL: <https://urldefense.com/v3/__https://www.rbvi.ucsf.edu/trac/ChimeraX/ticket/14771*comment:2__;Iw!!DZ3fjg!8qHTWlf7nA5-BtMOWj6o4UMBuyZldRRsyhYQ4FmxK8JvmDVLyOe1oD8R225QiescQbJigBHsyAWeEsMSOZieB40OcU9nApERPVXB$ > ChimeraX <https://urldefense.com/v3/__https://www.rbvi.ucsf.edu/chimerax/__;!!DZ3fjg!8qHTWlf7nA5-BtMOWj6o4UMBuyZldRRsyhYQ4FmxK8JvmDVLyOe1oD8R225QiescQbJigBHsyAWeEsMSOZieB40OcU9nAqBP4NOX$ > ChimeraX Issue Tracker
comment:4 by , 20 months ago
You are right, Q is fine. The error message says Invalid character in the sequence: I do not see any invalid character. Also the message is supposed to report the invalid character but it shows nothing after the colon. So I suspect you have pasted the sequence into ChimeraX and it contains an invisible character. That could happen if the source you are pasting from does something strange. So try pasting your sequence from som other application, or type it by hand.
comment:5 by , 20 months ago
Hello I tried it again without using two sequences and it still does not work. This is the error I'm getting still Traceback (most recent call last): File "/usr/local/lib/python3.10/dist-packages/colabfold/batch.py", line 1512, in run = generate_input_feature(query_seqs_unique, query_seqs_cardinality, unpaired_msa, paired_msa, File "/usr/local/lib/python3.10/dist-packages/colabfold/batch.py", line 1039, in generate_input_feature feature_dict = build_monomer_feature( File "/usr/local/lib/python3.10/dist-packages/colabfold/batch.py", line 892, in build_monomer_feature **pipeline.make_sequence_features( File "/usr/local/lib/python3.10/dist-packages/alphafold/data/pipeline.py", line 40, in make_sequence_features features['aatype'] = residue_constants.sequence_to_onehot( File "/usr/local/lib/python3.10/dist-packages/alphafold/common/residue_constants.py", line 580, in sequence_to_onehot raise ValueError(f'Invalid character in the sequence: {aa_type}') ValueError: Invalid character in the sequence: 2024-03-15 16:50:32,023 Done this is the sequence used: MSTALTNARPDVESANAVALANDHRIALLTARTALEPALAQRYTEDPRSLLAEFGLVAVEPAYAAWGTEDDTHLLIEDLDRTGSGGEGFSIVFTKSDVPFPSVGTARR there are no breaks, no spaces, or non-AA characters. This sequence works perfectly fine when I use it just through chrome on the colab folder. When I try to use AF on ChimeraX it is also crashing the application. Thanks Miriam Bregman ________________________________ From: ChimeraX <ChimeraX-bugs-admin@cgl.ucsf.edu> Sent: Thursday, March 14, 2024 9:14 PM To: Bregman, Miriam <bregman3@illinois.edu>; goddard@cgl.ucsf.edu <goddard@cgl.ucsf.edu> Subject: Re: [ChimeraX] #14771: AlphaFold2 colab notebook error #14771: AlphaFold2 colab notebook error -------------------------------------------+------------------------- Reporter: bregman3@… | Owner: Tom Goddard Type: defect | Status: closed Priority: normal | Milestone: Component: Structure Prediction | Version: Resolution: not a bug | Keywords: Blocked By: | Blocking: Notify when closed: | Platform: all Project: ChimeraX | -------------------------------------------+------------------------- Comment (by goddard@…): {{{ You are right, Q is fine. The error message says Invalid character in the sequence: I do not see any invalid character. Also the message is supposed to report the invalid character but it shows nothing after the colon. So I suspect you have pasted the sequence into ChimeraX and it contains an invisible character. That could happen if the source you are pasting from does something strange. So try pasting your sequence from som other application, or type it by hand. }}} -- Ticket URL: <https://urldefense.com/v3/__https://www.rbvi.ucsf.edu/trac/ChimeraX/ticket/14771*comment:4__;Iw!!DZ3fjg!5rJEpvNm6JWMSmtBmKRYO4OMztRIr6yJl4ocII2vW78_IlFkUBRChVcrwGqi3__2kAfpQxXoj34gorbD1dU9sh7dP4TcqZZSlm2h$ > ChimeraX <https://urldefense.com/v3/__https://www.rbvi.ucsf.edu/chimerax/__;!!DZ3fjg!5rJEpvNm6JWMSmtBmKRYO4OMztRIr6yJl4ocII2vW78_IlFkUBRChVcrwGqi3__2kAfpQxXoj34gorbD1dU9sh7dP4TcqTI-8iN7$ > ChimeraX Issue Tracker
comment:6 by , 19 months ago
ChimeraX Alphafold prediction is broken due to a change in Google Colab. Sorry it took a while to figure out since I am on vacation and only have a phone. I am not sure when it will get fixed, may be days or maybe in April when I am back to work. The problem is sending the sequence to Google Colab no longer works. I don't see any simple fix.
Note:
See TracTickets
for help on using tickets.
Reported by Miriam Bregman