Opened 3 years ago
Last modified 15 months ago
#7387 reopened enhancement
BLAST of Uniref50, UniRef90, UniRef100 appears to be using old databases from 2012 — at Version 14
| Reported by: | Tom Goddard | Owned by: | Zach Pearson |
|---|---|---|---|
| Priority: | high | Milestone: | |
| Component: | Sequence | Version: | |
| Keywords: | Cc: | Elaine Meng, Eric Pettersen, Greg Couch, Scooter | |
| Blocked By: | Blocking: | ||
| Notify when closed: | Platform: | all | |
| Project: | ChimeraX |
Description (last modified by )
Not sure if these are the uniref databases being used by blast protein. They are from 2012. The UniRef100 file is 8 Gbytes, while the current UniRef100 is 83 Gbytes. So these old databases only have 1/10 of the sequences. They should be updated.
Change History (14)
comment:1 by , 3 years ago
comment:2 by , 3 years ago
| Cc: | added |
|---|
comment:4 by , 3 years ago
| Cc: | added; removed |
|---|---|
| Owner: | changed from to |
comment:5 by , 3 years ago
My user account is not allowed to write files in that directory. Additionally, that script doesn't exist(!)
Reassigning to Scooter.
follow-up: 6 comment:6 by , 3 years ago
The uniref databases can be downloaded by ftp here, uniref100, uniref90 and uniref50 are each a single large gziped fasta file. Then a simple makeblastdb command makes the database files. The database directory is owned by sacsdb with group sacs and is not writable by group sacs. $ ls -ld /databases/mol/blast/db_uniref drwxr-xr-x. 2 sacsdb sacs 42 Jul 28 2012 /databases/mol/blast/db_uniref
comment:7 by , 3 years ago
| Priority: | moderate → high |
|---|
comment:8 by , 3 years ago
| Milestone: | → 1.5 |
|---|
comment:9 by , 3 years ago
Newer data, from 16 June 2021, is in /wynton/group/databases/UniProt/uniref/uniref{100,50,90}.
comment:11 by , 3 years ago
| Cc: | removed |
|---|---|
| Owner: | changed from to |
comment:12 by , 3 years ago
I've re-created the missing get_uniref_blast script and am testing it now.
comment:13 by , 3 years ago
| Milestone: | 1.6 |
|---|
Note:
See TracTickets
for help on using tickets.
Forgot to include the path to the 2012 databases on plato