#7435 closed enhancement (fixed)
Speed up Blast Protein search by using 4 threads
| Reported by: | Tom Goddard | Owned by: | Zach Pearson |
|---|---|---|---|
| Priority: | moderate | Milestone: | |
| Component: | Sequence | Version: | |
| Keywords: | Cc: | ||
| Blocked By: | Blocking: | ||
| Notify when closed: | Platform: | all | |
| Project: | ChimeraX |
Description
Have blastp use 4 threads instead of the default 1 thread when running a Blast Protein search of the large databases, NR and AlphaFold v3. Probably would be nice (and simpler) to use 4 threads for all of the databases. The main effect may be parallelizing the database file read speed on the slow beegfs file system. The plato nodes have 56 cores and I think it is rare to see more than one blast job running on plato at a time.
This sped up the sequence search by a factor of 4 in my tests. It requires adding the "--num_threads 4" option to the blastp command in run_job() in file
cxwebservices/task_runners/blast.py
I tested blastp run on watson with ChimeraX defaults on a length 268 sequence (7u0u chain A) searching the AlphaFold version 3 database. Took 19 minutes with 1 thread, 5 minutes with 4 threads.
Change History (3)
comment:1 by , 3 years ago
| Resolution: | → fixed |
|---|---|
| Status: | assigned → closed |
comment:2 by , 3 years ago
Beautiful! I just tested with 7u0u chain A searching AlphaFold database version 3 and now takes under 5 minutes while before it was 19 minutes.
comment:3 by , 3 years ago
Searching 7u0u chain A against NR database took 20 minutes and 30 seconds. I believe this took 50 minutes before. So a little more than a factor of 2 speed up.
I've made the change. :) Glad you found a way to make this faster.