Version 2 (modified by 15 months ago) ( diff ) | ,
---|
Attendees
Piet, Tom Goddard, Scooter, Greg, Eric, Elaine, Tom Ferrin
August 8, 2024
Agenda
- Showing Martin Steinegger ISMB keynote talk
Discussion Notes
- SSD drives on plato
- Each of 4 plato nodes now has a 4 TB SSD drive mounted at /scratch.
- Scooter is using this to build BLAST databases with makeblastdb which fails to run on beegfs file system.
- Martin Steinegger talk
- Martin created BFD (big fantastic database) metagenomics database that DeepMind then used for AlphaFold 2.
- Trained network to run Foldseek directly on a sequence with the network directly translating sequence to 3Di spatial alphabet. This takes seconds instead of minutes to make a ColabFold prediction.
- Martin has clustered AlphaFold database of 214 million structures down to 2-3 million clusters using foldseek-clust.
- About 35% of clusters have no annotations.
- Looked at how many clusters are specialized to bacteria or eukaryotes or archaea, cluster.foldseek.com.
- Martin made pitch strongly advocating open source software. Said they made foldseek open source in 2021 3 years before publication.
Action Items
Note:
See TracWiki
for help on using the wiki.