wiki:2024-8-8

Version 2 (modified by Tom Goddard, 15 months ago) ( diff )

--

Attendees

Piet, Tom Goddard, Scooter, Greg, Eric, Elaine, Tom Ferrin

August 8, 2024

Agenda

  • Showing Martin Steinegger ISMB keynote talk

Discussion Notes

  • SSD drives on plato
    • Each of 4 plato nodes now has a 4 TB SSD drive mounted at /scratch.
    • Scooter is using this to build BLAST databases with makeblastdb which fails to run on beegfs file system.
  • Martin Steinegger talk
    • Martin created BFD (big fantastic database) metagenomics database that DeepMind then used for AlphaFold 2.
    • Trained network to run Foldseek directly on a sequence with the network directly translating sequence to 3Di spatial alphabet. This takes seconds instead of minutes to make a ColabFold prediction.
    • Martin has clustered AlphaFold database of 214 million structures down to 2-3 million clusters using foldseek-clust.
    • About 35% of clusters have no annotations.
    • Looked at how many clusters are specialized to bacteria or eukaryotes or archaea, cluster.foldseek.com.
    • Martin made pitch strongly advocating open source software. Said they made foldseek open source in 2021 3 years before publication.

Action Items

Note: See TracWiki for help on using the wiki.