Graph Machine Learning

GraphML News (April 27th) - 🧬 The Protein Edition: OpenCRISPR, Xaira, ScaleFold

✂️ 🧬 Profluent Bio announced OpenCRISPR - an initiative to share CRISPR-Cas like proteins generated by protein LMs (a-la ESM 2). Profluent managed to generate rather novel proteins hundreds of mutations away from the known ones, and those new work surprisingly well — check out the thread by Ali Madani and a fresh preprint for more details. CRISRP is a genome editing tool that was awarded with 2020 Nobel Prize in Chemistry and got recently approved by FDA as a therapy for sickle cell disease (and a huge potential in other areas as well). Jennifer Doudna, one of the OG authors, gave a keynote at ICML’23 and even attended graph learning and comp bio workshops!

💸 A new biotech startup Xaira Therapeutics was established with $1B+ funding with David Baker as a co-founder. Investors include ARCH, Sequoia, Two Sigma, and other prominent VC bros. Perhaps we could hypothesize that the scaled up technology stack behind RF Diffusion (both ML and lab) is going to play a key role in Xaira. In related news, Max Welling announced his departure from MSR and co-founding of a new startup on molecular and materials discovery together with Chad Edwards.

📈 ScaleFold: Reducing AlphaFold Initial Training Time to 10 Hours - you only need 2080 H100’s to train AlphaFold in 7 hours (that’s roughly $130M given $500k price tag for a DGX with 8 H100 gpus). Gross extrapolation suggests that GPU-rich places like Meta could train a few AlphaFolds in less than an hour at the same time. Next milestone: train an AlphaFold-like model during a coffee break 👀.

📉 Artificial Intelligence Driving Materials Discovery? Perspective on the Article: Scaling Deep Learning for Materials Discovery - a critical look at the recently published GNoMe database of discovered crystalline structures. The two main points are (1) a lot of those structures contain radioactive elements making them impractical for real-world use; (2) a lot of those structures are isomorphic to well-known structures in crystallographic terms, eg, replacing one element with that of a similar group that induces pretty much the same crystal structure.

Weekend reading:

The GeometricKernels library by Viacheslav Borovitskiy et al that implements kernels for Riemannian manifolds, graphs, and meshes with TF, PT, and Jax bindings.

Learning with 3D rotations, a hitchhiker's guide to SO(3) by A. René Geist et al - a great introductory paper and resource for studying geometric rotations, a perfect companion to the Hitchhiker’s guide to Geometric GNNs

From Local to Global: A Graph RAG Approach to Query-Focused Summarization by Darren Edge and MSR - we mentioned GraphRAG a few times and here is the full preprint.

STaRK: Benchmarking LLM Retrieval on Textual and Relational Knowledge Bases by Shirley Wu, Shiyu Zhao et al feat. Jure Leskovec - a new benchmark for question answering on texts and structured sources

5.71K views08:53