Graph Machine Learning
6.71K subscribers
53 photos
11 files
808 links
Everything about graph theory, computer science, machine learning, etc.


If you have something worth sharing with the community, reach out @gimmeblues, @chaitjo.

Admins: Sergey Ivanov; Michael Galkin; Chaitanya K. Joshi
Download Telegram
​​Learning on Graphs @ NYC meetup (Feb 29th - March 1st) online streaming

The 2-day LoG meetup taking place in Jersey City will be streamed online openly for everyone! The talks include the Google Research team (who will for sure talk like a graph), Ricky Chen and Brandon Amos from Meta AI, biotech presence with Matthew McPartlon, Luca Naef from VantAI and Samuel Stanton from Genentech, and many more (see the schedule attached).
GraphML News (March 2nd) - Categorical Deep Learning, Evo, and NeuralPlexer 2

🔀 A fresh look on deep learning from the category theory perspective: Categorical Deep Learning: An Algebraic Theory of Architectures by Bruno Gavranović, Paul Lessard, Andrew Dudzik, featuing Petar Veličković. The position paper attempts to generalize Geometric Deep Learning even further - by the means of monad algebras that generalize invariance, equivariance, and symmetries (🍞 and 🧈 of GDL). The main part quickly ramps up to some advanced category theory concepts but the appendix covers the basics (still recommend Cats4AI as a pre-requisite though).

🧬 Evo - a foundation model by Arc Institute for RNA/DNA/protein sequences based on the StripedHyena architecture (state space models and convolutions) with the context length of 131K tokens. Some applications include zero-shot function prediction for ncRNA and regulatory DNA, CRISPR system generation, generating whole genome sequences, and many more. Adepts of the church of scaling laws might be interested in promising scaling capabilities of Evo that seems to outperform Transformers and recent Mamba

🪢 NeuralPlexer 2, a generative model for protein-ligand docking from Iambic, Caltech, and NVIDIA, challenges Alphafold-latest in several benchmarks: 75.4% RMSD <2Å on PoseBusters vs 73.6 of Alphafold-latest without site specification, and up to 93.8% with site specification, while being about 50x faster than AlphaFold. The race in comp bio intensifies, moats are challenged, and for us it means we’ll see more cool results - at the cost of more proprietary models and closed data though.

Weekend reading:

Graph Learning under Distribution Shifts: A Comprehensive Survey on Domain Adaptation, Out-of-distribution, and Continual Learning by Man Wu et al.

TorchMD-Net 2.0: Fast Neural Network Potentials for Molecular Simulations by Raul P. Pelaez, Guillem Simeon, et al - the next version of the popular ML potential package, now up to 10x faster thanks to torch compile! (from that perspective, a switch to JAX seems inevitable)

Weisfeiler-Leman at the margin: When more expressivity matters by Billy Franks, Chris Morris, Ameya Velingker, and Floris Geerts - a new study on expressivity and generalization of MPNNs that continues WL meet VC
GraphML News (March 10th) - Protein Design Community Principles, RF All Atom weights, ICLR workshops

🤝 More than 100 prominent researchers in protein design, structural biology, and geometric deep learning committed to the principles of Responsible AI in Biodesign. Recognizing the increasing capabilities of deep learning models in designing functional biological molecules, the community came up with several core values and principles such as the benefit of society, safety and security, openness, equity, international collaboration, and responsibility. Particular commitments include more scrutiny towards hazardous biomolecules before their manufacturing, better evaluation and risk assessment of DL models. Good for the protein design community, let’s hope those would be practically implemented!

🧬 Committing to the newly introduced principles, Baker’s lab released RosettaFold All-Atom and RFDiffusion All-Atom together with their model weights and several inference examples. Folks on Twitter who interpret the principles as “closed-source AI taking over” are obviously wrong 😛

📚 ICLR 2024 workshops started posting accepted papers - so far we see the papers from AI 4 Differential Equations, Representational Alignment, and Time Series for Health. ICLR workshop papers are usually good proxies for ICML and NeurIPS submissions, so you might be interested to check those of your domain.

Weekend reading:

A Survey of Graph Neural Networks in Real world: Imbalance, Noise, Privacy and OOD Challenges by Wei Ju et al

Graph neural network outputs are almost surely asymptotically constant by Sam Adam-Day et al. feat. Ismail Ilkan Ceylan

Pairwise Alignment Improves Graph Domain Adaptation by Shikun Liu et al feat. Pan Li

Understanding Biology in the Age of Artificial Intelligence by Elsa Lawrence, Adham El-Shazly, Srijit Seal feat. our own Chaitanya K. Joshi
GraphML News (March 16th) - RelationRx round, Caduceus, Blogposts, WholeGraph

💸 Relation Therapeutics, the drug discovery company, raises $35M seed funding led by DCVC and NVentures (VC arm of NVIDIA) - making it $60M in total after factoring in the previous round in 2022. Relation is developing treatments for osteoporosis and other bone-related diseases.

⚕️The race between Mamba and Hyena-like architectures for long-context DNA modeling is heating up: Caduceus by Yair Schiff featuring Tri Dao and Albert Gu is the first bi-directional Mamba equivariant to the reverse complement (RC) symmetry of DNA. Similarly to the recent Evo, it supports sequence lengths up to 131k. In turn, a new blog post by Hazy Research on Evo hinted upon the new Mechanistic Architecture Design framework that employs synthetic probes to check long-range modeling capabilities.

💬 A new Medium blogpost by Xiaoxin He (NUS Singapore) on chatting with your graph - dedicated to the recent G-Retriever paper on graph-based RAG for question answering tasks. The post goes through the technical details (perhaps the most interesting part is prize-collecting Steiner Tree for subgraph retrieval) and positions the work in the flurry of recent Graph + LLM approaches including Talk Like a Graph (highlighted in the recent Google Research blogpost) and Let the Graph do the Talking. Fun fact: now we have 2 different datasets named GraphQA with completely different contents and tasks (one from G-Retriever, another one from the Google papers).

💽 The WholeGraph Storage by NVIDIA for PyG and DGL - a handy way for distributed setups to keep a single graph in the shared storage accessible by the workers. WholeGraph comes in three flavors: continuous, chunked, and distributed.

Weekend reading:

Personalized Audiobook Recommendations at Spotify Through Graph Neural Networks by Marco De Nadai, Francesco Fabbri, and the Spotify team - Heterogeneous GNNs + The Two (MLP) Towers for SOTA RecSys.

Universal Representation of Permutation-Invariant Functions on Vectors and Tensors by Puoya Tabaghi and Yusu Wang (UCSD) - when encoding sets of N elements of D-dimensional vectors, DeepSets require a latent dimension of N^D. This cool work reduces this bound to 2ND 👀.

Generalizing Denoising to Non-Equilibrium Structures Improves Equivariant Force Fields by Yi-Lun Liao, Tess Smidt, Abhishek Das - the success of a Noisy Nodes-like auxiliary denoising objective is extended to non-equilibrium structures thanks to encoding forces of non-equilibrium structures. Yields SOTA on OpenCatalyst (if you have 16-128 V100’s though).
GraphML News (March 23rd) - Profluent round, Biology 2.0, TacticAI

💸 Profluent, a Berkley biotech startup founded in 2022, raises $35M (overall $44M so far). The company focuses on protein generation models in the context of CRISPR gene editing. VC funding in the biotech industry is on fire in 2024!

🧬 A huge blogpost The Road to Biology 2.0 Will Pass Through Black-Box Data by Michael Bronstein and Luca Naef offers a new perspective on the area of ML for biology and its common problem of lacking large amounts of labeled data. The idea is to leverage low-cost high-throughput data (eg, obtained from experimental facilities), coined as “black-box data”, that might not be directly understandable by humans (or experts) but can be used for training large-scale ML models even in the self-supervised regime. It is then hypothesized that the competitive edge would belong to the companies that manage to build such data pipelines and models. Time to convince old-school chemists about the benefits of black-box data.

Google DeepMind officially introduced TacticAI with the publication in Nature Communication (we wrote about it in the End-Of-The-Year post a few months ago at the preprint stage). TacticAI uses group-equivariant convnets and is designed for football games to give tactical insights for many practical cases such as corner kicks. Interestingly, experts prefer TacticAI outputs 90% of the time. Equivariance +  = 📈

Weekend reading:

Atomically accurate de novo design of single-domain antibodies from the Baker Lab - RFDiffusion for antibodies

Weisfeiler and Leman Go Loopy: A New Hierarchy for Graph Representational Learning by Raffaele Paolino, Sohir Maskey, Pascal Welke, and Gitta Kutyniok - WL visited one more location
GraphML News (March 30th) - AlphaFold course, Upcoming Summer Schools

The first week of ICML rebuttals has passed, one week to go - good luck everyone 💪

EMBL-EBI together with Google DeepMind released a free entry-level course about the basics of protein folding and using AlphaFold for structure prediction. The course helps to understand inputs and outputs of AlphaFold, how to interpret the metrics and predictions, and a bit of more advanced usage.

A handful of summer schools covering lots of Graph and Geometric DL were announced recently:

- Eastern European ML Summer School | 15-20 July 2024, Novi Sad, Serbia
- ELLIS Summer School on Machine Learning for Healthcare and Biology | 11-13 June 2024, Manchester, UK
- Generative Modeling Summer School | 24-28th June 2024, Eindhoven, Netherlands
- The workshop on mining and learning with graphs (MLG) will be co-located with ECML PKDD in Vilnius, Lithuania in September 2024 featuring keynotes by Yllka Velaj and Haggai Maron.

Weekend reading:

A new version of the Hitchhiker’s guide on Geometric GNNs featuring frame-based invariant GNNs and unconstrained GNNs (btw, the paper will be presented at the next LoGaG reading group on Monday, April 1st)

Space Group Informed Transformer for Crystalline Materials Generation - autoregressive, transformer-based crystal generation that takes into account space groups and Wyckoff positions (a competing diffusion model DiffCSP++ was accepted at ICLR’24)

Graphs Generalization under Distribution Shifts by Tian et al

Addressing heterophily in node classification with graph echo state networks by Alessio Micheli and Domenico Tortorella — applies a reservoir computing approach, that is, randomly initialize GNN weights to obtain a desired Lipschitz constant
GraphML News (April 6th) - Leash Bio Round, The BELKA Kaggle Competition, Sparse Ops speedups

💸 Leash Biosciences (founded by ex-Recursion folks) announced a $9.3M seed round led by Springtide Ventures. Leash focuses on building huge proprietary datasets for protein-molecule interactions.

🐿️ At the same time, Leash launched a new Kaggle competition on predicting the binding affinity of small molecules to proteins using the Big Encoded Library for Chemical Assessment (BELKA). The dataset contains about 133M small molecules vs 3 proteins (sEH, BRD4 and HSA). Protein-ligand binding diffusion models like DiffDock are allowed as well. Who will win: comp bio folks with domain expertise or Kaggle grandmasters with expertise on finding data leakages? 🤔 We’ll see in 3 months.

📈 Zhongming Yu and the team from UCSD, Stanford, and Intel released GeOT - a tensor centric library for GNNs via efficient segment reduction on GPU. The library ships efficient CUDA kernels for sparse operations like scatter summation and fused message-aggregation kernels. On average, GeOT brings 1.7-3.5x speedups over PyG sparse ops, and 2.3-3.6x over PyG dense ops. Looking forward seeing the kernels in major libraries and, hopefully, the Triton version.

🧜‍♂️ Recently, I played around quite a lot writing Triton kernels for fusing message and aggregation steps of several GNN architectures into one kernel call and can highly recommend trying to speed up your models with them. Triton kernels are written in Python (saving your feet from C++ code shootings), are compiled automatically into efficient code on several platforms (CUDA, ROCm, and even Intel GPUs), and are often faster than CUDA kernels.

Weekend reading (UC San Diego was on fire this week):

GeoT: Tensor Centric Library for Graph Neural Network via Efficient Segment Reduction on GPU

On the Theoretical Expressive Power and the Design Space of Higher-Order Graph Transformers by Cai Zhou, Rose Yu, Yusu Wang - the authors prove that k-order graph transformers are not more expressive than k-WL unless positional encodings are supplied. The results extend nicely our work Attending to Graph Transformers (recently accepted to TMLR)

DE-HNN: An effective neural model for Circuit Netlist representation by Zhishang Luo feat. Yusu Wang) - properly representing analog and digital circuits is a big pain in the chip design community. This work demonstrated the benefits of using directed hypergraphs for netlists and proposed a new big dataset for experiments.
🥰1
​​Deep learning for dynamic graphs: models and benchmarks

Guest post by Alessio Gravina

Published in IEEE Transactions on Neural Networks and Learning Systems
📜 arxiv preprint: link
🛠️ code: GitHub

Recent progress in research on Deep Graph Networks (DGNs) has led to a maturation of the domain of learning on graphs. Despite the growth of this research field, there are still important challenges that are yet unsolved. Specifically, there is an urge of making DGNs suitable for predictive tasks on real world systems of interconnected entities, which evolve over time.

In light of this, in this paper we proposed, at first, a survey that focuses on recent representation learning techniques for dynamic graphs under a uniform formalism consolidated from existing literature. Second, we provide the research community with a fair performance comparison among the most popular methods of the three families of dynamic graph problems, by leveraging a reproducible experimental environment.

We believe that this work will help fostering the research in the domain of dynamic graphs by providing a clear picture of the current development status and a good baseline to test new architectures and approaches.
GraphML News (April 13th) - MoML’24, ICML workshops, ICLR blogposts

🏆 Big news: Avi Wigderson received the Turing Award Prize 2024 for his contributions to randomness in computation along with other works in complexity theory, cryptography, and graph theory. Particularly in graph theory, Avi is well-known for studying expander graphs which recently became quite popular in Graph ML, eg, with Expander Graph Propagation and Exphormer as a sparse attention mechanism in graph transformers. Read more about Avi in this Quanta article.

🧬 Valence Labs and Mila announced the Molecular ML Conference 2024 (MoML) (June 19th) as the key part of the larger 2-week program on structural biology and geometric DL including the Drug Discovery Summer School (June 12-18) and Hackathon (June 20-21). All events will take place in Montreal (and June is the best time to be in Montreal). MoML will feature talks by Dominique Beaini (Valence), Jian Tang (Mila), Christine Allen (U of Toronto), and Max Jaderberg (Isomorphic Labs). The summer school will feature talks by Michael Bronstein, Mario Geiger, Yoshua Bengio, Connor Coley, Charlotte Bunne, and other prominent researchers. A perfect event for ML folks to learn bio, and for biologists to learn SOTA ML methods.

🎤 ICML’24 published a list of accepted workshops: you might be interested in:

- Geometry-grounded Representation Learning and Generative Modeling (GRaM)
- Structured Probabilistic Inference and Generative Modeling
- AI for Science: Scaling in AI for Scientific Discovery
- ML for Life and Material Science: From Theory to Industry Applications

Besides, ICLR published the blog posts accepted to the Blog Post track (a hidden treasure of ICLR) - check out the posts on deriving diffusion models, flow matching, equilibrium models for algorithmic reasoning, and even on computing Hessian-vector products.

📚 Weekend reading:

Simplicial Representation Learning with Neural k-Forms (ICLR 2024) by Kelly Maggs, Celia Hacker, Bastian Rieck - an alternative to message passing using neural k-forms and simplicial complexes

Benchmarking ChatGPT on Algorithmic Reasoning by Sean McLeish, Avi Schwarzschild, Tom Goldstein - turns out that ChatGPT with code interpreter can beat many GNNs on the CLRS benchmark when posing questions and data in natural language (who knew that quickselect, unsolvable by GNNs, could be almost perfectly solved by an LLM?). To be fair, the paper generated quite active discussions on Twitter as to the OOD generalization aspect of CLRS and the fact that LLMs saw all those algorithms many times during pre-training.

Empowering Biomedical Discovery with AI Agents by Shanghua Gao feat. Marinka Zitnik - a survey on advances of AI agents in biomedical discovery and open challenges
GraphML News (April 20th) - Near-Linear Min Cut, New blog posts, scaling GNNs

LLaMa 3 dominated the ML media this week but let’s try to see through it to find some graph gems.

✂️ Google Research published a new blog post on the recently proposed near-linear min-cut algorithm for weighted graphs. Existing near-linear algorithms are either randomized or work on rather simple graphs. In contrast, the proposed algorithm is deterministic and supports weighted graphs. The key points of the devised approach:
(1) the observation that cuts likely won’t change if we sparsify the graph a bit;
(2) min-cuts must have low graph conductance), hence partitioning algorithms (producing well-connected clusters) might be approximately consistent with min-cuts;
(3) the theory is actually applicable to weighted graphs.
The work received the best paper at SODA’24 👏

🌊 Tor Fjelde, Emile Mathieu and Vincent Dutordoir released an insightful introduction to flow matching starting from the basics of conditional normalizing flows up to the most recent stochastic interpolants and mini-batch optimal transport coupling. We are a little late to the party (the post dates to January) but it’s never too late to catch up with generative modeling and flow matching.

On the Scalability of GNNs for Molecular Graphs by Maciej Sypetkowski, Frederik Wenkel, and Valence / Mila folks - one of the first in-depth studies of scaling GNNs and Transformers for molecular tasks. In particular, they trained modified versions of MPNN, GPS, and vanilla Transformer models (with structural encodings of course) varying the size from 1M to 1B parameters on the LargeMix dataset of 5M molecules. Scaling does improve pretraining and downstream performance of all models but there is a clear signal that pre-training dataset size is not enough - experiments on the UltraLarge dataset with 83M molecules are likely in the works.

Weekend reading:

HelixFold-Multimer: Elevating Protein Complex Structure Prediction to New Heights by Xiaomin Fang and Baidu - a contender to AlphaFold 2.3 showing strong results on antibody-antigen and nanobody-antigen docking.

Graph Reinforcement Learning for Combinatorial Optimization: A Survey and Unifying Perspective by Victor-Alexandru Darvariu and UCL

VN-EGNN: E(3)-Equivariant Graph Neural Networks with Virtual Nodes Enhance Protein Binding Site Identification by Florian Sestak and ELLIS Linz - virtual nodes encode representations of the whole binding site
GraphML News (April 27th) - 🧬 The Protein Edition: OpenCRISPR, Xaira, ScaleFold

✂️ 🧬 Profluent Bio announced OpenCRISPR - an initiative to share CRISPR-Cas like proteins generated by protein LMs (a-la ESM 2). Profluent managed to generate rather novel proteins hundreds of mutations away from the known ones, and those new work surprisingly well — check out the thread by Ali Madani and a fresh preprint for more details. CRISRP is a genome editing tool that was awarded with 2020 Nobel Prize in Chemistry and got recently approved by FDA as a therapy for sickle cell disease (and a huge potential in other areas as well). Jennifer Doudna, one of the OG authors, gave a keynote at ICML’23 and even attended graph learning and comp bio workshops!

💸 A new biotech startup Xaira Therapeutics was established with $1B+ funding with David Baker as a co-founder. Investors include ARCH, Sequoia, Two Sigma, and other prominent VC bros. Perhaps we could hypothesize that the scaled up technology stack behind RF Diffusion (both ML and lab) is going to play a key role in Xaira. In related news, Max Welling announced his departure from MSR and co-founding of a new startup on molecular and materials discovery together with Chad Edwards.

📈 ScaleFold: Reducing AlphaFold Initial Training Time to 10 Hours - you only need 2080 H100’s to train AlphaFold in 7 hours (that’s roughly $130M given $500k price tag for a DGX with 8 H100 gpus). Gross extrapolation suggests that GPU-rich places like Meta could train a few AlphaFolds in less than an hour at the same time. Next milestone: train an AlphaFold-like model during a coffee break 👀.

📉 Artificial Intelligence Driving Materials Discovery? Perspective on the Article: Scaling Deep Learning for Materials Discovery - a critical look at the recently published GNoMe database of discovered crystalline structures. The two main points are (1) a lot of those structures contain radioactive elements making them impractical for real-world use; (2) a lot of those structures are isomorphic to well-known structures in crystallographic terms, eg, replacing one element with that of a similar group that induces pretty much the same crystal structure.

Weekend reading:

The GeometricKernels library by Viacheslav Borovitskiy et al that implements kernels for Riemannian manifolds, graphs, and meshes with TF, PT, and Jax bindings.

Learning with 3D rotations, a hitchhiker's guide to SO(3) by A. René Geist et al - a great introductory paper and resource for studying geometric rotations, a perfect companion to the Hitchhiker’s guide to Geometric GNNs

From Local to Global: A Graph RAG Approach to Query-Focused Summarization by Darren Edge and MSR - we mentioned GraphRAG a few times and here is the full preprint.

STaRK: Benchmarking LLM Retrieval on Textual and Relational Knowledge Bases by Shirley Wu, Shiyu Zhao et al feat. Jure Leskovec - a new benchmark for question answering on texts and structured sources
GraphML News (May 3rd) - The ICLR Week, new blogs

🎉 ICLR’24 starts in Vienna next Tuesday (May 5th)! There will be a ton of graph learning papers, geometric DL workshops, and, more importantly, the authors and folks who constitute the community. Michael and Chaitanya will be there, feel free to reach out to chat!

A few new blogposts:

- The TeraHAC algorithm by Google (to be presented at SIGMOD’24) for approximate clustering graphs with trillions of edges in quasi-linear time.
- Adventures of Pop – the undruggable protein by Dom Beaini (Valence Labs) - a spectacular ELI5 read about drug discovery where a celebrity protein Pop (the cause of a bad disease) has to eat a banana 🍌 (the ligand with a potential drug that would inhibit the protein). With this yummy vocabulary at hand, the post explains several key concepts like protein-ligand binding, free energy, molecular dynamics, DPMK optimization, and more.

Weekend reading:

Uncertainty for Active Learning on Graphs by Dominik Fuchsgruber, Tom Wollschläger et al feat. Stephan Günnemann (all TU Munich)

Parameter-Efficient Tuning Large Language Models for Graph Representation Learning by Qi Zhu and AWS team feat. George Karypis - on using GNNs for producing soft prompts to be sent to LLMs

4DBInfer: A 4D Benchmarking Toolbox for Graph-Centric Predictive Modeling on Relational DBs by Minjie Wang, Quan Gan feat. Muhan Zhang - a new benchmark for graph learning on relational DBs similar to a recent RelBench, but including more tasks like link prediction. Some GNNs seem to outperform XGBoost (Kaggle GMs are anxious and frowning)
GraphML News (May 11th) - AlphaFold 3

🧬 Google DeepMind and Isomorphic Labs announced AlphaFold 3 going beyond proteins and extending structure prediction capabilities to RNA, DNA, and small molecules (ligands). AF3 employs Pairformer (improved Evoformer) as an encoder and a diffusion model for generating 3D coordinates. Yes, AF3 demonstrates huge gains in structural biology tasks compared to previous models, but perhaps the hottest take from the Nature preprint is:

> Similarly to some recent work, we find that no invariance or equivariance with respect to global rotations and translation of the molecule are required in the architecture and so we omit them to simplify the machine learning architecture.

🔥 For reference, AF2 used SE(3)-equivariant attention that spun off a great deal of research in equivariance and geometry for structural biology. The new statement took researchers at ICLR by storm: do we need to invest time and efforts into complex math and group theory if vanilla non-equivariant transformer and diffusion trained on 48 random augmentations can beat other geometric models with baked-in equivariances? AF3 used rather modest compute (compared to LLMs) - 256 A100s for 10 days of pretraining and 10 days of finetuning (overall roughly $420K on Azure) - and it seems to be enough to send a wake-up call to the Geometric DL community.

🤔 Does the bitter lesson strike again? Is it easier to learn symmetries from data and augmentations (classical 2016 paper by Taco Cohen and Max Welling) rather than enforcing those constraints in the model? Maybe it’s the task (DNA and RNA structure prediction) that does not have explicit symmetries to bake into a model? It it quite likely that equivariant models can achieve a similar result - but with higher compute and inference costs - is it still worth it? The inference argument looks quite plausible - foundation models (be it LLMs or AF) run billions of inference passes, if you can save 2x inference time by not doing expensive math and just use longer pre-training, the total serving costs are also reduced.

Those will be the main questions in the community on social media and conferences in 2024.

Besides that, researchers can use the AlphaFold Server for custom inference jobs - we welcome comp bio folks into the world (thanks OpenAI and Anthropic) of paid API access and proprietary models 😉 Still, given the pace of OS community (at least two ongoing re-implementations 1, 2), relatively easy model, and modest training compute, it might take <6 months to replicate a model similar to AF3 in performance.
🔥1
GraphML News (May 18th) - MatterSim, new workshops

🔮 Continuing the success of MatterGen, a diffusion model for material generation, MSR announced MatterSim (blog), an ML force field for atomistic simulations. A single MatterSim model supports a wide range of temperatures (0-5000 K) and pressures (up to 1000 GPa) and thus could be seen as a competitor to a recent MACE MP-0 - in fact, the authors compare against MACE MP-0 and observe significant improvements in certain tasks. Practically, MatterSim exists with M3GNet or Graphormer backbones (equivariance lives!) so you can select one depending on the available compute. MatterSim could be especially useful in active learning scenarios as a quick proxy when filtering generated candidates.

👷 A few upcoming summer schools and workshops:

- Machine Learning for Chemistry 2024 CZS Summer School (Sept 9-13th in Karlsruhe) with invited speakers from Google, MSR, Mila, TU Munich, KIT, and EPFL. Early bird registration lasts until June 13th.
- 21st Machine Learning on Graphs (MLG) workshop (Sept 9th or 13th, co-located with ECML PKDD 2024 in Vilnius) accepts submissions until June 15th. Invited speakers include Yllka Velaj (Uni Vienna) and Haggai Maron (NVIDIA & Technion).

Weekend reading:

Improving Subgraph-GNNs via Edge-Level Ego-Network Encodings by Nurudin Alvareg-Gonzalez et al

AdsorbDiff: Adsorbate Placement via Conditional Denoising Diffusion by Adeesh Kolluru and John R Kitching - perhaps the first diffusion model for this task (uses Equiformer V2 and GemNet OC)

MiniMol: A Parameter-Efficient Foundation Model for Molecular Learning by Kerstin Kläser, Błazej Banaszewski, and Valence labs - a 10M param for encompassing most tasks on 2D molecules (where you have smiles and graphs)
GraphML News (May 25th) - Aurora, primer on MD, PoET for proteins

The main NeurIPS deadline has finally passed - congrats to those who made it to the submission, you deserved some decompression time! (and reviewers behold, 20k submissions are coming). We could probably expect a flurry of preprint on arxiv next weeks - we’ll keep you posted about the most interesting things.

🌍 MSR AI 4 Science presented Aurora - a foundation model of the atmosphere that works for weather forecasting, air pollution, and predicting rare weather events. Aurora improves over the recent GraphCast and does so with plain vanilla Perceivers and ViTs, no equivariance involved 🥲

⚛️ Abishaike Mahajan prepared a great primer on molecular dynamics for complete beginners gradually introducing most important concepts (with illustrations) from force fields to equilibration to computational simulation methods. Finally, the article touched upon some successful use-cases of MD in industry. Highly recommended read to grasp the basics.

✍️ Meanwhile, folks returning from ICLR share some reflections on their fields - for instance, Patrick Schwab (GSK) on the papers for ML for Drug Discovery, and Lindsay Edwards (Relation) on why AI for DD is difficult.

🧬 Openprotein released PoET (the protein evolution transformer) - a protein LM that significantly outperforms ESM-2 in zero-shot prediction on ProteinGym while being much smaller. The authors project that a 200M PoET model can be equivalent to a 500B ESM model (by extrapolating scaling laws a bit). The checkpoint and inference code are publicly available.

Weekend reading:

Deep Learning for Protein-Ligand Docking: Are We There Yet? by Alex Morehead et al. - introduces the PoseBench benchmark for docking and evaluated a handful of modern baselines (DiffDock-L leads in most cases)

Explaining Graph Neural Networks via Structure-aware Interaction Index (ICML’24) by Ngoc Bui et al. feat Rex Ying: Myerson-Taylor instead of Shapley methods

Fisher Flow Matching for Generative Modeling over Discrete Data by Oscar Davis feat. Michael Bronstein and Joey Bose - flow matching for discrete data, already outperforms a recent discrete FM model DirichletFM
GraphML News (June 1st) - GNNs for Automotive Vision, NeurIPS submissions

A fresh example of applying GNNs in real-world problems is provided in the Nature paper Low-latency automotive vision with event cameras by Daniel Gehrig and Davide Scaramuzza from Uni Zurich. There, GNNs help to parse temporal events (like appearance of a pedestrian on a road) and save a lot of compute by updating only local neighborhood of changed patches. The model (with efficient CUDA implementation) works in real time in cars! Code and video demo are available.

The week brought a handful of cool new papers formatted with the NeurIPS template (what could that mean 🤔) - let’s see:

🧬 Genie 2 by AlQuraishi lab - better protein diffusion model now supporting multi-motif scaffolding, outperforms RFDiffusion, FrameFlow, and Chroma, code is available.

🦄 LoGAH: Predicting 774-Million-Parameter Transformers using Graph HyperNetworks with 1/100 Parameters - Xinyu Zhou feat. Boris Knyzev - the next iteration of the Graph Hypernetwork (GHN-3) that directly predicts parameters of neural networks, now with an efficient module for transformer-sized matrices. The model can predict weights of GPT-2 and ViT-sized networks! Code

🍭 Understanding Transformer Reasoning Capabilities via Graph Algorithms by Clayton Sanford and Google team feat. Anton Tsitsulin and Bryan Perozzi - a theoretical study on transformers and their ability to solve graph problems. The study reveals that, eg, depth has to scale as O(log(V+E)) from the graph size for parallelizable problems, and additional scaling of width for search problems. Besides, there is a comparison between GNNs and Transformers (trained from scratch and fine-tuned T5) on the GraphQA benchmark. Prompting LLMs doesn’t really work.

🤓 Two papers on flow matching from Michael Bronstein’s lab: Fisher Flow Matching for Generative Modeling over Discrete Data by Oscar Davis et al - the best discrete FM model so far, and Metric Flow Matching for Smooth Interpolations on the Data Manifold by Kacper Kapusniak et al - improvement of the OT-CFM (conditional flow matching with optimal transport).

We’ll be posting more new cool papers in the coming days!
​​GraphAny: A Foundation Model for Node Classification on Any Graph

by Jianan Zhao, Hesham Mostafa, Michael Galkin, Michael Bronstein, Zhaocheng Zhu, Jian Tang

🚀 We have just released a new work!

Pre-trained on one graph (Wisconsin with 120 labeled node), GraphAny generalizes to any unseen graph with arbitrary feature and label spaces - 30 new graphs - with an average accuracy of 67.26% in an inductive manner, surpassing GCN and GAT individually trained in the supervised regime.

GraphAny runs inference on a new graph as analytical solutions to LinearGNNs and enjoys the inductive (training-free) inference on arbitrary feature and label spaces. The model learns inductive attention scores for each node to fuse the predictions of multiple LinearGNNs. It adaptively predicts the most important LinearGNN channels via transforming the distances features between LinearGNNs, eg, high-pass filters are more preferred on heterophilic graphs.

Unlike LLM-based-models that can’t scale to large graphs, GraphAny efficiently can be trained on 1 graph and evaluated on 30 others—3M nodes & 244M edges—in just 10 mins. Works great on any 16GB GPU or even a CPU.

Finally, you can train a model on Cora and run inductive node classification on Citeseer, Pubmed, and actually any graph!

Paper, Code
Internship/Visiting period at NEC Labs Europe, Heidelberg

Guest post by Federico Errica

Who: Federico Errica is hiring a PhD student or Postdoc for a 6-months collaboration in the form of internship or visiting research period.

What: the collaboration will be focused on improving and designing message passing methods that address long-range propagation issues, with application to computational science problems.

How to apply: through the official website or LinkedIn
GraphML News (June 8th) - LOG’24, FoldFlow 2, more new papers

🎙️The biggest announcement of the week is that the virtual LOG’24 actually happens before going physical at UCLA in 2025. The dates are Nov 26-29th 2024, and submission deadline is September 11th. LOG is known for a much higher review quality - a considerable part of the whole budget is dedicated to monetary rewards for reviewers (one of the few events that ever appreciate good reviews).

🧬 The Dreamfold team announced FoldFlow 2 - an improved version of the protein structure generative model that made Riemannian flow matching a mainstream topic. FoldFlow 2 adds an ESM2 encoder for protein sequences and is trained on a much bigger dataset (featuring filtered synthetic structures from SwissProt and AlphaFold 2 DB). Experimentally, FoldFlow 2 substantially improves over previous SOTA big guys, RFDiffusion and Chroma, on unconditional and conditional (motif scaffolding) generation tasks.

Besides, it’s never too late to remind that Federico Errica is hiring interns and visiting researchers at NEC Labs in Heidelberg.

📚 The weeks after the NeurIPS deadline continue to bring cool submissions and accepted ICML papers!

- Topological GNNs went equivariant all the way:

Topological Neural Networks go Persistent, Equivariant, and Continuous (ICML’24) by Yogesh Verma et al
E(n) Equivariant Topological Neural Networks by Claudio Battirolo et al
E(n) Equivariant Message Passing Cellular Networks by Veliko Kovač et al feat Erik Bekkers

- Theory on graph transformers and spectral GNNs (all will be at ICML’24)

What Improves the Generalization of Graph Transformers? A Theoretical Dive into the Self-attention and Positional Encoding by Hongkang Li et al
Aligning Transformers with Weisfeiler–Leman by Luis Müller and Chris Morris
On the Expressive Power of Spectral Invariant Graph Neural Networks by Bohang Zhang et al feat. Haggai Maron

- Transformers through the graph lens (both featuring Petar Veličković)

Transformers need glasses! Information over-squashing in language tasks by Federico Barbero et al - the old friend over-squashing is confirmed to be present in transformers
The CLRS-Text Algorithmic Reasoning Language Benchmark by Markeeva, McLeish, Ibarz et al - the text version of CLRS for all you LLM folks, a fresh unsaturated benchmark

- Combinatorial optimization with GNNs

Towards a General GNN Framework for Combinatorial Optimization by Frederik Wenkel, Semih Cantürk, et al
A Diffusion Model Framework for Unsupervised Neural Combinatorial Optimization by Sebastian Sanokowski et al
GraphML News (June 15th) - ICML’24 graph papers, musings on AF3, more Flow Matching

🎉 ICML 2024 papers (including orals and spotlights) are now visible on OpenReview (however, without search). If you don’t want to scroll through 100 pages of accepted papers manually or write a custom parser, Azmine Toushik Wasi compiled a collection of accepted Graph ML papers with a nice categorization.

👨‍🔬 More blogs on AlphaFold 3 and reflexions about the future of TechBio: Charlie Harris focuses more on the technical side whereas Carlos Outeiral presents the CompBio perspective highlighting some cases where AF3 still underperforms.

🔀 Flow Matching continues to reach new heights with recently released papers: Variational Flow Matching (you didn’t forget ELBO and KL divergence, right?) by the UvA team of Floor Eijkelboom, Grigory Bartosh, et al (feat. Max Welling) derives a generalized flow matching formulation that naturally allows for categorical data (😼 CatFlow) and graph generation - the model outperform DiGress and other diffusion baselines. At the same time, the NYU team of Boffi et al propose Flow Map Matching - pretty much the Consistency Models for FMs that enable generation in one step instead of 20-100. Finally, Ross Irwin et al from AstraZeneca come up with MolFlow - flow matching for generating 3D conformations of molecules showing compelling results on QM9 and Geom-Drugs.

📚 Weekend reading (no flow matching):

GraphStorm: all-in-one graph machine learning framework for industry applications by Da Zheng and AWS - we wrote about a new GNN framework for enterprises back in 2023, here is the full paper with details.

CRAG -- Comprehensive RAG Benchmark from Meta (and a Kaggle competition for $30k) - the factual QA benchmark that simulates queries to knowledge graphs and APIs. Vanilla RAG yields only 44% accuracy and fancy industrial models barely reach 63% - so a plenty of room for improvements.

Explainable Graph Neural Networks Under Fire - by Zhong Li feat Stephan Günnemann. Turns out most GNN explainers utterly fail and cannot be trusted in the presence of simple adversarial perturbations. Let us know if you ever found a successful working case for GNN explainers 🤭
GraphML News (June 22nd) - $30M seed for CuspAI, Graph Foundation Models, MoML 2024

💸 A new startup CuspAI by Max Welling and Chad Edwards focusing on materials discovery and materials design for clean energy and sustainability raised $30M in the seed round (led by Hoxton, Basis Set, and Lightspeed). The support from the godfathers is significant - Geoff Hinton is a board advisor and Yann LeCun commented on the collaboration with FAIR and OpenCatalyst teams on OpenDAC. The materials design area gets hotter - not as hot as drug discovery and protein design though - but is steadily growing. In addition to Radical AI, Orbital Materials, new CuspAI, a fresh Entalpic by ex-Mila founders raised $5M+.

🔖 Together with Michael Bronstein, we released a new blog post on Graph Foundation Models. First, we define what GFMs are and what are the key design challenges covering heterogeneous model expressivity, scaling laws, and data scarcity. Then, we describe several successful examples of recent generalist models that can be considered GFMs in a particular area, eg, GraphAny for node classification, ULTRA for KG reasoning, and MACE MP-0 as universal potentials. We made sure to include all the recent references including position papers to appear at ICML’24!

🧬 The Molecular ML 2024 conference took place in Montreal this week (concluding the ML for Drug Discovery summer school) and featured talks on drug discovery and drug design. The recording is already available - check out talks by Jian Tang (BioGeometry) on geometric DL for proteins and by Max Jaderberg (Chief AI Officer at Isomorphic Labs) on AlphaFold 3. Might be one of the first public talks on AF3!

Weekend reading:

More benchmarks (brought to you by the NeurIPS Datasets & Benchmarking track deadline).

Temporal Graph Benchmark 2.0 by Gastinger, Huang et al - the first large-scale benchmark for temporal KGs and heterogeneous graphs

Text-space Graph Foundation Models by Chen et al feat. Anton Tsitsulin and Bryan Perozzi - a collection of text-attributed graphs for node classification, link prediction, and graph-level tasks

Towards Neural Scaling Laws for Foundation Models on Temporal Graphs by Shirzadkhani, Ngo, Shamsi et al - perhaps the first evidence that one temporal GNN can generalize to different temporal graphs (here those are token transactions in Ethereum)

RNA-FrameFlow: Flow Matching for de novo 3D RNA Backbone Design by Rishabh Anand, our own Chaitanya K. Joshi, et al - equivariant flow matching for generating 3D RNA structures.