Graph Machine Learning
6.71K subscribers
53 photos
11 files
808 links
Everything about graph theory, computer science, machine learning, etc.


If you have something worth sharing with the community, reach out @gimmeblues, @chaitjo.

Admins: Sergey Ivanov; Michael Galkin; Chaitanya K. Joshi
Download Telegram
This year's ICLM will finally have a tutorial on graphs! Adrian Arnaiz-Rodriguez and Ameya Velingker will present a tutorial on on Graph Learning: Principles, Challenges, and Open Directions.
๐Ÿ—“๏ธ Date: Monday, July 22
๐Ÿ•’ Time: 15:30 CEST - 17:30 CEST
๐Ÿ“ ICML In-person Event: Hall A8, ICML Venue
๐Ÿ“ Virtual attendance: https://icml.cc/virtual/2024/tutorial/35233

What to expect?
- Intro to Graph Learning and GNNs: Introduction to Traditional graph representation, Graph Neural Networks (GNNs), Message Passing Networks (MPNNs), Graph Transformers (GTs) and spectral quantities.
- Expressiveness and Generalizability: GNN expressivity linked with the WL test, generalizability of MPNNs, and their performance implications.
- Challenges in GNNs: Understanding and addressing under-reaching, over-smoothing, over-squashing, and graph rewiring techniques.
- Panel Discussion on Future Directions: Panel discussion with Michael Bronstein, Bryan Perozzi, Christopher Morris and more panelist TBC. We will discuss about GNN limitations, graph foundation models, and integrating GNNs with large language models (LLMs).

This tutorial balances introductory content and advanced insights, aimed to both general audiences and experts. Donโ€™t miss this opportunity to deepen your understanding of GNNs!
GraphML News (July 13th) - Recursion goes brrr, Acquisition of Graphcore, Illustrated AF3

๐Ÿ’ธ Recursion and NVIDIA launched BioHive-2, a GPU cluster made of 504 H100โ€™s which is roughly equivalent to 1 petaflops in FP16 / BF16 and perhaps sub-$50M in the costs. Some napkin math indicates it could train and fine-tune a full AlphaFold 3-like model in about 4 days. Except for ESM-3, we havenโ€™t yet seen drug discovery models trained on such compute - congrats to Recursion, Valence, and researchers with engineers who can now really go brrr.

๐Ÿ’ธ Graphcore, a UK hardware startup offering their hardware platform (BOW IPUs), was acquired by SoftBank for rumored $500M (back in 2020 valuation was about $2.8B). Former employees likely lost their vested options ($500M is still less than $600M originally invested into the company) but letโ€™s hope that now the future would be more stable for Graphcore and we will see more successful products.

๐Ÿงฌ The Illustrated AlphaFold by Elana Simon and Jake Silberg from Stanford (inspired by the Illustrated Transformer) explains visually the main building blocks of the model - starting from the input data down to PairFormer, triangular attention to the diffusion module to the training losses. Things get much simpler indeed when you know which shapes are involved at each particular step.

Weekend reading:

Link Prediction with Untrained Message Passing Layers by Lisi Qarkaxhija, Anatol E. Wegner, and Ingo Scholtes - the unreasonable effectiveness of untrained MPNNs strikes back

SE(3)-Hyena Operator for Scalable Equivariant Learning by Artem Moskalev et al - FFT with Clifford MLPs enable equivariant Hyena on long sequences up to 3.5M tokens on a single GPU

On the Expressive Power of Sparse Geometric MPNNs by Yonatan Sverdlov, Nadav Dym - enabling equivariant GNNs on sparse graphs (usually EGNNs work on fully-connected graphs)
GraphML News (July 20th) - Pinder and Plinder, LAB bench, ICML 2024

๐ŸŽ™๏ธ ICML 2024 starts next week - enjoy the conference and Vienna if you are participating this year! Beside the main program, Monday will feature the Graph learning tutorial, Thursday and Friday have a handful of graph-related workshops.

๐Ÿงฌ VantAI together with MIT, NVIDIA, UniBasel, and SIB introduce two novel large-scale benchmarks: Pinder (Protein INteraction Dataset and Evaluation Resource) and Plinder (Protein-Ligand Interaction Dataset and Evaluation Resource). Pinder includes 500x more data than PPIRef, and Plinder is roughly 10x larger than DockGen, previous largest datasets in the area susceptible to test set leakages. Re-training SOTA diffusion models on Pinder and Plinder shows much lower results indicating that saturation is far away (at least for the coming year). Besides, it is great to see the industrial company (from a highly competitive CompBio area) contributing to the field with open datasets. Pinder and Plinder will be the main datasets for the upcoming ML for Structural Bio challenge at NeurIPS 2024, so prepare your GPUs and diffusion models.

๐Ÿ”ฌ FutureHouse released the LAB bench for studying LLMs in Biology and Chemistry. The benchmark includes 8 categories where LLMs have to deal with figures, images, scientific literature, databases, and designing protocols. Recent LLMs and VLMs (GPT-4o, Claude, and LLama-3) all show rather underwhelming results on those tasks - it is finally a new unsaturated benchmark for the LLM crowd! The authors saved some data to check training contamination of future models (eg, when training data for the next gen of such models would include validation and test splits of the datasets).

Weekend reading:

Beyond Euclid: An Illustrated Guide to Modern Machine Learning with Geometric, Topological, and Algebraic Structures by Sophia Sanborn, Johan Mathe, Mathilde Papillon, et al - a massive survey with amazing illustrations

PINDER: The protein interaction dataset and evaluation resource by Daniel Kovtun, Mehmet Akdel, and VantAI folks feat. Michael Bronstein

PLINDER: The protein-ligand interactions dataset and evaluation resource by Janani Durairaj, Yusuf Adeshina, and VantAI folks

LAB-Bench: Measuring Capabilities of Language Models for Biology Research by Jon M. Laurent, Joseph D. Janizek, et al feat. Andrew White
GraphML News (July 27th) - LLMs in Chemistry, Discrete Flow Matching

ICML kept most of the community busy (Saturday is the last day of workshops) while in the other news Llama 3.1, SearchGPT, AlphaProof, and AlphaGeometry 2 took the headlines of approaching AGI singularity. Anyways, August would likely be a quieter month.

Some fresh works for the weekend reading:

A Review of Large Language Models and Autonomous Agents in Chemistry by Mayk Ramos, Christopher Collison, and Andrew White - a massive survey on what current gen of LLMs can do in chemistry - from property prediction and synthesis prediction to tool-augmented and multi-modal frontier models for orchestrating automated discovery labs. (paying respects to the LLM week)

Discrete Flow Matching by Itai Gat and Meta FAIR including Ricky Chen and Yaron Lipman - the OG authors of (Riemannian) Flow Matching. Discrete FM is now competitive to Llama 2/3 on coding tasks - so we should expect that module to be in all generative models for molecules, proteins, and crystals around ICLRโ€™25 submissions and later.

Generative Modeling of Molecular Dynamics Trajectories by Bowen Jing and Hannes Stรคrk - MD via stochastic interpolants, supports accurate forward simulation, upsampling, interpolation between two states in the trajectory, and even inpainting of the simulated structure.
โ€‹โ€‹Seminar on Graph-based Causal Discovery in Computational Biology

๐ŸŽ“ Topic: "Causal discovery from multivariate information in biological and biomedical data"
๐Ÿ‘จโ€๐Ÿ”ฌ Who: Hervรฉ Isambert, The Isambert Lab, CNRS, Institut Curie, Paris
โŒš When: Monday, July 29th, 5pm CEST

Abstract: In this webinar, I will present the principles and limitations of graph-based causal discovery methods and their improvement using multivariate information decomposition, recently developed in my lab. Applications will range from gene expression data in single cells to nationwide medical databases of cancer patients. I will then discuss the theoretical link between graph-based causality and temporal (Granger-Schreiber) causality, which can both be expressed in terms of conditional multivariate information. While temporal causality is shown to imply graph-based causality, the converse may not be true (see Figure). An application to time series data concerns the analysis of video images of reconstituted tumor ecosystems, which uncovered a novel antagonistic effect of cell-cell interactions under therapeutically relevant conditions.

The Zoom link will appear in this channel shortly before 5pm
GraphML News (August 3rd) - NeurIPS workshops, MoML @ MIT, RUM and GraM

โ›ท๏ธ NeurIPSโ€™24 announced 56 accepted workshops (brace yourself, Vancouver convention center). In addition to a good bunch of LLM, VLM, and foundation model-focused events, graph and geometric learning folks might be interested in:

- AI for New Drug Modalities
- Machine Learning in Structural Biology
- Symmetry and Geometry in Neural Representations
- Multimodal Algorithmic Reasoning
- Machine Learning and the Physical Sciences
- AI for Accelerated Materials Design

๐Ÿงฌ The second part of MoML 2024 (Molecular ML) will be happening at MIT on November 5, you can submit short papers until October 10th. The authors of accepted papers get free admission!

๐Ÿ’Ž The GraM workshop of ICMLโ€™24 published accepted blogposts with some hidden gems like JAX implementation of EGNN, intro to equivariant neural fields, and the study of how consistency models donโ€™t work for 3D molecule generation. Check out others as well - most of them require only entry-level background.

๐Ÿ“ˆ Non-convolutional Graph Neural Networks by Yuanqing Wang and Kyunghyun Cho (the OG of GRUs) introduce RUM (random walk with unified memory) nets free of convolutions. Practically, the recipe of RUM included sampling random walks with anonymous node ID sequences (tracking the first occurrence of a node ID in the sequence), encodes both sequences via RNNs (sure, you can drop-in your fav Mamba here), concats both vectors with an MLP on top. The authors show RUMs are more expressive than 1-WL GNNs while not suffering from oversmoothing and oversquashing (and beating the baselines on a bunch of benchmarks). Interestingly, RUMs look like DeepWalk on steroids with several improvements. Is Bryan Perozzi the Noam Shazeer of graph learning? ๐Ÿค”

More weekend reading:

Spatio-Spectral Graph Neural Networks by Simon Geisler et al feat. Stephan Gรผnnemann - spectral GNNs can be strong performers, too - just to contrast with RUMs

Learning production functions for supply chains with graph neural networks by Serina Chang et al feat Jure Leskovec - a cool work that frames supply chains as temporal graphs, shows significant gains in prediction accuracy, and releases the data simulator

What Are Good Positional Encodings for Directed Graphs? by Yinan Huang, Haoyu Wang, and Pan Li. The answer is the Magnetic Laplacian with multiple potential factors (multi-q) - your best choice for DAGs.
GraphML News (August 10th) - Summer School recordings, DD merger

๐Ÿ–ฅ๏ธ Recordings from the ML for Drug Discovery Summer School are now available covering 5 days of talks with 28 videos - from basics of GNNs for chemistry and equivariance to protein folding, ML potentials, simulations, protein-protein (-ligand) binding, to generative modeling and causal discovery.

๐Ÿ–ฅ๏ธ The Eastern European ML Summer Schoolโ€™24 also published their recordings - 25 videos covering a more general area of deep learning including LLMs, reasoning, VLMs, RL, generative models, Bayesian DL, and many more. Notebooks from the practical sessions are available on GitHub.

Both schools feature the most up-to-date material from the top experts in the field, quite the gems to watch during the summer break ๐Ÿ’Ž.

โš›๏ธ Continuing with the quality content, Sophia Tang published a massive, 2.5h-read guide to spherical equivariant graph transformers deriving them from the first principles and spherical harmonics to TensorField nets to the SE(3)-Transformer. Lots of illustrations with the code going along. The best tutorial so far.

๐Ÿ’ธ News from the Geometric Wall Street Journal: a huge merger between Recursion and Exscientia (focusing on precision oncology) - actually, Recursion bought Exscientia for $688M in stocks continuing its acquisition spree (besides the BioHive-2 with 500 H100โ€™s). (Not a stonks advice)

Weekend reading:

The Heterophilic Graph Learning Handbook: Benchmarks, Models, Theoretical Analysis, Applications and Challenges by Sitao Luan feat. Rex Ying and Stefanie Jegelka - everything you wanted to know about heterophilic graphs in 2024

When Heterophily Meets Heterogeneity: New Graph Benchmarks and Effective Methods by Junhong Lin et al - introduces H2DB, a collection of known and new heterophilic and heterogeneous graphs, much larger than existing datasets.
GraphML News (August 17th) - Spanner Graph, some new papers

๐Ÿ”ง Google announced Spanner Graph - the infinitely scalable graph database (as the vanilla Spanner) with all the bells and whistles GDBMS have in 2024: support both Graph Query Language (GQL, finally standardized by ISO in April after 8 years of work) and SQL, vector search and full-text search, basic graph algorithms at query time.

Otherwise, itโ€™s mid-August and vacation time, so probably no major news for the next few weeks.

Weekend reading:

Training Language Models on the Knowledge Graph: Insights on Hallucinations and Their Detectability from the large DeepMind team - turns out reducing hallucinations when training LLMs on KGs (ie, recalling training triples) requires an order of magnitude more compute than Chinchilla scaling laws. Lots of qualitative results - have a look! Besides, it is one of the accepted papers at COLM - a new conference specifically tailored for LLM research (rip, ACL/EMNLP).

Topological Blind Spots: Understanding and Extending Topological Deep Learning Through the Lens of Expressivity by Yam Eitan et al. feat Haggai Maron - one of the first studies of expressive power of topological (higher-order) MPNNs. Turns out standard models based on simplicial complexes or cellular networks cannot distinguish many common topological patterns like a Mรถbius strip vs cylinder. The authors then derive provably more powerful scalable multi-cell networks.

Tokenized and Continuous Embedding Compressions of Protein Sequence and Structure by Amy X. Lu et al feat. Pieter Abbeel and Kyunghyun Cho - a deep dive into the latent space of ESMFold which happens to be quite sparse, it can reduced by 128x without losing in prediction performance.
GraphML News (August 24th) - Psiformer, ML potentials arena, Single-cell foundation models

โš›๏ธ DeepMind announced the updated version of Psiformer (together with the paper in Science, twitter thread, and source code in Jax) - a transformer for quantum physics tasks. The new model can approximate excited states of molecules on par or better than existing gold standard models. Excited energy states are responsible for lasers, semiconductors, solar panels, fluorescence, and many other phenomena - a huge potential for Psiformer in industrial applications.

๐Ÿ† Continuing with energy states - you probably know that the ultimate LLM benchmark those days is the ELO rating on the Chatbot Arena. Yuan Chiang started a similar effort for ML potential models (MLIP Arena) featuring 3 tasks: two atoms of the same type (the only LB for now) and two molecular dynamics tasks (loading time is slow). The supported models for now are Equiformer V2, CHGNet, MACE MP, M3GNet, SevenNet, and the GPAW DFT baseline from the DFT world.

๐ŸŽป Single-cell foundation models are getting more attention. The new scCello by Mila is a transformer trained on the masked LM task together with the alignment loss using the Cell Ontology. scCello in the zero-shot inference regime outperforms end-to-end trained models on tasks like cell type classification, marker gene prediction, and batch integration. If you are interested to learn more, have a look at the fresh survey on transformers in SC omics.

Weekend reading: more foundation models and materials science:

A foundation model for clinician-centered drug repurposing by Kexin Huang et al feat. Jure Leskovec and Marinka Zitnik - introduces TxGNN, a graph foundation model for drug repurposing trained on a medical KG of 17k diseases and 8k drugs, strong zero-shot performance included. The model and example weights are already on Github.

Microsoft published the source code of Aurora - FM for atmospheric forecasting, consists of Perceiver encoder/decoder and SwinTransformer as the backbone.

Crystalline Material Discovery in the Era of Artificial Intelligence by Zhenzhong Wang et al (thanks to Wanyu Lin for highlighting the work) - a survey on predictive and generative models for crystals, with the github repo of relevant papers

From Text to Insight: Large Language Models for Materials Science Data Extraction + tutorial online book by Mara Schilling-Wilhelmi, Martiรฑo Rรญos-Garcรญa et al. LLMs are surprisingly strong in generating 3D structures of solid-state materials (ICLR 2024) on par with fancy equivariant diffusion models, this survey studies how much MatSci data LLMs could possibly feed.
GraphML News (August 31st) - When GNNs help, randomized transformers, and new papers

August is a dry month in terms of news, but soon weโ€™ll start to see upcoming ICLR submissions!

๐Ÿ”จ Meanwhile, have a look at the Measuring and Exploiting Network Usable Information blogpost by Meng-Chieh Lee (based off the spotlight ICLR 2024 paper) that touches upon the question asked every day in industrial labs - will GNNs outperform MLPs on my data? Are there any hints or data characteristics (well, apart from the homophily ratio) that could indicate which model would be better without training one? The authors introduce the notion of Network Usable Information (NUI) as a function of structural embeddings, node features, and neighborsโ€™ features and find some correlations between the new score and performance on node classification and link prediction.

We submitted a position paper to ICMLโ€™24 studying a similar question but it didnโ€™t get through because reviewers demanded more experiments (in the positions track, yeah).

๐ŸŽฐ Learning Randomized Algorithms with Transformers by Google and ETH Zurich - a intriguing blend of theoretical CS, math, and randomized algorithms with expressiveness of transformers. Experiments shows that randomized transformers can solve graph coloring problems on small sizes and explore grid worlds.

More weekend reading:

๐Ÿ’Š Graph Artificial Intelligence in Medicine by Ruth Johnson, Michelle Li, feat Marinka Zitnik - a massive survey on GNNs in clinical applications.

Do Graph Neural Networks Work for High Entropy Alloys? by Zhang et al - the answer is yes, but with proper modeling. High-entropy alloys are unordered at the atomic scale but can be represented as sets of graphs (each graph is a local env for an alloy). Practically, adding a set pooling function like DeepSet(GNN(set of graphs)) is what we are looking for.

Expressive Power of Temporal Message Passing by Przemysล‚aw Waล‚ega and Michael Rawson - Weisfeiler and Leman Go Temporal! Another fun fact about temporal GNNs: two models named DyG-Mamba (one, two, both add Mamba on top of GNN encoders) were submitted on arxiv with a few days gap.
GraphML News (September 7th) - AF 3 reproductions, AlphaProteo, ORB, Entalpic round

Just the first week of September, but already so much news in the protein design and materials science!

๐Ÿงฌ Two AlphaFold 3 reproductions are now available: HelixFold 3 from Baidu (tech report) and AF3 from Ligo Bioscience (no tech report yet). Training HelixFold 3 on PDB and custom data yields results roughly similar to the OG AlphaFold 3 on PoseBusters and CASP 15 - good news for science and reproducibility (and for Nature editors, hehe). Getting more data will be the key to the full reproduction - probably no other lab has as large and diverse dataset as DM and Iso.

Meanwhile, Google DeepMind announced AlphaProteo - a generative model for binders conditioned on the target protein and possible binding sites. The preprint has no information about the generative model itself (an educated guess would be either autoregressive transformer or discrete diffusion as a backbone) but the training dataset is similar to that of the full AlphaFold 3. Experimentally, AlphaProteo generates plausible binders in several use-cases like Epstein-Barr virus protein, COVID-19 spike protein, and proteins involved in cancer.

๐Ÿ”ฎ In the computational materials science, Orbital Materials announced ORB - a family of forcefield models to compute energy, forces, and stresses of atomistic systems (like bulk materials or semiconductors). ORB trained on Alexandria and Materials Project trajectories with the denoising objective (improved Noisy Nodes) yields SOTA on MatBench Discovery outperforming big boys MatterSim from MSR and GNoME from DeepMind. The authors highlight that ORB are non-equivariant GNNs - in fact, the backbone is very similar to the Graph Network Simulator from 2020 with an optional attention interaction. It will be fun to watch equivariant vs non-equivariant folks beating each others SOTA in the next few months ๐Ÿฟ

๐Ÿ’ธ Entalpic, a French materials discovery startup with founders graduated from Mila, announced โ‚ฌ8.5m seed round co-lead by Breega, Cathay Innovation and Felicis - congrats to Mathieu, Victor, and Alexandre! Entalpic joins CuspAI and Orbital Materials in the emerging market of DL-based materials discovery companies - weโ€™ll be keeping an eye on their advances.

Weekend reading:

Two papers from Shuiwang Jiโ€™s lab on SE(3)-invariant 1D tokenization of 3D molecules for autoregressive generation:

Geometry Informed Tokenization of Molecules for Language Model Generation - for small molecules on QM9 and Geom-Drugs.

Fragment and Geometry Aware Tokenization of Molecules for Structure-Based Drug Design Using Language Models - for generating ligands for protein pockets.

Talking about autoregressive molecule generation,

Any-Property-Conditional Molecule Generation with Self-Criticism using Spanning Trees is another strong baseline improving spanning tree-based graph generation.
GraphML News (September 17th) - Chai-1, GenMS

๐Ÿ“ This week offered a significant portion of strawberries that might result in major improvements in scientific applications. For now, letโ€™s try to check whatโ€™s there beyond the berries.

๐Ÿงฌ Chai Discovery emerged from stealth and released Chai-1 - a reproduction of AlphaFold 3 with trained weights (thanks to a month on 128 A100 which saved you roughly $500k), a tech report, open inference server, and inference code (interestingly, no model code). Initial experiments report numbers close to AF 3. Chai is backed by OpenAI and many famous VCs, so it might appear as a new strong player in the industry, weโ€™ll keep an eye.

๐Ÿ”ฎ Google DeepMind announced GenMS: Generative Hierarchical Materials Search by Sherry Yang, Simon Batzner, and the team that brought us UniMat last year. GenMS employs three components: (1) Gemini 1.5 to sample candidate formulae after a natural language query, eg, โ€œgive me the formula for a stable, chalcogenide with atom ratio 1:1:2 that's not in the ICSD databaseโ€. Samples are filtered through some rule-based heuristics and re-reranked by an LLM; (2) best candidates are sent to a diffusion model (non-equivariant, attention-based 3D Unet) to generate 3D structures; (3) the structures are scored by a pre-trained ML potential (NequIP) - if they are stable and exhibit target characteristics, we add them as a tree branch for the new iteration by LLMs. GenMS excels at perovskites, pyrochlore, and spinel crystals with structures confirmed by DFT formation energy calculations. Almost no geometric DL whatsoever ๐Ÿ™€

Weekend reading:

Recurrent Aggregators in Neural Algorithmic Reasoning by Kaijia Xu and Petar Veliฤkoviฤ‡ - the first model capable to solve quickselect from the CLRS benchmark happened to be a Triplet MPNN with a non permutation-invariant LSTM aggregator (GraphSAGE vibes). Back in January in our annual review post quickselect was the most unlikely candidate for traction, and looks like it is almost solved now!

On the design space between molecular mechanics and machine learning force fields by Yuanqing Wang and a huge collab of physicists and chemists led by NYU (feat. Kyunghyung Cho) - a nice intro to molecular mechanics, force fields, and potentials approachable by folks without a degree in physics. The survey includes a discussion on foundational ML potential models and โ€œa nihilstic epilogueโ€ worth checking out.
GraphML News (September 21st) - AITHYRA, Fragrance 2o, LOG meetups

๐Ÿงฌ The Austrian Academy of Sciences together with Boehringer Ingelheim Foundation launched AITHYRA - the Institute for Biomedical AI - with a generous โ‚ฌ150M funding over the next 12 years as a part of the Vienna BioCenter with Michael Bronstein as the first scientific director! AITHYRA plans to host 10-15 research groups supporting them with compute resources and robotic lab. Chances are AITHYRA might become the European version of the Institute for Protein Design (behold, David Baker) and the hub for Geometric Deep Learning research. Big win for Vienna ๐Ÿ‘

๐Ÿ‘ƒ Osmo, a generative fragrance startup founded by ex-Google researchers who worked on the Principal Odor Map, uncovered a bit more details on the Fragrance 2o platform - essentially, this is a molecule search / generation for potential fragrance molecules with further conditional generation capabilities. It would certainly be exciting to discover a personalized scent like โ€œof a sweaty researcher submitting an ICLR paper while camping in Yosemite forestsโ€. We will keep you up to date whether GNNs conquer the perfume world and beauty industry and when Fragrantica starts to list LLM prompts as ingredients.

๐Ÿป One of the unique ideas of the Learning on Graphs conference are local meetups about graph learning research. To date, seven meetups spanning October-December have been announced: Tel Aviv, New Jersey, Aachen, Amsterdam, Paris, Kunshan, and Siena - feel free to attend or organize one at your place!

Weekend reading:

Accelerating Training with Neuron Interaction and Nowcasting Networks by Boris Knyazev et al and collab between Samsung and Mila - pretty amazing work where every k-th optimization step model weights are predicted by a graph transformer conditioned on the neural net architecture (supports convnets, GPT2, BERT, Llama, and ViTs), brings up to 50% speed ups in optimization.

The Empirical Impact of Neural Parameter Symmetries, or Lack Thereof by Derek Lim, Moe Putterman feat. Haggai Maron - another interesting work on neural parameter symmetries. Turns out that fixing weights in MLPs via freezing or non-linearities breaks parameter symmetries and enables better model merging (you can interpolate between pre-trained models to get even better performance).

Can Graph Reordering Speed Up Graph Neural Network Training? An Experimental Study by Nikolai Merkel et al (VLDB 2025) - The answer is yes, avg speedup is 25%. The idea of partitioning the graph into several components to optimize memory reads is similar to the findings of Graph Segment Pre-training (by Google) and Sequential Aggregation and Rematerialization (Intel).

Deep learning-based predictions of gene perturbation effects do not yet outperform simple linear methods by Constantin Ahlmann-Eltze et al ๐Ÿซณ๐ŸŽค
Discrete Neural Algorithmic Reasoning

Guest post by Gleb Rodionov

Paper: https://www.arxiv.org/abs/2402.11628
Blog: https://research.yandex.com/blog/discrete-neural-algorithmic-reasoning
Code: https://github.com/yandex-research/dnar

In this paper, we focus on generalizable and interpretable neural algorithmic reasoners. Starting with attention-based GNN, we inspect the reasons for generalization errors and propose several architectural modifications: feature discretization, hard attention and separating discrete and continuous data flows. All of these blocks are important for generalization:

โƒ State discretization prevents the model to use complex and redundant dependencies in the data;

โƒ Hard attention ensures that attention weights are not annealed for larger graphs. Also, hard attention limits the set of possible messages that each node can receive;

โƒ Separating discrete and continuous flows is needed to ensure that state discretization does not lose information about continuous data.

As a result, we achieve a model that provably imitates the execution of several algorithms for any test data when trained with hints. Practically, on SALSA-CLRS, trained on problem sizes of 16 nodes, the model demonstrates perfect graph- and node-level scores generalizing to problems of up to 1600 nodes.

For future work, it would be interesting to enhance the expressivity of the proposed model to a broader set of algorithms and investigate whether it is possible to train these models without hints.
GraphML News (September 28th) - AlphaChip, Generate + Novartis deal, MolPhenix

NeurIPS results for both tracks have arrived - congrats to those who made it, the datasets track this year was particularly egregious with hard score cutting below average 6.3. Good luck with the final ICLR push and see you in Vancouver!

๐Ÿ’ป Google DeepMind presented AlphaChip - the improved version of the famous 2021 Nature paper that introduced the RL agent that uses edge-level GNNs for chip placement - that is, placing dozens of smaller blocks (often implementing certain logical function) on a canvas to optimize common design metrics like HPWL or PPA. The addendum highlights that pre-training with large compute is rather crucial and reports that AlphaChip has been successfully used for several generations of TPUs (25 RL-designed blocks in the latest TPU) as well as for external customers like MediaTek. The paper got some controversial reputation in the chip design community and some professors even argued for retracting the work from Nature for lack of clarity and reproducibility. Over time, however, it seems more like a skill issue of those who tried to replicate it - generally, the level of ML expertise in the chip design community is pretty low (some accepted papers at top venues like DAC are just ๐Ÿซฃ) and most university teams are stuck somewhere between MLPs and convnets. Professors gonna hate, Google gonna continue making impactful real-world products, and we will have new pre-trained checkpoints of AlphaChip with some Colab tutorials ๐Ÿฟ.

๐Ÿ’ธ Generate:Biomedicines (the authors of Chroma, a generative model for protein design) announced collaboration with Novartis resulting in $65M upfront payments and $1B in biobucks (royalties and other performance-based milestones typically split across many years).

๐Ÿฆ Valence Labs announced MolPhenix, a CLIP-like model to study phenomics (how cells respond to perturbations). Practically, it is trained on pairs of microscopy images and molecules using ViT as image encoder and MolGPS for molecules. Experiments report massive 10x improvements in Top-1% recall of active molecules over previous SOTA ๐Ÿ‘.

Weekend reading:

TabGraphs: A Benchmark and Strong Baselines for Learning on Graphs with Tabular Node Features by Gleb Bazhenov et al - a fresh collection of new graph datasets where features are interpretable (numerical, categorical) - a stark contrast to boring text-attributed graphs or Planetoid datasets with bag-of-words as features.

Design of Ligand-Binding Proteins with Atomic Flow Matching by Junqi Liu et al feat. Jian Tang - generate a docked protein-ligand 3D structure conditioned just on 2D ligand graph and protein sequence with flow matching. Outperforms RFDiffusionAA on several metrics.
GraphML News (Oct 5th) - ICLR 2025 Graph and Geometric DL Submissions

๐Ÿ“š Brace yourselves, for your browser is about to endure 50+ new tabs. All accepted NeurIPS 2024 papers are now visible (titles and abstracts), and a new batch of goodies from ICLRโ€™25 has just arrived. Tried to select the papers that haven't yet appeared during the ICML/NeurIPS cycles. PDFs will be available on the respective OpenReview pages shortly:

Towards Graph Foundation Models:

GraphProp: Training the Graph Foundation Models using Graph Properties
GFSE: A Foundational Model For Graph Structural Encoding
Towards Neural Scaling Laws for Foundation Models on Temporal Graphs

Graph Generative Models:

Quality Measures for Dynamic Graph Generative Models
Improving Graph Generation with Flow Matching and Optimal Transport
Equivariant Denoisers Cannot Copy Graphs: Align Your Graph Diffusion Models
Topology-aware Graph Diffusion Model with Persistent Homology
Hierarchical Equivariant Graph Generation
Smooth Probabilistic Interpolation Benefits Generative Modeling for Discrete Graphs

GNN Theory:

Towards a Complete Logical Framework for GNN Expressiveness
Rethinking the Expressiveness of GNNs: A Computational Model Perspective
Learning Efficient Positional Encodings with Graph Neural Networks

Equivariant GNNs:

Improving Equivariant Networks with Probabilistic Symmetry Breaking
Does equivariance matter at scale?
Beyond Canonicalization: How Tensorial Messages Improve Equivariant Message Passing
Spacetime E(n) Transformer: Equivariant Attention for Spatio-temporal Graphs
Rethinking Efficient 3D Equivariant Graph Neural Networks

Generative modeling with molecules (hundreds of them actually):

AssembleFlow: Rigid Flow Matching with Inertial Frames for Molecular Assembly
RoFt-Mol: Benchmarking Robust Fine-tuning with Molecular Graph Foundation Models
Multi-Modal Foundation Models Induce Interpretable Molecular Graph Languages
MolSpectra: Pre-training 3D Molecular Representation with Multi-modal Energy Spectra
Reaction Graph: Toward Modeling Chemical Reactions with 3D Molecular Structures
Accelerating 3D Molecule Generation via Jointly Geometric Optimal Transport
GraphML News (Oct 12th) - Nobel Prizes, Mediterranean ML Summer School

๐Ÿ…If you lived under the rock this week, Deep Learning got two Nobel Prizes this year: Geoff Hinton and John Hopfield got the physics prize (less expected), and David Baker, John Jumper, and Demis Hassabis got the chemistry prize (more than expected after AF 2 received almost all other scientific awards). The acknowledgement of deep learning advancements was not rushed as it might seem - it took already 10+ years since the ImageNet revolution and the entire new industry has grown on top of it. It roughly took the same time for CRISPR (another chemistry Nobel Prize in 2020) to get acknowledged. What does the prizes mean for the field and industry (other than DL researchers could claim to be a bit of physicists and chemists themselves)?

It is likely that AI 4 Science as a field in general would receive a significant attention with more researchers entering the area and more funding for commercializing some of the tech behind it. The potential of using DL methods in accelerating scientific discovery is still largely untapped (yes, Geometric DL did enable the recent successes in protein design and pharma but, for example, we canโ€™t say that protein generative models truly learn underlying physics phenomena for now), so it is as exciting time as ever to start your research journey in this area. There is a plenty of space to do impactful research and weโ€™ll probably see more labs and companies pivoting there. (Fun fact - brace yourselves as every 2nd talk at NeurIPS 2024 would probably start with the same Nobel Prize slides).

๐Ÿ“บ The recordings of the Mediterranean ML Summer School are finally available! The school took place in September in Milan packing a week of talks on transformers, reasoning, diffusion models and flow matching, GNNs, RL, RLHF, optimization, and many more.

Weekend reading (while waiting for ICLR papers to go public) is featuring a fresh lineup of works by Google DeepMind on studying the guts of transformers:

softmax is not enough (for sharp out-of-distribution) by Petar Veliฤkoviฤ‡ et al arguing that softmax necessarily looses sharpness on longer OOD inputs

Positional Attention: Out-of-Distribution Generalization and Expressivity for Neural Algorithmic Reasoning by Artur Back de Luca, George Giapitzakis, Shenghao Yang et al

Round and Round We Go! What makes Rotary Positional Encodings useful? by Federico Barbero et al
GraphML News (Oct 19th) - Orb-v2, OMat24, Stanford Graph Learning Seminar, new PhD positions

๐Ÿ”ฅ The competition in materials science heats up: ML potentials (models that estimate the potential energy of an atomistic system and often predict energy, forces, and stresses) are one of the main drivers in the field as they can significantly speed up expensive molecular dynamics (MD) calculations. Matbench Discovery is one of the main benchmarks for ML potentials.

๐Ÿ”ฎ During the week, Orbital Materials released the code and weights of Orb-v2, the next version of the non-equivariant MPNN (Orbital folks explicitly bet against equivariant GNNs) that outperforms mighty MatterSim from MSR with just 25M parameters. Besides, Orb-v2 offers increased stability during MD calculations.

๐Ÿ“ˆ A few days later, FAIR chemistry released OMat24, a new large dataset with 100M+ structures for training ML potentials (much larger than existing datasets) that required 400M+ core hours to complete DFT calculations for (preprint). Together with OMat24, FAIR released EquiformerV2, equivariant transformer, pre-trained on this dataset and fine-tuned on MatBench discovery (using just 64 A100s - ๐ŸŒš an entry-level ๐ŸŒš of compute those days) and claimed SOTA on Matbench Discovery. Interestingly, Equiformer got a significant performance boost when trained with the denoising objective - similar to what Orb models are trained on. It is likely that the benchmark will be fully saturated next year.

Meanwhile, Google DeepMind together with Japanese institutes released a paper on applying GNoME (the flagship tool for materials discovery introduced last year) to synthesizing cesium chlorides.

๐ŸŽ™๏ธ The Stanford Graph Learning Workshop will take place on November 5th physically at Stanford with the online stream, expect some new announcements and releases!

๐ŸŽ“ Finally, the application season for PhD positions and internships is open: weโ€™d highlight the call for fully-funded PhD positions from Viacheslav Borovitskiy at the University of Edinburgh on Geometric Learning and Uncertainty Quantification (Geometric Kernels is one of the most recent works). Application deadline: Dec 15th, start date: September 2025.

Let us know if your lab is hiring this season and weโ€™ll compile a larger list of open geometric learning positions!

Weekend reading:

PDFs of ICLR 2025 submissions are now visible - you can open and read everything from the list we prepared a few weeks ago.
GraphML News (Oct 26th) - LOG meetups, Orbital round, ESANN 2025

๐Ÿป The Learning of Graphs conference continues to update the list of local meetups - the networks already includes 13 places from well-known graph learning places like Stanford, NYC, Paris, Oxford, Aachen, Amsterdam, Tel Aviv down to Tromsรธ, Uppsala, Siena, New Delhi, Suzhou, and Vancouver (Late November in Tromsรธ, talking graphs with a cup of glรผhwein and snow outside must be a quite a cozy venue). The call for meetups is still open!

On this note, the European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning (ESANN 2025) will host 3 special sessions on graph learning: Foundation and Generative Models for Graphs, Graph Generation for Life Sciences, and Network Science Meets AI. Submission deadline is November 20, 6 pages tops. Thanks to Manuel Dileo for the pointer! ESANN 2025 will take place on April 23-25 (2025) in Bruges (jokes about the movie and Tottenham are welcome).

๐Ÿ’ธ Orbital Materials secured a new funding round led by NVIDIA Ventures (financial details undisclosed) โ€“ timed nicely coinciding with the recent release of the ML potential GNN Orb-v2. A new unicorn from AI 4 Science is coming? ๐Ÿค”

Weekend reading:

Learning Graph Quantized Tokenizers for Transformers by Limei Wang, Kaveh Hassani et al and Meta - an unorthodox approach for graph tokenization via vector quantization and codebook learning, conceptually similar to VQ-GNNs (NeurIPS 2021), strange to not see this older paper cited

Relaxed Equivariance via Multitask Learning by Ahmed Elhag et al feat Michael Bronstein - instead of baking equivariances right into models, letโ€™s add it as a loss component and allow a model to learn and use as much equivariance as necessary, brings 10x inference speedups.

Homomorphism Counts as Structural Encodings for Graph Learning by Linus Bao, Emily Jin, et al - introduces motif structural encoding (MoSE) for graph transformers. Paired with GraphGPS, brings MAE on ZINC from 0.07 down to 0.062 and to 0.056 with GRIT.
GraphML News (Nov 2nd) - The Debate on Equivariance, MoML and Stanford Graph workshop

๐ŸŽƒ Writing ICLR reviews and LOG rebuttals might have delivered you enough of the Halloween spirit with spooky papers and (semi)undead reviewers - itโ€™s almost over though!

๐ŸฅŠ The debate on equivariance, namely, is it worth to bake symmetries right into the model or learn from data, remains to be a hot topic in the community with new evidence appearing every week supporting both sides. Is torch.nn.TransformerEncoder all you need?

In the blue corner, the work Does equivariance matter at scale? by Johann Brehmer et al compares a vanilla transformer with the E(3)-equivariant Geometric Algebra Transformer (GATr) on the rigid-body modelling task with a wide range of sizes to derive scaling laws (akin to Kaplan and Chinchilla laws) and finds that the equivariant transformer scales better overall.

In the red corner, we have The Importance of Being Scalable: Improving the Speed and Accuracy of Neural Network Interatomic Potentials Across Chemical Domains by Eric Qu et al who modify a vanilla transformer and outperform hefty Equiformer, GemNet, and MACE on ML potential benchmarks on molecules and crystals. Another wrestler in the red corner is the tech report on ORB v2 by Mark Neumann and Orbital Materials - ORB v2 is a vanilla MPNN potential trained with a denoising objective and delivers SOTA (or close to) performance while trained on only 8 A100s (compared to 64+ GPUs needed for Equiformer V2 but subject to different training datasets).

๐Ÿ† Overall, โ€œno equivarianceโ€ wins this week 2-1 (2.5 - 1 if including a recent work on relaxed equivariance).

๐ŸŽค Next Tuesday, Nov 5th, is not just the election day in the US, but also the day of two graph learning events: MoML 2024 at MIT and the Graph Learning Workshop 2024 at Stanford. Programs of both events are now visible and there might be livestreams as well, keep an eye on the announcements.

Weekend reading:

Generator Matching: Generative modeling with arbitrary Markov processes by Peter Holderrieth feat. Ricky Chen and Yaron Lipman - a generalization of diffusion, flow matching (both continuous and discrete), and jump processes (outstanding paper award at ICLRโ€™24). Expect a new generation of generative models for images / proteins / molecules / SBDD / RNAs / crystals to adopt this next year.

Long-context Protein Language Model by Yingheng Wang and (surprisingly) Amazon team - introduces a Mamba-based bidirectional protein LM that outperforms ESM-2 on a variety of tasks while being much smaller and faster.

Iambic announced NeuralPLexer 3 competitive with AlphaFold 3. While we are waiting for the tech report and more experiments, it seems that NP3 features Triton kernels for efficient triangular attention akin to FlashAttention but on triples of nodes.