Probabilistic Learning on Graphs via Contextual Architectures
This is a guest post by Federico Errica ([email protected]) about their new JMLR work called “Probabilistic Learning on Graphs via Contextual Architectures”.
Intro/TL;DR:
We propose a probabilistic methodology for representation learning on graph-structured data, in which a stack of Bayesian networks learns different distributions of a vertex’s neighbourhood. The main characteristics of our approach are (i) unsupervised, as it models the generation of node attributes; (ii) layer-wise training: (iii) incremental construction policy; (iv) maximum likelihood estimation with Expectation-Maximization. The model, called Contextual Graph Markov Model (CGMM), can be regarded as a probabilistic version of Deep Graph Networks (DGNs).
Each layer of the model implements a probabilistic version of neighbourhood aggregation. The hidden representation of each node is modelled as a categorical distribution. When aggregating neighbours, the incoming messages are the *frozen* posterior probabilities computed when training the previous layers. When discrete edge types are available, we can weight the contribution of nodes in different ways using the Switching Parent approximation. Moreover, each neighbour aggregation can be conditioned on an arbitrary subset of the previous layers.
By design, this incremental construction policy avoids the exploding/vanishing gradient effect. As a result, each layer exploits different sets of statistics when trying to maximize the likelihood of the nodes in each graph. We test the model on node and graph classification tasks. First, we generate unsupervised node/graph representations; then, we apply a standard ML classifier to output the right class. In turn, this leads to a critical analysis of some benchmarks used in the literature. Finally, we show that the performances of the model increase as we add more layers (up to 20).
Paper: https://www.jmlr.org/papers/v21/19-470.html
Code: https://github.com/diningphil/CGMM
Related reads: (i) https://doi.org/10.1016/j.neunet.2020.06.006 (ii) https://proceedings.mlr.press/v80/bacciu18a.html
This is a guest post by Federico Errica ([email protected]) about their new JMLR work called “Probabilistic Learning on Graphs via Contextual Architectures”.
Intro/TL;DR:
We propose a probabilistic methodology for representation learning on graph-structured data, in which a stack of Bayesian networks learns different distributions of a vertex’s neighbourhood. The main characteristics of our approach are (i) unsupervised, as it models the generation of node attributes; (ii) layer-wise training: (iii) incremental construction policy; (iv) maximum likelihood estimation with Expectation-Maximization. The model, called Contextual Graph Markov Model (CGMM), can be regarded as a probabilistic version of Deep Graph Networks (DGNs).
Each layer of the model implements a probabilistic version of neighbourhood aggregation. The hidden representation of each node is modelled as a categorical distribution. When aggregating neighbours, the incoming messages are the *frozen* posterior probabilities computed when training the previous layers. When discrete edge types are available, we can weight the contribution of nodes in different ways using the Switching Parent approximation. Moreover, each neighbour aggregation can be conditioned on an arbitrary subset of the previous layers.
By design, this incremental construction policy avoids the exploding/vanishing gradient effect. As a result, each layer exploits different sets of statistics when trying to maximize the likelihood of the nodes in each graph. We test the model on node and graph classification tasks. First, we generate unsupervised node/graph representations; then, we apply a standard ML classifier to output the right class. In turn, this leads to a critical analysis of some benchmarks used in the literature. Finally, we show that the performances of the model increase as we add more layers (up to 20).
Paper: https://www.jmlr.org/papers/v21/19-470.html
Code: https://github.com/diningphil/CGMM
Related reads: (i) https://doi.org/10.1016/j.neunet.2020.06.006 (ii) https://proceedings.mlr.press/v80/bacciu18a.html
GitHub
GitHub - diningphil/CGMM: Official Repository of "Contextual Graph Markov Model" (ICML 2018 - JMLR 2020)
Official Repository of "Contextual Graph Markov Model" (ICML 2018 - JMLR 2020) - GitHub - diningphil/CGMM: Official Repository of "Contextual Graph Markov Model&quo...
Issue #1: Introduction, PAC Isometry, Over-Smoothing, and Evolution of the Field
Finally the first issue of a newsletter is out and I hope there will many more in the future. The most difficult of this is to find good stories for the email: it's somewhat different from posting on telegram and twitter, as you need to have more insights in a single story. So if you find something that could be relevant to the community, definitely send me a message.
Finally the first issue of a newsletter is out and I hope there will many more in the future. The most difficult of this is to find good stories for the email: it's somewhat different from posting on telegram and twitter, as you need to have more insights in a single story. So if you find something that could be relevant to the community, definitely send me a message.
Simple scalable graph neural networks
Michael Bronstein continues a marathon of great blog posts on GML. In a new post he describes their recent work on scaling GNNs to large network. There is a good introduction to sampling-based methods (e.g. SAGE, GraphSAINT, ClusterGCN), which sample a subgraph of a large graph and then train GNN only on a subgraph.
Then, he describes that it can be beneficial just precompute r-hop matrices, A^r X, and use MLP on these features. This way, you use topology of your graph and you apply mini-batch training with MLP.
What's cool is that the algorithm is already available in pytorch-geometric as a transform, which is implemented via sparseTensor matrix multiplication.
Michael Bronstein continues a marathon of great blog posts on GML. In a new post he describes their recent work on scaling GNNs to large network. There is a good introduction to sampling-based methods (e.g. SAGE, GraphSAINT, ClusterGCN), which sample a subgraph of a large graph and then train GNN only on a subgraph.
Then, he describes that it can be beneficial just precompute r-hop matrices, A^r X, and use MLP on these features. This way, you use topology of your graph and you apply mini-batch training with MLP.
What's cool is that the algorithm is already available in pytorch-geometric as a transform, which is implemented via sparseTensor matrix multiplication.
Medium
Simple scalable graph neural networks
One of the practical challenges of graph neural networks in scalability to large graphs. We present a simple solution for scalable GNNs.
MLSS Indo 2020: Equivariance
MLSS-Indonesia is part of MLSS series, which is one of the top summer schools on ML. Here is a lecture and slides on equivariance and group theory and how it's used for convolutions.
MLSS-Indonesia is part of MLSS series, which is one of the top summer schools on ML. Here is a lecture and slides on equivariance and group theory and how it's used for convolutions.
YouTube
MLSS-Indo 2020 - Lecture 3 - Advanced Topics in Equivariance and Inductive Biases
The Machine Learning Summer School Indonesia (MLSS-Indo) is part of MLSS series (mlss.cc), which was started at Max Planck Institute for Intelligent Systems in Tübingen, Germany, in 2002. It is a seven-day event where participants have the chance to learn…
Fresh picks from ArXiv
This week discusses ways to compute a distance between graphs, fast triangle counting, and a new i.i.d. test 🧪
GNN
• Adversarial Directed Graph Embedding with Philip S. Yu
• CoCoS: Fast and Accurate Distributed Triangle Counting in Graph Streams with Christos Faloutsos
• Network comparison and the within-ensemble graph distance
Math
• Using Expander Graphs to test whether samples are i.i.d
Surveys
• Graph Signal Processing for Geometric Data and Beyond: Theory and Applications
• Graph Neural Networks: Architectures, Stability and Transferability
• Random Walks: A Review of Algorithms and Applications
• Big Networks: A Survey
This week discusses ways to compute a distance between graphs, fast triangle counting, and a new i.i.d. test 🧪
GNN
• Adversarial Directed Graph Embedding with Philip S. Yu
• CoCoS: Fast and Accurate Distributed Triangle Counting in Graph Streams with Christos Faloutsos
• Network comparison and the within-ensemble graph distance
Math
• Using Expander Graphs to test whether samples are i.i.d
Surveys
• Graph Signal Processing for Geometric Data and Beyond: Theory and Applications
• Graph Neural Networks: Architectures, Stability and Transferability
• Random Walks: A Review of Algorithms and Applications
• Big Networks: A Survey
Israeli Geometric Deep Learning Workshop
Many cool presentations at the recent DGL workshop, including Yaron Lipman, Gal Chechik, Gal Chechik, and many other experienced people in this field. The video is on YouTube.
Many cool presentations at the recent DGL workshop, including Yaron Lipman, Gal Chechik, Gal Chechik, and many other experienced people in this field. The video is on YouTube.
YouTube
iGDL 2020: Israeli Geometric Deep Learning Workshop
We are excited to announce the first Israeli workshop on geometric deep learning (iGDL) that will take place on August 2nd, 2020 2 PM-7 PM (Israel timezone). The workshop will be in English, and will take place virtually via Zoom due to COVID19 restrictions.…
The Quantum Graph Recurrent Neural Network
This demonstration by pennylane investigates quantum graph recurrent neural networks (QGRNN), which are the quantum analogue of a classical graph recurrent neural network, and a subclass of the more general quantum graph neural network ansatz. Both the QGNN and QGRNN were introduced in this paper (2019) by Google X.
This demonstration by pennylane investigates quantum graph recurrent neural networks (QGRNN), which are the quantum analogue of a classical graph recurrent neural network, and a subclass of the more general quantum graph neural network ansatz. Both the QGNN and QGRNN were introduced in this paper (2019) by Google X.
Drawing neural networks in LaTeX
There is a repo of good examples by Petar Veličković of how you can draw Tikz images in LaTeX. Here is an example of 1-layer GNN by Matthias Fey.
There is a repo of good examples by Petar Veličković of how you can draw Tikz images in LaTeX. Here is an example of 1-layer GNN by Matthias Fey.
GitHub
GitHub - PetarV-/TikZ: Complete collection of my PGF/TikZ figures.
Complete collection of my PGF/TikZ figures. Contribute to PetarV-/TikZ development by creating an account on GitHub.
Graph Machine Learning research groups: Xavier Bresson
I do a series of posts on the groups in graph research, previous post is here. The 12th is Xavier Bresson, conference and tutorial organizers on graph machine learning.
Xavier Bresson (~1975)
- Affiliation: NTU Singapore
- Education: Ph.D. at EPFL, Switzerland in 2005 (supervised by Jean-Philippe Thiran);
- h-index: 38;
- Awards: Singapore NRF fellowship, best paper at IVMSP;
- Interests: signal processing on graphs, GNN, combinatorial optimization.
I do a series of posts on the groups in graph research, previous post is here. The 12th is Xavier Bresson, conference and tutorial organizers on graph machine learning.
Xavier Bresson (~1975)
- Affiliation: NTU Singapore
- Education: Ph.D. at EPFL, Switzerland in 2005 (supervised by Jean-Philippe Thiran);
- h-index: 38;
- Awards: Singapore NRF fellowship, best paper at IVMSP;
- Interests: signal processing on graphs, GNN, combinatorial optimization.
Telegram
Graph Machine Learning
Graph Machine Learning research groups: Kristian Kersting
I do a series of posts on the groups in graph research, previous post is here. The 11th is Kristian Kersting, co-author of TU data set and several graph kernels.
Kristian Kersting (1973)
- Affiliation:…
I do a series of posts on the groups in graph research, previous post is here. The 11th is Kristian Kersting, co-author of TU data set and several graph kernels.
Kristian Kersting (1973)
- Affiliation:…
Temporal Graph Networks
Another blog post by Michael Bronstein with Emanuele Rossi on applying graph nets on dynamic graphs (represented as the stream of edges). Apparently this problem is much more realistic in many business contexts such as social networks and it has not been studied at depth until that paper.
Another blog post by Michael Bronstein with Emanuele Rossi on applying graph nets on dynamic graphs (represented as the stream of edges). Apparently this problem is much more realistic in many business contexts such as social networks and it has not been studied at depth until that paper.
Medium
Temporal Graph Networks
A new neural network architecture for dynamic graphs
Fresh picks from ArXiv
This week we have novel GNN architectures, Turing test for graph drawing, and graph exploration of git repositories 💻
GNN
• Quaternion Graph Neural Networks
• Representative Graph Neural Network ECCV 20
• DensE: An Enhanced Non-Abelian Group Representation for Knowledge Graph Embedding
Applications
• Graph Edit Distance Reward: Learning to Edit Scene Graph
• HOSE-Net: Higher Order Structure Embedded Network for Scene Graph Generation
• Bipartite Graph Reasoning GANs for Person Image Generation
• Graph Polish: A Novel Graph Generation Paradigm for Molecular Optimization
• Graph Drawing via Gradient Descent, (GD)2
• The Turing Test for Graph Drawing Algorithms
• GraphRepo: Fast Exploration in Software Repository Mining
This week we have novel GNN architectures, Turing test for graph drawing, and graph exploration of git repositories 💻
GNN
• Quaternion Graph Neural Networks
• Representative Graph Neural Network ECCV 20
• DensE: An Enhanced Non-Abelian Group Representation for Knowledge Graph Embedding
Applications
• Graph Edit Distance Reward: Learning to Edit Scene Graph
• HOSE-Net: Higher Order Structure Embedded Network for Scene Graph Generation
• Bipartite Graph Reasoning GANs for Person Image Generation
• Graph Polish: A Novel Graph Generation Paradigm for Molecular Optimization
• Graph Drawing via Gradient Descent, (GD)2
• The Turing Test for Graph Drawing Algorithms
• GraphRepo: Fast Exploration in Software Repository Mining
Node regression problem
I asked on twitter what the available node regression data sets there and found quite a few interesting responses.
1. There are pure node regression data sets, but not so many. One can use Wikipedia, Pokec, or data sets from this paper. I hope to release a couple more data sets like these soon.
2. You can also find data sets in spatiotemporal prediction on graphs (eg. traffic forecasting). You are given graph + velocity on each lane and you are asked to predict velocity in the future. My opinion is that the problem is a toy problem: there are no features associated with the nodes (except for a speed). But you can take a look at DCRNN, STGCN, GaAN, Graph WaveNet, STGRAT, etc. models that deal with that.
3. You can find node regression in the work of simulating physics. A node is a particle, it has a few features (eg. position+velocity) you are asked to predict acceleration. This is an interesting problem, but I haven't found data sets. You probably need to write your own simulator.
4. Next scene prediction. Essentially the same as previous, but the objects can be anything: for example, a camera view in a self-driving car. You are asked to predict next position of every object. I don't know if anyone tried to solve this problem.
5. Action prediction for RL agent. NerveNet did it. Each object is a graph and you predict an action for each node.
I asked on twitter what the available node regression data sets there and found quite a few interesting responses.
1. There are pure node regression data sets, but not so many. One can use Wikipedia, Pokec, or data sets from this paper. I hope to release a couple more data sets like these soon.
2. You can also find data sets in spatiotemporal prediction on graphs (eg. traffic forecasting). You are given graph + velocity on each lane and you are asked to predict velocity in the future. My opinion is that the problem is a toy problem: there are no features associated with the nodes (except for a speed). But you can take a look at DCRNN, STGCN, GaAN, Graph WaveNet, STGRAT, etc. models that deal with that.
3. You can find node regression in the work of simulating physics. A node is a particle, it has a few features (eg. position+velocity) you are asked to predict acceleration. This is an interesting problem, but I haven't found data sets. You probably need to write your own simulator.
4. Next scene prediction. Essentially the same as previous, but the objects can be anything: for example, a camera view in a self-driving car. You are asked to predict next position of every object. I don't know if anyone tried to solve this problem.
5. Action prediction for RL agent. NerveNet did it. Each object is a graph and you predict an action for each node.
Twitter
Sergey Ivanov
There are many data sets for graph classification/regression and node classification. Anyone knows if there are *any* node *regression* data sets?
Brief analytics of KDD papers
Here are some plots for upcoming KDD 2020 (only research track). It's interesting to compare it against ICML 2020. You can check out git repo with the analysis. Here is highlights:
1. Top affiliations at KDD are different than at ICML, with several "small" names at the top.
2. Leading authors are almost all from China.
3. There are more authors per paper (~4-5) at KDD than at ICML (~3-4). Only a single paper with a single author.
4. There are ~65 graph papers, with a handful batch on pure algorithms.
5. In total there are 217 research papers. Graphs comprise about 30% of all papers.
6. Wordcloud confirms: graphs are the most used word in titles.
7. "Geodesic Forests" is the shortest title appeared.
Here are some plots for upcoming KDD 2020 (only research track). It's interesting to compare it against ICML 2020. You can check out git repo with the analysis. Here is highlights:
1. Top affiliations at KDD are different than at ICML, with several "small" names at the top.
2. Leading authors are almost all from China.
3. There are more authors per paper (~4-5) at KDD than at ICML (~3-4). Only a single paper with a single author.
4. There are ~65 graph papers, with a handful batch on pure algorithms.
5. In total there are 217 research papers. Graphs comprise about 30% of all papers.
6. Wordcloud confirms: graphs are the most used word in titles.
7. "Geodesic Forests" is the shortest title appeared.
Medium
ICML 2020. Comprehensive analysis of authors, organizations, and countries.
Who published the most?