On the universal equivariant functions
Yesterday's post was followed by a conversation with Andreas Loukas on the power of graph neural networks. There is one detail about the current analysis of GNN, which I didn't pay much attention before, even though I encountered it (it's a recurring phenomenon for me, when some major insight is given one sentence in the paper and is not highlighted in CAPITAL letters).
The insight is that there are two types of GNN, anonymous and non-anonymous. Anonymous case means you are invariant to the order of nodes, for example when you use the sum aggregation over nodes. It was shown that anonymous case is equivalent to WL algorithm and therefore has a lot limitations such as not being able to count subgraphs or distinguish graphons, etc. So the current anonymous models are not universal: they cannot compute all the functions of the inputs. It's a weaker model than non-anonymous case, when you give some orderings to the nodes and then you iterate over all orderings.
Non-anonymous models have additional node features, for example one-hot encodings of their position in adjacency matrix, which is one of the sufficient conditions for GNN to be universal. There are then two scenarios. Either you consider all possible permutations, in which case it grows in factorial terms and essentially it's a cheat. Or you resort to a single permutation, but then do not enjoy the invariance property of GNN, i.e. for different orderings it can give different set of embeddings for the same nodes.
So it's interesting to see if there are universal equivariant functions that do not use all node permutations, which is still an open question.
Yesterday's post was followed by a conversation with Andreas Loukas on the power of graph neural networks. There is one detail about the current analysis of GNN, which I didn't pay much attention before, even though I encountered it (it's a recurring phenomenon for me, when some major insight is given one sentence in the paper and is not highlighted in CAPITAL letters).
The insight is that there are two types of GNN, anonymous and non-anonymous. Anonymous case means you are invariant to the order of nodes, for example when you use the sum aggregation over nodes. It was shown that anonymous case is equivalent to WL algorithm and therefore has a lot limitations such as not being able to count subgraphs or distinguish graphons, etc. So the current anonymous models are not universal: they cannot compute all the functions of the inputs. It's a weaker model than non-anonymous case, when you give some orderings to the nodes and then you iterate over all orderings.
Non-anonymous models have additional node features, for example one-hot encodings of their position in adjacency matrix, which is one of the sufficient conditions for GNN to be universal. There are then two scenarios. Either you consider all possible permutations, in which case it grows in factorial terms and essentially it's a cheat. Or you resort to a single permutation, but then do not enjoy the invariance property of GNN, i.e. for different orderings it can give different set of embeddings for the same nodes.
So it's interesting to see if there are universal equivariant functions that do not use all node permutations, which is still an open question.
Other telegram channels
There are many other interesting channels, below are English-speaking that I follow. If I miss something valuable, feel free to send them to me and I will update the post.
https://t.iss.one/opendatascience
https://t.iss.one/ArtificialIntelligencedl
https://t.iss.one/ai_machinelearning_big_data
https://t.iss.one/j_links
https://t.iss.one/ArtificialIntelligenceArticles
https://t.iss.one/snakers4
https://t.iss.one/loss_function_porn
https://t.iss.one/yegor256news
Also quite useful aggregator https://infomate.club/ml/
There are many other interesting channels, below are English-speaking that I follow. If I miss something valuable, feel free to send them to me and I will update the post.
https://t.iss.one/opendatascience
https://t.iss.one/ArtificialIntelligencedl
https://t.iss.one/ai_machinelearning_big_data
https://t.iss.one/j_links
https://t.iss.one/ArtificialIntelligenceArticles
https://t.iss.one/snakers4
https://t.iss.one/loss_function_porn
https://t.iss.one/yegor256news
Also quite useful aggregator https://infomate.club/ml/
Telegram
Data Science by ODS.ai π¦
First Telegram Data Science channel. Covering all technical and popular staff about anything related to Data Science: AI, Big Data, Machine Learning, Statistics, general Math and the applications of former. To reach editors contact: @malev
Why `True is False is False` -> False?
Stumbled upon this little question why
Stumbled upon this little question why
True is False is False evaluates to False in python. The answer is as simple as the question, but not obvious and many people could not answer it. So I wrote a quick post about it.Medium
Why `True is False is False` -> False?
Here is a little interview question to you, experienced Python programmer.
Fresh picks from ArXiv
This week has GNN variants for various types of graphs, NP-completeness of MaxCut problem, and a survey on graph data management π₯
GNN
β’ Understanding the Message Passing in Graph Neural Networks via Power Iteration
β’ Interpretable and Efficient Heterogeneous Graph Convolutional Network
β’ Graph Neural Network for Hamiltonian-Based Material Property Prediction
β’ Non-IID Graph Neural Networks
β’ Hierarchical Fashion Graph Network for Personalized Outfit Recommendation
β’ Non-Local Graph Neural Networks
Graph Theory
β’ Complexity of Maximum Cut on Interval Graphs
β’ Planar Graphs that Need Four Pages
Surveys
β’ Benchmarking Graph Data Management and Processing Systems: A Survey
This week has GNN variants for various types of graphs, NP-completeness of MaxCut problem, and a survey on graph data management π₯
GNN
β’ Understanding the Message Passing in Graph Neural Networks via Power Iteration
β’ Interpretable and Efficient Heterogeneous Graph Convolutional Network
β’ Graph Neural Network for Hamiltonian-Based Material Property Prediction
β’ Non-IID Graph Neural Networks
β’ Hierarchical Fashion Graph Network for Personalized Outfit Recommendation
β’ Non-Local Graph Neural Networks
Graph Theory
β’ Complexity of Maximum Cut on Interval Graphs
β’ Planar Graphs that Need Four Pages
Surveys
β’ Benchmarking Graph Data Management and Processing Systems: A Survey
ICML 2020 stats
ICML is the top conference in ML.
Dates: July 12-18
Where: Online
β’ 4990 submissions (vs 3424 in 2019)
β’ 1088 accepted (vs 774 in 2019)
β’ 21.8% acceptance rate (vs 22.6% in 2019)
β’ 53 graph papers (5% of total)
ICML is the top conference in ML.
Dates: July 12-18
Where: Online
β’ 4990 submissions (vs 3424 in 2019)
β’ 1088 accepted (vs 774 in 2019)
β’ 21.8% acceptance rate (vs 22.6% in 2019)
β’ 53 graph papers (5% of total)
Online Seminar on Mathematical Foundations of Data Science
Virtual weekly seminars with excellent list of speakers, open to the public, on mathematics and statistics in ML.
Virtual weekly seminars with excellent list of speakers, open to the public, on mathematics and statistics in ML.
Google
Home
Two Sigma Fellowship
Graph Machine Learning research groups: Joan Bruna
I do a series of posts on the groups in graph research, previous post is here. The seventh is Joan Bruna. He was one of the authors of the survey Geometric Deep Learning: going beyond Euclidean Data and now has increasingly more papers on the theoretical explanations of GNN.
Joan Bruna (~1981)
- Affiliation: New York University
- Education: Ph.D. at Ecole Polytechnique, France in 2013 (supervised by Stephane Mallat);
- h-index: 27;
- Awards: NSF career award, Sloan fellowship, ICMLA best paper;
- Interests: GNN theory, equivariant networks
I do a series of posts on the groups in graph research, previous post is here. The seventh is Joan Bruna. He was one of the authors of the survey Geometric Deep Learning: going beyond Euclidean Data and now has increasingly more papers on the theoretical explanations of GNN.
Joan Bruna (~1981)
- Affiliation: New York University
- Education: Ph.D. at Ecole Polytechnique, France in 2013 (supervised by Stephane Mallat);
- h-index: 27;
- Awards: NSF career award, Sloan fellowship, ICMLA best paper;
- Interests: GNN theory, equivariant networks
Telegram
Graph Machine Learning
Graph Machine Learning research groups: Christos Faloutsos
I do a series of posts on the groups in graph research. The sixth is Christos Faloutsos. He was an advisor for many current professors in GML such as Jure Leskovec, Leman Akoglu, Stephan Guennemmanβ¦
I do a series of posts on the groups in graph research. The sixth is Christos Faloutsos. He was an advisor for many current professors in GML such as Jure Leskovec, Leman Akoglu, Stephan Guennemmanβ¦
ICML 2020 arxiv links
There is a nice website that gather all (available) links to papers at ICML. There are some interesting insights.
First, it's interesting to see what is the oldest paper that was accepted to ICML this year. Apparently this paper was published on arxiv in Sep 2017, waiting for a little bit less than 3 years to get accepted. There are 8 papers from 2018. And the authors probably started working on these papers 6-9 months before the publication date. It's brutal.
Another interesting observation is that the word graph appeared to be top-5 word among all words in titles, which show increased interest in graphs at ICML.
There is a nice website that gather all (available) links to papers at ICML. There are some interesting insights.
First, it's interesting to see what is the oldest paper that was accepted to ICML this year. Apparently this paper was published on arxiv in Sep 2017, waiting for a little bit less than 3 years to get accepted. There are 8 papers from 2018. And the authors probably started working on these papers 6-9 months before the publication date. It's brutal.
Another interesting observation is that the word graph appeared to be top-5 word among all words in titles, which show increased interest in graphs at ICML.
conference-viz.now.sh
Statistics, Paper Links and Visualizations of Machine Learning Conferences
Fresh picks from ArXiv
This week is rich on explainable GNN literature, as well as papers on compression of graphs, combinatorial optimization and recommendation.
GNN
β’ Learning to Solve Combinatorial Optimization Problems on Real-World Graphs in Linear Time
β’ Are Graph Convolutional Networks Fully Exploiting Graph Structure?
β’ Accurately Solving Physical Systems with Graph Learning
β’ Single-Layer Graph Convolutional Networks For Recommendation
β’ XAI for Graphs: Explaining Graph Neural Network Predictions by Identifying Relevant Walks
β’ XGNN: Towards Model-Level Explanations of Graph Neural Networks
β’ Universal Graph Compression: Stochastic Block Models
β’ Convergence and Stability of Graph Convolutional Networks on Large Random Graphs
β’ Fairness-Aware Explainable Recommendation over Knowledge Graphs
Graph Theory
β’ Hierarchical hyperbolicity of graph products
β’ Tree-Projected Gradient Descent for Estimating Gradient-Sparse Parameters on Graphs
β’ The Weisfeiler-Leman dimension of chordal bipartite graphs without bipartite claw
Conferences
β’ Multi-level Graph Convolutional Networks for Cross-platform Anchor Link Prediction KDD 20
Surveys
β’ Generate FAIR Literature Surveys with Scholarly Knowledge Graphs
β’ A Comprehensive Survey of Neural Architecture Search: Challenges and Solutions
This week is rich on explainable GNN literature, as well as papers on compression of graphs, combinatorial optimization and recommendation.
GNN
β’ Learning to Solve Combinatorial Optimization Problems on Real-World Graphs in Linear Time
β’ Are Graph Convolutional Networks Fully Exploiting Graph Structure?
β’ Accurately Solving Physical Systems with Graph Learning
β’ Single-Layer Graph Convolutional Networks For Recommendation
β’ XAI for Graphs: Explaining Graph Neural Network Predictions by Identifying Relevant Walks
β’ XGNN: Towards Model-Level Explanations of Graph Neural Networks
β’ Universal Graph Compression: Stochastic Block Models
β’ Convergence and Stability of Graph Convolutional Networks on Large Random Graphs
β’ Fairness-Aware Explainable Recommendation over Knowledge Graphs
Graph Theory
β’ Hierarchical hyperbolicity of graph products
β’ Tree-Projected Gradient Descent for Estimating Gradient-Sparse Parameters on Graphs
β’ The Weisfeiler-Leman dimension of chordal bipartite graphs without bipartite claw
Conferences
β’ Multi-level Graph Convolutional Networks for Cross-platform Anchor Link Prediction KDD 20
Surveys
β’ Generate FAIR Literature Surveys with Scholarly Knowledge Graphs
β’ A Comprehensive Survey of Neural Architecture Search: Challenges and Solutions
Gradient Boosting Meets Graph Neural Networks for Heterogeneous Data
We have two short paper submissions this year to GRL workshop this year. One of them is about application of gradient boosting decision trees (GBDT) to graphs. We know that Xgboost, LightGBM, and CatBoost perform extremely well on tabular data and are preferred methods for competitions like Kaggle. But how do you generalize it to graph-structured data?
A naΓ―ve approach is to train first GBDT on node features only, ignoring graph topology and then use predictions as additional features to your model. But that misses graph information, possibly leading to inaccurate predictions. Instead, we propose to train GBDT and GNN end-to-end such that each tree of GBDT approximates mistakes made by GNN in the forward passes. We call the model Boosted Graph Neural Network and show that it can lead to significant uplift in performance in node regression task, while being very efficient.
We have two short paper submissions this year to GRL workshop this year. One of them is about application of gradient boosting decision trees (GBDT) to graphs. We know that Xgboost, LightGBM, and CatBoost perform extremely well on tabular data and are preferred methods for competitions like Kaggle. But how do you generalize it to graph-structured data?
A naΓ―ve approach is to train first GBDT on node features only, ignoring graph topology and then use predictions as additional features to your model. But that misses graph information, possibly leading to inaccurate predictions. Instead, we propose to train GBDT and GNN end-to-end such that each tree of GBDT approximates mistakes made by GNN in the forward passes. We call the model Boosted Graph Neural Network and show that it can lead to significant uplift in performance in node regression task, while being very efficient.
grlplus.github.io
Call for Papers
ICML 2020 Workshop
Are Hyperbolic Representations in Graphs Created Equal?
The second submission to GRL workshop was on hyperbolic embeddings for graphs. We first make a good introduction to the distances and dot products in k-Stereographic model (a Riemannian manifold with constant curvature) and fix the issue with taking gradients at zero curvature, by taking a Taylor series expansion around the origin. This allows seamless gradient descent optimization in non-Euclidean space.
Then we make experiments on node and graph classification, link prediction, and graph embedding task (i.e. preserving distances in the latent space) and show that for link prediction and graph embedding there is an uplift in using hyperbolic manifolds, while for node and graph classification Euclidean models work better.
The second submission to GRL workshop was on hyperbolic embeddings for graphs. We first make a good introduction to the distances and dot products in k-Stereographic model (a Riemannian manifold with constant curvature) and fix the issue with taking gradients at zero curvature, by taking a Taylor series expansion around the origin. This allows seamless gradient descent optimization in non-Euclidean space.
Then we make experiments on node and graph classification, link prediction, and graph embedding task (i.e. preserving distances in the latent space) and show that for link prediction and graph embedding there is an uplift in using hyperbolic manifolds, while for node and graph classification Euclidean models work better.
grlplus.github.io
Call for Papers
ICML 2020 Workshop
ICML 2020 collaboration graph
As a preview to my future post (next week) about ICML 2020, I want to share a collaboration graph between different organizations. Final graph has 429 nodes (organizations) and 1206 edges (collaborations). Each edge has a weight: the number of papers the organizations collaborated with. As the final graph is too big to display nicely, you can also look at the subgraph between organizations that collaborated the most (at least 30 collaborations). I will release a colab notebook so that you can play with it.
As a preview to my future post (next week) about ICML 2020, I want to share a collaboration graph between different organizations. Final graph has 429 nodes (organizations) and 1206 edges (collaborations). Each edge has a weight: the number of papers the organizations collaborated with. As the final graph is too big to display nicely, you can also look at the subgraph between organizations that collaborated the most (at least 30 collaborations). I will release a colab notebook so that you can play with it.
Covid Knowledge Graph
A knowledge graph on COVID-19 that integrates various public datasets. This includes relevant publications, case statistics, genes and functions, molecular data and much more. It's implemented in Neo4j and can be accessed via browser.
A knowledge graph on COVID-19 that integrates various public datasets. This includes relevant publications, case statistics, genes and functions, molecular data and much more. It's implemented in Neo4j and can be accessed via browser.
The βUselessβ Perspective That Transformed Mathematics
Matrix algebra is well understood, while group theory, which is used in many proofs of graph theory and other fields, is much more complicated to study. Representation theory creates a bridge between group theory and linear algebra by assigning a matrix to each element in a group, according to certain rules. This nice article introduces to the world of representation theory.
Matrix algebra is well understood, while group theory, which is used in many proofs of graph theory and other fields, is much more complicated to study. Representation theory creates a bridge between group theory and linear algebra by assigning a matrix to each element in a group, according to certain rules. This nice article introduces to the world of representation theory.
Quanta Magazine
The βUselessβ Perspective That Transformed Mathematics
Representation theory was initially dismissed. Today, itβs central to much of mathematics.
Fresh picks from ArXiv
This week people share their works they submitted to NeurIPS, a lot of interesting papers from the top people in GML.
GNN
β’ From Graph Low-Rank Global Attention to 2-FWL Approximation with Yaron Lipman
β’ Graph Meta Learning via Local Subgraphs with Marinka Zitnik
β’ Data Augmentation for Graph Neural Networks
β’ Towards Deeper Graph Neural Networks with Differentiable Group Normalization
β’ Learning Graph Models for Template-Free Retrosynthesis with Regina Barzilay
β’ Wide and Deep Graph Neural Networks with Distributed Online Learning
β’ Pointer Graph Networks with Petar VeliΔkoviΔ
β’ Optimal Transport Graph Neural Networks with Octavian-Eugen Ganea, Regina Barzilay, Tommi Jaakkola
β’ Manifold structure in graph embeddings
β’ On the Bottleneck of Graph Neural Networks and its Practical Implications
Conferences
β’ Exploring Algorithmic Fairness in Robust Graph Covering Problems NeurIPS 2019
Surveys
β’ A Survey on Generative Adversarial Networks: Variants, Applications, and Training
This week people share their works they submitted to NeurIPS, a lot of interesting papers from the top people in GML.
GNN
β’ From Graph Low-Rank Global Attention to 2-FWL Approximation with Yaron Lipman
β’ Graph Meta Learning via Local Subgraphs with Marinka Zitnik
β’ Data Augmentation for Graph Neural Networks
β’ Towards Deeper Graph Neural Networks with Differentiable Group Normalization
β’ Learning Graph Models for Template-Free Retrosynthesis with Regina Barzilay
β’ Wide and Deep Graph Neural Networks with Distributed Online Learning
β’ Pointer Graph Networks with Petar VeliΔkoviΔ
β’ Optimal Transport Graph Neural Networks with Octavian-Eugen Ganea, Regina Barzilay, Tommi Jaakkola
β’ Manifold structure in graph embeddings
β’ On the Bottleneck of Graph Neural Networks and its Practical Implications
Conferences
β’ Exploring Algorithmic Fairness in Robust Graph Covering Problems NeurIPS 2019
Surveys
β’ A Survey on Generative Adversarial Networks: Variants, Applications, and Training
ICML 2020. Comprehensive analysis of authors, organizations, and countries.
Finally here is my post on the analysis of ICML 2020. There are several things I learned from that. For example that USA participates in 3/4 of the papers π± Or that DeepMind makes approximately half of all papers for UK. Or that Google does not collaborate with other companies. Or that, except the USA, there is only China that can brag about several companies that publish regularly. Or that a Japanese professor published 12 papers. And much more.
The code and data is on the github, but the cool part is that you can make your own interactive plots in colab notebook (with no installation required) including a collaboration graph between universities and companies.
Finally here is my post on the analysis of ICML 2020. There are several things I learned from that. For example that USA participates in 3/4 of the papers π± Or that DeepMind makes approximately half of all papers for UK. Or that Google does not collaborate with other companies. Or that, except the USA, there is only China that can brag about several companies that publish regularly. Or that a Japanese professor published 12 papers. And much more.
The code and data is on the github, but the cool part is that you can make your own interactive plots in colab notebook (with no installation required) including a collaboration graph between universities and companies.
Medium
ICML 2020. Comprehensive analysis of authors, organizations, and countries.
Who published the most?