Graph Machine Learning research groups: Albert-László Barabási
I do a series of posts on the groups in graph research, previous post is here. The 18th is Albert-László Barabási, who introduced scale-free networks and in particular Barabási–Albert model and popularized network science through several books and courses.
Albert-László Barabási (1967)
- Affiliation: Northeastern University
- Education: Ph.D. at Boston University in 1994 (advisor: H. Eugene Stanley)
- h-index 145
- Awards: John von Neumann Medal, C&C Prize, Prima Primissima Award, Cozzarelli Prize, Bolyai Prize
- Interests: network science, network structure, network control
I do a series of posts on the groups in graph research, previous post is here. The 18th is Albert-László Barabási, who introduced scale-free networks and in particular Barabási–Albert model and popularized network science through several books and courses.
Albert-László Barabási (1967)
- Affiliation: Northeastern University
- Education: Ph.D. at Boston University in 1994 (advisor: H. Eugene Stanley)
- h-index 145
- Awards: John von Neumann Medal, C&C Prize, Prima Primissima Award, Cozzarelli Prize, Bolyai Prize
- Interests: network science, network structure, network control
Telegram
Graph Machine Learning
Graph Machine Learning research groups: Jiliang Tang
I do a series of posts on the groups in graph research, previous post is here. The 17th is Jiliang Tang, coauthor of the book "Deep Learning on Graphs".
Jiliang Tang (~1974)
- Affiliation: Michigan State…
I do a series of posts on the groups in graph research, previous post is here. The 17th is Jiliang Tang, coauthor of the book "Deep Learning on Graphs".
Jiliang Tang (~1974)
- Affiliation: Michigan State…
Graph ML Newsletter: NeurIPS 2020
This issue I devoted to upcoming NeurIPS 2020: general trends and exciting topics in graph ML. Oversmoothing, adversarials, explainability, scalability, and more.
This issue I devoted to upcoming NeurIPS 2020: general trends and exciting topics in graph ML. Oversmoothing, adversarials, explainability, scalability, and more.
Fresh picks from ArXiv
Today at ArXiv three interesting surveys: overview of graph kernel literature from the last 15 years, GNNs for recommendations, and digest of Babai and Luks algorithms for graph isomorphism problem.
If I forgot to mention your paper, please shoot me a message and I will update the post.
Graphs
- Single-Node Attack for Fooling Graph Neural Networks
- Learning with Molecules beyond Graph Neural Networks Workshop NeurIPS 2020
- Massively Parallel Graph Drawing and Representation Learning BigData 2020
- Deoscillated Graph Collaborative Filtering with Philip S. Yu
- On Self-Distilling Graph Neural Network
- Finding Friends and Flipping Frenemies: Automatic Paraphrase Dataset Augmentation Using Graph Theory EMNLP 2020
Survey
- Graph Kernels: State-of-the-Art and Future Challenges with Karsten Borgwardt
- Graph Neural Networks in Recommender Systems: A Survey
- Recent Advances on the Graph Isomorphism Problem
Today at ArXiv three interesting surveys: overview of graph kernel literature from the last 15 years, GNNs for recommendations, and digest of Babai and Luks algorithms for graph isomorphism problem.
If I forgot to mention your paper, please shoot me a message and I will update the post.
Graphs
- Single-Node Attack for Fooling Graph Neural Networks
- Learning with Molecules beyond Graph Neural Networks Workshop NeurIPS 2020
- Massively Parallel Graph Drawing and Representation Learning BigData 2020
- Deoscillated Graph Collaborative Filtering with Philip S. Yu
- On Self-Distilling Graph Neural Network
- Finding Friends and Flipping Frenemies: Automatic Paraphrase Dataset Augmentation Using Graph Theory EMNLP 2020
Survey
- Graph Kernels: State-of-the-Art and Future Challenges with Karsten Borgwardt
- Graph Neural Networks in Recommender Systems: A Survey
- Recent Advances on the Graph Isomorphism Problem
Erdős goes neural
This is a post by Andreas Loukas about their recent work on solving combinatorial problems, accepted to NeurIPS 2020.
There is great progress in combinatorial optimization (CO) problems with neural networks. Unsupervised learning (UL) is a particularly appealing direction as it aspires to avoid common drawbacks (computational cost, limited generalization) associated with the use of labeled instances. A major challenge in UL is to ensure that the discretized output respects CO constraints. Researchers have often tackled these problems by various ad-hoc heuristics that repair/improve upon the fractional solutions.
Our paper proposes a principled approach to use UL GNNs for CO that ensures that constraints will be met without breaking differentiability. Our idea is to utilize the probabilistic method pioneered by Paul Erdős to bridge the gap between the discrete and continuous worlds.
The probabilistic method. To prove the existence of an object with a property π one invents a distribution over possible objects. Showing that P(object has π)>0 is a certificate for the object's existence. For example, we can solve the maximum cut problem by using a fair coin toss to decide in which partition each node is placed. It is easy to show that, on average, half the edges of the graph will be cut, implying the existence of a 1/2-approximation for the maxcut problem.
Erdős goes neural. We utilize the same procedure but learn the node probabilities with a GNN. To achieve this, we analytically construct a differentiable "probabilistic penalty loss" for the discrete problem. As shown, by minimizing the latter, the GNN provides a probabilistic certificate for the existence of a low cost and feasible solution. At inference time, discrete solutions of guaranteed cost are deterministically obtained with the method of conditional expectation.
In practice. We demonstrated the efficacy of this approach on two NP-hard problems: the maximum clique problem and the constrained min-cut problem. In our experiments, our approach always yields feasible solutions and is competitive with neural baselines, MIP solvers, and heuristics.
Future work. There are subtleties in our framework that need to be carefully addressed. On the theoretical front, it can be challenging to derive a probabilistic penalty loss for certain types of constraints (such as TSP). On the computational front, finding ways to accelerate our decoding module could significantly improve the running time during inference.
TLDR:
* We propose a differentiable way to solve CO problems with unsupervised GNNs.
* In our approach, the neural network is learning a distribution of solutions. Successfully minimizing our loss provides a certificate of existence of low-cost feasible solutions in the learned distribution.
* We obtain competitive results against neural baselines, heuristics, and solvers.
blogpost: https://andreasloukas.blog/2020/10/31/erdos-goes-neural
paper: https://rb.gy/uktdfa
This is a post by Andreas Loukas about their recent work on solving combinatorial problems, accepted to NeurIPS 2020.
There is great progress in combinatorial optimization (CO) problems with neural networks. Unsupervised learning (UL) is a particularly appealing direction as it aspires to avoid common drawbacks (computational cost, limited generalization) associated with the use of labeled instances. A major challenge in UL is to ensure that the discretized output respects CO constraints. Researchers have often tackled these problems by various ad-hoc heuristics that repair/improve upon the fractional solutions.
Our paper proposes a principled approach to use UL GNNs for CO that ensures that constraints will be met without breaking differentiability. Our idea is to utilize the probabilistic method pioneered by Paul Erdős to bridge the gap between the discrete and continuous worlds.
The probabilistic method. To prove the existence of an object with a property π one invents a distribution over possible objects. Showing that P(object has π)>0 is a certificate for the object's existence. For example, we can solve the maximum cut problem by using a fair coin toss to decide in which partition each node is placed. It is easy to show that, on average, half the edges of the graph will be cut, implying the existence of a 1/2-approximation for the maxcut problem.
Erdős goes neural. We utilize the same procedure but learn the node probabilities with a GNN. To achieve this, we analytically construct a differentiable "probabilistic penalty loss" for the discrete problem. As shown, by minimizing the latter, the GNN provides a probabilistic certificate for the existence of a low cost and feasible solution. At inference time, discrete solutions of guaranteed cost are deterministically obtained with the method of conditional expectation.
In practice. We demonstrated the efficacy of this approach on two NP-hard problems: the maximum clique problem and the constrained min-cut problem. In our experiments, our approach always yields feasible solutions and is competitive with neural baselines, MIP solvers, and heuristics.
Future work. There are subtleties in our framework that need to be carefully addressed. On the theoretical front, it can be challenging to derive a probabilistic penalty loss for certain types of constraints (such as TSP). On the computational front, finding ways to accelerate our decoding module could significantly improve the running time during inference.
TLDR:
* We propose a differentiable way to solve CO problems with unsupervised GNNs.
* In our approach, the neural network is learning a distribution of solutions. Successfully minimizing our loss provides a certificate of existence of low-cost feasible solutions in the learned distribution.
* We obtain competitive results against neural baselines, heuristics, and solvers.
blogpost: https://andreasloukas.blog/2020/10/31/erdos-goes-neural
paper: https://rb.gy/uktdfa
andreas loukas
Erdős goes neural: unsupervised learning of combinatorial algorithms with neural networks
This post summarizes our recent work “Erdős goes neural: an unsupervised learning framework for combinatorial optimization on graphs” (bibtex) presented as an oral contribution at NeurI…
Papers and Scores at ICLR 2021
You can see a ranked list of all papers at ICLR 2021 here (~3000 papers) and only for graph papers here (200+ papers). These scores are before rebuttal, so they can change in the final ranking.
With 20% acceptance rate (which is a bit low for ICLR), the average score should be about 6, in order to get accepted.
The median score now is 5.25.
There are many graph papers are at the top of the list, including the top-1 paper.
And from the last year here are some insights:
(1) Every third paper on graphs is accepted, clear indication GML is becoming popular;
(2) On average it's needed [6,6,8] to get accepted, [6,6,6] would be borderline.
(3) AC can sometimes save a paper, even if got low scores. This is rather good, meaning that reviewers are not the only ones who decide.
(4) Likewise, AC can reject a paper, even if it is unanimous accept by the reviewers. I think that happens mostly because the paper does not present enough experimental comparison to SOTA.
You can see a ranked list of all papers at ICLR 2021 here (~3000 papers) and only for graph papers here (200+ papers). These scores are before rebuttal, so they can change in the final ranking.
With 20% acceptance rate (which is a bit low for ICLR), the average score should be about 6, in order to get accepted.
The median score now is 5.25.
There are many graph papers are at the top of the list, including the top-1 paper.
And from the last year here are some insights:
(1) Every third paper on graphs is accepted, clear indication GML is becoming popular;
(2) On average it's needed [6,6,8] to get accepted, [6,6,6] would be borderline.
(3) AC can sometimes save a paper, even if got low scores. This is rather good, meaning that reviewers are not the only ones who decide.
(4) Likewise, AC can reject a paper, even if it is unanimous accept by the reviewers. I think that happens mostly because the paper does not present enough experimental comparison to SOTA.
Google Docs
iclr2021_review_scores
iclr2021_review_rebuttal
title,url,avg_rating,ratings
How Neural Networks Extrapolate: From Feedforward to Graph Neural Networks,https://openreview.net/forum?id=UH-cmocLJC,8.25,['8', '9', '9', '7']
Learning a Latent Simplex in Input Sparsity Time,https:…
title,url,avg_rating,ratings
How Neural Networks Extrapolate: From Feedforward to Graph Neural Networks,https://openreview.net/forum?id=UH-cmocLJC,8.25,['8', '9', '9', '7']
Learning a Latent Simplex in Input Sparsity Time,https:…
EMNLP 2020 stats
Dates: Nov 16-18
Where: Online
Price: $200 ($75 students)
Graph papers can be found at paper digest.
• 3359 submissions (vs 2905 in 2019)
• 754/520 accepted EMNLP/Findings (vs 660 in 2019)
• 22.4% / 20.5% acceptance rate (vs 22.7% in 2019)
• ~104 total graph papers (8% of total)
Dates: Nov 16-18
Where: Online
Price: $200 ($75 students)
Graph papers can be found at paper digest.
• 3359 submissions (vs 2905 in 2019)
• 754/520 accepted EMNLP/Findings (vs 660 in 2019)
• 22.4% / 20.5% acceptance rate (vs 22.7% in 2019)
• ~104 total graph papers (8% of total)
2020.emnlp.org
EMNLP 2020
The 2020 Conference on Empirical Methods in Natural Language Processing
Workshops at NeurIPS 2020
There are more than 60 workshops at NeurIPS this year. Some relevant (with available accepted papers) are Learning Meets Combinatorial Algorithms (LMCA) on ML + NP-hard problems; and Differential Geometry meets Deep Learning (DiffGeo4DL) on geometry and manifolds.
There are more than 60 workshops at NeurIPS this year. Some relevant (with available accepted papers) are Learning Meets Combinatorial Algorithms (LMCA) on ML + NP-hard problems; and Differential Geometry meets Deep Learning (DiffGeo4DL) on geometry and manifolds.
nips.cc
NeurIPS 2020 Schedule
NeurIPS Website
Fresh picks from ArXiv
Today at ArXiv new big graph dataset for graph classification, fast clustering, and reconstructing ancient documents with GNN ⚰️
If I forgot to mention your paper, please shoot me a message and I will update the post.
Graphs
- A Large-Scale Database for Graph Representation Learning
- Using Graph Neural Networks to Reconstruct Ancient Documents
- Learning to Drop: Robust Graph Neural Network via Topological Denoising
- Towards Better Approximation of Graph Crossing Number
- Distill2Vec: Dynamic Graph Representation Learning with Knowledge Distillation
- Node Attribute Completion in Knowledge Graphs with Multi-Relational Propagation
Conferences
- Duality-Induced Regularizer for Tensor Factorization Based Knowledge Graph Completion NeurIPS 2020
- Higher-Order Spectral Clustering of Directed Graphs NeurIPS 2020
- Molecular Mechanics-Driven Graph Neural Network with Multiplex Graph for Molecular Structures Workshop NeurIPS 2020
- IGSQL: Database Schema Interaction Graph Based Neural Model for Context-Dependent Text-to-SQL Generation EMNLP 2020
Today at ArXiv new big graph dataset for graph classification, fast clustering, and reconstructing ancient documents with GNN ⚰️
If I forgot to mention your paper, please shoot me a message and I will update the post.
Graphs
- A Large-Scale Database for Graph Representation Learning
- Using Graph Neural Networks to Reconstruct Ancient Documents
- Learning to Drop: Robust Graph Neural Network via Topological Denoising
- Towards Better Approximation of Graph Crossing Number
- Distill2Vec: Dynamic Graph Representation Learning with Knowledge Distillation
- Node Attribute Completion in Knowledge Graphs with Multi-Relational Propagation
Conferences
- Duality-Induced Regularizer for Tensor Factorization Based Knowledge Graph Completion NeurIPS 2020
- Higher-Order Spectral Clustering of Directed Graphs NeurIPS 2020
- Molecular Mechanics-Driven Graph Neural Network with Multiplex Graph for Molecular Structures Workshop NeurIPS 2020
- IGSQL: Database Schema Interaction Graph Based Neural Model for Context-Dependent Text-to-SQL Generation EMNLP 2020
Combining Label Propagation and Simple Models Out-performs Graph Neural Networks
This paper by Cornell and Facebook made a lot of noise on Twitter recently. In short, it shows that GNNs can be outperformed by simpler models such as MLP + Label Propagation (LP) on several large datasets.
They use LP (actually twice) to propagate the labels from training nodes to test nodes. LP has been used for two decades successfully (NIPS 2004 as well as this survey), it's just it was not directly compared to GNN. Unfortunately, LP does not use node features, so the authors propose first to use MLP on node features and then use LP on predictions of MLP and on labels.
This work only applies for transductive node classification, but not on inductive node classification (applying trained model on new graphs), neither on link prediction nor graph classification. But for node classification it shows pretty good results in terms of speed and quality.
Another detail is that LP usually works on homophilous graphs, i.e. graphs where nodes with the same labels have higher chance of being connected. While this assumption is reasonable, not all graphs have this type of connectivity, for example the mail that goes from a person to a post office to aggregator to the recipient may connect nodes of different classes together. Petar Veličković talks more in detail about this.
I must add that it's not the first time we see that existing graph datasets can be outperformed by simple models. A year ago there were many works showing that MLP works better than GNN on many graph classification datasets (e.g. this paper). MLP don't work on OGB datasets really well, but MLP + LP does. Hopefully it will lead to more graph datasets and subsequently to more insights about which tools are the best for graph prediction problems.
This paper by Cornell and Facebook made a lot of noise on Twitter recently. In short, it shows that GNNs can be outperformed by simpler models such as MLP + Label Propagation (LP) on several large datasets.
They use LP (actually twice) to propagate the labels from training nodes to test nodes. LP has been used for two decades successfully (NIPS 2004 as well as this survey), it's just it was not directly compared to GNN. Unfortunately, LP does not use node features, so the authors propose first to use MLP on node features and then use LP on predictions of MLP and on labels.
This work only applies for transductive node classification, but not on inductive node classification (applying trained model on new graphs), neither on link prediction nor graph classification. But for node classification it shows pretty good results in terms of speed and quality.
Another detail is that LP usually works on homophilous graphs, i.e. graphs where nodes with the same labels have higher chance of being connected. While this assumption is reasonable, not all graphs have this type of connectivity, for example the mail that goes from a person to a post office to aggregator to the recipient may connect nodes of different classes together. Petar Veličković talks more in detail about this.
I must add that it's not the first time we see that existing graph datasets can be outperformed by simple models. A year ago there were many works showing that MLP works better than GNN on many graph classification datasets (e.g. this paper). MLP don't work on OGB datasets really well, but MLP + LP does. Hopefully it will lead to more graph datasets and subsequently to more insights about which tools are the best for graph prediction problems.
Twitter
Austin Benson
(1/N) Since KDD 2019 in Alaska, I have spent a lot of time trying to understand Graph Neural Networks (GNNs) and the graph mining / ML community’s infatuation with them. This has resulted in two papers on "node prediction,", i.e., (semi-supervised) graph…
Video: Recent Developments of Graph Network Architectures
(video) I already did a post with the slides with a great lecture of Xavier Bresson on latest GNNs. Here is also a video presentation of it.
(video) I already did a post with the slides with a great lecture of Xavier Bresson on latest GNNs. Here is also a video presentation of it.
YouTube
Xavier Bresson "Recent Developments of Graph Network Architectures"
Lecture slides on "Recent Developments of Graph Network Architectures" from my deep learning course. It is a review of some exciting works on graph neural networks published in 2019-2020.
Slides: https://rb.gy/quo3n6
Xavier Bresson, NTU, Singapore
http…
Slides: https://rb.gy/quo3n6
Xavier Bresson, NTU, Singapore
http…
Weisfeiler and Leman go sparse: Towards scalable higher-order graph embeddings
This is a guest post by Christopher Morris about their recent work accepted to NeurIPS 2020 that deals with higher-order WL algorithms.
Motivation: Since the power of GNNs is upper-bounded by the 1-dimensional Weisfeiler-Leman algorithm (WL) (Xu et al. 2019, Morris et al. 2019), it is natural to design GNNs based on insights from the k-dimensional WL (k-WL), which is a strictly more powerful heuristic for the graph isomorphism heuristic. Instead of computing colors or features for single vertices, the k-WL gets more powerful by computing colors for k-tuples, defined over the vertex set, and defines a suitable adjacency notion between them to do a message-passing style update. Hence, it accounts for the higher-order interactions between vertices. However, it does not scale and may suffer from overfitting when used in a machine learning setting. Hence, it remains an important open problem to design WL-based graph learning methods simultaneously expressive, scalable, and non-overfitting.
Methodological Contribution: In our paper, we propose local variants of the k-WL and corresponding neural architectures, which consider a subset of the original neighborhood, making them more scalable, and less prone to overfitting. Surprisingly, the expressive power of (one of)
our algorithms is strictly higher than the original algorithm in terms of the ability to distinguish non-isomorphic graphs. We then lift our results to the neural setting and connect our finding to recent learning theoretic results for GNNs (Garg et al., 2020), showing that our architectures offer better generalization errors.
Empirical results: Our experimental study confirms that the local
algorithms, both kernel and neural architectures lead to vastly reduced computation times and prevent overfitting. The kernel version establishes a new state-of-the-art for graph classification on a wide range of benchmark datasets. In contrast, the neural version shows promising performance on large-scale molecular regression tasks.
Future Challenges: While our new sparse architecture leads to a boost in expressive power over standard GNNs and is less prone to overfitting than dense architectures, it still does not scale to truly large-scale. The main reason for this is the exponential dependence on k, i.e., the algorithm still considers all n**k tuples. Hence, designing scalable (higher-order) GNNs that can provably capture graph structure is an important future goal.
In general, we believe that moving away from the restrictive graph isomorphism objective and deriving a deeper
understanding of our architecture, when optimized with stochastic gradient descent, are important futures goals.
This is a guest post by Christopher Morris about their recent work accepted to NeurIPS 2020 that deals with higher-order WL algorithms.
Motivation: Since the power of GNNs is upper-bounded by the 1-dimensional Weisfeiler-Leman algorithm (WL) (Xu et al. 2019, Morris et al. 2019), it is natural to design GNNs based on insights from the k-dimensional WL (k-WL), which is a strictly more powerful heuristic for the graph isomorphism heuristic. Instead of computing colors or features for single vertices, the k-WL gets more powerful by computing colors for k-tuples, defined over the vertex set, and defines a suitable adjacency notion between them to do a message-passing style update. Hence, it accounts for the higher-order interactions between vertices. However, it does not scale and may suffer from overfitting when used in a machine learning setting. Hence, it remains an important open problem to design WL-based graph learning methods simultaneously expressive, scalable, and non-overfitting.
Methodological Contribution: In our paper, we propose local variants of the k-WL and corresponding neural architectures, which consider a subset of the original neighborhood, making them more scalable, and less prone to overfitting. Surprisingly, the expressive power of (one of)
our algorithms is strictly higher than the original algorithm in terms of the ability to distinguish non-isomorphic graphs. We then lift our results to the neural setting and connect our finding to recent learning theoretic results for GNNs (Garg et al., 2020), showing that our architectures offer better generalization errors.
Empirical results: Our experimental study confirms that the local
algorithms, both kernel and neural architectures lead to vastly reduced computation times and prevent overfitting. The kernel version establishes a new state-of-the-art for graph classification on a wide range of benchmark datasets. In contrast, the neural version shows promising performance on large-scale molecular regression tasks.
Future Challenges: While our new sparse architecture leads to a boost in expressive power over standard GNNs and is less prone to overfitting than dense architectures, it still does not scale to truly large-scale. The main reason for this is the exponential dependence on k, i.e., the algorithm still considers all n**k tuples. Hence, designing scalable (higher-order) GNNs that can provably capture graph structure is an important future goal.
In general, we believe that moving away from the restrictive graph isomorphism objective and deriving a deeper
understanding of our architecture, when optimized with stochastic gradient descent, are important futures goals.
Knowledge Graphs in NLP @ EMNLP 2020
A new digest from Michael Galkin on the applications of knowledge graphs in NLP at the last EMNLP conference. Much bigger models (6.5B parameters), more languages (100 languages for entity linking), more complex tasks (data to text).
A new digest from Michael Galkin on the applications of knowledge graphs in NLP at the last EMNLP conference. Much bigger models (6.5B parameters), more languages (100 languages for entity linking), more complex tasks (data to text).
Medium
Knowledge Graphs in NLP @ EMNLP 2020
Your guide to the KG-related research in NLP, November edition.
Network Repository
A cool interactive repository of about a thousand of different graphs. Could be useful if you need some graphs with specific properties for specific tasks.
A cool interactive repository of about a thousand of different graphs. Could be useful if you need some graphs with specific properties for specific tasks.
Network Data Repository
Network Data Repository | The First Interactive Network Data Repository
The first interactive network dataset repository with interactive graph visualization and analytics
Fresh picks from ArXiv
Today at ArXiv smoothing for link prediction, design space of GNNs, and scalable networks for heterogeneous graphs ⛰
If I forgot to mention your paper, please shoot me a message and I will update the post.
Conferences
- Node Similarity Preserving Graph Convolutional Networks WSDM 2021
- Design Space for Graph Neural Networks with Jure Leskovec, NeurIPS 2020
Graphs
- VLG-Net: Video-Language Graph Matching Network for Video Grounding
- Scalable Graph Neural Networks for Heterogeneous Graphs
- Graph embeddings via matrix factorization for link prediction: smoothing or truncating negatives?
- Reinforcement Learning of Graph Neural Networks for Service Function Chaining
- Quantum algorithms for learning graphs and beyond
Survey
- Survey and Open Problems in Privacy Preserving Knowledge Graph: Merging, Query, Representation, Completion and Applications
- Subpath Queries on Compressed Graphs: a Survey
Today at ArXiv smoothing for link prediction, design space of GNNs, and scalable networks for heterogeneous graphs ⛰
If I forgot to mention your paper, please shoot me a message and I will update the post.
Conferences
- Node Similarity Preserving Graph Convolutional Networks WSDM 2021
- Design Space for Graph Neural Networks with Jure Leskovec, NeurIPS 2020
Graphs
- VLG-Net: Video-Language Graph Matching Network for Video Grounding
- Scalable Graph Neural Networks for Heterogeneous Graphs
- Graph embeddings via matrix factorization for link prediction: smoothing or truncating negatives?
- Reinforcement Learning of Graph Neural Networks for Service Function Chaining
- Quantum algorithms for learning graphs and beyond
Survey
- Survey and Open Problems in Privacy Preserving Knowledge Graph: Merging, Query, Representation, Completion and Applications
- Subpath Queries on Compressed Graphs: a Survey
How node features affect performance of GNN?
This is an open question that I recently thought a bit. In particular, what surprised me are the results from a recent paper on Label Propagation on a particular dataset Rice31 (table below).
You can see that some models achieve 80% accuracy, while others 10% (random guess). In the paper they say that the node features are heterogeneous features such as gender or major, but after speaking with authors it seems they use spectral embeddings instead.
I have tried this dataset with GNN and my results are close to random guess (10%). I tried several variations of GNN as well as node features, but didn't get much higher than 15%. Then I tried GBDT with spectral embeddings and it gave me about 50% accuracy. I haven't tried LP yet on this dataset, but it would be remarkable to see that LP with spectral embeddings can have such a drastic difference with GNN.
This and other experiments led me to think that the paradigm of message passing is too strong, i.e. aggregating information simultaneously among your neighbors may not be a good idea in general. The inductive bias that such model has could be wrong for a particular graph dataset. GNN work on some graph datasets, but how node labels depend on the graph structure is very similar to how message-passing works. In other words, if you were to create a dataset, where a node label equals to an average label of your neighbors, then GNN that does average aggregation would easily learn such dependency. But if your node labels depend on the structure in some counter-intuitive way (for example, by picking a neighbor at random and then assigning its node label), then your GNN with average aggregation would fail. In other words, GNN models don't have to follow message-passing paradigm, they can have very different design principles and that's something that I think we will see in the coming years.
This is an open question that I recently thought a bit. In particular, what surprised me are the results from a recent paper on Label Propagation on a particular dataset Rice31 (table below).
You can see that some models achieve 80% accuracy, while others 10% (random guess). In the paper they say that the node features are heterogeneous features such as gender or major, but after speaking with authors it seems they use spectral embeddings instead.
I have tried this dataset with GNN and my results are close to random guess (10%). I tried several variations of GNN as well as node features, but didn't get much higher than 15%. Then I tried GBDT with spectral embeddings and it gave me about 50% accuracy. I haven't tried LP yet on this dataset, but it would be remarkable to see that LP with spectral embeddings can have such a drastic difference with GNN.
This and other experiments led me to think that the paradigm of message passing is too strong, i.e. aggregating information simultaneously among your neighbors may not be a good idea in general. The inductive bias that such model has could be wrong for a particular graph dataset. GNN work on some graph datasets, but how node labels depend on the graph structure is very similar to how message-passing works. In other words, if you were to create a dataset, where a node label equals to an average label of your neighbors, then GNN that does average aggregation would easily learn such dependency. But if your node labels depend on the structure in some counter-intuitive way (for example, by picking a neighbor at random and then assigning its node label), then your GNN with average aggregation would fail. In other words, GNN models don't have to follow message-passing paradigm, they can have very different design principles and that's something that I think we will see in the coming years.
Telegram
Graph Machine Learning
Combining Label Propagation and Simple Models Out-performs Graph Neural Networks
This paper by Cornell and Facebook made a lot of noise on Twitter recently. In short, it shows that GNNs can be outperformed by simpler models such as MLP + Label Propagation…
This paper by Cornell and Facebook made a lot of noise on Twitter recently. In short, it shows that GNNs can be outperformed by simpler models such as MLP + Label Propagation…
Jraph - A library for graph neural networks in jax.
Jgraph is a new library by DeepMind for constructing GNNs in JAX (autograd computation) and Haiku (writing neural network layers). Could be useful if you cannot use PyTorch.
Jgraph is a new library by DeepMind for constructing GNNs in JAX (autograd computation) and Haiku (writing neural network layers). Could be useful if you cannot use PyTorch.
GitHub
GitHub - google-deepmind/jraph: A Graph Neural Network Library in Jax
A Graph Neural Network Library in Jax. Contribute to google-deepmind/jraph development by creating an account on GitHub.
Graph Machine Learning research groups: Jimeng Sun
I do a series of posts on the groups in graph research, previous post is here. The 19th is Jimeng Sun, the head of SunLab at UIUC, teaching the courses of Big Data Analytics and Healthcare as well as Computing and Society.
Jimeng Sun (~1981)
- Affiliation: University of Illinois Urbana-Champaign
- Education: Ph.D. at CMU in 2002 (advisor: Christos Faloutsos)
- h-index 66
- Awards: KDD, ICDM, SDM best paper awards
- Interests: drug discovery, GNNs, graph mining
I do a series of posts on the groups in graph research, previous post is here. The 19th is Jimeng Sun, the head of SunLab at UIUC, teaching the courses of Big Data Analytics and Healthcare as well as Computing and Society.
Jimeng Sun (~1981)
- Affiliation: University of Illinois Urbana-Champaign
- Education: Ph.D. at CMU in 2002 (advisor: Christos Faloutsos)
- h-index 66
- Awards: KDD, ICDM, SDM best paper awards
- Interests: drug discovery, GNNs, graph mining
Telegram
Graph Machine Learning
Graph Machine Learning research groups: Albert-László Barabási
I do a series of posts on the groups in graph research, previous post is here. The 18th is Albert-László Barabási, who introduced scale-free networks and in particular Barabási–Albert model and…
I do a series of posts on the groups in graph research, previous post is here. The 18th is Albert-László Barabási, who introduced scale-free networks and in particular Barabási–Albert model and…
Planarity game
If you need some time to procrastinate and you want to do it with graphs, here is a fun game to play, called Tronix2. You just need to make the drawn graphs planar. There are several clones of this game (here and here), which even explain how to generate planar graphs. And here is Numberphile video about planar graphs.
If you need some time to procrastinate and you want to do it with graphs, here is a fun game to play, called Tronix2. You just need to make the drawn graphs planar. There are several clones of this game (here and here), which even explain how to generate planar graphs. And here is Numberphile video about planar graphs.
Atom
Treksit.com is for sale at Atom!
Domain name Treksit.com evokes a sense of adventure and exploration, making it ideal for travel, outdoor, or tech startups. The word trek suggests a journey or expedition, while sit implies a place to relax and reflect. This duality can app . " #CatchyDomains…
Golden Knowledge Graph
Golden is a Silicon Valley startup building a knowledge database (similar to Wikipedia) — a good example how knowledge graphs can be commercialized.
Golden is a Silicon Valley startup building a knowledge database (similar to Wikipedia) — a good example how knowledge graphs can be commercialized.