Graph Machine Learning
6.7K subscribers
53 photos
11 files
808 links
Everything about graph theory, computer science, machine learning, etc.


If you have something worth sharing with the community, reach out @gimmeblues, @chaitjo.

Admins: Sergey Ivanov; Michael Galkin; Chaitanya K. Joshi
Download Telegram
Papers and Scores at ICLR 2021

You can see a ranked list of all papers at ICLR 2021 here (~3000 papers) and only for graph papers here (200+ papers). These scores are before rebuttal, so they can change in the final ranking.

With 20% acceptance rate (which is a bit low for ICLR), the average score should be about 6, in order to get accepted.
The median score now is 5.25.

There are many graph papers are at the top of the list, including the top-1 paper.

And from the last year here are some insights:
(1) Every third paper on graphs is accepted, clear indication GML is becoming popular;
(2) On average it's needed [6,6,8] to get accepted, [6,6,6] would be borderline.
(3) AC can sometimes save a paper, even if got low scores. This is rather good, meaning that reviewers are not the only ones who decide.
(4) Likewise, AC can reject a paper, even if it is unanimous accept by the reviewers. I think that happens mostly because the paper does not present enough experimental comparison to SOTA.
EMNLP 2020 stats

Dates: Nov 16-18
Where: Online
Price: $200 ($75 students)

Graph papers can be found at paper digest.

• 3359 submissions (vs 2905 in 2019)
• 754/520 accepted EMNLP/Findings (vs 660 in 2019)
• 22.4% / 20.5% acceptance rate (vs 22.7% in 2019)
• ~104 total graph papers (8% of total)
Workshops at NeurIPS 2020

There are more than 60 workshops at NeurIPS this year. Some relevant (with available accepted papers) are Learning Meets Combinatorial Algorithms (LMCA) on ML + NP-hard problems; and Differential Geometry meets Deep Learning (DiffGeo4DL) on geometry and manifolds.
Combining Label Propagation and Simple Models Out-performs Graph Neural Networks

This paper by Cornell and Facebook made a lot of noise on Twitter recently. In short, it shows that GNNs can be outperformed by simpler models such as MLP + Label Propagation (LP) on several large datasets.

They use LP (actually twice) to propagate the labels from training nodes to test nodes. LP has been used for two decades successfully (NIPS 2004 as well as this survey), it's just it was not directly compared to GNN. Unfortunately, LP does not use node features, so the authors propose first to use MLP on node features and then use LP on predictions of MLP and on labels.

This work only applies for transductive node classification, but not on inductive node classification (applying trained model on new graphs), neither on link prediction nor graph classification. But for node classification it shows pretty good results in terms of speed and quality.

Another detail is that LP usually works on homophilous graphs, i.e. graphs where nodes with the same labels have higher chance of being connected. While this assumption is reasonable, not all graphs have this type of connectivity, for example the mail that goes from a person to a post office to aggregator to the recipient may connect nodes of different classes together. Petar Veličković talks more in detail about this.

I must add that it's not the first time we see that existing graph datasets can be outperformed by simple models. A year ago there were many works showing that MLP works better than GNN on many graph classification datasets (e.g. this paper). MLP don't work on OGB datasets really well, but MLP + LP does. Hopefully it will lead to more graph datasets and subsequently to more insights about which tools are the best for graph prediction problems.
Weisfeiler and Leman go sparse: Towards scalable higher-order graph embeddings

This is a guest post by Christopher Morris about their recent work accepted to NeurIPS 2020 that deals with higher-order WL algorithms.

Motivation: Since the power of GNNs is upper-bounded by the 1-dimensional Weisfeiler-Leman algorithm (WL) (Xu et al. 2019, Morris et al. 2019), it is natural to design GNNs based on insights from the k-dimensional WL (k-WL), which is a strictly more powerful heuristic for the graph isomorphism heuristic. Instead of computing colors or features for single vertices, the k-WL gets more powerful by computing colors for k-tuples, defined over the vertex set, and defines a suitable adjacency notion between them to do a message-passing style update. Hence, it accounts for the higher-order interactions between vertices. However, it does not scale and may suffer from overfitting when used in a machine learning setting. Hence, it remains an important open problem to design WL-based graph learning methods simultaneously expressive, scalable, and non-overfitting.

Methodological Contribution: In our paper, we propose local variants of the k-WL and corresponding neural architectures, which consider a subset of the original neighborhood, making them more scalable, and less prone to overfitting. Surprisingly, the expressive power of (one of)
our algorithms is strictly higher than the original algorithm in terms of the ability to distinguish non-isomorphic graphs. We then lift our results to the neural setting and connect our finding to recent learning theoretic results for GNNs (Garg et al., 2020), showing that our architectures offer better generalization errors.

Empirical results: Our experimental study confirms that the local
algorithms, both kernel and neural architectures lead to vastly reduced computation times and prevent overfitting. The kernel version establishes a new state-of-the-art for graph classification on a wide range of benchmark datasets. In contrast, the neural version shows promising performance on large-scale molecular regression tasks.

Future Challenges: While our new sparse architecture leads to a boost in expressive power over standard GNNs and is less prone to overfitting than dense architectures, it still does not scale to truly large-scale. The main reason for this is the exponential dependence on k, i.e., the algorithm still considers all n**k tuples. Hence, designing scalable (higher-order) GNNs that can provably capture graph structure is an important future goal.

In general, we believe that moving away from the restrictive graph isomorphism objective and deriving a deeper
understanding of our architecture, when optimized with stochastic gradient descent, are important futures goals.
Knowledge Graphs in NLP @ EMNLP 2020

A new digest from Michael Galkin on the applications of knowledge graphs in NLP at the last EMNLP conference. Much bigger models (6.5B parameters), more languages (100 languages for entity linking), more complex tasks (data to text).
How node features affect performance of GNN?

This is an open question that I recently thought a bit. In particular, what surprised me are the results from a recent paper on Label Propagation on a particular dataset Rice31 (table below).

You can see that some models achieve 80% accuracy, while others 10% (random guess). In the paper they say that the node features are heterogeneous features such as gender or major, but after speaking with authors it seems they use spectral embeddings instead.

I have tried this dataset with GNN and my results are close to random guess (10%). I tried several variations of GNN as well as node features, but didn't get much higher than 15%. Then I tried GBDT with spectral embeddings and it gave me about 50% accuracy. I haven't tried LP yet on this dataset, but it would be remarkable to see that LP with spectral embeddings can have such a drastic difference with GNN.

This and other experiments led me to think that the paradigm of message passing is too strong, i.e. aggregating information simultaneously among your neighbors may not be a good idea in general. The inductive bias that such model has could be wrong for a particular graph dataset. GNN work on some graph datasets, but how node labels depend on the graph structure is very similar to how message-passing works. In other words, if you were to create a dataset, where a node label equals to an average label of your neighbors, then GNN that does average aggregation would easily learn such dependency. But if your node labels depend on the structure in some counter-intuitive way (for example, by picking a neighbor at random and then assigning its node label), then your GNN with average aggregation would fail. In other words, GNN models don't have to follow message-passing paradigm, they can have very different design principles and that's something that I think we will see in the coming years.
Jraph - A library for graph neural networks in jax.

Jgraph is a new library by DeepMind for constructing GNNs in JAX (autograd computation) and Haiku (writing neural network layers). Could be useful if you cannot use PyTorch.
Graph Machine Learning research groups: Jimeng Sun

I
do a series of posts on the groups in graph research, previous post is here. The 19th is Jimeng Sun, the head of SunLab at UIUC, teaching the courses of Big Data Analytics and Healthcare as well as Computing and Society.

Jimeng Sun (~1981)
- Affiliation: University of Illinois Urbana-Champaign
- Education: Ph.D. at CMU in 2002 (advisor: Christos Faloutsos)
- h-index 66
- Awards: KDD, ICDM, SDM best paper awards
- Interests: drug discovery, GNNs, graph mining
Planarity game

If you need some time to procrastinate and you want to do it with graphs, here is a fun game to play, called Tronix2. You just need to make the drawn graphs planar. There are several clones of this game (here and here), which even explain how to generate planar graphs. And here is Numberphile video about planar graphs.
Golden Knowledge Graph

Golden is a Silicon Valley startup building a knowledge database (similar to Wikipedia) — a good example how knowledge graphs can be commercialized.
Undergraduate Math Student Pushes Frontier of Graph Theory

A new article at QuantaMagazine about 21 year old who improved results of Erdős and Szekeres on the upper bound for two-color Ramsey numbers. Informally, Ramsey numbers can be explained as "how big graphs can get before patterns inevitably emerge". This is in addition to the recent proof for lower bounds, also covered in Quanta.
ICLR 2021 Graph Papers

Here is a list of graph papers with their final scores. This is in addition to the list for all the papers. Overall, 74 papers (out of 208 graph papers) increased their scores and 18 decreased.
Open Access Theses and Dissertations

Seeking for an inspiration for your dissertation or maybe want to check the latest monolithic works in graph community, take a look at OATD portal. Here is for example a search for all dissertations that have graph in their title, resulting in ~400 PhD and ~100 MSc theses just in 2016-2020 period.