Graph Machine Learning
6.7K subscribers
53 photos
11 files
808 links
Everything about graph theory, computer science, machine learning, etc.


If you have something worth sharing with the community, reach out @gimmeblues, @chaitjo.

Admins: Sergey Ivanov; Michael Galkin; Chaitanya K. Joshi
Download Telegram
NeurIPS 2019 stats

6743 number of submissions
1428 accepted
21% acceptance rate
75 graph papers (5% of accepted)
Ringel’s conjecture is proved.

Ringel's conjecture states that every complete graph with 2n+1 nodes can be decomposed into a set of any identical non-overlapping trees of order n. In other words, take any tree with n nodes, place it on the complete graph with 2n+1 nodes, remove the edges your tree covers, and continue with the remaining graph. No matter which tree you have started with, there is a procedure to remove all the edges in a complete graph by placing your tree step by step.

This conjecture was known for 60 years and finally has been proved last month. At last this article makes a good job explaining how it was done.
Reinforcement Learning for Combinatorial Optimization: A Survey

Our new submission to IJCAI survey track. We surveyed all of the literature we found on applying RL methods for combinatorial optimization problems (e.g. TSP, Knapsack, MaxCut).

There are three types of the RL approaches we categorized the papers: Value-based, Policy-based, and Monte-Carlo Tree Search based. This is one of the domains that appeared very recently, a few years ago, and has an increasing number of successful applications to traditional problems each year. I would say it's a good topic for a fresh Ph.D. student to start working on.
Recent approaches to Graph Convolutional Networks, Graph Representation Learning and Reinforcement Learning

Surprisingly discovered a local workshop on GML with strong list of keynote speakers. Free of charge 🤫

https://gcn-grl-rl.sciencesconf.org/
ML and NLP Publications in 2019

One of the reasons why I have this channel is to encourage people do more research in general. And looking at the recent analysis of the number of accepted papers by countries you can understand why.

USA has almost as much publications last year as all other countries combined.

In particular, it's 2.5x > China, 6x > UK, and 100x > Russia.

The reasons are obvious, USA spends more budget on research from both government and industry, but also the culture of doing research is different: making a publication at top conferences is considered a big deal and a lot of efforts is put on this, — which can definitely adopted by other countries.
Understanding the Representation Power of Graph Neural Networks in Learning Graph Topology

This is NeurIPS 2019 work by the group of Albert-László Barabási, — the one who invented a famous Barabási–Albert graph model.

In this work, they study if GNN can compute the function that approximates graph moments, i.e. powers of adjacency matrix. The key result is that if GNN has number of layers more than the power of the matrix then it can learn a corresponding graph moment. For those who followed previous post, which describes a paper What graph neural networks cannot learn: depth vs width by Loukas, will see that Barabási's paper is a precursor and a partial result of Loukas' paper that states the condition on GNN for computability of any function (not just graph moments).
Transformers are Graph Neural Networks

It seems today everyone is talking about this blog post. It describes how NLP's Transformers are just part of a more general concept we call Graph Neural Networks.

In Transformer, we have an encoder that for every word gives an embedding. The idea is very simple: let's the embedding of word A be a weighted average of embeddings of other words in a sentence. That's it. That's what everyone calls attention. Attention = weighted average. The weights are trained together with embeddings, you also add normalization tricks, positional encodings, MLP, and other deep learning mumbles, but this is like in any other model. What's novel is that you use weighted average, instead of aggregation word by word sequentially.

In GNN we also have an encoder, this encoder just takes a graph in and outputs embeddings of nodes. The most popular way in GNN to do this is through message-passing mechanism, the idea of which is also simple: the embedding of node A is a function of embeddings of its neighbors. Two changes from Transformer:
1) We can have any function, instead of weighted average.
2) We aggregate over the neighbors, instead of all nodes.

If you take weighted average function, i.e. attention, then you have Graph Attention Network. Surprise.

You could also aggregate the nodes beyond your neighbors and I believe there are works that do this, but common assumption is that graph connections are already constructed in such a way that node is most influenced by the incoming edges and not by some distant nodes.

So Transformers are not GNNs, at most Transformers are Graph Attention Networks. This is important if we want to draw some conclusions for NLP from GNN area. For example, Graph Attention Network is less powerful than many other GNNs such as GIN. It also does not have function approximation guarantees that I described in previous posts.

But sure the connection between NLP and GML exists and is not fully explored. I bet you can leverage graphs when you analyze text or you can use GNN for better encoding. GML on the other hand benefited from NLP in many ways already: usually we first see some results in NLP (RNN, Transformers, BERT) and then see its adaptation in GML. So maybe it's the moment we'll see more contributions to NLP with graphs.
Channel photo updated
Comparing Stars: On Approximating Graph Edit Distance

Good old VLDB'09 paper on the computation of graph edit distance.

* It shows the exact distance computation is NP-hard;
* It discusses connections with graph matching;
* And it provides lower and upper bound algorithms.
Fresh picks from ArXiv
This week is full of CVPR and AISTATS 20 accepted papers, new surveys, more submissions to ICML and KDD, and new GNN models 📚

CVPR 20
* Unbiased Scene Graph Generation from Biased Training
* Social-STGCNN: A Social Spatio-Temporal Graph Convolutional Neural Network for Human Trajectory Prediction
* 4D Association Graph for Realtime Multi-person Motion Capture Using Multiple Video Cameras
* Representations, Metrics and Statistics For Shape Analysis of Elastic Graphs
* Say As You Wish: Fine-grained Control of Image Caption Generation with Abstract Scene Graphs
* Fine-grained Video-Text Retrieval with Hierarchical Graph Reasoning
* SketchGCN: Semantic Sketch Segmentation with Graph Convolutional Networks

Survey
* Bridging the Gap between Spatial and Spectral Domains: A Survey on Graph Neural Networks
* Graph Neural Networks Meet Neural-Symbolic Computing: A Survey and Perspective
* Adversarial Attacks and Defenses on Graphs: A Review and Empirical Study
* Knowledge Graphs on the Web -- an Overview

GNN
* Infinitely Wide Graph Convolutional Networks: Semi-supervised Learning via Gaussian Processes
* Can graph neural networks count substructures? by group of Joan Bruna
* Heterogeneous Graph Neural Networks for Malicious Account Detection by group of Le Song

AISTATS 20
* Permutation Invariant Graph Generation via Score-Based Generative Modeling

KDD 20
* PM2.5-GNN: A Domain Knowledge Enhanced Graph Neural Network For PM2.5 Forecasting

ICML 20
* Semi-supervised Anomaly Detection on Attributed Graphs
* Inverse Graphics GAN: Learning to Generate 3D Shapes from Unstructured 2D Data
* Permutohedral-GCN: Graph Convolutional Networks with Global Attention
* Benchmarking Graph Neural Networks by group of Yoshua Bengio


Graph Theory
* Finding large matchings in 1-planar graphs of minimum degree 3
* Trapping problem on star-type graphs with applications
* On Fast Computation of Directed Graph Laplacian Pseudo-Inverse
Bridging the Gap between Spatial and Spectral Domains: A Survey on Graph Neural Networks

A new short survey on GNNs, submission to IJCAI survey track.

* It formulates popular GNNs in the same context, i.e. Z = f(G)X, where Z is matrix of embeddings, f(G) is GNN function, and X is matrix of features. This helps to compare different GNNs by purely looking at f(G) function.
* It provides examples for different local aggregation schemes, order of neighborhoods, and directions on the edges.
* It discusses common GNNs from spectral perspective.
Benchmarking Graph Neural Networks

This is a recent submission to ICML that compares existing GNN on multiple data sets. There was already an accepted paper at ICLR 2020 so the two works largely overlap (and I guess that would decrease the chances for ICML submissions to be accepted).

Good points:
* The authors propose 6 new data sets with tens of thousands graphs.
* They provide learned lessons:
1. GNNs are better than MLP for their proposed data sets.
2. Adding weights for each edge helps.
3. Residual connections and normalization helps.

Bad points:
* The data sets are still small in terms of the size of the graphs, ~100 of nodes.
* No metrics beyond accuracy.
AISTATS 2020 stats

1490 number of submissions
423 accepted papers
28% acceptance rate
15 graph papers (4% of accepted)
​​My thoughts on Kelly's conjecture.

A Kelly's conjecture (a.k.a. reconstruction conjecture) states that for a graph with n vertices, a deck with n graphs on (n-1) vertices can reconstruct exactly a graph.
In other words, pick a graph, remove one vertex (with all its outgoing edges), and put to the set. Then start with the original graph, remove another vertex, and put it to the set. Repeat for all vertices. Finally, you will have a set of n graphs with n-1 vertices, which is called a deck.
Now, the conjecture states that deck uniquely determines the original graph, hence, it's a complete graph invariant, i.e. two graphs have the same deck if and only if, they are isomorphic.

This conjecture has been known since 1957 and many results have been achieved. For example, it's known that regular graphs or trees can be reconstructed uniquely with a deck, but directed graphs are not. For general undirected case it's still not clear. There is even an ArXiv paper from 2017 that claimed to resolve the conjecture, but after communication with the authors, they admit there is a mistake in the proof.

This idea has been in graph representation too. In particular, there is a graphlet kernel that decomposes a graph into a set of small graphlets/motifs. This is quite expressive kernel, i.e. it can recognize a lot of graphs, but whether it recognizes all of the graphs is still a question.

For a long time, I believed that if Kelly's conjecture is true (which is quite likely), then not only a deck of graphs on (n-1) vertices is unique, but also of smaller graphs. The assumption was that you can in turn decompose a (n-1) graph into a set of (n-2) deck and move deeper to any number of vertices that you want.

But yesterday I found that there are non-isomorphic graphs that have the same decomposition on smaller graphs. In particular, if you take a cycle of 4k+2 vertices as G1 and two cycles of 2k+1 as G2, then a decomposition into k-size graphlets will be the same.

For example, for k=3, we have graphlets, i.e. triplets which are connected in all possible ways. In this case, G1 is a cycle on 10 vertices and G2 is two cycles on 5 vertices. The number of closed triangles would be 0 for G1 and G2, the number of 2 connected points and one disconnected would be 60 for G1 and G2, and so on. In the end, the decomposition would be the same, while the graphs are different.
Moreover, in recent paper IJCAI '18 it's shown that graphlet kernel cannot identify some planar from non-planar, bijective from non-bijective, and connected from disconnected graphs. This is quite interesting as it shows the limitations of graphlet kernel and in general graphlet decomposition.