Graph Machine Learning
6.71K subscribers
53 photos
11 files
808 links
Everything about graph theory, computer science, machine learning, etc.


If you have something worth sharing with the community, reach out @gimmeblues, @chaitjo.

Admins: Sergey Ivanov; Michael Galkin; Chaitanya K. Joshi
Download Telegram
What should be the order of authors in your ML paper?

The more you write papers, the more you ask questions like this.

On some occasions, it's a bit more subtle than you expect. For example, one guy did experiments and another made all the theory. Who should go first? It's not clear.

So I asked a few experienced professors and here are the insights:

* The first author is the one who did the most for the paper. If there is more than one, put a corresponding sign.
* The last places are reserved for supervisors.
* The middle are sorted by contribution.

But no one has any precise formula for computing contribution. So I proposed one.

Check it out in my latest blog post and clap if you like it 👏
WebConf 2020 stats (April 20-24, Taipei)

1129 number of submissions
217 accepted
19% acceptance rate
~30% graph papers
Network Science Institute at Northeastern University
networkscienceinstitute.org

With the director Albert-László Barabási, the focus is on biological networks, epidemiology, and formation.
They also have a YouTube channel with guest presentations on graph theory.
Fresh picks from ArXiv
ICML and KDD 20 submissions, AISTATS 20, Graph Isomorphism, and Review


ICML 20 submissions
Graph Convolutional Gaussian Processes For Link Prediction
When Labelled Data Hurts: Deep Semi-Supervised Classification with the Graph 1-Laplacian
Improving Graph Neural Network Representations of Logical Formulae with Subgraph Pooling
Differentiable Graph Module (DGM) for Graph Convolutional Network by group of Michael Bronstein
Deep Multi-Task Augmented Feature Learning via Hierarchical Graph Neural Network
Graph Universal Adversarial Attacks: A Few Bad Actors Ruin Graph Learning Models
Towards Similarity Graphs Constructed by Deep Reinforcement Learning by Yandex team
Connectivity-driven Communication in Multi-agent Reinforcement Learning through Diffusion Processes on Graphs
Explainable Deep Modeling of Tabular Data using TableGraphNet
Graph Filtration Learning
Graph Prolongation Convolutional Networks
Deep Coordination Graphs
Unifying Graph Convolutional Neural Networks and Label Propagation by group of Jure Leskovec

KDD 20 submissions
Disease State Prediction From Single-Cell Data Using Graph Attention Networks
Entity Context and Relational Paths for Knowledge Graph Completion by group of Jure Leskovec

Theory
Generalization and Representational Limits of Graph Neural Networks by group of Tommi Jaakkola

Graph Isomorphism
A polynomial time parallel algorithm for graph isomorphism using a quasipolynomial number of processors
Isomorphism for Random k-Uniform Hypergraphs

Review
Hypergraphs: an introduction and review
Do Deep Graph Neural Networks exist?

One of the open questions in GNN literature is whether deep GNN, i.e. GNN with many layers (e.g. more than 10), is useful.

There is a theoretical paper, What graph neural networks cannot learn: depth vs width, that proves that at least the number of layers * the embedding size of each layer should be proportional to the number of the nodes in the graph if GNN can compute many Turing computable functions. So if a graph has 10K nodes, then d*w = O(10K). For example, common embedding size, w, is 128 or 256, which means that a number of layers should be 40.

There is a cost associated with each layer: each node has to look at every neighbor and aggregate its information. So most of the implementations have up to 5 layers for obvious reasons, it's very time-consuming to compute.

Somewhat contrary, another theoretical paper, Graph Neural Networks Exponentially Lose Expressive Power for Node Classification, shows that under the certain conditions on the graph, GNN will essentially carry only degree information for each node, which is the most local property you can have for a node. This does not contradict the previous paper as (1) this paper works in a limit, (2) previous paper says that if d*w < O(n) then there is an instance of a graph for which GNN fails, which does not mean the result is universal for all graphs, and (3) this paper has certain conditions to hold which are only applicable to a narrow family of graphs.

Beyond this, there is a question of double descent, whether it occurs in GNN setting, which is yet the next question to solve.

So, my response is that for now we still have little understanding if deep GNN is useful and if so, how we can make them efficient in practice.
NeurIPS 2019 stats

6743 number of submissions
1428 accepted
21% acceptance rate
75 graph papers (5% of accepted)
Ringel’s conjecture is proved.

Ringel's conjecture states that every complete graph with 2n+1 nodes can be decomposed into a set of any identical non-overlapping trees of order n. In other words, take any tree with n nodes, place it on the complete graph with 2n+1 nodes, remove the edges your tree covers, and continue with the remaining graph. No matter which tree you have started with, there is a procedure to remove all the edges in a complete graph by placing your tree step by step.

This conjecture was known for 60 years and finally has been proved last month. At last this article makes a good job explaining how it was done.
Reinforcement Learning for Combinatorial Optimization: A Survey

Our new submission to IJCAI survey track. We surveyed all of the literature we found on applying RL methods for combinatorial optimization problems (e.g. TSP, Knapsack, MaxCut).

There are three types of the RL approaches we categorized the papers: Value-based, Policy-based, and Monte-Carlo Tree Search based. This is one of the domains that appeared very recently, a few years ago, and has an increasing number of successful applications to traditional problems each year. I would say it's a good topic for a fresh Ph.D. student to start working on.
Recent approaches to Graph Convolutional Networks, Graph Representation Learning and Reinforcement Learning

Surprisingly discovered a local workshop on GML with strong list of keynote speakers. Free of charge 🤫

https://gcn-grl-rl.sciencesconf.org/
ML and NLP Publications in 2019

One of the reasons why I have this channel is to encourage people do more research in general. And looking at the recent analysis of the number of accepted papers by countries you can understand why.

USA has almost as much publications last year as all other countries combined.

In particular, it's 2.5x > China, 6x > UK, and 100x > Russia.

The reasons are obvious, USA spends more budget on research from both government and industry, but also the culture of doing research is different: making a publication at top conferences is considered a big deal and a lot of efforts is put on this, — which can definitely adopted by other countries.
Understanding the Representation Power of Graph Neural Networks in Learning Graph Topology

This is NeurIPS 2019 work by the group of Albert-László Barabási, — the one who invented a famous Barabási–Albert graph model.

In this work, they study if GNN can compute the function that approximates graph moments, i.e. powers of adjacency matrix. The key result is that if GNN has number of layers more than the power of the matrix then it can learn a corresponding graph moment. For those who followed previous post, which describes a paper What graph neural networks cannot learn: depth vs width by Loukas, will see that Barabási's paper is a precursor and a partial result of Loukas' paper that states the condition on GNN for computability of any function (not just graph moments).
Transformers are Graph Neural Networks

It seems today everyone is talking about this blog post. It describes how NLP's Transformers are just part of a more general concept we call Graph Neural Networks.

In Transformer, we have an encoder that for every word gives an embedding. The idea is very simple: let's the embedding of word A be a weighted average of embeddings of other words in a sentence. That's it. That's what everyone calls attention. Attention = weighted average. The weights are trained together with embeddings, you also add normalization tricks, positional encodings, MLP, and other deep learning mumbles, but this is like in any other model. What's novel is that you use weighted average, instead of aggregation word by word sequentially.

In GNN we also have an encoder, this encoder just takes a graph in and outputs embeddings of nodes. The most popular way in GNN to do this is through message-passing mechanism, the idea of which is also simple: the embedding of node A is a function of embeddings of its neighbors. Two changes from Transformer:
1) We can have any function, instead of weighted average.
2) We aggregate over the neighbors, instead of all nodes.

If you take weighted average function, i.e. attention, then you have Graph Attention Network. Surprise.

You could also aggregate the nodes beyond your neighbors and I believe there are works that do this, but common assumption is that graph connections are already constructed in such a way that node is most influenced by the incoming edges and not by some distant nodes.

So Transformers are not GNNs, at most Transformers are Graph Attention Networks. This is important if we want to draw some conclusions for NLP from GNN area. For example, Graph Attention Network is less powerful than many other GNNs such as GIN. It also does not have function approximation guarantees that I described in previous posts.

But sure the connection between NLP and GML exists and is not fully explored. I bet you can leverage graphs when you analyze text or you can use GNN for better encoding. GML on the other hand benefited from NLP in many ways already: usually we first see some results in NLP (RNN, Transformers, BERT) and then see its adaptation in GML. So maybe it's the moment we'll see more contributions to NLP with graphs.
Channel photo updated
Comparing Stars: On Approximating Graph Edit Distance

Good old VLDB'09 paper on the computation of graph edit distance.

* It shows the exact distance computation is NP-hard;
* It discusses connections with graph matching;
* And it provides lower and upper bound algorithms.
Fresh picks from ArXiv
This week is full of CVPR and AISTATS 20 accepted papers, new surveys, more submissions to ICML and KDD, and new GNN models 📚

CVPR 20
* Unbiased Scene Graph Generation from Biased Training
* Social-STGCNN: A Social Spatio-Temporal Graph Convolutional Neural Network for Human Trajectory Prediction
* 4D Association Graph for Realtime Multi-person Motion Capture Using Multiple Video Cameras
* Representations, Metrics and Statistics For Shape Analysis of Elastic Graphs
* Say As You Wish: Fine-grained Control of Image Caption Generation with Abstract Scene Graphs
* Fine-grained Video-Text Retrieval with Hierarchical Graph Reasoning
* SketchGCN: Semantic Sketch Segmentation with Graph Convolutional Networks

Survey
* Bridging the Gap between Spatial and Spectral Domains: A Survey on Graph Neural Networks
* Graph Neural Networks Meet Neural-Symbolic Computing: A Survey and Perspective
* Adversarial Attacks and Defenses on Graphs: A Review and Empirical Study
* Knowledge Graphs on the Web -- an Overview

GNN
* Infinitely Wide Graph Convolutional Networks: Semi-supervised Learning via Gaussian Processes
* Can graph neural networks count substructures? by group of Joan Bruna
* Heterogeneous Graph Neural Networks for Malicious Account Detection by group of Le Song

AISTATS 20
* Permutation Invariant Graph Generation via Score-Based Generative Modeling

KDD 20
* PM2.5-GNN: A Domain Knowledge Enhanced Graph Neural Network For PM2.5 Forecasting

ICML 20
* Semi-supervised Anomaly Detection on Attributed Graphs
* Inverse Graphics GAN: Learning to Generate 3D Shapes from Unstructured 2D Data
* Permutohedral-GCN: Graph Convolutional Networks with Global Attention
* Benchmarking Graph Neural Networks by group of Yoshua Bengio


Graph Theory
* Finding large matchings in 1-planar graphs of minimum degree 3
* Trapping problem on star-type graphs with applications
* On Fast Computation of Directed Graph Laplacian Pseudo-Inverse