Graph Machine Learning
6.7K subscribers
53 photos
11 files
808 links
Everything about graph theory, computer science, machine learning, etc.


If you have something worth sharing with the community, reach out @gimmeblues, @chaitjo.

Admins: Sergey Ivanov; Michael Galkin; Chaitanya K. Joshi
Download Telegram
Optimal transport: a hidden gem that empowers today’s machine learning

Very simple explanation of what optimal transport problem is and how it can be applied to various domains such as computer vision. Interestingly just yesterday there was a paper on optimal transport GNN.
June Arxiv: how many graphs papers?

From 18 March to 17 April there were 282 new and 98 updated papers in ArXiv CS section. This is 18 papers less that in the previous period.
Graph Machine Learning research groups: Tommi Jaakkola

I do a series of posts on the groups in graph research, previous post is here. The eighth is Tommi Jaakkola. He has 7 papers in upcoming ICML 2020. His recent interests include molecular graph design and he maintains AI initiative for finding promising antiviral molecules for COVID-19.


Tommi Jaakkola (~1971)
- Affiliation: MIT
- Education: Ph.D. at MIT in 1997 (supervised by Michael Jordan);
- h-index: 76;
- Awards: Sloan research fellowship, AAAI Fellow;
- Interests: molecular generation, models of GNN
DeepSnap

There is a release of DeepSnap by Stanford group. I have not tested it, but it should allow applying graph algorithms from networkx to pytorch-geometric graphs.
Fresh picks from ArXiv
This week highlights applications of GNNs to molecules, contagion, NLP, recommender systems and more.

GNN
Generalizing Graph Neural Networks Beyond Homophily
Finding Patient Zero: Learning Contagion Source with Graph Neural Networks with Albert-László Barabási
MoFlow: An Invertible Flow Model for Generating Molecular Graphs
Quantifying Challenges in the Application of Graph Representation Learning
Neural Architecture Optimization with Graph VAE
Interactive Recommender System via Knowledge Graph-enhanced Reinforcement Learning
Subgraph Neural Networks with Marinka Zitnik
Temporal Graph Networks for Deep Learning on Dynamic Graphs with Michael Bronstein
Erdos Goes Neural: an Unsupervised Learning Framework for Combinatorial Optimization on Graphs with Andreas Loukas
Walk Message Passing Neural Networks and Second-Order Graph Neural Networks
Isometric Graph Neural Networks
Modeling Graph Structure via Relative Position for Better Text Generation from Knowledge Graphs
Improving Graph Neural Network Expressivity via Subgraph Isomorphism Counting with Michael Bronstein

Math:
Local limit theorems for subgraph counts
Longest and shortest cycles in random planar graphs


Conferences
How to Count Triangles, without Seeing the Whole Graph KDD 2020
GCC: Graph Contrastive Coding for Graph Neural Network Pre-Training KDD 2020

Surveys
Localized Spectral Graph Filter Frames: A Unifying Framework, Survey of Design Considerations, and Numerical Comparison
Spektral

Spektral is a library to code GNN in Tensorflow 2 and Keras. New version includes:

- a unified message-passing interface based on gather-scatter
- 7 new GNN layers
- Huge performance improvements
- Improved utils, docs, and examples

The paper will be presented in GRL workshop.
Criteo papers at ICML 2020

Criteo, where I work, this year has record number of accepted papers at ICML. We have 9 papers on various topics, from online learning to theory of optimization to GANs. It makes us 1st company in EU and top-7 company worldwide (among 134 companies who have their papers accepted). So I wrote a short description of each paper in a new blog post.
Top number of submissions at NeurIPS 2020

Mastodons of ML are the following:

* Peter Richtárik (KAUST) 14
* Bernhard Schölkopf (MPI) 13
* Sergey Levine (UC Berkeley) 12
* Masashi Sugiyama (RIKEN) 11
* Yoshua Bengio (MILA) 11

This is based on 2313 arXiv papers that are submitted to NeurIPS2020.

Last year there were at least some people with 15 submissions, so it's probably underestimates these numbers. Also, compared to last year there was 54% of the papers appearing in arXiv at the moment of the conference. For this year, today there are 25% of arXiv papers, so it means not everyone submitted their papers to arXiv.
Bringing Light Into the Dark: A Large-scale Evaluation of Knowledge Graph Embedding Models Under a Unified Framework

This is a post by Michael Galkin (@gimmeblues) about their new work on comprehensive evaluation of knowledge graph embeddings. A lot of interesting insights about knowledge graphs.

Today we are publishing the results of our large-scale benchmarking study of knowledge graph (KG) embedding approaches. Further, we are releasing the code of PyKEEN 1.0 - the library behind the study (in PyTorch)! What makes KGs special: they often have hundreds or thousands of different relations (edge types), and having good representations is essential for reasoning in embedding spaces as well as for numerous NLP tasks.

We often evaluate KG embeddings on the link prediction task - given subject+predicate, the model has to predict most plausible objects. As typical KGs contain 50k-100k different entities, you can guess the top1/top10 ranking task is quite complex!

Why benchmarking is important: currently, there is no baseline numbers to refer to. Lots of papers in the domain are not reproducible, or the authors simply take metrics values as reported in other papers withougt reproducing their results.

In this study, we ran 65K+ experiments and spent 21K+ GPU hours evaluating 19 models spanning from RESCAL first published in 2011 to the late 2019's RotatE and TuckER, 5 loss functions, training strategies with/without negative sampling, and many more hyper-parameters that turn out to be important to consider.

Key findings:
- Careful HPO optimization brings us new SOTA results giving significant gains of 4-5% compared to reported results in respective papers (btw, we used Optuna for HPO);
- Properly tuned classical models (TransE, DistMult) are still good and actually outperform several newer models;
- No Best-of-the-Best Silver Bullet model that beats all others across all tasks - some models better capture transitivity, whereas other better capture symmetric relations;
- Surprisingly, for the inherently ranking task, the ranking loss (or MarginRankingLoss in PyTorch) is suboptimal. Instead, Cross-Entropy and its variations show better result;
- Using all enities for negative sampling, i.e., sigmoid/softmax distribution over all enities, works well but can be quite expensive on large KGs. Stochastic negative sampling is a way to go then;
- Computationally expensive and bigger models do not yield that big and drastic performance gains. In fact, 64-d Rotate is better than most 500-d models.


Paper: https://arxiv.org/abs/2006.13365
Code: https://github.com/pykeen/pykeen
Manually-curated List of Combinatorial Conferences

Mostly mathematical, with some occasions on CS, here is a manually-curated list of upcoming conferences, workshops, symposiums on combinatorics, among which you can find graph-related topics.
UAI 2020 stats

UAI is a small but strong conference on AI.

Dates: 3-6 Aug
Where: Online
Cost: 125$
Papers available online.

• 580 submissions (vs 450 in 2019)
• 140 accepted (vs 118 in 2019)
• 24.1% acceptance rate (vs 26% in 2019)
• 5 graph papers (4% of total)
Open Problems - Graph Theory and Combinatorics

In addition to Open Problem Garden, there is a list of open problems in graph theory and a corresponding old archive. Sometimes proof to these is just a specific graph that even people without background may find.
Graph Machine Learning research groups: Stephan Günnemann

I do a series of posts on the groups in graph research, previous post is here. The nineth is Stephan Günnemann. His research interests include adversarial attacks on graphs and graph generative models.


Stephan Günnemann (~1984)
- Affiliation: Technical University of Munich
- Education: Ph.D. at RWTH Aachen University in 2012 (supervised by Thomas Seidl);
- h-index: 30;
- Awards: best paper at KDD;
- Interests: graph adversarial attacks; clustering; graph generative models