Graph Machine Learning
6.7K subscribers
53 photos
11 files
808 links
Everything about graph theory, computer science, machine learning, etc.


If you have something worth sharing with the community, reach out @gimmeblues, @chaitjo.

Admins: Sergey Ivanov; Michael Galkin; Chaitanya K. Joshi
Download Telegram
Graphs and Networks Workshop

There is one-day free online workshop for those who love network science, happening this Friday, July 10.
Channel photo updated
ICML GRL Workshop Papers

There are 73 interesting short papers on various topics of GML at ICML GRL workshop.
Easiest unsolved case of reconstruction conjecture

We spent some time thinking about reconstruction conjecture and got into a conclusion that no one yet solved a simple case as follows.

You have a graph composed of two parts X and Y. In X you have n vertices, in Y you have 2 vertices only. Vertices in Y are connected to X such that all vertices in X have degree 5 and all vertices in Y have degree 4.

So final graph has only two values of degrees, 4 and 5. It's know that when a graph is regular, it can be reconstructed. Here there are only 2 vertices of different degree, nonetheless the problem becomes harder. In more general case you have 2 vertices of degree a and n vertices of degree a+1, but it seems to be much harder to reason about.
Knowledge Graphs at ACL 2020

Another brilliant post by Michael Galkin on usage of knowledge graphs in NLP at ACL 2020.

"Knowledge graphs demonstrate better capabilities to reveal higher-order interdependencies in otherwise unstructured data."

Content:
1. Question Answering over Structured Data
2. KG Embeddings: Hyperbolic and Hyper-relational
3. Data-to-text NLG: Prepare your Transformer
4. Conversational AI: Improving Goal-Oriented Bots
5. Information Extraction: OpenIE and Link Prediction
Beyond Weisfeiler-Lehman: approximate isomorphisms and metric embeddings

Last post in a series of posts by M. Bronstein about how we can reformulate the framework of designing GNNs. Most popular approach currently is to show that GNN can be equivalent to WL algorithm, which in turn implies that the algorithm can detect isomorphism of most graphs. However, this is not very valuable in practice, as there are very few graphs. Instead, it would be great to have GNN that preserve some kind of distance (e.g. edit distance) between the graphs in euclidean space, up to some error. This can be seen as a generalization of the current framework, when we care only about the case when the two graphs are isomorphic.
Graph Clustering with Graph Neural Networks

This is a guest post by Anton Tsitsulin (@xgfsru) about their new work called “Graph Clustering with Graph Neural Networks”.

We are proposing a new perspective on unsupervised training of Graph Neural Networks (GNNs) through the lens of graph clustering. Attributed graph clustering is a popular task that poses a unique challenge: we have to balance the information we get from the graph itself and node attributes (features). Can we create an end-to-end GNN model with automatic balance of graph & feature structure?

Intuitively, graph clustering with graph neural networks should be connected with graph pooling: graph clusters that are close in terms of their features can be safely merged together. Not all pooling methods are useful for clustering, however, as some of them do not collapse together nodes (e.g. top-k pooling). State-of-the-art there is MinCut pooling that appeared in ICML’20.

We present DMoN, a GNN loss function that directly optimizes graph modularity in its spectral formulation. Modularity is known to perform better than cut-based metrics in real-world graphs, because of that we obtain an average of 30% better label correlation than the best other neural method, and 45% better scores for graph clustering across 7 datasets.

To better investigate the performance of GNNs for clustering, we also propose a set of synthetic benchmarks. For example, in the attached image we show how the performance of different methods changes with varying signal strength in either graph or node features. We hope that this methodology will allow more thorough evaluation of different models in the future.

TL;DR:
- Graph pooling is very similar to graph clustering, it’s a good idea to evaluate pooling methods on clustering tasks.
- We show how to do clustering in an end-to-end way with GNNs via spectral modularity optimization.
- DMoN offers substantial (30-50% across different metrics) quality increases on 7 datasets.

Paper: https://arxiv.org/abs/2006.16904
Code will be available here soon.
Graph of words

GPT-3 is a new SOTA for text generation and we thought it would be cool to produce a graph of words from it (if someone). I think no one has yet done it and it can bring some insights in how the language is structured. For example, if we can do graph clustering and obtain same clusters as in pure text; or, if we can find rare combination of words; or, discovering the difference between such graph and graph coming from n-grams. Potentially, this can be a hypergraph with multiple nodes connected at once, corresponding to entire phrases.
Graph Machine Learning research groups: Jian Tang

I do a series of posts on the groups in graph research, previous post is here. The tenth is Jian Tang. A young professor at Mila, with 4 papers at ICML and 2 at ICLR in 2020, and looking for graduate students (in case you want to pursue PhD).


Jian Tang (~1987)
- Affiliation: Mila-Quebec AI Institute and HEC Montreal
- Education: Ph.D. at Peking University, Chine in 2014 (supervised by Ming Zhang);
- h-index: 19;
- Awards: Amazon faculty, best paper at KDD workshop;
- Interests: graph molecular design; graph generative models; knowledge graphs
How Gödel’s Proof Works

Down-to-earth explanation by QuantaMagazine of how Gödel proved his famous incompleteness theorems. To many these results are profound illustration that mathematics is doomed to capture all real world effects: there are true facts about this world that we will never prove. One example of such "unprovable" problem is Continuum hypothesis: there is no infinite set in between set of integers (countable) and set of reals (uncountable).

In some sense, Gödel results killed mathematics and, luckily, paved the way for the emergence of computer science. There is a very good comics (I'm not a fan of comics, but this one about mathematics and short) called Logicomix that explains well the events that happened at the start of 20th century in mathematics, highlighting how many great thinkers such as Russell, Cantor, Boole, Wittgenstein, Hilbert, Frege, Poincaré, Gödel, and Turing approached the creation of new math and eventually failed.
KDD 2020 stats

KDD is one of the biggest top conferences in GML.

Dates: Aug 23-27
Where: Online
Cost: $250

• 1279 research track / 756 applied data science track (vs 1200 (only research track) in 2019)
• 217 / 121 accepted (vs 170 in 2019)
• 16.9% / 16.0% acceptance rate (vs 14.2% in 2019)
• 70 / 14 graph papers (32% / 11% of total)
Do we need deep graph neural networks?

A very frequent research question is discussed in a new blog post of Michael Bronstein. The problem with deep GNN is called over-smoothing and is related to the fact that with more layers GNN tends to produce embeddings that are equal across all nodes.

This problem started with AAAI'20 paper and now received a lot of attention (I'd say this is the 2nd most popular theoretical question about GNN after expressiveness), proposing different methods to tackle over-smoothing. It seems that if the graph/node labels depend on high-order information such as graphlets then the depth is necessary; however, encountering such data sets in real life may not be common,
ECCV 2020 stats

ECCV is among the best conferences in computer vision.

Dates: Aug 23-28
Where: Online
Cost: £150
Link to papers

• 5025 submissions (vs 2439 in 2019)
• 1361 accepted (vs 776 in 2019)
• 27.1% acceptance rate (vs 31.8% in 2018)
• 4 graph papers
IJCAI 2020 stats

IJCAI moved its dates to Jan 2021.

Dates: Jan 2021
Where: Japan/Online
Link to papers

• 4717 submissions (vs 4752 in 2019)
• 592 accepted (vs 850 in 2019)
• 12.6% acceptance rate (vs 17.9% in 2018)
• 55 graph papers
Opening-Remarks-GRL+.pdf
6.6 MB
Opening slides from GRL+ workshop (ICML 20) by Petar Veličković.
Trends in GML

I think GRL+ workshop is really cool: it gathers people in GML and discusses ideas that are not fully developed but will be soon. It's like peeking into the crystal ball. Petar Veličković, one of the organizers of this workshop, outlined the following trends:

- Emerging work on performance / scalability (e.g. SIGN, Weisfeiler & Leman go sparse)
- KG embeddings are as strong as ever (e.g. neural multi-hop reasoning, MPQE, Stay Positive, UniKER)
- proposal of many datasets/benchmarks/libraries (Wiki-CS, TUDataset, Spektral, Graphein, Geo2DR, Geoopt)
- work on computational chemistry (with applications to drug design/repurposing), such as the Retrosynthesis paper (which won best paper award)
- Applications of GRL for algorithmic reasoning (e.g. Neural Bipartite Matching, planning with neuro-algorithmic policies. and PGNs)

But the obvious standout, not only in the papers but also in most of our invited talks, is the explicit consideration of structure.