Graph Machine Learning
6.7K subscribers
53 photos
11 files
808 links
Everything about graph theory, computer science, machine learning, etc.


If you have something worth sharing with the community, reach out @gimmeblues, @chaitjo.

Admins: Sergey Ivanov; Michael Galkin; Chaitanya K. Joshi
Download Telegram
Open Problems - Graph Theory and Combinatorics

In addition to Open Problem Garden, there is a list of open problems in graph theory and a corresponding old archive. Sometimes proof to these is just a specific graph that even people without background may find.
Graph Machine Learning research groups: Stephan Günnemann

I do a series of posts on the groups in graph research, previous post is here. The nineth is Stephan Günnemann. His research interests include adversarial attacks on graphs and graph generative models.


Stephan Günnemann (~1984)
- Affiliation: Technical University of Munich
- Education: Ph.D. at RWTH Aachen University in 2012 (supervised by Thomas Seidl);
- h-index: 30;
- Awards: best paper at KDD;
- Interests: graph adversarial attacks; clustering; graph generative models
Graphs with the same degree distribution

Degree distribution plays a key distinctive role between graphs. In networks science there are specific models that generate you a graph according to some distribution of degrees. For example, scale-free networks are the ones with power law degree distribution, which we observe in real world (e.g. social networks). Scale-free networks use preferential attachment mechanism that mimics the way people connect with others in a new society: we connect to people with high degree and people that we know. The Barabási–Albert model is the most famous example of such a model.

What's interesting in some cases is to provide explicitly the degrees that you expect to have in a graph and then generate a graph with this sequence of degrees. There is a model for that too: it's called Chung-Lu model. Yet, in some other cases, you want to generate a graph exactly with some degree sequence. This is quite simple, you just connect pairs of vertices one by one, until you make a desired degree sequence. It shows how many actually there are different graphs with the same degree sequence. Here is an explanation of this.
Beyond Weisfeiler-Lehman: using substructures for provably expressive graph neural networks

This is the second post by Michael Bronstein, where he discussed his recent architecture of GNN. In one sentence, they append information about graph statistics, such as number of 4-cliques, to message-passing mechanism and show that it is theoretically equivalent to k-WL, with fraction of its cost.

For more than 6 months, I wondered why do we try to design GNN that can solve graph isomorphism (GI), if in all cases we are at most as good as already known algorithms to GI. What if we just take a automorphism group of a graph and then append this information to GNN, hoping it will help for downstream tasks. This way we solve GI by default by using automorphism group, and just measure effectiveness of the GNN for the tasks that matter.
Graphs and Networks Workshop

There is one-day free online workshop for those who love network science, happening this Friday, July 10.
Channel photo updated
ICML GRL Workshop Papers

There are 73 interesting short papers on various topics of GML at ICML GRL workshop.
Easiest unsolved case of reconstruction conjecture

We spent some time thinking about reconstruction conjecture and got into a conclusion that no one yet solved a simple case as follows.

You have a graph composed of two parts X and Y. In X you have n vertices, in Y you have 2 vertices only. Vertices in Y are connected to X such that all vertices in X have degree 5 and all vertices in Y have degree 4.

So final graph has only two values of degrees, 4 and 5. It's know that when a graph is regular, it can be reconstructed. Here there are only 2 vertices of different degree, nonetheless the problem becomes harder. In more general case you have 2 vertices of degree a and n vertices of degree a+1, but it seems to be much harder to reason about.
Knowledge Graphs at ACL 2020

Another brilliant post by Michael Galkin on usage of knowledge graphs in NLP at ACL 2020.

"Knowledge graphs demonstrate better capabilities to reveal higher-order interdependencies in otherwise unstructured data."

Content:
1. Question Answering over Structured Data
2. KG Embeddings: Hyperbolic and Hyper-relational
3. Data-to-text NLG: Prepare your Transformer
4. Conversational AI: Improving Goal-Oriented Bots
5. Information Extraction: OpenIE and Link Prediction
Beyond Weisfeiler-Lehman: approximate isomorphisms and metric embeddings

Last post in a series of posts by M. Bronstein about how we can reformulate the framework of designing GNNs. Most popular approach currently is to show that GNN can be equivalent to WL algorithm, which in turn implies that the algorithm can detect isomorphism of most graphs. However, this is not very valuable in practice, as there are very few graphs. Instead, it would be great to have GNN that preserve some kind of distance (e.g. edit distance) between the graphs in euclidean space, up to some error. This can be seen as a generalization of the current framework, when we care only about the case when the two graphs are isomorphic.
Graph Clustering with Graph Neural Networks

This is a guest post by Anton Tsitsulin (@xgfsru) about their new work called “Graph Clustering with Graph Neural Networks”.

We are proposing a new perspective on unsupervised training of Graph Neural Networks (GNNs) through the lens of graph clustering. Attributed graph clustering is a popular task that poses a unique challenge: we have to balance the information we get from the graph itself and node attributes (features). Can we create an end-to-end GNN model with automatic balance of graph & feature structure?

Intuitively, graph clustering with graph neural networks should be connected with graph pooling: graph clusters that are close in terms of their features can be safely merged together. Not all pooling methods are useful for clustering, however, as some of them do not collapse together nodes (e.g. top-k pooling). State-of-the-art there is MinCut pooling that appeared in ICML’20.

We present DMoN, a GNN loss function that directly optimizes graph modularity in its spectral formulation. Modularity is known to perform better than cut-based metrics in real-world graphs, because of that we obtain an average of 30% better label correlation than the best other neural method, and 45% better scores for graph clustering across 7 datasets.

To better investigate the performance of GNNs for clustering, we also propose a set of synthetic benchmarks. For example, in the attached image we show how the performance of different methods changes with varying signal strength in either graph or node features. We hope that this methodology will allow more thorough evaluation of different models in the future.

TL;DR:
- Graph pooling is very similar to graph clustering, it’s a good idea to evaluate pooling methods on clustering tasks.
- We show how to do clustering in an end-to-end way with GNNs via spectral modularity optimization.
- DMoN offers substantial (30-50% across different metrics) quality increases on 7 datasets.

Paper: https://arxiv.org/abs/2006.16904
Code will be available here soon.
Graph of words

GPT-3 is a new SOTA for text generation and we thought it would be cool to produce a graph of words from it (if someone). I think no one has yet done it and it can bring some insights in how the language is structured. For example, if we can do graph clustering and obtain same clusters as in pure text; or, if we can find rare combination of words; or, discovering the difference between such graph and graph coming from n-grams. Potentially, this can be a hypergraph with multiple nodes connected at once, corresponding to entire phrases.
Graph Machine Learning research groups: Jian Tang

I do a series of posts on the groups in graph research, previous post is here. The tenth is Jian Tang. A young professor at Mila, with 4 papers at ICML and 2 at ICLR in 2020, and looking for graduate students (in case you want to pursue PhD).


Jian Tang (~1987)
- Affiliation: Mila-Quebec AI Institute and HEC Montreal
- Education: Ph.D. at Peking University, Chine in 2014 (supervised by Ming Zhang);
- h-index: 19;
- Awards: Amazon faculty, best paper at KDD workshop;
- Interests: graph molecular design; graph generative models; knowledge graphs
How Gödel’s Proof Works

Down-to-earth explanation by QuantaMagazine of how Gödel proved his famous incompleteness theorems. To many these results are profound illustration that mathematics is doomed to capture all real world effects: there are true facts about this world that we will never prove. One example of such "unprovable" problem is Continuum hypothesis: there is no infinite set in between set of integers (countable) and set of reals (uncountable).

In some sense, Gödel results killed mathematics and, luckily, paved the way for the emergence of computer science. There is a very good comics (I'm not a fan of comics, but this one about mathematics and short) called Logicomix that explains well the events that happened at the start of 20th century in mathematics, highlighting how many great thinkers such as Russell, Cantor, Boole, Wittgenstein, Hilbert, Frege, Poincaré, Gödel, and Turing approached the creation of new math and eventually failed.
KDD 2020 stats

KDD is one of the biggest top conferences in GML.

Dates: Aug 23-27
Where: Online
Cost: $250

• 1279 research track / 756 applied data science track (vs 1200 (only research track) in 2019)
• 217 / 121 accepted (vs 170 in 2019)
• 16.9% / 16.0% acceptance rate (vs 14.2% in 2019)
• 70 / 14 graph papers (32% / 11% of total)