My thoughts on Kelly's conjecture.
A Kelly's conjecture (a.k.a. reconstruction conjecture) states that for a graph with
In other words, pick a graph, remove one vertex (with all its outgoing edges), and put to the set. Then start with the original graph, remove another vertex, and put it to the set. Repeat for all vertices. Finally, you will have a set of
Now, the conjecture states that deck uniquely determines the original graph, hence, it's a complete graph invariant, i.e. two graphs have the same deck if and only if, they are isomorphic.
This conjecture has been known since 1957 and many results have been achieved. For example, it's known that regular graphs or trees can be reconstructed uniquely with a deck, but directed graphs are not. For general undirected case it's still not clear. There is even an ArXiv paper from 2017 that claimed to resolve the conjecture, but after communication with the authors, they admit there is a mistake in the proof.
This idea has been in graph representation too. In particular, there is a graphlet kernel that decomposes a graph into a set of small graphlets/motifs. This is quite expressive kernel, i.e. it can recognize a lot of graphs, but whether it recognizes all of the graphs is still a question.
For a long time, I believed that if Kelly's conjecture is true (which is quite likely), then not only a deck of graphs on
But yesterday I found that there are non-isomorphic graphs that have the same decomposition on smaller graphs. In particular, if you take a cycle of
For example, for k=3, we have graphlets, i.e. triplets which are connected in all possible ways. In this case, G1 is a cycle on 10 vertices and G2 is two cycles on 5 vertices. The number of closed triangles would be 0 for G1 and G2, the number of 2 connected points and one disconnected would be 60 for G1 and G2, and so on. In the end, the decomposition would be the same, while the graphs are different.
Moreover, in recent paper IJCAI '18 it's shown that graphlet kernel cannot identify some planar from non-planar, bijective from non-bijective, and connected from disconnected graphs. This is quite interesting as it shows the limitations of graphlet kernel and in general graphlet decomposition.
A Kelly's conjecture (a.k.a. reconstruction conjecture) states that for a graph with
n vertices, a deck with n graphs on (n-1) vertices can reconstruct exactly a graph. In other words, pick a graph, remove one vertex (with all its outgoing edges), and put to the set. Then start with the original graph, remove another vertex, and put it to the set. Repeat for all vertices. Finally, you will have a set of
n graphs with n-1 vertices, which is called a deck. Now, the conjecture states that deck uniquely determines the original graph, hence, it's a complete graph invariant, i.e. two graphs have the same deck if and only if, they are isomorphic.
This conjecture has been known since 1957 and many results have been achieved. For example, it's known that regular graphs or trees can be reconstructed uniquely with a deck, but directed graphs are not. For general undirected case it's still not clear. There is even an ArXiv paper from 2017 that claimed to resolve the conjecture, but after communication with the authors, they admit there is a mistake in the proof.
This idea has been in graph representation too. In particular, there is a graphlet kernel that decomposes a graph into a set of small graphlets/motifs. This is quite expressive kernel, i.e. it can recognize a lot of graphs, but whether it recognizes all of the graphs is still a question.
For a long time, I believed that if Kelly's conjecture is true (which is quite likely), then not only a deck of graphs on
(n-1) vertices is unique, but also of smaller graphs. The assumption was that you can in turn decompose a (n-1) graph into a set of (n-2) deck and move deeper to any number of vertices that you want. But yesterday I found that there are non-isomorphic graphs that have the same decomposition on smaller graphs. In particular, if you take a cycle of
4k+2 vertices as G1 and two cycles of 2k+1 as G2, then a decomposition into k-size graphlets will be the same. For example, for k=3, we have graphlets, i.e. triplets which are connected in all possible ways. In this case, G1 is a cycle on 10 vertices and G2 is two cycles on 5 vertices. The number of closed triangles would be 0 for G1 and G2, the number of 2 connected points and one disconnected would be 60 for G1 and G2, and so on. In the end, the decomposition would be the same, while the graphs are different.
Moreover, in recent paper IJCAI '18 it's shown that graphlet kernel cannot identify some planar from non-planar, bijective from non-bijective, and connected from disconnected graphs. This is quite interesting as it shows the limitations of graphlet kernel and in general graphlet decomposition.
Learning to Simulate Complex Physics with Graph Networks
For those who are interested in the overlap of physics and graph machine learning, there is a nice video by Peter Battaglia (DeepMind), who has been working on this topic for years. Also they have a recent submission to ICML 20 with some cool videos of GNN predictions of real physics.
Beyond the cool research that they do, it's incredible to see the animation that they accompany their papers with. You know what they say it's more important how you present the paper than what your paper is about.
For those who are interested in the overlap of physics and graph machine learning, there is a nice video by Peter Battaglia (DeepMind), who has been working on this topic for years. Also they have a recent submission to ICML 20 with some cool videos of GNN predictions of real physics.
Beyond the cool research that they do, it's incredible to see the animation that they accompany their papers with. You know what they say it's more important how you present the paper than what your paper is about.
YouTube
Peter Battaglia: "Learning structured models of physics"
Machine Learning for Physics and the Physics of Learning 2019
Workshop II: Interpretable Learning in Physical Sciences
"Learning structured models of physics"
Peter Battaglia, DeepMind Technologies
Abstract: This talk will describe a class of machine learning…
Workshop II: Interpretable Learning in Physical Sciences
"Learning structured models of physics"
Peter Battaglia, DeepMind Technologies
Abstract: This talk will describe a class of machine learning…
A Survey on The Expressive Power of Graph Neural Networks
This is the best survey on the theory on GNNs I'm aware of. It produces so many illustrative examples on what GNN can and cannot distinguish.
It's funny, it's made by Ryoma Sato who I already saw from other works on GNNs and I thought it's one of these old Japanese professors with long beard and strict habits, but it turned out to be a 1st year MSc student 🇯🇵
This is the best survey on the theory on GNNs I'm aware of. It produces so many illustrative examples on what GNN can and cannot distinguish.
It's funny, it's made by Ryoma Sato who I already saw from other works on GNNs and I thought it's one of these old Japanese professors with long beard and strict habits, but it turned out to be a 1st year MSc student 🇯🇵
Random Features Strengthen Graph Neural Networks
Continuation of the previous post, a recent submission to ICML 20 by Sato et al. This is quite cool as it shows you can add a random number to your node embedding and it would become a more powerful GNN. They improved bounds on Minimum Dominating Set and Maximum Matching problems and showed that it can recognize graphs that other GNNs cannot.
Continuation of the previous post, a recent submission to ICML 20 by Sato et al. This is quite cool as it shows you can add a random number to your node embedding and it would become a more powerful GNN. They improved bounds on Minimum Dominating Set and Maximum Matching problems and showed that it can recognize graphs that other GNNs cannot.
Graph Machine Learning research groups: Jure Leskovec
I will do a series of posts on the groups in graph research. The first one will be on the most famous researcher in GML, Jure Leskovec.
Once I had a chance to visit his lab at Stanford and to talk to his students. By the time I already worked in a company and when I asked if there are topics we can work together, they said that typically a company needs to make a "gift" to start working for them. Naively, I asked what's the gift. They answered "somewhat, around 200-300K USD". So I still wait for my publications with them:)
Jure Leskovec (1980)
- Affiliation: Stanford, Pinterest;
- Education: Ph.D. at CMU in 2008 (supervised by Christos Faloutsos);
- h-index: 103;
- Awards: best papers at KDD, ICDM, WSDM; Lagrange prize;
- Interests: new models for GNN, social network analysis, diffusion models.
I will do a series of posts on the groups in graph research. The first one will be on the most famous researcher in GML, Jure Leskovec.
Once I had a chance to visit his lab at Stanford and to talk to his students. By the time I already worked in a company and when I asked if there are topics we can work together, they said that typically a company needs to make a "gift" to start working for them. Naively, I asked what's the gift. They answered "somewhat, around 200-300K USD". So I still wait for my publications with them:)
Jure Leskovec (1980)
- Affiliation: Stanford, Pinterest;
- Education: Ph.D. at CMU in 2008 (supervised by Christos Faloutsos);
- h-index: 103;
- Awards: best papers at KDD, ICDM, WSDM; Lagrange prize;
- Interests: new models for GNN, social network analysis, diffusion models.
Fresh picks from ArXiv
This week continues with accepted papers to CVPR and WebConf, submissions to ICML, lots of interesting surveys and even happiness in graphs ☺️
Survey
• Reconfiguration of Colourings and Dominating Sets in Graphs: a Survey
• A Survey of Adversarial Learning on Graphs
• Graph Spanners: A Tutorial Review
• A Survey on Contextual Embeddings
• Hyper-Parameter Optimization: A Review of Algorithms and Applications
• Integrating Physics-Based Modeling with Machine Learning: A Survey
CVPR 20
• Towards High-Fidelity 3D Face Reconstruction from In-the-Wild Images Using Graph Convolutional Networks
• VSGNet: Spatial Attention Network for Detecting Human Object Interactions Using Graph Convolutions
• Adaptive Graph Convolutional Network with Attention Graph Clustering for Co-saliency Detection
WebConf 20
• Reinforced Negative Sampling over Knowledge Graph for Recommendation
ICML 20
• Gauge Equivariant Mesh CNNs: Anisotropic convolutions on geometric graphs by group of Max Welling
• Learning Algebraic Multigrid Using Graph Neural Networks
• Wasserstein-based Graph Alignment
• Evaluating Logical Generalization in Graph Neural Networks by group of William L. Hamilton
• Universal Function Approximation on Graphs using Multivalued Functions
Physics
• Characterization of solvable spin models via graph invariants
Graph Theory
• Maximizing Happiness in Graphs of Bounded Clique-Width
• The Reconstruction Conjecture for finite simple graphs and associated directed graphs
Libraries
• An API Oriented Open-source Python Framework for Unsupervised Learning on Graphs
This week continues with accepted papers to CVPR and WebConf, submissions to ICML, lots of interesting surveys and even happiness in graphs ☺️
Survey
• Reconfiguration of Colourings and Dominating Sets in Graphs: a Survey
• A Survey of Adversarial Learning on Graphs
• Graph Spanners: A Tutorial Review
• A Survey on Contextual Embeddings
• Hyper-Parameter Optimization: A Review of Algorithms and Applications
• Integrating Physics-Based Modeling with Machine Learning: A Survey
CVPR 20
• Towards High-Fidelity 3D Face Reconstruction from In-the-Wild Images Using Graph Convolutional Networks
• VSGNet: Spatial Attention Network for Detecting Human Object Interactions Using Graph Convolutions
• Adaptive Graph Convolutional Network with Attention Graph Clustering for Co-saliency Detection
WebConf 20
• Reinforced Negative Sampling over Knowledge Graph for Recommendation
ICML 20
• Gauge Equivariant Mesh CNNs: Anisotropic convolutions on geometric graphs by group of Max Welling
• Learning Algebraic Multigrid Using Graph Neural Networks
• Wasserstein-based Graph Alignment
• Evaluating Logical Generalization in Graph Neural Networks by group of William L. Hamilton
• Universal Function Approximation on Graphs using Multivalued Functions
Physics
• Characterization of solvable spin models via graph invariants
Graph Theory
• Maximizing Happiness in Graphs of Bounded Clique-Width
• The Reconstruction Conjecture for finite simple graphs and associated directed graphs
Libraries
• An API Oriented Open-source Python Framework for Unsupervised Learning on Graphs
How many graphs papers are there?
From 14 February to 17 March there were 335 new and 152 updated papers in ArXiv CS section. This section is usually the place for all GML works.
I didn't include here a more theoretical section, ArXiv math, which usually has more or less the same number of papers as CS section.
From 14 February to 17 March there were 335 new and 152 updated papers in ArXiv CS section. This section is usually the place for all GML works.
I didn't include here a more theoretical section, ArXiv math, which usually has more or less the same number of papers as CS section.
How powerful graph neural networks?
I spent last night on studying the proofs from the ground-breaking ICLR paper of 2018, How Poweful are Graph Neural Networks by Xu et al. (a collaboration between MIT and Stanford). I almost convinced myself that everything is correct when I got a major insight that I think I need to share.
This paper is revolutionary because it shows that you can build GNN such that it would solve graph isomorphism problem almost surely. In particular, their model, Graph Isomorphism Network (GIN) is as powerful as 1-WL algorithm, which is known for 50 years and almost solves graph isomorphism problem, except for a few edge cases. I think this is how everyone in the field think about this paper: there is an architecture which you can train such that it solves very hard combinatorial problem. But I realized it's not the case and there are situations when GIN cannot distinguish non-isomorphic graphs, essentially meaning that sometimes GIN can be less powerful than 1-WL.
In their paper they show that if your aggregation and readout functions are injective then GNN will be as powerful as 1-WL algorithm. Look at theorem below.
I spent last night on studying the proofs from the ground-breaking ICLR paper of 2018, How Poweful are Graph Neural Networks by Xu et al. (a collaboration between MIT and Stanford). I almost convinced myself that everything is correct when I got a major insight that I think I need to share.
This paper is revolutionary because it shows that you can build GNN such that it would solve graph isomorphism problem almost surely. In particular, their model, Graph Isomorphism Network (GIN) is as powerful as 1-WL algorithm, which is known for 50 years and almost solves graph isomorphism problem, except for a few edge cases. I think this is how everyone in the field think about this paper: there is an architecture which you can train such that it solves very hard combinatorial problem. But I realized it's not the case and there are situations when GIN cannot distinguish non-isomorphic graphs, essentially meaning that sometimes GIN can be less powerful than 1-WL.
In their paper they show that if your aggregation and readout functions are injective then GNN will be as powerful as 1-WL algorithm. Look at theorem below.
But then, they say that GIN in fact uses MLP as the aggregation function \phi and they only hope that it will learn injective function. There is no constraint on the MLP so in principle it can easily learn non-injective function, in which case it will be less powerful then 1-WL.
This is very interesting as essentially it means that in the best case scenario GIN will be as powerful as 1-WL but in general it's not and this what most people including myself don't get right about this work. It would be very interesting to see a counterexample when GIN cannot distinguish graphs that 1-WL can.
Twitter conference
There is an interesting format of neuroscience conference happening right now: 24-hour of papers on Twitter. Each paper is presented as a series of tweets at the allocated time slot. No registration, no videos, but you can still talk to the authors. To get an idea, take a look at this graph paper, Characterising group-level brain connectivity using exponential random graph models. You get limited understanding of the paper, but what's cool is that you have a permanent opportunity to ping the authors about their paper even when the conference ends.
There is an interesting format of neuroscience conference happening right now: 24-hour of papers on Twitter. Each paper is presented as a series of tweets at the allocated time slot. No registration, no videos, but you can still talk to the authors. To get an idea, take a look at this graph paper, Characterising group-level brain connectivity using exponential random graph models. You get limited understanding of the paper, but what's cool is that you have a permanent opportunity to ping the authors about their paper even when the conference ends.
Scene Graph Generation by Iterative Message Passing
There are numerous applications of graphs in computer vision, which usually boil down to reduction of the image to a sparse graph. This has found applications in image captioning, human action identification, predicting trajectories, or transferring knowledge to new data sets, among others (see Applications section from this survey). One of the recent interesting advancements is from 2017 CVPR paper (a group of Fei-Fei Li), Scene Graph Generation by Iterative Message Passing, where the authors proposed an image encoder that produces essentially a knowledge graph of objects in the image and their relationships, which can be then used for downstream tasks.
There are numerous applications of graphs in computer vision, which usually boil down to reduction of the image to a sparse graph. This has found applications in image captioning, human action identification, predicting trajectories, or transferring knowledge to new data sets, among others (see Applications section from this survey). One of the recent interesting advancements is from 2017 CVPR paper (a group of Fei-Fei Li), Scene Graph Generation by Iterative Message Passing, where the authors proposed an image encoder that produces essentially a knowledge graph of objects in the image and their relationships, which can be then used for downstream tasks.
Applications of graphs in NLP
In addition to computer vision, there are many applications of graphs to NLP too. For example, knowledge graphs have been used extensively in research to bring fact knowledge to dialogue and question-answering systems, besides approaches for commonsense reasoning and named entity recognition (see these conference reviews of NLP+KG domains by Michael Galkin). And sometimes you even get more technical papers in this domain, like in this video Relation Embedding with Dihedral Group in Knowledge Graph of ACL '19 paper, where the authors model composition of relationships with some group theory.
In addition to computer vision, there are many applications of graphs to NLP too. For example, knowledge graphs have been used extensively in research to bring fact knowledge to dialogue and question-answering systems, besides approaches for commonsense reasoning and named entity recognition (see these conference reviews of NLP+KG domains by Michael Galkin). And sometimes you even get more technical papers in this domain, like in this video Relation Embedding with Dihedral Group in Knowledge Graph of ACL '19 paper, where the authors model composition of relationships with some group theory.
Medium
Michael Galkin – Medium
Read writing from Michael Galkin on Medium. AI Research Scientist @ Intel Labs. Working on Graph ML, Geometric DL, and Knowledge Graphs. Every day, Michael Galkin and thousands of other voices read, write, and share important stories on Medium.
The forbidden sidetrip by László Babai.
László Babai, the top mathematician in the field of graph isomorphism, turns out, has a zingy autobiographical story called "The forbidden sidetrip" about his trip to Minsk, Moscow, and Leningrad, when it was 1978 and he was 28. It's just fun on its own, but it also contains lots of names of Soviet mathematicians and references about the places where and how math was being built in the late USSR. To my surprise, he knows (knew?) Russian quite well (originally Hungarian) to listen to lectures in the universities.
There are stories about how he generally despises anti-semitism by another talented Russian mathematician Pontryagin, and how he was getting into the soviet queues before knowing what they sell, and how you can give "chervonets sverhu" (i.e. greasing the palm) to get the ticket when they are sold out, and lots of other soviet-related stuff that I really love asking older generation about. But among others, it's just interesting to look at the young, not-yet-famous guy who will then climb to the olymp of mathematics. I wish there was a movie about it.
László Babai, the top mathematician in the field of graph isomorphism, turns out, has a zingy autobiographical story called "The forbidden sidetrip" about his trip to Minsk, Moscow, and Leningrad, when it was 1978 and he was 28. It's just fun on its own, but it also contains lots of names of Soviet mathematicians and references about the places where and how math was being built in the late USSR. To my surprise, he knows (knew?) Russian quite well (originally Hungarian) to listen to lectures in the universities.
There are stories about how he generally despises anti-semitism by another talented Russian mathematician Pontryagin, and how he was getting into the soviet queues before knowing what they sell, and how you can give "chervonets sverhu" (i.e. greasing the palm) to get the ticket when they are sold out, and lots of other soviet-related stuff that I really love asking older generation about. But among others, it's just interesting to look at the young, not-yet-famous guy who will then climb to the olymp of mathematics. I wish there was a movie about it.
How powerful are graph neural networks? Part II
This is the second insight I recently got from the paper, How powerful are graph neural networks. It's about the initial node features. In the paper they write:
This is the second insight I recently got from the paper, How powerful are graph neural networks. It's about the initial node features. In the paper they write:
This first X (which is in latex \mathcal{X}) is the set of all input features. What this theorem says is that you need to have input features that are countable. Countable means rational or categorical, but not irrational numbers. So this condition "prohibits" any "continuous" features such as real numbers (e.g. degrees). If your set of features is uncountable then the sum is not injective anymore and hence GNN becomes less powerful than 1-WL test. So that's why if you use one-hot encoding of features your GNN becomes theoretically more powerful.
GraphChallenge
Challenges such as YOHO, MNIST, HPC Challenge, ImageNet, and VAST have played important roles in driving progress in fields as diverse as machine learning, high performance computing, and visual analytics.
GraphChallenge encourages community approaches to developing new solutions for analyzing graphs and sparse data derived from social media, sensor feeds, and scientific data to enable relationships between events to be discovered as they unfold in the field.
Challenges such as YOHO, MNIST, HPC Challenge, ImageNet, and VAST have played important roles in driving progress in fields as diverse as machine learning, high performance computing, and visual analytics.
GraphChallenge encourages community approaches to developing new solutions for analyzing graphs and sparse data derived from social media, sensor feeds, and scientific data to enable relationships between events to be discovered as they unfold in the field.
Fresh picks from ArXiv
This week presents a new challenge on the computing of triangles, connection between graphs and prime numbers, and as usual submissions to ICML 📕
Survey
• Pre-trained Models for Natural Language Processing: A Survey
• Survey of Privacy-Preserving Collaborative Filtering
WebConf 20
• What is Normal, What is Strange, and What is Missing in a Knowledge Graph: Unified Characterization via Inductive Summarization
ICML 20
• Scattering GCN: Overcoming Oversmoothness in Graph Convolutional Networks
• Probabilistic Dual Network Architecture Search on Graphs
Applications
• GFTE: Graph-based Financial Table Extraction
• GraphChallenge.org Triangle Counting Performance
Graph Theory
• Classification of vertex-transitive digraphs via automorphism group
• A Graph Theoretic Formula for the Number of Primes π(n)
• Large cycles in essentially 4-connected graphs
• Countable graphs are majority 3-choosable
This week presents a new challenge on the computing of triangles, connection between graphs and prime numbers, and as usual submissions to ICML 📕
Survey
• Pre-trained Models for Natural Language Processing: A Survey
• Survey of Privacy-Preserving Collaborative Filtering
WebConf 20
• What is Normal, What is Strange, and What is Missing in a Knowledge Graph: Unified Characterization via Inductive Summarization
ICML 20
• Scattering GCN: Overcoming Oversmoothness in Graph Convolutional Networks
• Probabilistic Dual Network Architecture Search on Graphs
Applications
• GFTE: Graph-based Financial Table Extraction
• GraphChallenge.org Triangle Counting Performance
Graph Theory
• Classification of vertex-transitive digraphs via automorphism group
• A Graph Theoretic Formula for the Number of Primes π(n)
• Large cycles in essentially 4-connected graphs
• Countable graphs are majority 3-choosable