Graph Machine Learning
6.7K subscribers
53 photos
11 files
808 links
Everything about graph theory, computer science, machine learning, etc.


If you have something worth sharing with the community, reach out @gimmeblues, @chaitjo.

Admins: Sergey Ivanov; Michael Galkin; Chaitanya K. Joshi
Download Telegram
Book: The Atlas for the Aspiring Network Scientist

A new introductory book of network science by Michele Coscia. 760 pages covering Hitting Time Matrix, Kronecker graph model, network measurement error, graph embedding techniques, and more. As the author describes he aims it to be broad, not deep, so there is not much math involved.
Graph Machine Learning research groups: Michele Coscia

I do a series of posts on the groups in graph research, previous post is here. The 21st is Michele Coscia, the author of the atlas of the network science.

Michele Coscia (~1985)
- Affiliation: IT University of Copenhagen
- Education: Ph.D. at University of Pisa in 2012 (advisor: Dino Pedreschi)
- h-index 22
- Awards: KDD Dissertation Award, ERCIM Cor Baayen Award
- Interests: homophily, community detection, network science
Survey: Utilising Graph Machine Learning within Drug Discovery and Development

A new survey with Michael Bronstein and his colleagues on application of GNNs to drug discovery. This is very exciting line of research and I bet there will be much more effort in 2021 not only from the academia but also from the startups and big pharmacies. In this domain graphs appear as a natural structure to model relationships in molecules or more complex bio entities, for examples protein to protein interactions. There are also many valuable tasks such as target identification, molecule property prediction, de-novo drug design and more. Relation Therapeutics, a London-based startup that also participates in writing this survey, even has an opening for Graph ML researcher.
Graph Papers at ICLR 2021: Decisions

Here is an updated list of graph papers with decisions and keywords at ICLR 2021.

There are 201 graph papers: 1 Oral, 9 Spotlights, 40 Posters.

Among most common topics are generalization bounds, equivariance, knowledge graphs, applications to physics/biology/RL/videos.
Post: Top Applications of Graph Neural Networks 2021

In my new post I discuss applications of GNNs in real-world settings. There are ~100 new papers each month on ArXiv about GNNs, indicating that it's a very hot topic 🔥 However, until lately there were not many applications of GNNs in the industry.

I gathered the most interesting applications of GNNs including discovering new medicine 💊, optimizing the power of computer chips 🖥, approximating chemical reactions for renewable energy💨 I really hope that this list will extend in 2021, with more people using GNNs as a default tool for graph structured data.
Podcast: Twiml with Michael Bronstein and Taco Cohen

There are two recent podcasts on Twiml. One with Taco Cohen, a researcher at Qualcomm, on their NeurIPS 20 work with Max Welling and Pim de Haan, called Natural Graph Networks.

The second is with Michael Bronstein, who looks back at the ML achievements of 2020 such as GPT-3 and Neural Implicit Representations. He also discusses the landscape of the Graph ML field in 2021.
Post: AlphaFold 2 & Equivariance

"AlphaFold 2 probably used some network/algorithm to map graph features to obtain the initial XYZ coordinates. Later in the pipeline, they improved their initial prediction by iteratively running their structure module."

Scrutiny of the AlphaFold 2 inner workings by Justas Dauparas & Fabian Fuchs.
GML Newsletter: Do we do it right?

My new issue of the Graph ML newsletter: looking back and ahead for the field. This time I want to raise a point that with all the great research we have in GML, we have comparatively fewer applications of it in real world and that's maybe up to us to pitch and use these developments for the good of people.

Also I moved the platform to substack and there is a wonderful button to support my writings altogether. When I moved to substack I actually just considered eliminating the costs of the previous platform, but surprisingly a few people became paying customers right from the start (which pleasantly surprised me of course). There are some perks of being a paying subscriber (no T-shirts yet:), but my plan is to continue to write for everybody, so hopefully win-win for me and the readers.
AAAI 2021 stats

Dates: Feb 2-9
Where: Online
Price: ~$300

All papers can be found here. Graph categorization can be found here.

• 9034 submissions (vs 7737 in 2020)
• 1692 accepted (vs 1591 in 2020)
• 21% acceptance rate (vs 21% in 2020)
• 141 graph papers (8% of total)
ICLR 2021 stats

Dates: May 4-8
Where: Online

All papers can be found here. Graph papers can be found here.

• 2997 submissions (vs 2594 in 2020)
• 860 accepted (vs 687 in 2020)
• 29% acceptance rate (vs 26.5% in 2020)
• 50 graph papers (6% of total)
S+SSPR Workshop

An online event on GNN, adversarial learning and other topics, happening today and tomorrow, with great list of keynotes: Nicholas Carlini, Michael Bronstein, Max Welling, Fabio Roli. Program can be found here and streaming on YouTube (resuming at 15 Europe time).
Boost then Convolve: Gradient Boosting Meets Graph Neural Networks

In our new work at ICLR 2021, we explore how to apply Gradient Boosted Decision Trees to graphs. Surprisingly, I haven't encountered before papers that test performance of pure GBDT on graphs, for example for node classification.

GBDTs are usually used for heterogeneous data (e.g. in Kaggle competitions): the columns can be categorical, of different scale and meaning (e.g. income column vs age column). Such data is quite common in the real world, but most of the research graph datasets have sparse homogeneous nodes features (e.g. bag-of-words features or word embeddings). So we asked a question whether GNNs are efficient on graphs with heterogeneous features.

The first insight is that you can just pretrain GBDT on the node features and use the predictions of GBDT for training GNN model. This already gives a boost to GNN model.

Second, we proposed a scheme how to train GBDT and GNN end-to-end, and this would additionally boost performance.

Third, this combo of GBDT and GNN, which we call BGNN, converges much faster than GNN and therefore usually is faster to train than pure GNN.

Some limitations.
* BGNN works well with heterogeneous features. So Cora datasets and others with homogeneous features are still better of with plain GNN.
* The approach works for node regression and classification. We have some ideas how to extend it to link prediction or graph classification, but haven't worked it out yet. If you have some interest in continuing this line of work, let me know.

The code and datasets are available here.
Graph Machine Learning research groups: Stefanie Jegelka

I do a series of posts on the groups in graph research, previous post is here. The 22nd is Stefanie Jegelka, a professor at MIT working on submodular functions, DPP, and more recently on theoretical aspects of GNNs.

Stefanie Jegelka (~1986)
- Affiliation: MIT
- Education: Ph.D. at Max Planck Institute for Intelligent Systems, Tubingen and ETH Zurich in 2012 (advisors: Jeff Bilmes, Bernhard Scholkopf, Andreas Krause)
- h-index 33
- Awards: Joseph A Martore Award, NSF CAREER Award, best papers at ICML, NeurIPS
- Interests: generalization and expressivity of GNNs, clustering and graph partitioning
PhD position in Graph Neural Networks Modelling

Norwegian University of Science and Technology opened a PhD position for the thesis topic Interpretable Models with Graph Neural Networks to support the Green Transition of Critical Infrastructures. Deadline is 1 Feb 2021. 3-year contract, ~500K NOK per year before tax.