image

paper

TL;DR

  • I read this because.. : NeurIPS 2023
  • task : graph representation
  • problem : message passing approaches (MPNN) suffer from over-smoothing and over-squashing, a problem similar to long-term dependency in NLP. Putting the transformer directly into the graph can lead to quadratic computation when there are too many nodes for global attention.
  • Idea :** MPNN + Transformer, clean up the existing Positional Embedding and Structural Embedding and see how each affects MPNN.
  • architecture : global attention + MPNN does
  • baseline : GCN, GAT, SAN, Graphormer, …
  • data : ZINC, PATTERN, CLIST, MNIST, CIFAR10, ….
  • evaluation : MAE, Accuracy, …
  • result : Several of the benchmarks SOTA, compliant grades
image

Details

first fully transformer graph netowrk https://arxiv.org/pdf/2012.09699.pdf

Positional Encoding(PE)

  • local Knowing your position in the node cluster.
  • global Knowing your position in the graph
  • relative Knowing the relative distance between a pair of nodes.

Structural Encoding(SE)

Aiming to increase the expressiveness and generalization of GNNs by embedding the structure of graphs or subgraphs image

GPS Layer: an MPNN + Transformer Hybrid

image
  • $A\in\mathbb{R}^{N\times N}$ : adjacency matric of a graph with N nodes and E edges
  • X^l\in \mathbb{R}^{N\times d_l}$ : $d_l$ dimensional node feature
  • Edge feature in $E^l\in \mathbb{R}^{N\times d_l}$ : $d_l$ dimensions

Result

image image

Ablation

image