Subgraph Invariant Learning Towards Large-Scale Graph Node Classification
Abstract
Graph Neural Networks (GNNs) have shown efficacy in graph node classification, but face computational challenges on large-scale graphs. Although existing graph reduction methods address these issues, they still require high computational resources and fail to prioritize robust performance on out-of-distribution data. To tackle these challenges, we introduce the subgraph invariant learning paradigm, inspired by the small-world phenomenon. This approach enables models trained on specific subgraphs to generalize across diverse subgraphs, reducing computational demands, and enhancing scalability. To promote generalization, we maximize the invariance log-likelihood by deriving a theoretical lower bound of it and formulating the InVar loss. This loss minimizes the discrepancy between node representations and their corresponding invariance representations while maximizing the entropy of the node representation. In response to InVar loss, we propose the Invariance Facilitation Model (IFM), comprising the Invariance Representation Encoder (IRE) and Node Representation Encoder (NRE). IRE, capturing the invariance representations, utilizes Invariance ATTention (InvarATT) to compress long-range dependencies, while NRE learns the node representation, by integrating invariance representations via Telematic ATTention (TeleATT) and exchanging local information within each subgraph through GNNs. Evaluations on four large-scale graph datasets demonstrate the effectiveness, computational efficiency, and interpretability of IFM for large-scale graph node classification.