Optimizing Graph Neural Network Training with DiskGNN: A Leap Toward Efficient Large-Scale Learning

Graph Neural Networks (GNNs) are crucial in processing data from domains such as e-commerce and social networks because they manage complex structures. Traditionally, GNNs operate on data that fits within a system’s main memory. However, with the growing scale of graph data, many networks now require methods to handle datasets that exceed memory limits, introducing the need for out-of-core solutions where data resides on disk.

Despite their necessity, existing out-of-core GNN systems struggle to balance efficient data access with model accuracy. Current systems face a trade-off: either suffer from slow input/output operations due to small, frequent disk reads or compromise accuracy by handling graph data in disconnected chunks. For instance, while pioneering, these challenges have limited previous solutions like Ginex and MariusGNN, showing significant drawbacks in training speed or accuracy.

The DiskGNN framework, developed by researchers from Southern University of Science and Technology, Shanghai Jiao Tong University, Centre for Perceptual and Interactive Intelligence, AWS Shanghai AI Lab, and New York University, emerges as a transformative solution specifically designed to optimize the speed and accuracy of GNN training on large datasets. This system utilizes an innovative offline sampling technique that prepares data for quick access during training. By preprocessing and arranging graph data based on expected access patterns, DiskGNN reduces unnecessary disk reads, significantly enhancing training efficiency.

Blog