|
1 | 1 | --- |
| 2 | +- title: "Over-squashing in Spatiotemporal Graph Neural Networks" |
| 3 | + links: |
| 4 | + paper: https://arxiv.org/abs/2506.15507 |
| 5 | + venue: Preprint |
| 6 | + year: 2025 |
| 7 | + authors: |
| 8 | + - id:imarisca |
| 9 | + - J. Bamberger |
| 10 | + - id:calippi |
| 11 | + - M. Bronstein |
| 12 | + keywords: |
| 13 | + - spatiotemporal data |
| 14 | + - graph neural networks |
| 15 | + - forecasting |
| 16 | + - over-squashing |
| 17 | + abstract: > |
| 18 | + Graph Neural Networks (GNNs) have achieved remarkable success across various domains. However, recent theoretical advances have identified fundamental limitations in their information propagation capabilities, such as over-squashing, where distant nodes fail to effectively exchange information. While extensively studied in static contexts, this issue remains unexplored in Spatiotemporal GNNs (STGNNs), which process sequences associated with graph nodes. Nonetheless, the temporal dimension amplifies this challenge by increasing the information that must be propagated. In this work, we formalize the spatiotemporal over-squashing problem and demonstrate its distinct characteristics compared to the static case. Our analysis reveals that counterintuitively, convolutional STGNNs favor information propagation from points temporally distant rather than close in time. Moreover, we prove that architectures that follow either time-and-space or time-then-space processing paradigms are equally affected by this phenomenon, providing theoretical justification for computationally efficient implementations. We validate our findings on synthetic and real-world datasets, providing deeper insights into their operational dynamics and principled guidance for more effective designs. |
| 19 | + bibtex: > |
| 20 | + @article{marisca2025oversquashing, |
| 21 | + title = {Over-squashing in Spatiotemporal Graph Neural Networks}, |
| 22 | + author = {Ivan Marisca and Jacob Bamberger and Cesare Alippi and Michael M. Bronstein}, |
| 23 | + year = {2025}, |
| 24 | + journal = {arXiv preprint arXiv:2506.15507}, |
| 25 | + url = {https://arxiv.org/abs/2506.15507}, |
| 26 | + } |
2 | 27 | - title: 'PeakWeather: MeteoSwiss Weather Station Measurements for Spatiotemporal Deep Learning' |
3 | 28 | links: |
4 | 29 | paper: https://arxiv.org/abs/2506.13652 |
|
92 | 117 | - D. Bacciu |
93 | 118 | - id:calippi |
94 | 119 | keywords: |
95 | | - - spatiotemporal graphs |
| 120 | + - spatiotemporal data |
96 | 121 | - graph neural networks |
97 | 122 | - irregular sampling |
98 | 123 | abstract: Modern graph representation learning works mostly under the assumption of dealing with regularly sampled temporal graph snapshots, which is far from realistic, e.g., social networks and physical systems are characterized by continuous dynamics and sporadic observations. To address this limitation, we introduce the Temporal Graph Ordinary Differential Equation (TG-ODE) framework, which learns both the temporal and spatial dynamics from graph streams where the intervals between observations are not regularly spaced. We empirically validate the proposed approach on several graph benchmarks, showing that TG-ODE can achieve state-of-the-art performance in irregular graph stream tasks. |
|
119 | 144 | - V. V. Gusev |
120 | 145 | - id:calippi |
121 | 146 | keywords: |
122 | | - - spatiotemporal graphs |
| 147 | + - spatiotemporal data |
123 | 148 | - graph neural networks |
124 | 149 | - imputation |
125 | 150 | - graph structure learning |
|
142 | 167 | - id:calippi |
143 | 168 | - id:fmbianchi |
144 | 169 | keywords: |
145 | | - - spatiotemporal graphs |
| 170 | + - spatiotemporal data |
146 | 171 | - forecasting |
| 172 | + - pooling |
| 173 | + - irregular sampling |
147 | 174 | abstract: Given a set of synchronous time series, each associated with a sensor-point in space and characterized by inter-series relationships, the problem of spatiotemporal forecasting consists of predicting future observations for each point. Spatiotemporal graph neural networks achieve striking results by representing the relationships across time series as a graph. Nonetheless, most existing methods rely on the often unrealistic assumption that inputs are always available and fail to capture hidden spatiotemporal dynamics when part of the data is missing. In this work, we tackle this problem through hierarchical spatiotemporal downsampling. The input time series are progressively coarsened over time and space, obtaining a pool of representations that capture heterogeneous temporal and spatial dynamics. Conditioned on observations and missing data patterns, such representations are combined by an interpretable attention mechanism to generate the forecasts. Our approach outperforms state-of-the-art methods on synthetic and real-world benchmarks under different missing data distributions, particularly in the presence of contiguous blocks of missing values. |
148 | 175 | - title: Graph Deep Learning for Time Series Forecasting |
149 | 176 | links: |
|
157 | 184 | - id:dzambon |
158 | 185 | - id:calippi |
159 | 186 | keywords: |
160 | | - - spatiotemporal graphs |
| 187 | + - spatiotemporal data |
161 | 188 | - forecasting |
162 | 189 | abstract: Graph-based deep learning methods have become popular tools to process collections of correlated time series. Differently from traditional multivariate forecasting methods, neural graph-based predictors take advantage of pairwise relationships by conditioning forecasts on a (possibly dynamic) graph spanning the time series collection. The conditioning can take the form of an architectural inductive bias on the neural forecasting architecture, resulting in a family of deep learning models called spatiotemporal graph neural networks. Such relational inductive biases enable the training of global forecasting models on large time-series collections, while at the same time localizing predictions w.r.t. each element in the set (i.e., graph nodes) by accounting for local correlations among them (i.e., graph edges). Indeed, recent theoretical and practical advances in graph neural networks and deep learning for time series forecasting make the adoption of such processing frameworks appealing and timely. However, most of the studies in the literature focus on proposing variations of existing neural architectures by taking advantage of modern deep learning practices, while foundational and methodological aspects have not been subject to systematic investigation. To fill the gap, this paper aims to introduce a comprehensive methodological framework that formalizes the forecasting problem and provides design principles for graph-based predictive models and methods to assess their performance. At the same time, together with an overview of the field, we provide design guidelines, recommendations, and best practices, as well as an in-depth discussion of open challenges and future research directions. |
163 | 190 | bibtex: > |
|
216 | 243 | - I. King |
217 | 244 | - S. Pan |
218 | 245 | keywords: |
219 | | - - spatiotemporal graphs |
| 246 | + - spatiotemporal data |
220 | 247 | - graph neural networks |
221 | 248 | - forecasting |
222 | 249 | - imputation |
|
243 | 270 | - D. Mandic |
244 | 271 | - id:calippi |
245 | 272 | keywords: |
246 | | - - spatiotemporal graphs |
| 273 | + - spatiotemporal data |
247 | 274 | - forecasting |
248 | 275 | - time series clustering |
249 | 276 | - pooling |
|
309 | 336 | - id:dzambon |
310 | 337 | first_authors: 2 |
311 | 338 | keywords: |
312 | | - - spatiotemporal graphs |
| 339 | + - spatiotemporal data |
313 | 340 | - state-space models |
314 | 341 | - graph structure learning |
315 | 342 | abstract: The well-known Kalman filters model dynamical systems by relying on state-space representations with the next state updated, and its uncertainty controlled, by fresh information associated with newly observed system outputs. This paper generalizes, for the first time in the literature, Kalman and extended Kalman filters to discrete-time settings where inputs, states, and outputs are represented as attributed graphs whose topology and attributes can change with time. The setup allows us to adapt the framework to cases where the output is a vector or a scalar too (node/graph level tasks). Within the proposed theoretical framework, the unknown state-transition and the readout functions are learned end-to-end along with the downstream prediction task. |
|
326 | 353 | - id:calippi |
327 | 354 | first_authors: 2 |
328 | 355 | keywords: |
329 | | - - spatiotemporal graphs |
| 356 | + - spatiotemporal data |
330 | 357 | - forecasting |
331 | 358 | - embeddings |
332 | 359 | abstract: Spatiotemporal graph neural networks have shown to be effective in time series forecasting applications, achieving better performance than standard univariate predictors in several settings. These architectures take advantage of a graph structure and relational inductive biases to learn a single (global) inductive model to predict any number of the input time series, each associated with a graph node. Despite the gain achieved in computational and data efficiency w.r.t. fitting a set of local models, relying on a single global model can be a limitation whenever some of the time series are generated by a different spatiotemporal stochastic process. The main objective of this paper is to understand the interplay between globality and locality in graph-based spatiotemporal forecasting, while contextually proposing a methodological framework to rationalize the practice of including trainable node embeddings in such architectures. We ascribe to trainable node embeddings the role of amortizing the learning of specialized components. Moreover, embeddings allow for 1) effectively combining the advantages of shared message-passing layers with node-specific parameters and 2) efficiently transferring the learned model to new node sets. Supported by strong empirical evidence, we provide insights and guidelines for specializing graph-based models to the dynamics of each time series and show how this aspect plays a crucial role in obtaining accurate predictions. |
|
349 | 376 | - id:dzambon |
350 | 377 | - id:calippi |
351 | 378 | keywords: |
352 | | - - spatiotemporal graphs |
| 379 | + - spatiotemporal data |
353 | 380 | - residual analysis |
354 | 381 | abstract: This paper introduces a novel residual correlation analysis, called AZ-analysis, to assess the optimality of spatio-temporal predictive models. The proposed AZ-analysis constitutes a valuable asset for discovering and highlighting those space-time regions where the model can be improved with respect to performance. The AZ-analysis operates under very mild assumptions and is based on a spatio-temporal graph that encodes serial and functional dependencies in the data; asymptotically distribution-free summary statistics identify existing residual correlation in space and time regions, hence localizing time frames and/or communities of sensors, where the predictor can be improved. |
355 | 382 | - title: "Peak shaving in distribution networks using stationary energy storage systems: A Swiss case study" |
|
391 | 418 | - id:llivi |
392 | 419 | - id:calippi |
393 | 420 | keywords: |
394 | | - - spatiotemporal graphs |
| 421 | + - spatiotemporal data |
395 | 422 | - state-space models |
396 | 423 | - graph structure learning |
397 | 424 | abstract: State-space models constitute an effective modeling tool to describe multivariate time series and operate by maintaining an updated representation of the system state from which predictions are made. Within this framework, relational inductive biases, e.g., associated with functional dependencies existing among signals, are not explicitly exploited leaving unattended great opportunities for effective modeling approaches. The manuscript aims, for the first time, at filling this gap by matching state-space modeling and spatio-temporal data where the relational information, say the functional graph capturing latent dependencies, is learned directly from data and is allowed to change over time. Within a probabilistic formulation that accounts for the uncertainty in the data-generating process, an encoder-decoder architecture is proposed to learn the state-space model end-to-end on a downstream task. The proposed methodological framework generalizes several state-of-the-art methods and demonstrates to be effective in extracting meaningful relational information while achieving optimal forecasting performance in controlled environments. |
|
408 | 435 | - id:calippi |
409 | 436 | first_authors: 2 |
410 | 437 | keywords: |
411 | | - - spatiotemporal graphs |
| 438 | + - spatiotemporal data |
412 | 439 | - forecasting |
413 | 440 | - reservoir computing |
414 | 441 | abstract: Neural forecasting of spatiotemporal time series drives both research |
|
475 | 502 | - id:calippi |
476 | 503 | first_authors: 2 |
477 | 504 | keywords: |
478 | | - - spatiotemporal graphs |
| 505 | + - spatiotemporal data |
479 | 506 | - imputation |
480 | 507 | abstract: Modeling multivariate time series as temporal signals over a (possibly |
481 | 508 | dynamic) graph is an effective representational framework that allows for developing |
|
506 | 533 | - id:dzambon |
507 | 534 | - id:calippi |
508 | 535 | keywords: |
509 | | - - spatiotemporal graphs |
| 536 | + - spatiotemporal data |
510 | 537 | - residual analysis |
511 | 538 | abstract: We present the first whiteness test for graphs, i.e., a whiteness test |
512 | 539 | for multivariate time series associated with the nodes of a dynamic graph. The |
|
545 | 572 | - id:dzambon |
546 | 573 | - id:calippi |
547 | 574 | keywords: |
548 | | - - spatiotemporal graphs |
| 575 | + - spatiotemporal data |
549 | 576 | - graph structure learning |
550 | 577 | - forecasting |
551 | 578 | abstract: Outstanding achievements of graph neural networks for spatiotemporal time series analysis show that relational constraints introduce an effective inductive bias into neural forecasting architectures. Often, however, the relational information characterizing the underlying data-generating process is unavailable and the practitioner is left with the problem of inferring from data which relational graph to use in the subsequent processing stages. We propose novel, principled - yet practical - probabilistic score-based methods that learn the relational dependencies as distributions over graphs while maximizing end-to-end the performance at task. The proposed graph learning framework is based on consolidated variance reduction techniques for Monte Carlo score-based gradient estimation, is theoretically grounded, and, as we show, effective in practice. In this paper, we focus on the time series forecasting problem and show that, by tailoring the gradient estimators to the graph learning problem, we are able to achieve state-of-the-art performance while controlling the sparsity of the learned graph and the computational scalability. We empirically assess the effectiveness of the proposed method on synthetic and real-world benchmarks, showing that the proposed solution can be used as a stand-alone graph identification procedure as well as a graph learning component of an end-to-end forecasting architecture. |
|
605 | 632 | - id:slukovic |
606 | 633 | - id:calippi |
607 | 634 | keywords: |
608 | | - - spatiotemporal graphs |
| 635 | + - spatiotemporal data |
609 | 636 | - forecasting |
610 | 637 | - energy analytics |
611 | 638 | abstract: Accurate forecasting of electricity demand is a core component in many |
|
989 | 1016 | - id:calippi |
990 | 1017 | first_authors: 2 |
991 | 1018 | keywords: |
992 | | - - spatiotemporal graphs |
| 1019 | + - spatiotemporal data |
993 | 1020 | - imputation |
994 | 1021 | abstract: Dealing with missing values and incomplete time series is a labor-intensive, |
995 | 1022 | tedious, inevitable task when handling data coming from real-world applications. |
|
0 commit comments