论文标题
使用神经网络的Granger因果关系
Granger Causality using Neural Networks
论文作者
论文摘要
网络中的节点之间的依赖性是一个重要的概念,它遍及许多领域,包括金融,政治,社会学,基因组学和脑科学。表征多元时间序列数据组件之间依赖性的一种方法是通过Granger因果关系(GC)。 GC估计 /推理的标准传统方法通常假设线性动力学,但是在许多现实世界中,信号本质上是非线性的,这种简化并不存在。在这种情况下,施加线性模型(例如向量自回旋(VAR)模型)可能导致真正的Granger因果相互作用的错误特征。为了克服这一限制,Tank等人(关于模式分析和机器学习的IEEE交易,2022年)提出了一种使用具有稀疏正则惩罚的神经网络的解决方案。正则化鼓励可学习的权重稀疏,这可以推断GC。本文通过利用机器学习和深度学习的进步来克服当前方法的局限性,这些方法已被证明是学习数据中隐藏模式的局限性。我们提出了新的模型类别,这些模型可以以计算有效的方式处理潜在的非线性,同时提供GC和滞后订单选择。首先,我们介绍了学习的内核var(lekvar)模型,该模型通过共享神经网络学习内核,然后对可学习的权重进行惩罚以发现GC结构。其次,我们表明可以通过脱钩的惩罚直接将滞后和单个时间序列的重要性分解。这很重要,因为我们想在GC估计过程中选择滞后顺序。这种去耦充当过滤,可以扩展到任何DL模型,包括多层感知器(MLP),复发性神经网络(RNN),长期短期存储网络(LSTM),变形金刚等,用于同时GC估计和滞后选择。
Dependence between nodes in a network is an important concept that pervades many areas including finance, politics, sociology, genomics and the brain sciences. One way to characterize dependence between components of a multivariate time series data is via Granger Causality (GC). Standard traditional approaches to GC estimation / inference commonly assume linear dynamics, however such simplification does not hold in many real-world applications where signals are inherently non-linear. In such cases, imposing linear models such as vector autoregressive (VAR) models can lead to mis-characterization of true Granger Causal interactions. To overcome this limitation, Tank et al (IEEE Transactions on Pattern Analysis and Machine Learning, 2022) proposed a solution that uses neural networks with sparse regularization penalties. The regularization encourages learnable weights to be sparse, which enables inference on GC. This paper overcomes the limitations of current methods by leveraging advances in machine learning and deep learning which have been demonstrated to learn hidden patterns in the data. We propose novel classes of models that can handle underlying non-linearity in a computationally efficient manner, simultaneously providing GC and lag order selection. Firstly, we present the Learned Kernel VAR (LeKVAR) model that learns kernel parameterized by a shared neural net followed by penalization on learnable weights to discover GC structure. Secondly, we show one can directly decouple lags and individual time series importance via decoupled penalties. This is important as we want to select the lag order during the process of GC estimation. This decoupling acts as a filtering and can be extended to any DL model including Multi-Layer Perceptrons (MLP), Recurrent Neural Networks (RNN), Long Short Term Memory Networks (LSTM), Transformers etc, for simultaneous GC estimation and lag selection.