论文标题
将先验知识转换为数据流的贝叶斯模型
Dynamic transformation of prior knowledge into Bayesian models for data streams
论文作者
论文摘要
我们考虑如何从数据无限和顺序出现的流媒体环境中学习贝叶斯模型时有效地使用先验知识。这个问题在数据爆炸时代和丰富的外部知识的丰富来源(例如预训练的模型,本体,维基百科等)非常重要。我们表明,某些现有方法可以很快忘记任何知识。然后,我们提出了一个新颖的框架,该框架能够将不同形式的先验知识纳入数据流的基本贝叶斯模型中。我们的框架包含一些现有的时间序列/动态数据的流行模型。广泛的实验表明,我们的框架的表现要优于较大边距的现有方法。特别是,我们的框架可以帮助贝叶斯模型在极短的文本上良好地概括,而其他方法过度效果。我们的框架的实现可在https://github.com/bachtranxuan/tps.git上获得。
We consider how to effectively use prior knowledge when learning a Bayesian model from streaming environments where the data come infinitely and sequentially. This problem is highly important in the era of data explosion and rich sources of precious external knowledge such as pre-trained models, ontologies, Wikipedia, etc. We show that some existing approaches can forget any knowledge very fast. We then propose a novel framework that enables to incorporate the prior knowledge of different forms into a base Bayesian model for data streams. Our framework subsumes some existing popular models for time-series/dynamic data. Extensive experiments show that our framework outperforms existing methods with a large margin. In particular, our framework can help Bayesian models generalize well on extremely short text while other methods overfit. The implementation of our framework is available at https://github.com/bachtranxuan/TPS.git.