论文标题

基于推文的数据集用于公司级股票回报预测

A Tweet-based Dataset for Company-Level Stock Return Prediction

论文作者

Sowinska, Karolina, Madhyastha, Pranava

论文摘要

公众舆论影响了事件,特别是与股票市场运动有关的事件,在这些事件中,微妙的提示会影响市场的当地结果。在本文中,我们提出了一个数据集,该数据集允许对基于推文的影响对一,两,三和7天的股票收益进行基于推文的影响。我们的数据集由862,231个带有英语的标记实例组成,我们还向社区发布了85,176个标记实例的清洁子集。我们还使用标准机器学习算法和基于多视图的学习方法提供基准,该方法利用不同类型的功能。我们的数据集,脚本和模型可在以下网址公开获取:https://github.com/imperialnlp/stockreturnpred。

Public opinion influences events, especially related to stock market movement, in which a subtle hint can influence the local outcome of the market. In this paper, we present a dataset that allows for company-level analysis of tweet based impact on one-, two-, three-, and seven-day stock returns. Our dataset consists of 862, 231 labelled instances from twitter in English, we also release a cleaned subset of 85, 176 labelled instances to the community. We also provide baselines using standard machine learning algorithms and a multi-view learning based approach that makes use of different types of features. Our dataset, scripts and models are publicly available at: https://github.com/ImperialNLP/stockreturnpred.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源