Facebook Videolive18：一个用于流元数据和在线观众位置的实时视频流数据集

论文标题

Facebook Videolive18：一个用于流元数据和在线观众位置的实时视频流数据集

FacebookVideoLive18: A Live Video Streaming Dataset for Streams Metadata and Online Viewers Locations

论文作者

Baccour, Emna, Erbad, Aiman, Bilal, Kashif, Mohamed, Amr, Guizani, Mohsen, Hamdi, Mounir

论文摘要

随着个人智能设备和普遍网络连接的进步，用户不再是被动内容消费者，而是生产新内容的贡献者。实时服务的这种扩展需要对广播公司和观众的行为进行详细分析，以最大程度地提高用户的体验质量（QOE）。在本文中，我们介绍了一个数据集，该数据集是从流行的实时流媒体平台之一收集的：Facebook。在此数据集中，我们存储了2018年6月和2018年7月收集的1,500,000多个实时流记录。这些数据包括来自世界各地的公开实时视频。但是，Facebook Live API没有提供使用其细粒度数据收集在线视频的可能性。仅当我们知道其ID（标识符）时，API允许获取流的一般数据。因此，使用Facebook提供的实时地图网站并显示在线流的位置和观众的位置，我们与General Metadata一起提取了视频ID和不同的坐标。然后，拥有这些ID并使用API，我们可以收集可能对研究社区有用的公共视频的细粒元数据。我们还提出了一些初步分析，以描述和确定流和观众的模式。这样的细粒细节将使多媒体社区能够重新创建现实世界情景，尤其是用于资源分配，缓存，计算和转码中的边缘网络中。现有的数据集并未提供观众的位置，这限制了尽可能近地分配多媒体资源的努力，并提供更好的QoE。

With the advancement in personal smart devices and pervasive network connectivity, users are no longer passive content consumers, but also contributors in producing new contents. This expansion in live services requires a detailed analysis of broadcasters' and viewers' behavior to maximize users' Quality of Experience (QoE). In this paper, we present a dataset gathered from one of the popular live streaming platforms: Facebook. In this dataset, we stored more than 1,500,000 live stream records collected in June and July 2018. These data include public live videos from all over the world. However, Facebook live API does not offer the possibility to collect online videos with their fine grained data. The API allows to get the general data of a stream, only if we know its ID (identifier). Therefore, using the live map website provided by Facebook and showing the locations of online streams and locations of viewers, we extracted video IDs and different coordinates along with general metadata. Then, having these IDs and using the API, we can collect the fine grained metadata of public videos that might be useful for the research community. We also present several preliminary analyses to describe and identify the patterns of the streams and viewers. Such fine grained details will enable the multimedia community to recreate real-world scenarios particularly for resource allocation, caching, computation, and transcoding in edge networks. Existing datasets do not provide the locations of the viewers, which limits the efforts made to allocate the multimedia resources as close as possible to viewers and to offer better QoE.

下载PDF全文

下载文献需遵守相关版权规定

论文标题