论文标题
普遍计算应用程序中数据管理的概率方法
A Probabilistic Approach for Data Management in Pervasive Computing Applications
论文作者
论文摘要
普遍计算(PC)的当前进步涉及采用物联网(IoT)和边缘计算(EC)的巨大基础架构。物联网和EC都可以支持围绕最终用户的创新应用,以促进其活动。此类申请是基于收集的数据和要求的适当处理。为了限制潜伏期,研究界为欧洲数据管理提供了许多用于数据管理的模型,而不是依靠云进行数据存储和处理。通常以任务或查询形式定义的请求要求处理特定数据。必须在请求到达之前预处理数据并检测其统计数据的模型。在本文中,我们提出了一种有希望且易于实施的方案,用于根据概率方法选择适当的传入数据主机。我们的目的是将类似的数据存储在相同的分布式数据集中,以事先了解其统计数据,同时保持其稳固性高水平。作为坚固性,我们考虑数据的有限统计偏差,因此,我们可以支持同一数据集中高度相关数据的存储。此外,我们提出了一种在数据到来后应用的异常值检测的聚合机制。离群值转移到云中进行进一步处理。当接受数据被本地存储时,我们提出了一个模型,用于选择适当的数据集,以复制它们以构建故障耐受系统。我们分析描述我们的模型,并通过广泛的模拟呈现其优缺点进行评估。
Current advances in Pervasive Computing (PC) involve the adoption of the huge infrastructures of the Internet of Things (IoT) and the Edge Computing (EC). Both, IoT and EC, can support innovative applications around end users to facilitate their activities. Such applications are built upon the collected data and the appropriate processing demanded in the form of requests. To limit the latency, instead of relying on Cloud for data storage and processing, the research community provides a number of models for data management at the EC. Requests, usually defined in the form of tasks or queries, demand the processing of specific data. A model for pre-processing the data preparing them and detecting their statistics before requests arrive is necessary. In this paper, we propose a promising and easy to implement scheme for selecting the appropriate host of the incoming data based on a probabilistic approach. Our aim is to store similar data in the same distributed datasets to have, beforehand, knowledge on their statistics while keeping their solidity at high levels. As solidity, we consider the limited statistical deviation of data, thus, we can support the storage of highly correlated data in the same dataset. Additionally, we propose an aggregation mechanism for outliers detection applied just after the arrival of data. Outliers are transferred to Cloud for further processing. When data are accepted to be locally stored, we propose a model for selecting the appropriate datasets where they will be replicated for building a fault tolerant system. We analytically describe our model and evaluate it through extensive simulations presenting its pros and cons.