论文标题
Web的OBDA:在网络数据源之上创建虚拟RDF图
OBDA for the Web: Creating Virtual RDF Graphs On Top of Web Data Sources
论文作者
论文摘要
由于多样性,Web数据具有许多不同的结构和格式,HTML表和REST API(例如社交媒体API)是最受欢迎的API。 Web数据的很大一部分也以速度为特征,因为数据经常更新,因此消费者可以获取相应数据集的最新版本。但是,目前,这些数据源尚未由语义Web工具有效地支持。为了解决多样性和速度,我们提出了Ontop4Theweb,该系统将各种格式的Web数据映射到虚拟RDF Triples中,从而允许在不实现RDF的情况下进行查询。我们演示了Ontop4Theweb如何使用SPARQL统一查询流行但异质的Web数据源,例如HTML表和Web API。我们在许多用例中展示了我们的方法,例如Twitter,Foursquare,Yelp和HTML表。我们进行了彻底的实验评估,该评估验证了我们框架的高效率,该框架在功能和性能方面都超出了该领域的当前最新效率。
Due to Variety, Web data come in many different structures and formats, with HTML tables and REST APIs (e.g., social media APIs) being among the most popular ones. A big subset of Web data is also characterised by Velocity, as data gets frequently updated so that consumers can obtain the most up-to-date version of the respective datasets. At the moment, though, these data sources are not effectively supported by Semantic Web tools. To address variety and velocity, we propose Ontop4theWeb, a system that maps Web data of various formats into virtual RDF triples, thus allowing for querying them on-the-fly without materializing them as RDF. We demonstrate how Ontop4theWeb can use SPARQL to uniformly query popular, but heterogeneous Web data sources, like HTML tables and Web APIs. We showcase our approach in a number of use cases, such as Twitter, Foursquare, Yelp and HTML tables. We carried out a thorough experimental evaluation which verifies the high efficiency of our framework, which goes beyond the current state-of-the-art in this area, in terms of both functionality and performance.