论文标题

网络报告:网络数据集的结构化描述

Network Report: A Structured Description for Network Datasets

论文作者

Zheng, Xinyi, Rossi, Ryan A., Ahmed, Nesreen, Moritz, Dominik

论文摘要

网络科学和技术的快速发展取决于可共享的数据集。当前,没有用于报告和共享网络数据集的标准实践。一些网络数据集提供商仅共享链接,而另一些网络数据集提供商提供了一些上下文或基本统计信息。结果,关键信息可能无意间删除,网络数据集消费者可能会误解或忽略关键方面。使用网络数据集不适当地导致严重的后果(例如,歧视),尤其是当将网络上的机器学习模型部署在高维护域中时。挑战出现,因为网络经常在不同的领域(例如网络科学,物理等)上使用,并且具有复杂的结构。为了促进网络数据集提供商和消费者之间的通信,我们提出了网络报告。网络报告是一个结构化的描述,总结并上下文化网络数据集。网络报告从先前的工作中扩展了数据集报告的想法(例如,数据集的数据表),其中包含非i.i.d的网络特定描述。自然,人口统计信息,网络特征等。我们希望网络报告鼓励不同领域的网络研发透明度和问责制。

The rapid development of network science and technologies depends on shareable datasets. Currently, there is no standard practice for reporting and sharing network datasets. Some network dataset providers only share links, while others provide some contexts or basic statistics. As a result, critical information may be unintentionally dropped, and network dataset consumers may misunderstand or overlook critical aspects. Inappropriately using a network dataset can lead to severe consequences (e.g., discrimination) especially when machine learning models on networks are deployed in high-stake domains. Challenges arise as networks are often used across different domains (e.g., network science, physics, etc) and have complex structures. To facilitate the communication between network dataset providers and consumers, we propose network report. A network report is a structured description that summarizes and contextualizes a network dataset. Network report extends the idea of dataset reports (e.g., Datasheets for Datasets) from prior work with network-specific descriptions of the non-i.i.d. nature, demographic information, network characteristics, etc. We hope network reports encourage transparency and accountability in network research and development across different fields.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源