论文标题
激励响应能力,工具控制和影响
Incentives for Responsiveness, Instrumental Control and Impact
论文作者
论文摘要
我们介绍了描述代理人激励措施的三个概念:响应激励措施表明环境中的哪些变量(例如敏感的人口统计信息)会影响最佳政策下的决策。工具控制激励措施表明是否选择了代理商的政策来操纵其环境的一部分,例如用户的偏好或说明。影响激励措施表明代理会有意或其他方式影响哪些变量。对于每个概念,我们都会建立声音和完整的图形标准,并讨论一般类别的技术类别,这些技术可用于为安全和公平的代理行为产生激励措施。最后,我们概述了如何将这些概念推广到多决策设置。这篇期刊长度论文扩展了我们的会议出版物“激励响应能力,工具控制和影响力”和“代理激励措施:因果观点”:响应激励措施和工具控制激励措施的材料已更新,而影响激励措施和多项式否决的工作是完全新的。
We introduce three concepts that describe an agent's incentives: response incentives indicate which variables in the environment, such as sensitive demographic information, affect the decision under the optimal policy. Instrumental control incentives indicate whether an agent's policy is chosen to manipulate part of its environment, such as the preferences or instructions of a user. Impact incentives indicate which variables an agent will affect, intentionally or otherwise. For each concept, we establish sound and complete graphical criteria, and discuss general classes of techniques that may be used to produce incentives for safe and fair agent behaviour. Finally, we outline how these notions may be generalised to multi-decision settings. This journal-length paper extends our conference publications "Incentives for Responsiveness, Instrumental Control and Impact" and "Agent Incentives: A Causal Perspective": the material on response incentives and instrumental control incentives is updated, while the work on impact incentives and multi-decision settings is entirely new.