论文标题
最佳控制的反事实编程
Counterfactual Programming for Optimal Control
论文作者
论文摘要
近年来,已经完成了大量工作来解决基于观察的设计控制法的问题,以允许未知的动态系统执行预先指定的任务。但是,至少对于自治而言,学习哪些任务首先要执行哪些任务的问题。对于代理商要求多个(可能是相互矛盾的)任务和要求的情况,这尤其至关重要。这种情况是由于过度指定或动态操作条件而产生的,只有在通过模拟学习动态系统模型时才会加剧。通常,这些问题是使用基于专业专家知识调整的正规化和罚款来解决的。然而,对于大规模系统,未知操作条件和/或在系统操作期间需要专家输入的在线设置中,该解决方案变得不切实际。取而代之的是,这项工作使代理商能够通过损害绩效和规范成本来自主摆姿势,调整和解决最佳控制问题。利用双重性理论,它提出了一种反事实优化算法,该算法直接确定规范权衡,同时解决最佳控制问题。
In recent years, considerable work has been done to tackle the issue of designing control laws based on observations to allow unknown dynamical systems to perform pre-specified tasks. At least as important for autonomy, however, is the issue of learning which tasks can be performed in the first place. This is particularly critical in situations where multiple (possibly conflicting) tasks and requirements are demanded from the agent, resulting in infeasible specifications. Such situations arise due to over-specification or dynamic operating conditions and are only aggravated when the dynamical system model is learned through simulations. Often, these issues are tackled using regularization and penalties tuned based on application-specific expert knowledge. Nevertheless, this solution becomes impractical for large-scale systems, unknown operating conditions, and/or in online settings where expert input would be needed during the system operation. Instead, this work enables agents to autonomously pose, tune, and solve optimal control problems by compromising between performance and specification costs. Leveraging duality theory, it puts forward a counterfactual optimization algorithm that directly determines the specification trade-off while solving the optimal control problem.