通过程序合成从演示中进行半监督的学习：检查机器人案例研究

论文标题

通过程序合成从演示中进行半监督的学习：检查机器人案例研究

Semi-supervised Learning From Demonstration Through Program Synthesis: An Inspection Robot Case Study

论文作者

Smith, Simón C., Ramamoorthy, Subramanian

论文摘要

半监督学习通过利用无监督学习的方法来提高监督机器学习的性能，以提取标签中未明确可用的信息。通过设计一个使机器人从人类操作员学习检查策略的系统，我们提出了一个混合半监督系统，能够从演示中学习可解释和可验证的模型。该系统通过使用顺序重要性抽样从沉浸式演示中学习来诱导控制器程序。这些视觉伺服控制器通过比例增益进行参数化，并通过观察机器人在环境中的位置进行视觉验证。聚类和有效的粒度过滤使系统可以在状态空间中发现目标。这些目标用于标记原始演示，以端到端学习行为模型。行为模型用于自主模型预测控制，并仔细检查以进行解释。我们实施因果敏感性分析，以识别显着对象并产生反事实的条件解释。这些功能使决策解释并事后发现失败的原因。所提出的系统通过将驱虫剂纳入抽样过程的归属中，扩展了以前的程序合成方法。我们从检查场景中成功地学习了混合系统，在该方案中，无人接地车必须按照特定的顺序检查环境的不同区域进行检查。该系统诱导了可解释的演示计算机程序，该程序可以合成以产生新的检查行为。重要的是，机器人在对环境的看不见的配置上成功地运行了合成程序，同时介绍了其自主行为的解释。

Semi-supervised learning improves the performance of supervised machine learning by leveraging methods from unsupervised learning to extract information not explicitly available in the labels. Through the design of a system that enables a robot to learn inspection strategies from a human operator, we present a hybrid semi-supervised system capable of learning interpretable and verifiable models from demonstrations. The system induces a controller program by learning from immersive demonstrations using sequential importance sampling. These visual servo controllers are parametrised by proportional gains and are visually verifiable through observation of the position of the robot in the environment. Clustering and effective particle size filtering allows the system to discover goals in the state space. These goals are used to label the original demonstration for end-to-end learning of behavioural models. The behavioural models are used for autonomous model predictive control and scrutinised for explanations. We implement causal sensitivity analysis to identify salient objects and generate counterfactual conditional explanations. These features enable decision making interpretation and post hoc discovery of the causes of a failure. The proposed system expands on previous approaches to program synthesis by incorporating repellers in the attribution prior of the sampling process. We successfully learn the hybrid system from an inspection scenario where an unmanned ground vehicle has to inspect, in a specific order, different areas of the environment. The system induces an interpretable computer program of the demonstration that can be synthesised to produce novel inspection behaviours. Importantly, the robot successfully runs the synthesised program on an unseen configuration of the environment while presenting explanations of its autonomous behaviour.

下载PDF全文

下载文献需遵守相关版权规定

论文标题