论文标题
利用多种环境进行学习和决策:拆除用例
Leveraging Multiple Environments for Learning and Decision Making: a Dismantling Use Case
论文作者
论文摘要
学习通常是通过观察实际机器人执行来执行的。基于物理的模拟器是提供高价的信息,同时避免代价高昂且可能具有破坏性的机器人执行的理想选择。我们提出了一种学习符号机器人动作结果概率的新方法。这是在执行时间内利用不同的环境(例如基于物理的模拟器)完成的。为此,我们提出了Menid(多个环境噪声不确定的Deictic)规则,这是一种新颖的表示,能够应对机器人任务中固有的不确定性。 Menid规则明确表示动作的每个可能结果,保持对体验的来源的记忆,并保持每个结果成功的可能性。我们还基于以前的经验和预期收益引入了一种算法以在环境之间分发动作。在使用基于物理学的模拟之前,我们提出了一种评估不同模拟设置的方法,并确定可以使用的最短耗时模型,同时仍会产生相干结果。我们使用质量降低的模拟系统的模拟以及具有完整分辨率的仿真来证明该方法在拆卸用例中的有效性,在该模拟中,我们将噪声添加到轨迹和某些物理参数中作为真实系统的表示。
Learning is usually performed by observing real robot executions. Physics-based simulators are a good alternative for providing highly valuable information while avoiding costly and potentially destructive robot executions. We present a novel approach for learning the probabilities of symbolic robot action outcomes. This is done leveraging different environments, such as physics-based simulators, in execution time. To this end, we propose MENID (Multiple Environment Noise Indeterministic Deictic) rules, a novel representation able to cope with the inherent uncertainties present in robotic tasks. MENID rules explicitly represent each possible outcomes of an action, keep memory of the source of the experience, and maintain the probability of success of each outcome. We also introduce an algorithm to distribute actions among environments, based on previous experiences and expected gain. Before using physics-based simulations, we propose a methodology for evaluating different simulation settings and determining the least time-consuming model that could be used while still producing coherent results. We demonstrate the validity of the approach in a dismantling use case, using a simulation with reduced quality as simulated system, and a simulation with full resolution where we add noise to the trajectories and some physical parameters as a representation of the real system.