论文标题
用贝叶斯高斯混合模型积极改善控制策略
Active Improvement of Control Policies with Bayesian Gaussian Mixture Model
论文作者
论文摘要
从演示中学习(LFD)是一个直观的框架,允许非专家用户轻松(重新)程序机器人。但是,演示的质量和数量对LFD方法的概括性能有很大影响。在本文中,我们引入了一个新颖的主动学习框架,以提高控制策略的概括能力。所提出的方法基于贝叶斯高斯混合模型(BGMM)的认知不确定性。我们通过基于二次Rényi熵优化封闭形式的信息密度成本来确定新的查询点位置。此外,为了更好地代表不确定的区域并避免局部最佳问题,我们建议使用高斯混合模型(GMM)近似积极的学习成本。我们在杂乱无章的环境中以示例性玩具示例和熊猫机器人的真实实验来展示我们的积极学习框架。
Learning from demonstration (LfD) is an intuitive framework allowing non-expert users to easily (re-)program robots. However, the quality and quantity of demonstrations have a great influence on the generalization performances of LfD approaches. In this paper, we introduce a novel active learning framework in order to improve the generalization capabilities of control policies. The proposed approach is based on the epistemic uncertainties of Bayesian Gaussian mixture models (BGMMs). We determine the new query point location by optimizing a closed-form information-density cost based on the quadratic Rényi entropy. Furthermore, to better represent uncertain regions and to avoid local optima problem, we propose to approximate the active learning cost with a Gaussian mixture model (GMM). We demonstrate our active learning framework in the context of a reaching task in a cluttered environment with an illustrative toy example and a real experiment with a Panda robot.