论文标题
统计上模型检查马尔可夫决策过程中的PCTL规格
Statistically Model Checking PCTL Specifications on Markov Decision Processes via Reinforcement Learning
论文作者
论文摘要
概率计算树逻辑(PCTL)经常用于正式指定控制目标,例如概率可达性和安全性。在这项工作中,我们专注于模型检查PCTL规范在Markov决策过程(MDP)上通过采样(例如,检查是否存在可行的策略),以使达到某些目标状态的概率大于阈值。我们使用强化学习来搜索PCTL规范的可行策略,然后开发一种统计模型检查(SMC)方法,并在其错误上可证明保证。具体而言,我们首先使用基于上限的Q-学习来设计有限时间PCTL规范的SMC算法,然后通过检查PCTL规范及其否定,通过识别适当的截断时间,将此算法扩展到无限的时间规格。最后,我们在案例研究上评估了提出的方法。
Probabilistic Computation Tree Logic (PCTL) is frequently used to formally specify control objectives such as probabilistic reachability and safety. In this work, we focus on model checking PCTL specifications statistically on Markov Decision Processes (MDPs) by sampling, e.g., checking whether there exists a feasible policy such that the probability of reaching certain goal states is greater than a threshold. We use reinforcement learning to search for such a feasible policy for PCTL specifications, and then develop a statistical model checking (SMC) method with provable guarantees on its error. Specifically, we first use upper-confidence-bound (UCB) based Q-learning to design an SMC algorithm for bounded-time PCTL specifications, and then extend this algorithm to unbounded-time specifications by identifying a proper truncation time by checking the PCTL specification and its negation at the same time. Finally, we evaluate the proposed method on case studies.