Monte Carlo Tree Search for Verifying Reachability in Markov Decision Processes

Logo poskytovatele

Varování

Publikace nespadá pod Filozofickou fakultu, ale pod Fakultu informatiky. Oficiální stránka publikace je na webu muni.cz.
Autoři

ASHOK Pranav BRÁZDIL Tomáš KŘETÍNSKÝ Jan SLÁMEČKA Ondřej

Rok publikování 2018
Druh Článek ve sborníku
Konference Leveraging Applications of Formal Methods, Verification and Validation (ISoLA 2018)
Fakulta / Pracoviště MU

Fakulta informatiky

Citace
Doi http://dx.doi.org/10.1007/978-3-030-03421-4_21
Klíčová slova Monte Carlo Tree Search; Reachability; Markov Decision Processes
Popis The maximum reachability probabilities in a Markov decision process can be computed using value iteration (VI). Recently, simulation-based heuristic extensions of VI have been introduced, such as bounded real-time dynamic programming (BRTDP), which often manage to avoid explicit analysis of the whole state space while preserving guarantees on the computed result. In this paper, we introduce a new class of such heuristics, based on Monte Carlo tree search (MCTS), a technique celebrated in various machine-learning settings. We provide a spectrum of algorithms ranging from MCTS to BRTDP. We evaluate these techniques and show that for larger examples, where VI is no more applicable, our techniques are more broadly applicable than BRTDP with only a minor additional overhead.
Související projekty:

Používáte starou verzi internetového prohlížeče. Doporučujeme aktualizovat Váš prohlížeč na nejnovější verzi.