Efficient robotic behaviors require robustness and adaptation to dynamic changes of the environment, whose characteristics rapidly vary during robot operation. To generate effective robot action policies, planning and learning techniques have shown the most promising results. However, if considered individually, they present different limitations. Planning techniques lack generalization among similar states and require experts to define behavioral routines at different levels of abstraction. Conversely, learning methods usually require a considerable number of training samples and iterations of the algorithm. To overcome these issues, and to efficiently generate robot behaviors, we introduce LOOP, an iterative learning algorithm for optimistic planning that combines state-of-the-art planning and learning techniques to generate action policies. The main contribution of LOOP is the combination of Monte-Carlo Search Planning and Q-learning, which enables focused exploration during policy refinement in different robotic applications. We demonstrate the robustness and flexibility of LOOP in various domains and multiple robotic platforms, by validating the proposed approach with an extensive experimental evaluation.
Dettaglio pubblicazione
2021, ROBOTICS AND AUTONOMOUS SYSTEMS, Pages 103693- (volume: 136)
LOOP: Iterative learning for optimistic planning on robots (01a Articolo in rivista)
Riccio F., Capobianco R., Nardi D.
Gruppo di ricerca: Artificial Intelligence and Robotics
keywords