Non-markovian Reinforcement Learning (RL) tasks are extremely hard to solve, because intelligent
agents must consider the entire history of state-action pairs to act rationally in the environment. Most
works use Linear Temporal Logic (LTL) to specify temporally-extended tasks. This approach applies
only in finite and discrete state environments or continuous problems for which a mapping between
the continuous state and a symbolic interpretation is known as a symbol grounding function. In this
work, we define Visual Reward Machines (VRM), an automata-based neurosymbolic framework that can
be used for both reasoning and learning in non-symbolic non-markovian RL domains. VRM is a fully
neural but interpretable system, that is based on the probabilistic relaxation of Moore Machines. Results
show that VRMs can exploit ungrounded symbolic temporal knowledge to outperform baseline methods
based on RNNs in non-markovian RL tasks.
Dettaglio pubblicazione
2023, 17th International Workshop on Neural-Symbolic Learning and Reasoning, Pages -
Visual reward machines (04b Atto di convegno in volume)
Umili Elena, Argenziano Francesco, Barbin Aymeric, Capobianco Roberto
Gruppo di ricerca: Artificial Intelligence and Robotics
keywords