Paper Summary: Solving K-MDPs
Abstract: Markov Decision Processes (MDPs) are employed to modelsequential decision-making problems under uncertainty. Traditionally, algorithms to solve MDPs have focused on solving large state or action spaces. With increasing applicationsof MDPs to human-operated domains such as conservationof biodiversity and health, developing easy-to-interpret so-lutions is of paramount importance ...
Paper Summary: Differentiable Logic Machines
Abstract: The integration of reasoning, learning, and decision-making is key to build more general artificial intelligence systems. As a step in this direction, we propose a novel neural-logic architecture, called differentiable logic machine (DLM), that can solve both inductive logic programming (ILP) and reinforcement learning (RL) problems, where the solution can be interpreted as a f...
Paper Summary: Relational Deep Reinforcement Learning
Abstract: We introduce an approach for deep reinforcement learning (RL) that improves upon the efficiency, generalization capacity, and interpretability of conventional approaches through structured perception and relational reasoning. It uses self-attention to iteratively reason about the relations between entities in a scene and to guide a model-free policy. Our results show that in a ...
Paper Summary: Causal Imitation Learning via Unobserved Confounders
Abstract: One of the common ways children learn is by mimicking adults. Imitation learning focuses on learning policies with suitable performance from demonstrations generated by an expert, with an unspecified performance measure, and unobserved reward signal. Popular methods for imitation learning start by either directly mimicking the behavior policy of an expert (behavior cloning) or ...
Paper Summary: Fitted Q-Learning for Relational Domains
Abstract: We consider the problem of Approximate Dynamic Programming in relational domains. Inspired by the success of fitted Q-learning methods in propositional settings, we develop the first relational fitted Q-learning algorithms by representing the value function and Bellman residuals. When we fit the Q-functions, we show how the two steps of Bellman operator; application and project...