Multi-Armed Bandits for Intelligent Tutoring Systems

11 October 2013

Abstract

We present an approach to Intelligent Tutoring Systems which adaptively personalizes sequences of learning activities to maximize skills acquired for each individual student, taking into account limited time and motivational resources. At a given point in time, the system tries to propose to the student the activity which makes him progress best, hence the name of the approach: the Right Activity at the Right Time" (RiARiT). The system is based on the combination of three approaches. First, it leverages recent models of intrinsically motivated learning by transposing them to active teaching, relying of empirical estimation of learning progress provided by speci?c activities to particular students. Second, it uses state-of-the-art Multi-Arm Bandit (MAB) techniques to e?ciently manage the exploration/exploitation challenge of this optimization process. Third, it leverages expert knowledge to constrain and bootstrap initial exploration of the MAB, while requiring only coarse guidance information of the expert and allowing the system to deal with didactic gaps in its knowledge. The system is evaluated in a scenario where 7-8 year old schoolchildren learn how to decompose number while manipulating money. Systematic experiments are presented with simulated students, followed by results of a pilot study across a population of 100 schoolchildren.

View on arXiv

Comments on this paper