Author
Listed:
- Fricker, Nicolai Benjamin
- Krüger, Nicolai
- Schubart, Constantin
Abstract
In order to learn a task through behavior cloning, a dataset consisting of state-action pairs is needed. However, this kind of data is often not available in sufficient quantity or quality. Consequently, several publications have addressed the issue of extracting actions from a sequence of states to convert them into corresponding state-action pairs (Torabi et al., 2018; Edwards et al., 2019; Baker et al., 2022; Bruce et al., 2024). Using this dataset, an agent can then be trained via behavior cloning. For instance, this approach was applied to games such as Cartpole and Mountain Car (Edwards et al., 2019). Additionally, actions were extracted from videos of Minecraft (Baker et al., 2022) and jump 'n' run games (Edwards et al., 2019; Bruce et al., 2024) to train deep neural network models to play these games. In this work, videos from YouTube as well as synthetic videos of the game Sokoban were analyzed. Sokoban is a single-player, turn-based game where the player has to push boxes onto target squares (Murase et al., 1996). The actions that a user performs in the videos were extracted using a modified training procedure described by Edwards et al. (2019). The resulting state-action pairs were used to train deep neural network models to play Sokoban. These models were further improved with reinforcement learning in combination with a Monte Carlo tree search as a planning step. The resulting agent demonstrated moderate playing strength. In addition to learning how to solve a Sokoban puzzle, the rules of Sokoban were learned from videos. This enabled the creation of a Sokoban simulator, which was used to carry out model-based reinforcement learning. This work serves as a proof of concept, demonstrating that it is possible to extract actions from videos of a strategy game, perform behavior cloning, infer the rules of the game, and perform model-based reinforcement learning - all without direct interaction with the game environment. Code and models are available at https://github.com/loanMaster/sokoban_learning.
Suggested Citation
Fricker, Nicolai Benjamin & Krüger, Nicolai & Schubart, Constantin, 2024.
"Learning to play Sokoban from videos,"
IU Discussion Papers - IT & Engineering
3 (Oktober 2024), IU International University of Applied Sciences.
Handle:
RePEc:zbw:iubhit:303523
Download full text from publisher
Corrections
All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:zbw:iubhit:303523. See general information about how to correct material in RePEc.
If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.
We have no bibliographic references for this item. You can help adding them by using this form .
If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.
For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: ZBW - Leibniz Information Centre for Economics (email available below). General contact details of provider: https://www.iu.de/forschung/ .
Please note that corrections may take a couple of weeks to filter through
the various RePEc services.