Eduardo Daza's Home Page

Eduardo Daza's Home Page - Circles

CIRCLES, A PROPOSED ENVIRONMENT FOR RL APPLICATIONS
WHERE DO CIRCLES COME FROM ?
A few years ago I decided to design and program Circles 1.0, a computer game for windows in which some circles with different size and velocity remained bouncing in a squared 2D-field and a little creature (the player) had to capture a little power package located within the field without bumping any circle. The circles neither stopped bouncing or changed their velocity. Once the player reached the package, his power and his score (both always displayed on the screen) increased in one unit, but when any circle reached the package first, the power decreased in one. However, the package appeared again in another random location in the field. When the power reached zero units or when the player bumped a circle, the player lost one of his three lives. So the game turned into a race between environment and player for the package. Every 15 points (score%15==0) the lost lifes were automatically recovered (# of lives:= 3). The player moves were Up,down, right and left.

[Fig 1.] Fighting survive (power= 1!)

[Fig 2.] Captured by a circle

Circles 1.0's source code in C disappeared because of a crash in my pc. The exe file is not interesting any longer because I naively made screen-updating dependent on the pc's speed, so today it's too difficult to play it unless one has a slow computer (PII 266 mhz or so). However, here is the dos-exe file circles_1.0.zip (55 kb).
WHAT CAN I USE THEM FOR ?
I was wondering how good it would be to replace the human player by a RL agent in this game. Although the environment is not stochastic, it's difficult for a human (at least for my friends and all people who played Circles) to keep a "good" behavior without much experience. Many strategies emerge, but nobody has been able to describe an optimal algorithm, so reinforcement learning could help us finding it. The space here is (supposed to be) continuous, which calls the atention of those who like generalization applied to RL.
WHAT I HAVE DONE...
I was able to re-built the environment. Here you can obtain a copy of the non-graphic simulation using a stupid agent(doesn't know nor learn anything). Just modify the random-agent.h file to produce something good. I have created a graphic version and some agents that I'll be uploading soon to this site. They haven't learned something really surprising yet, but I have much hope in my next agent. I am using the standard proposed by Richard Sutton and Juan Carlos Santamaría for programming RL applications. They say RL people could build libraries of environments and libraries of agents separatedly and share them. What I want is to contribute with the circles environment and, of course, to use it for my own experiments with RL agents.

Last modified: 27/07/2002, 14:28 PM.
Using Hyper Builder on GNU/ Linux.
Thanks!