CIRCLES, A PROPOSED ENVIRONMENT FOR RL APPLICATIONS
WHERE DO CIRCLES COME FROM ?
A few years ago I decided to design and program Circles 1.0, a computer
game for windows in which some circles with different size and velocity
remained bouncing in a squared 2D-field and a little creature (the player)
had to capture a little power package located within the field without bumping
any circle. The circles neither stopped bouncing or changed their velocity. Once
the player reached the package, his power and his score (both always
displayed on the screen) increased in one unit, but when any circle reached
the package first, the power decreased in one. However, the package appeared
again in another random location in the field. When the power reached zero
units or when the player bumped a circle, the player lost one of his three
lives. So the game turned into a race between environment and player for the
package. Every 15 points (score%15==0) the lost lifes were automatically
recovered (# of lives:= 3). The player moves were Up,down, right and left.
[Fig 1.] Fighting survive (power= 1!)
|
[Fig 2.] Captured by a circle |
Circles 1.0's source code in C disappeared because of a crash in my pc. The exe file is not interesting any
longer because I naively made screen-updating dependent on the pc's speed, so today it's too difficult to play it
unless one has a slow computer (PII 266 mhz or so). However, here is the
dos-exe file circles_1.0.zip (55 kb).
WHAT CAN I USE THEM FOR ?
I was wondering how good it would be to replace the human player by a
RL agent in this game. Although the environment is not stochastic, it's
difficult for a human (at least for my friends and all people who played
Circles) to keep a "good" behavior without much experience. Many
strategies emerge, but nobody has been able to describe an optimal algorithm,
so reinforcement learning could help us finding it. The space here is
(supposed to be) continuous, which calls the atention of those who like
generalization applied to RL.
WHAT I HAVE DONE...
I was able to re-built the environment. Here you can obtain a copy of the non-graphic simulation using a stupid agent(doesn't know nor learn anything). Just modify the random-agent.h file to produce something good.
I have created a graphic version and some agents that I'll be uploading soon to this site. They haven't learned something really surprising yet, but I have much hope in my next agent. I am using the standard proposed by Richard Sutton and Juan Carlos Santamarķa for programming RL applications. They say RL people could
build libraries of environments and libraries of agents separatedly and
share them. What I want is to contribute with the circles environment and, of
course, to use it for my own experiments with RL agents.
|