Anna Shleyfman
CS 7495 – Computer Vision
Final Project Description
Outline:
For my project, I decided to work on the Open Inventor demo game called Maze. The goal of this game is to move the ball from initial position to the final one, by tilting the floor with your left mouse button, so that the ball does not fall in any of the holes. The following picture shows the maze.
My goal was to track the ball’s position in the maze from the rendered image in the animation, rather than to grab it from the simulation. This was an individual project.
To complete the project I had to figure out the Open Inventor interface. Previously I worked with open inventor only once two years ago. My task was to grab the image array from the runtime animation as it rendered the frames for the game. Unfortunately, after spending several days just on studying the Open Inventor software package I still have not found the way to access that information. Instead, I had to re-render every single frame using SoOffscreen-Render class to be able to read the image buffer.
Once I got the hold of image, I realized that the maze is viewed through the perspective camera, and therefore, the picture of the maze is skewed. This presented a problem for translating the pixel coordinates into the maze coordinates. To cope with this problem I rendered out the modified version of maze, divided in the grid by its walls (see the picture below).
From this grid, I generated the matrix that
described the pixel coordinates for each of the squares in the grid. This information was used to assign the ball
the correct position in the maze.
I took the following steps to complete the project:
My program currently correctly identifies the ball position in the maze space but rounds it down to and integer. If the ball’s most part of the body is one cell of the grid, my program will output coordinates of that cell. Because of that limitation I decided not to feed my coordinates into the game, because maze simulation expects the floating point number.
Because I re-render every frame of the animation twice, and regular animations are produced at 30 frames per second, this rendering slows the program at least twice as much. This slowdown in performance is especially notable on the SGI workstations.
Example of
program output:
Found
ball pixel position from image: RGB=243 (295, 102)
Original
programmed ball position: (4.400000, 1.600000)
My estimated position of the ball from image: (5, 2)
Limitations of approach/implementation
Currently, my technique is using the
generated grid to identify ball’s position in the matrix. Originally, I thought
that my program will give the correct output for the program only if the maze
is positioned horizontally. I was
pleased to find out that it works fine due to the fact that the game has
constraints on the amount of the tilt of the maze.
Areas for improvements or future work
I could use the vision technique of detecting the maze tilt, based to the maze edges, to make sure that the coordinates of the maze are identified correctly.
To match my results to the original game coordinates I need to convert my obtained ball’s location in the grid to floating point to match the simulation output. To do that the program needs to be studied thoroughly to understand how that number is computed.
I need to study the Open Inventor software package more, to find the ways of obtaining access to the rendered image buffer directly, without re-rendering every frame of the animation. This change would speed up the program execution tremendously.
Due to this slowdown, the ball movement no longer agrees with the maze rotation during the demo mode of the game that I added. I would have to modify the path file or timer settings of the simulation to fix these disagreements.