INTERACT IST-2006-031215 (2007-2010).
Keywords: EC VI Framework. IST. Holographic Displays. Gesture Tracking. Natural Speech Processing. Real-Time. Fusion.  


Project Summary

The real world that surrounds us is 3D and for that reason, 3D information is becoming every day more important in today’s Information Society. Unfortunately, existing computer user interaction methods are fairly inadequate since 3D information is inherently difficult to visualize, comprehend and manipulate. Most existing display devices are designed for 2D operation and show severe difficulties to represent realistically the third dimension. On the other hand, there is no satisfactory solution for the interaction and manipulation of 3D objects because existing technologies are expensive, cumbersome, time consuming and require expert knowledge. As a consequence the massive dissemination of 3D applications is being jeopardised due to the lack of adequate interfacing solutions.

INTERACT uses all three senses: vision, speech and touch, with the aim to overcome existing limitations in the way information is presented and interacted with in the new generation of computer platforms. The emphasis is placed on 3D information visualization and manipulation based on the fusion of three core technologies: Holographic Displays, Computer Vision based 3D Hands Tracker and Speech Processing. The Holographic Displays provides, highly realistic, naked eyed, representations of 3D information to multiple simultaneous users, each viewing a different perspective. 3D objects may appear floating behind or in front of the display and users will be able to walk around the screen in a wide field of view, seeing objects and shadows moving continuously as in the normal perspective without causing strain to the user. The Hands Tracker uses advanced computer vision techniques to achieve very accurate detection of the movements of the hands and fingers of the user in the 3D space with the aim of precise manipulation of virtual objects and advanced gesture interpretation. Finally, the Speech Interface allows intelligent dialogue exchange with the machine using plain words.

The main result from the INTERACT project will be an advanced 3D Information Access Point, where the three mentioned interfacing technologies will be fully integrated with the aim to achieve improved levels of realism and interactivity. The project also provides the tools necessary to integrate the individual interfacing solutions into third party applications. Within the INTERACT 3D Information Access Point virtual representations is displayed realistically in true 3D. Then, the user is to manipulate the virtual objects in real time using his own hands, while he issues gesture commands or dialogues with the speech interface for more complex interaction requests with the machine. INTERACT interfacing technologies are achieved naked eyed, do not require specially prepared laboratories and avoid the use of uncomfortable devices; such as: markers, wearable devices or cabling.

Finally, the INTERACT project produces two demonstration applications: the 3D Human Body Simulator and the Circuit Board Design training application. Both demonstration applications exercises the most advanced capabilities of INTERACT’s interfacing technologies with the aim to provide a demonstration platform for the most promising market fields targeted in the project: gaming and education, medical applications, CAD and design, visualization of scientific data, kiosks and museums, virtual training and military applications.

Project Objectives

The objective of the INTERACT project is to develop new technologies and methods that will contribute to overcome existing limitations in the way information is presented and interacted with in existing or future computer interfaces. The emphasis is placed on visualizing and manipulating 3D information within a computer platform in a most realistic and natural way. This aim is achieved based on the fusion of three core technologies:

  • For the purpose of true 3D visualization INTERACT develops a highly immersive Holographic 3D Display capable of materialising realistic naked eye 3D representations to multiple simultaneous users.
  • With the objective of precise manipulation of the presented 3D information INTERACT develops a high precision computer vision based Hands Tracker, capable of capturing in real time a detailed 3D representation of the movements of the complete hand (hands and fingers) of the user.
  • Finally, for an uncompromised dialogue exchange with the machine using plain words INTERACT develops a Speech-Processing interface capable of recognising the human voice and interpreting the commands of the user.

In INTERACT we develop a contact-less, computer vision based, Hands Tracker able to capture in real time a detailed 3D representation of the movements of the complete hand (hands and fingers) of the user, while avoiding the use of uncomfortable wearable devices. For that purpose a complete kinematic model of the hands of the user is constructed with the aim to track precisely the movements of his fingers and hands so that it becomes possible the development of sophisticated 3D user interfaces, highly interactive manipulation of 3D virtual objects and rich gesture interpretation. The system is person independent and the capturing environment is most simple since the Hands Tracker does not require any special preparation or decoration of the scenario and the system operates under normal ambience light conditions.

On the other hand INTERACT uses the recently appeared holographic display technology as the right solution for the visualisation problem.  Holographic displays can provide a naturally interactive and immersive 3D experience without the need to wear any kind of display device, helmet, glasses or similar headgear. Holographic displays are truly multi-person devices; several persons are able to walk around the screen in a wide field of view. Each user sees different details of the objects and shadows moving continuously as in the normal perspective. It is also possible to look behind the objects, hidden details appearing, while others disappear as in motion parallax.

In addition, in the INTERACT framework, speech recognition is used in a multimodal environment where gesture and speech are combined to manipulate 3D objects. The basic objective of the Speech Processing Module of the INTERACT project is to develop innovative voice interfaces that allows more intuitive control of the system and permits a natural dialogue between man and machine. In more concrete terms, the objective is twofold: provide the capabilities that allows the user to issue voice commands that modifies the performance of the 3D environment and to search for contextual information related to the objects or applications in scene.

The technologies described above are fused together into a common 3D work space with the aim to implement application scenarios where traditional input and visualization devices are discarded in favour of methods that allows visualizing and manipulating 3D information using 3D vision, gestures and speech. In that sense the system is based on highly immersive visualization devices where users can share their experiences and are no longer constrained by: annoying cabling, primitive markers, uncomfortable wearable devices, special lighting, limited freedom to move, etc. In addition, the user is to manipulate 3D objects in space with his own hands or command the user interface using simple gestures. Finally the system must respond intelligently to sophisticated user’s requests by means of the bundled speech processing interface.

All these technological capabilities are integrated into a working prototype implementing an advanced 3D access point suitable for visualizing and manipulating 3D information.

 

                                                                                            


Last Updated 12/03/2007

aa