Projects

04/04/04

Here are listed some of my major projects:

I. Graphics

I have written my own graphical library with fast drawing primitives (like lines, circles, boxes, spheres, etc), texture mapping, image transformations (filtering and deformation), image and movie compression, palette optimization and transformation, graphic controls (mouse supported) (DOS).

1. Ray-tracing. A demo - four balls on chess board with multiple reflections between each other, light source and shadows. The animation is compressed with my own coding algorithm. The codec supports real-time decompression on Intel386/40mHz. Graphic mode (320x200x256) (DOS).

2. Intersection between two convex polyhedrons (have a demo). I am planning to use this for brush editing in 3D level editor (DOS).

3. A “VisualWave” program – Dos CD player with real-time stereo 256 band sound spectrum analyzer. Graphic controls (buttons, check-boxes, radio buttons, sliders bordered windows, text edit, etc.). Console interface to RS-232. Display mode – 640x480, 800x600, 1024x768 with 256 colors (DOS).

4. A “FireWave” program – Dos CD player with very cool real-time graphic effects (like later appeared WinAmp visualization plug-ins). Including: image filtering, various shape transformations, texture mapping on wall, sphere, cones, etc., textured lines and circles (image mapping on concentric circles), palette optimization and fast true color blur with 256 colors (DOS).

5. Highly optimized Fast Fourier Transform (FFT)/Wavelet algorithm (C++ realization) for sound and image processing.

6. Image enhancement for video frames – NTSC to PAL transformation, single field based frame deinterlacing using nonlinear trainable filtering. You can see some additional information here.

7. Video capturing and video frames to speech timbre linking. The goal was real-time speech synchronized talking head.

8. Various types of vector quantization (including neural network based) used for palette optimization, image enhancement and so on.

9. Delphi component for OpenGL encapsulation.

10. Program that creates tapestry scheme from given picture.

11. 3D level editor/game engine (not finished). 3D Studio Max (3DS) format import, Unreal T3D format import, light-maps creation with ray-tracing, texture packing. Supports portals, Oct-Tree, multi-texturing, collision detection, sky-box. The source code could be compiled with MSVC6.0, Borland C++ Builder 5.0 and MinGW gcc. OpenGL is only supported at this time.

II. Sound

1. A DOS sound library, which supports various types of sound cards : WSS, ESS, SB(Pro), SB16. I used full duplex mode on my WSS sound card to do real-time filtering with 256 band stereo equalizer based on FFT. The Fourier transform is written on assembler and is highly optimized to work with fixed-point arithmetic on Intel486 (Borland Pascal 7.0/Assembler).

2. A sound playing/capturing library for Win32, threaded and supporting full-duplex (C++ Builder 5).

3. A sound analysis/synthesis library, featuring:

Real-time (guitar) effects (echo, distortion, amplitude compensation, vibrato, etc.);

Multi-channel (32 on Intel486) mixing;

Sound/speech pitch estimation;

Filtering (Infinite Impulse Response (IIR); FFT, Wavelet Transform - Finite Impulse Response);

Speech (timbre, pitch and rate) morphing, real-time noise analysis and filtering.

4. Speech coding at very low bit rate (below 2Kbits).

5. A MP3 player with fixed-point arithmetic optimized under Watcom C++ 10.5.

6. A Text-to-Speech synthesizer for Bulgarian language and speech segmentation editor (FFT and Linear Prediction is used). Supports pitch, rate and power modifications per phoneme. Di-phonemes and tri-phonemes are used. You can download a demo.

7. Phoneme recognition.

8. Isolated word speech recognition.

III. Artificial Intelligence

A lot of experience with various types of neural networks (NN) and types of neurons.

1. Conventional neurons - Cascade Correlation NN (clustering, pattern recognition), Time Delay NN (speech recognition), several types of competitive (self-training) neural networks (CNN) – CNN with assisted training, ART1, ART2, cascade ART2 plus additional training modifications (clustering – in speech coding), Kohonen’s NN (phoneme recognition).

2. Spiking neurons – my own training rules, cascade network of spiking neurons. Recognition of complex patterns, in-pattern relations, event prediction. Isolated word recognition with a few neurons (1-3). The word to recognize could differ by timbre, speech rate, speaker pronunciation, etc. My PhD research is based on this.

You can see some additional information here.

3. A PDF article analyzer – the goal is automatic searching in large base of PDF format articles. The analyzer extracts article title, authors, abstract, introduction and body sections.