The glh Library

Best viewed 1024 x 768 minimum

Thanks to
OpenGL Logo
by Silicon Graphics Inc.
www.opengl.org

A Few ScreenShots
Source + Executable for Depth Of Field Effect (52.5 KB)
Normalization cubemap generated with glhBuildNormalizationCubeMap (417 KB, 6 PNG files)

Link to NEW page

Introduction
This page contains a library called "glh" which stands for Graphics Library Helper. It is kind of like OpenGL's GLU but contains extra functions and optimized functions. The optimizations are done with assembly and are for the x86 architecture. Note that not all parts are done in assembly.

glhlib is still freeware and still contains the very fast image scaling function --> glhScaleImage_asm386
the function that started this project.

Log

Saturday, May 1, 2004
This is an annoucement.
Since it would be nice to have some features similar to what D3DX offers, I have decided to go in this direction.
The next version will be 1.51 and will feature a reader/writer for DDS files with support for many formats (if not all), support for 2D, 3D and Cubemap textures.

The new functions may look something like this :

glhReadFile_DDS(const char *pfilePath, GLint dataAlignment, GLint *width, GLint *height, GLint *depth, GLint *format, GLint *textureType, GLint desiredFormat, GLenum type, void *pixels);
glhWriteFile_DDS(const char *pfilePath, GLint dataAlignment, GLint width, GLint height, GLint depth, GLint format, GLenum type, void *pixels);

I intend to support the following formats :

B2_G3_R3 = 8 bit
B2_G3_R3_A8 = 16 bit
B5_G5_R5_A1 = 16 bit
B5_G5_R5_X1 = 16 bit
B4_G4_R4_A4 = 16 bit
B4_G4_R4_X4 = 16 bit
B5_G6_R5 = 16 bit
BGR8 = 24 bit
RGB8 = 24 bit
BGRA8 = 32 bit
RGBA8 = 32 bit
BGRX8 = 32 bit
RGBX8 = 32 bit
R10_G10_B10_A2 = 32 bit
B10_G10_R10_A2 = 32 bit
B16_G16_R16_A16 = 64 bit

Monday, Nov 24, 2003
Version 1.50 is ready for download.
This update contains some functions that take advantage of SSE instructions.
Intel has created these instructions and released them with the launch of the Pentium 3
Other CPU makers, like AMD, have also added these instructions into their architecture.
It's time to add them to glhlib. Future addition will have an SSE version as well when possible.

In the header file, Block 16 has been added.
The new functions are (See the header file for more details):
NOTE: These functions are for mass processing.
glhProjectFLOAT_2 (Similar to gluProject but uses float and vertices are 4D)
glhProjectFLOAT_3 (Just like glhProjectFLOAT_2, but instead of 4D data, it takes 3D data)
glhUnProjectFLOAT_2 (Similar to gluUnProject but uses float and is intended for mass processing)
glhUnProjectFLOAT_3 (Just like glhUnProjectFLOAT_2, but instead of 4D data, it takes 3D data)

glhProjectFLOAT_SSE_Aligned_2 (Just like glhProjectFLOAT_2, but uses SSE and some x86_fpu)
glhProjectFLOAT_SSE_Aligned_WarmCache_2 (Do I need to explain?)
glhProjectFLOAT_SSE_Unaligned_2 (Just like glhProjectFLOAT_SSE_Aligned_2, but data need not be 16 byte aligned)
glhUnProjectFLOAT_SSE_Aligned_2 (The unproject version)
glhUnProjectFLOAT_SSE_Aligned_WarmCache_2 (The unproject version)
glhUnProjectFLOAT_SSE_Unaligned_2 (The unproject version)
glhMultiplyMatrixByVector4by4FLOAT_1 (Does a mult_matrix with 4D vectors)
glhMultiplyMatrixByVector4by4FLOAT_2 (Does a mult_matrix with 3D vectors)
glhMultiplyMatrixByVector4by4FLOAT_SSE_Aligned_1 (Does a mult_matrix with 4D vectors with SSE)
glhMultiplyMatrixByVector4by4FLOAT_SSE_Aligned_WarmCache_1 (Does a mult_matrix with 4D vectors with SSE)
glhDoesProcessorSupportMMX (Obvious)
glhDoesProcessorSupportSSE (Obvious)
glhDoesOSSupportSSE (Obvious)

Here is one benchmark :

(Processing 1.6 MB of vertices)
glhProjectFLOAT_2 vs. glhProjectFLOAT_SSE_Aligned_2
glhProjectFLOAT_SSE_Aligned_2 is ~1.3 times faster

glhProjectFLOAT_2 vs. glhProjectFLOAT_SSE_Aligned_WarmCache_2
glhProjectFLOAT_SSE_Aligned_WarmCache_2 ~1.8 times faster

glhProjectFLOAT_2 vs. glhProjectFLOAT_SSE_Unaligned_2
glhProjectFLOAT_SSE_Unaligned_2 is ~1.2 times faster

(Processing 4.0 MB of vertices)
glhMultiplyMatrixByVector4by4FLOAT_1 vs. glhMultiplyMatrixByVector4by4FLOAT_SSE_Aligned_1
glhMultiplyMatrixByVector4by4FLOAT_SSE_Aligned_1 is ~1.6 times faster

glhMultiplyMatrixByVector4by4FLOAT_1 vs.glhMultiplyMatrixByVector4by4FLOAT_SSE_Aligned_WarmCache_1
glhMultiplyMatrixByVector4by4FLOAT_SSE_Aligned_WarmCache_1 is ~1.7 times faster

Conclusion
Theoretically, since SSE allows us to process 4 numbers at a times, it should improve the performance by 4 over standard x86_fpu.
In one artificial test, I attained 3.5 times the performance.
In the above glh functions the maximum is probably 2x since there is some overhead and perhaps there is room for improvement.
The glh functions that prefetch data into L1 and L2 cache and try to avoid cache pollution are slightly faster.

In the future, I might release sorting, intersection testing, CSG operation doing, image processing functions using SSE.

Friday, Sept 12, 2003
Version 1.41 is ready for download.

In version 1.40 (or another version), I had made a change to functions :
- glhUnProjectFLOAT_1
- glhUnProjectDOUBLE_1

in order to optimize them but unfortunatly I had placed the wrong function in there so the result it was calculating was wrong. Now both functions give correct results, comparable to gluUnProject which is present in the GLU library.

Also, I have removed (my personal versions) gl.h glext.h glu.h wglext.h from the download, so now the file size is smaller than previous, 59.1 KB while before it was over 100 KB.

Tuesday, June 24, 2003
Version 1.40 is ready for download.

There is a set of functions added and some changes. Check out the header file to learn about the functions and their usage.

Added support for data alignment == 4 in glhScaleImage_asm386, thus any
functions using it in this library also support it.

Made small change to glhScaleImage_asm386 to improve performance for
both 24 and 32 bpp images.

Corrected bug in glhScaleImage_asm386 that caused data corruption on
some color channels for 32 bit images (data alignment == 1).

Corrected bug in glhScaleImage_asm386_MMX that caused data corruption on
some color channels for 32 bit images (data alignment == 1). (Do not use this function)

Corrected bug in glhScaleImage2_asm386 that caused data corruption on
some color channels for 32 bit images (data alignment == 1). (Do not use this function)

Improved glhBuild2DMipmaps so that it wouldn't modify the supplied data
as it created the mipmaps. Also supports data alignment == 4 due to update made
to glhScaleImage_asm386.
Fixed small bug that caused it to reject data format GL_BGR, GL_BGRA,
and GL_ABGR.

glhRenderWith_DepthOfField_SceneAntialiased_FLOAT
has been renamed to
glhRender_DOF_SceneAA_FLOAT
to avoid a compiler bug in Visual C++ 6
The problem is that VC++ is producing a faulty
lib file which cause problems with the linker of VC++
Really weird, since other functions seem fine.

Section 12 has a complete set of matrix functions for doing calculations in software,

Section 13 :
glhBuildCubeMapMipmaps (Has the same benifits as glhBuild2DMipmaps)
glhLowerPowerOfTwo2
glhHigherPowerOfTwo2
Decided not to add glhScaleImage3D_asm386 and glhBuild3DMipmaps.

Section 14 :
glhFrustumd
glhFrustumf
glhOrthod
glhOrthof
glhMergedFrustumd
glhMergedFrustumf
glhMergedPerspectived
glhMergedPerspectivef
glhFrustumInfiniteFarPlaned
glhFrustumInfiniteFarPlanef
glhPerspectiveInfiniteFarPlaned
glhPerspectiveInfiniteFarPlanef
glhLookAtd
glhLookAtf
glhIsMatrixRotationMatrixd
glhIsMatrixRotationMatrixf
glhExtractAnglesFromRotationMatrixd2
glhExtractAnglesFromRotationMatrixf2

Section 15 :
glhBuild2DNormalMipmaps
glhBuildCubeMapNormalMipmaps
glhBuildNormalizationCubeMap
glhBuildNormalizationCubeMap_FLOAT

Tuesday, August 20, 2002
Version 1.30 is up for download. I made some important changes to the glh. As usual, there is a set of new functions added, but nothing seriously big. Look at Block 11 and 12 in the header file.

Saturady, June 1, 2002
Wow, been a long time since I updated this page. Notice the nice pic I made for the title. I have decided to bring my web pages up to date (Read make them nicer looking by putting my artistic talents to good use :))
So it's time for an update to the glhlib.dll and this version is numbered 1.20.
It can do plenty more than the orginal version that could only do image scaling. It can update all the mipmaps when only a portion of the base texture needs to be updated. It can tell if your running on Microsoft GDI or an ICD. It can get you the GLU and GL version numbers as floats for easy checking on version numbers. It has a "Depth Of Field" and FSAA function that uses the accumulation buffer (gives great results). A bunch of other GL related and mathematics related functions to make coding easier.

Sunday, Feb 3, 2002
A better release. Added the glhGetString function.

Saturday, July 28, 2001
A first release of the library and it's source code which only contains an optimized version of gluScaleImage called glhScaleImage_asm386.
I have been able to attain drastic performance gains compared to the gluScaleImage present in glu32.dll

Here is one benchmark :

Beginning benchmarking of gluScaleImage:
1024 x 1024 (24 bit) --> 400 x 400 (24 bit)
time = 1502.00012 milliseconds

Beginning benchmarking of glhScaleImage_asm386 with point filtering:
1024 x 1024 (24 bit) --> 400 x 400 (24 bit)
time = 19.99996 milliseconds

Beginning benchmarking of glhScaleImage_asm386 with linear filtering:
1024 x 1024 (24 bit) --> 400 x 400 (24 bit)
time = 120.00000 milliseconds

First / Second = 75.10016
First / Third = 12.51667
Third / Second = 6.00001

By using point filtering, the algorithm is 75 times faster than gluScaleImage. By using linear filtering, the algorithm is 12 times faster than gluScaleImage. Very impressive numbers, I'd say. The catch is that the image's alignment must be 1, and must be 24 or 32 bit (GL_RGB or GL_RGB8 or GL_RGBA or GL_RGBA8), the buffers must be of type GLubyte (unsigned char). That's what most eople use (as do I) so that's what I optimized for.
Just remember that your mileage may vary, that the algorithm may have bugs, that the results it generates may not match that of the original gluScaleImage.

Links

Home Page My Home Page, the root of everything on this server.

The GLU Library The GLU library for OpenGL. Download the latest version!

Thanks to
OpenGL Logo
by Silicon Graphics Inc.
www.opengl.org

* OpenGL(R) is a registered trademark of Silicon Graphics, Inc.

This page is http://www.oocities.org/vmelkon/glhlibrary.html
This page is http://ee.1asphost.com/vmelkon/glhlibrary.html
Graphics Library Helper aka glh
Copyright (C) 2001-2005 Vrej M. All Rights Reserved.

Home Page	My Home Page, the root of everything on this server.
The GLU Library	The GLU library for OpenGL. Download the latest version!