Video Core and Memory Speed of GeForce3
tnaw_xtennis@usa.net
2001.02.21
The NVIDIA will release its GeForce3 (card
picture here) in a few days. Several leak info (1,
2) have indicated that the video core and memory on GeForce3 will run at 200MHz and 230MHz respectively. These are no any faster than those of GeForce2 Ultra (core/memory = 250MHz/230MHz). We all know that memory bandwidth is the main factor that limits the performance of video cards today. Then,
with the same memory bandwidth and lower video core speed how can that GeForce3 take on a better performance than its predecessor?
Video core speed: The 3D rendering capacity of the fastest NVIDIA graphic processor has reached a value as high as 1000 mega pixels per second, however this fill rate is never seen actualised due to the acute CPU limitation or memory bandwidth encountered.
Table 1 shows the analyzing results obtained by the method introduced by
Calculating the Potentiality of Your
GeForce2. At 640 x 480 x 32bit, the performance of GeForce2 Ultra is limited by the CPU even in a Pentium 4 1.5 GHz system. Only 23% of capable fill rate of the GeForce2 Ultra, which equal to the rendering output of a GeForce2 Ultra running at 58MHz, brings into play. While at 1600 x 1200 x 32bit, the performance of GeForce2 Ultra is limited by video memory bandwidth and we got a performance same as a GeForce2 Ultra running
at 110MHz video core speed. Therefore,
whether the video core speed of GeForce3 to be 200MHz or 250MHz doesn't make any performance difference, which is just the case of GeForce2 series, for us end users.
It is only a specification printed on the paper.
Video memory speed: The system bottleneck that limit performance of most 3D games at higher resolution, such as 1024 x 768 x 32bit,
is the memory bandwidth of video card. But one must keep in mind that a video processor with advanced core architecture always has the ability to manage memory bandwidth more efficiently and thus take on a better performance with the same memory bandwidth. This is exhibited clearly by the fact of GeForce SDR vs. TNT2 Ultra showed in
Table 2. The higher profitability of the memory bandwidth is just one of the main points that makes that GeForce3 shine over GeForce2 Ultra.
It is also noticeable that several video memory bandwidth saving rendering mechanisms, such as Hidden Surface Remove (HSR) and HyperZ, are being applied on video cards today. With a fixed memory bandwidth, we could obtain a performance boost from 20% to 40% by those features applying.
Table 2 and Table 3 show some analyzing results concerning the memory bandwidth on different NVIDIA based cards. It is interested to find out from these results (light blue stress) that for the same video chip, the profitability of memory bandwidth decreases with increasing memory bandwidth. This is just the reason that
without a advanced video core why we can't make a great video card by only increasing its memory bandwidth, though that the memory bandwidth is the main factor that limits the performance of video cards today.
With the capability of executing programmability instructions just like CPU, GeForce3 strides into a new stage of 3D graphics processor development. Though we shall probably have to wait half year before games are released that take advantage of this new feature, with its new video core architecture and advanced memory-interface GeForce3 will bring us a distinct performance boost over its predecessor when running games at present.
We also notice the facts that low video core and memory speed of GeForce3 introduction and the
300 MHz DDR SDRAM available already
will let NVIDIA release the refresh products GeForce3 Pro (?), Ultra (?), just as
the GeForce2 Pro and Ultra, very easily following the GeForce3.
Appendix
Table 1. Analyzing the GeForce2 based cards by the Quake 3 testing results
GeForce2 GTS | GeForce2 Pro | GeForce2 Ultra | |
Video Card Memory Bandwidth (GB/s) | 5.312 | 6.400 | 7.360 |
Video Processor Fill Rate (Mega Pixels/s) | 800 | 800 | 1000 |
640x480 | |||
Quake 3 Frame_rate (fps) * | 190 | 190 | 190 |
Fill_rate corresponding to the Quake 3 Frame_rate (MB/s) | 233 | 233 | 233 |
Fill Rate of GeForce2 that brings into play (%) | 29% | 29% | 23% |
Video Core Speed Needed Corresponding to the Quake 3 Frame_rate (MHz) | 58 | 58 | 58 |
3D_rendering_memory_bandwidth of GeForce2 that brings into play (%) |
71% | 58% | 51% |
theory un-CPU-limited Quake 3 Frame_rate (fps) | 269 | 325 | 374 |
1600x1200 | |||
Quake 3 Frame_rate (fps) * | 39 | 47 | 57 |
Fill_rate corresponding to the Quake 3 Frame_rate (MB/s) | 300 | 361 | 438 |
Fill Rate of GeForce2 that brings into play (%) | 38% | 45% | 44% |
Video Core Speed Needed Corresponding to the Quake 3 Frame_rate (MHz) | 76 | 90 | 110 |
theory un-memory_bandwidth-limited Quake 3 Frame_rate (fps) |
104 | 104 | 130 |
theory memory_bandwidth needed (GB/s) | 13.196 | 13.196 | 16.351 |
Video Card | Core Speed (MHz) |
Memory Bandwidth
(GB/sec) / Memory Speed (MHZ) |
Pixel Fill Rate
/ Texel Fill Rate (MegaPixels/sec MegaTexels/sec) |
Quake
3
Performance Index * (FPS) |
Quake
3
Performance per Memory Bandwidth (Frame / GB) |
TNT2 Ultra | 150 | 2.928 / 183 |
300 / 300 |
39 | 13.3 |
GeForce SDR | 120 | 2.656 / 166 |
480 / 480 |
50 | 18.8 |
GeForce DDR | 120 | 4.8 / 150x2 |
480 / 480 |
73 | 15.2 |
GeForce2 | 200 | 5.312 / 166x2 |
800 / 1600 |
100 | 18.8 |
GeForce2 Pro | 200 | 6.4 / 200x2 |
800 / 1600 |
115 | 18.0 |
GeForce2 Ultra | 250 | 7.360 / 230x2 |
1000 / 2000 |
122 | 16.6 |
NV20 | 200 | 7.360 / 230x2 |
800 / 3200 |
Quake
3 Performance Improvement |
Performance
per Memory Bandwidth Improvement |
|
GeForce SDR / TNT2 Ultra | +28% | + 41% |
GeForce DDR / GeForce SDR | +46% | - 19% |
GeForce2 / GeForce DDR | +37% | + 24% |
GeForce2 Pro / GeForce2 | +15% | - 4% |
GeForce2 Ultra / GeForce2 Pro | +6% | - 8% |
NV 20 / GeForce2 Ultra |
Turning your GeForce/GeForce2 into a Quadro/Quadro2
Pentium 4 1.5GHz or Athlon Thunderbird 1.2GHz (DDR) ( 2000/11/28 )
Calculating the Potentiality of Your GeForce2 ( 2000/12/26 )
A Preview of NV20 ( 2001/01/18 )
Sidelights of Voodoo5 6000 ( 2001-02-08 )