![]() |
![]() |
![]() |
Processors |
Motherboards |
Chipsets |
Memory |
Graphics Cards |
Storage |
Cases and Cooling |
Mobile |
Systems |
Displays |
Shows and Expos |
|
|
:: PC Perspective . Graphics Card . NVIDIA GeForce GTX 460 Review - GF104 and the budget Fermi . More than just a shrink
The PC Perspective Podcast is your weekly stop for the latest PC tech news and reviews! Give it a listen!
More than just a shrinkThe Need for GF104
This is the original GF100 GPU that consists of 4 GPC (Graphics Processing Clusters), each with 4 SMs (Simultaneous Multiprocessors), each with 32 CUDA cores. While we never saw that magical and elusive 512 core part from NVIDIA, the GeForce GTX 480 has a single SM disabled for a total of 480 cores.
Now here is the GF104 GPU that looks surprisingly different. There are two GPCs and four SMs per GPC but there some interesting changes included like the move from 32 to 48 CUDA cores with each SM. Most of the other components scale accordingly including the move to a 256-bit memory bus (though there is a 192-bit option we'll discuss as well).
Also worth noting is that just like the GF100, the GF104 is being released with one SM completely disabled - the remaining seven SMs add up to the total 336 CUDA cores.
Taking a closer look at the SM of the new GF104 we can see some other interesting changes. The increase in the number of CUDA cores in the SM is balanced by doubling the number of instruction dispatch units to four. NVIDIA has also doubled the number of texture units for the SM to 8 and this indicates a higher concentration of texture performance than shader performance when compared to the GF100 design.
Most interesting is the fact that the PolyMorph Engines are now balanced quite differently with one per 48 SMs rather than one per 32 SMs. Remember that NVIDIA has been highly touting its advantage in tessellation performance and games that use the technology though this change moves the tessellation performance per CUDA core value down some. While the GTX 480 saw 30 cores per PolyMorph Engine the new GTX 460 will see 41.5 cores per tessellation engine.
As you would expect the new GF104 GPU is quite a bit smaller than the original GF100 - and it is an odd rectangular shape based on the dimensions of the heat spreader resting over it. Also note that we are testing A1 silicon; a notable achievement for anyone familiar with processor design.
For reference, here is a GF100 GPU with the same quarter size comparison While NVIDIA wouldn't tell us the exact die size for the new GF104 we do know that it has a transistor count of 1.95 billion compared to the 3.0 billion of GF100; a 35% smaller chip.
If we look at the rest of the specifications for the GTX 460 / GF104 GPU, a few other notable items step out. While I mentioned all of the texture unit counts per SM it's nice to see a total of 56 of them on such a low cost graphics card when the GTX 480 has only slightly more of them with 60. The ROP count comes in at 24 and 32 depending on the memory buffer configuration and total memory bandwidth comes in at 86.4 GB/s or 115.2 GB/s compared to the 177.4 GB/s of the GF100. The drop in L2 cache size going from the 256-bit GTX 460 (512KB) to the 192-bit version (384KB) will also affect performance of games as well as in non-gaming applications and I'll be curious how that pans out for CUDA-enabled programs.
With such a dramatic shuffling of the GF100 architecture it seems obvious that NVIDIA found a couple of things worth changing as they built the GF104 chip for use in the GTX 460. By reducing the number of PolyMorph Engines per CUDA core in the GPU NVIDIA has lowered the amount of relative tessellation performance in comparison to the GTX 480 and GTX 470. I doubt they have lowered it too much but NVIDIA obviously thought the tessellation engines were idling a bit too much so by increasing the number of cores for shader processing per PolyMorph the balance should be shifted in the other direction. This type of adjustment happens pretty often as GPU companies move from process node to process node or between redesigns like this. NVIDIA is simply adjusting its estimates for GPU performance and utilization across games, GPGPU functionality, etc, in hopes that they will find better power efficiency in the long run. |
|
|||||||||||
![]() |
Legal - Contact - Advertising | ![]() |