Nvidia Kepler GPU Tested: Smaller, Stronger, Made for Ultrabooks

Nvidia today officially announced its new 600M Series of GPUs, which feature the company's new Kepler architecture. The chip promises 10X the speed of Intel's integrate graphics but with the size and efficiency needed to squeeze into slim Ultrabooks like the Acer Aspire Timeline Ultra M3. We got a sneak peek at the new GPU, and took one for a spin, too. What does Nvidia promise, and does it deliver?

What is it?

Kepler is the code name for Nvidia's new GPU architecture. It uses a new 28 nanometer design process, which is smaller than the 40-nm process used in the previous generation, known as Fermi.

Inside, the GPU has 8 geometry units, 32 ROP units, 4 raster units, and 256-bit GDDR5. 

Additionally, the 8 streaming multiprocessors (SMX) have been redesigned for greater power efficiency. Inside each SMX is 192 CUDA cores (for a total of 1,536), 16 texture units (for a total of 128), and Polymorph Engine 2.0.

One way that Nvidia has increased power efficiency is by streamlining the scheduling process. In Fermi, the scheduler included a hardware-based step to ensure that the data being sent through was valid. However, Nvidia realized this was redundant, and was able to remove it. 

Polymorph Engine 2.0 improves the GPU's performance on DX11 tessellation, enabling it to deliver double the per-clock performance of the engine in Fermi GPUs.

What can it do?

According to Nvidia, the GT640M GPU is more than twice as efficient as the GT 540M, and ten times faster than integrated graphics.

The Kepler GPUs will have roughly twice the performance per watt as Fermi GPUs--for example, the a 600M GPU has a TDP (Thermal Design Profile) of about 25 watts, where the 500M has a TDP of about 50 watts. That means notebook makers will be able to fit this GPU into systems with thinner profiles, such as the Acer Aspire Ultrabook M3, and still keep all the parts cool. However, don't expect discrete GPUs in Ultrabooks as thin as the MacBook Air - that's a little bit beyond the laws of physics for now.

Another feature that will be available to all GeForce 600M GPUs is support for FXAA, Nvidia's new anti-aliasing technology, which, according to Nvidia, can provide frame rates twice as high as when using 4xMSAA. The Kepler-based 600M GPUs will be able to do one better, supporting TXAA, which combines anti-aliasing along with other technologies to achieve even smoother lines. According to Nvidia, TXAA1 will be visually equal to 8xMSAA, but only use as much resources as 2xMSAA.  Games and developers who have committed to offering TXAA support include "MechWarrior Online," "Secret World," "Eve Online," "Borderlands 2," Unreal 4 Engine, BitSquid, Slant Six Games, and Crytek.

Kepler GPUs have a new hardware-based H.264 decoding engine, called NVENC, which is almost four times faster than the CUDA-based controller, but consumes less power. 

As with the previous generation, Kepler GPUs will also support DirectX 11, Optimus graphics-switching, PhysX, Verde, CUDA, 3D Vision, and 3DTV Play. Unlike the GTX 680 desktop GPU, the mobile processors won't have Nvidia GPU Boost, which can dynamically overclock the GPU as needed.  

How powerful is it?

Pretty powerful. Let's compare the Acer Aspire Timeline Ultra M3, which has a 1.7-GHZ Intel Core i7-2637M, 4GB of RAM, Nvidia GeForce 640M GPU, and a 256GB SSD, with some heavy hitters:

  • Alienware M14x: 2.3GHz Intel Core i7-2820QM, 8GB RAM, Nvidia GeForce GT555M, 1.5GB VRAM, 750GB, 7200-rpm hard drive, 1600 x 900p display
  • Apple MacBook Pro (15-inch, 2011): 2.2-GHz Intel Core i7-2720QM, 8GB RAM, AMD Radeon 6750M, 1GB VRAM, 750GB, 5,400-rpm hard drive, 1440 x 900 display.
  • HP Pavilion dv7t Quad Edition: 2.0GHz Intel Quad Core i7-2630QM, 8GB RAM, AMD Radeon HD 6770M, 1GB VRAM, 120GB solid state drive, and a 540GB, 7,200-rpm hard drive, 1600 x 900 display. 

On the graphics benchmark 3DMark06, the GT640M in the Acer M3 performed better than the AMD Radeon 6750M in the MacBook Pro, and came in a hair less than the AMD Radeon HD 6770M GPU in the HP Pavilion dv7t. The Alienware M14x, which has an Nvidia GT555M GPU, finished about 1,400 points higher.

In our "World of Warcraft," where we max out all the settings, the Acer Aspire M3 came out on top with an average of 80 frames per second, beating even the Alienware M14x (77 fps). However, it should be noted that the M3 has the lowest-resolution display of the bunch (1366 x 768), so, all things being equal, we would imagine the M14x coming out slightly ahead. Still, 80 fps is nothing to sneeze at.

Finally, in the more demanding "Far Cry 2" benchmark, the Acer Aspire M3 again came out on top. 

We also ran our "World of Warcraft" test on the Acer Aspire M3 using the discrete Nvidia GPU and the integrated Intel HD Graphics 3000 GPU. The results, as you can imagine, were quite disparate. 

With the settings on autodetect, and the screen resolution set to 1366 x 768, the integrated GPU notched 35 fps, which is playable, but the Nvidia GPU scored 155 fps, more than four times as high. When we upped the effects to max, the integrated GPU couldn't handle it, but the discrete GPU clocked in at 80 fps. 

We also tested the NVENC decoding engine by converting a 5-minute 1080p MPEG-4 video into an iPod touch format, using a beta version of Cyberlink MediaEspresso. Indeed, the Nvidia GPU was fast, but Intel's Quicksync technology was even faster.

Where can I find it?

All of the mobile GPUs announced today will be in the GeForce 600M series, of which there are nine. However, not all use the new Kepler architecture; some use the current Fermi architecture, as you can see in the chart below. 

Mainstream: GeForce GT 620M

Performance: GeForce GT 630M, GT 640M, GT 640M LE, GT 650M

Enthusiast: GeForce GTX 660M, GTX 670M, GTX 675M 

Kepler Specs

Swipe to scroll horizontally
Row 0 - Cell 0 GT 620M   GT 630M    GT 635M GT 640M LE   GT 640M  GT 650M   
Process28 nm28/40 nm40 nm28 nm28 nm28 nm
ArchitectureFermiFermiFermiKeplerKeplerKepler
CoresUp to 96Up to 96Up to 144Up to 384Up to 384Up to 384
FeaturesOptimus, PhysX,Verde, CUDA, 3DTV PlayOptimus, PhysX, Verde, CUDA, 3D Vision, 3DTV PlayOptimus, PhysX, Verde, CUDA, 3D Vision, 3DTV PlayOptimus, PhysX, Verde, CUDA, 3D Vision, 3DTV PlayOptimus, PhysX, Verde, CUDA, 3D Vision, 3DTV PlayOptimus, PhysX, Verde, CUDA, 3D Vision, 3DTV Play
ClockUp to 625 MHzUp to 800 MHzUp to 675 MHzUp to 500 MHzUp to 625 MHzUp to 850 MHz
Memory InterfaceUp to 1GB GDDR3Up to 2GB GDDR3Up to 2GB GDDR5Up to 2GB GDDR3Up to 2GB GDDR3 or GDDR5Up to 2GB GDDR3 or GDDR5
Memory WidthUp to 128-bitUp to 128-bitUp to 192-bitUp to 128-bitUp to 128-bitUp to 128-bit
BandwidthUp to 28.8Up to 32.0Up to 43.2Up to 28.8Up to 64.0Up to 64.0
Swipe to scroll horizontally
Row 0 - Cell 0 GTX 660MGTX 670MGTX 675M 
Process28 nm40 nm40 nm
ArchitectureKeplerFermiFermi
CoresUp to 384Up to 336Up to 384
FeaturesOptimus, SLI, PhysX,Verde, CUDA, 3DTV PlayOptimus, SLI, PhysX,Verde, CUDA, 3DTV PlayOptimus, SLI, PhysX,Verde, CUDA, 3DTV Play
Processor ClockUp to 835 MHzUp to 598 MHzUp to 620 MHz
Memory ClockUp to 2000 MHzUp to 1500 MHzUp to 1500 MHz
Memory InterfaceUp to 2GB GDDR5Up to 3GB GDDR5Up to 2GB GDDR5
Memory WidthUp to 128-bitUp to 192-bitUp to 256-bit
BandwidthUp to 64.0Up to 72.0Up to 96.0
LAPTOP Reviews Editor