For well over a decade, LAPTOP has been testing notebooks in our lab to help you decide which ones rise above the rest. During that time we’ve adopted both synthetic benchmarks and created real-world tests to give shoppers the most complete picture of a given laptop’s performance. Today, we evaluate everything from speed and battery life to display brightness, speaker volume and system heat. We then use this data to determine a system’s rating, combined with other factors like design, usability and value.
Below is a breakdown of all the tests we use to evaluate laptops and how we use the results to compare similar systems before handing down our final verdict.
Each score is recorded and compared with the averaged scores of all notebooks in the same category. Those categories include:
● Desktop Replacements (16-inch and bigger displays, weighing 7 pounds or more)
● Mainstream (15- and 16-inch displays, weighing less than 7 pounds)
● Thin-and-Lights (12- to 14-inch displays, weighing less than 6 pounds)
● Ultraportables (10- to 13-inch displays, weighing less than 4 pounds)
● Netbooks (low-cost, highly-portable systems, usually 10 to 12-inches and based on low-power processors)
A notebook’s results on each test are compared to results from other systems in its category. The category average for any given test and category (example: battery life test for netbooks) is calculated by taking the mean score from the prior 18 months of test results.
These tests measure overall system performance in a single score, stressing the processor, graphics, and storage drive.
This benchmarking suite, developed by Futuremark, runs on 32-bit and 64-bit Windows 8 systems. It is most often used to test a netbook’s CPU, memory, and 2D and 3D graphics through video playback and transcoding, image manipulation, web browsing and decrypting, importing pictures and other other real-world tasks. It puts the CPU, graphics card, and hard drive through the ringer.
Developed by Maxon, this benchmarking suite tests the performance of a notebook’s CPU (by rendering a high-resolution image using every core) and graphics card (by rendering 3D images using the OpenGL API). It runs on both Windows and Mac notebooks.
Geekbench, developed by Primate Labs, runs on a variety of platforms including Apple and 32-bit or 64-bit Windows machines. It tests the performance and speed of each of the processor cores of the notebook, as well as memory performance. This test returns both a single- and a multi-core score; we only use the latter.
These benchmarks measure the ability of a notebook to provide smooth video playback, gaming, and other video-centric tasks.
Developed by Futuremark, this benchmarking suite tests a notebook’s DirectX 11 graphics performance by using features such as tessellation, shaders and multi-threading to stress the graphics card and CPU. This benchmark only works on GPUs that support DirectX 11.
Futuremark’s latest version of its 3D benchmarking tool, 3DMark Pro tests notebooks on a variety of graphics-related tasks, including physics rendering, real-time lighting and heavy particle effects at 720p and 1080p resolutions. 3DMark Pro is better suited at testing a range of systems with three different tests–Ice Storm, Cloud Gate and Fire Strike, each more demanding than the last. For instance, Ice Storm bests measures entry-level machines with integrated graphics chips, whereas Fire Strike is aimed at testing the most powerful dedicated GPUs.
World of Warcraft
A key ingredient in the enduring popularity of “World of Warcraft” is the fact even relatively low-end notebooks can run the game with ease. On both Windows and Mac, we use the game’s built-in benchmarking script, first with the graphics effects set to autodetect, and then at the maximum. We run the test at three different resolutions, if applicable: 1366 x 768, 1920 x 1080, and then at the notebook’s native resolution, if it’s higher. We consider a frame rate of 30 fps or higher to be playable. A score of 30 to 50 frames per second on a mainstream notebook is relatively good.
MORE: Best Gaming Laptop 2014
Developed by Irrational Games, “Bioshock Infinite” is one of the more graphically demanding games of this generation. We use the game’s built-in benchmarking tool to measure the performance of notebooks with dedicated graphics processors. The benchmark runs through two separate scenes at a steady clip. We run this test six times, twice at the lowest settings (at both 1366 x 768, 1080p and native resolution, if applicable) and twice at the highest settings with DirectX 11 enabled (at the same separate resolutions).
A frame rate of 30 frames per second is generally accepted as playable. At native resolutions and high settings, a score between 35 and 45 fps is considered respectable.
Metro: Last Light
This game developed by 4A Games is an even more intense test of a notebook’s graphics performance than “Bioshock Infinite.” Just as before, we run the included benchmark tool, which runs through a single scene three times and provides the average framerate across all three runs. To give a rounded idea of performance, we run the benchmark four times. First, we run the tool twice at 1366 x 768 resolution. Then, we run the benchmark twice more at 1920 x 1080, then twice more at a notebook’s native resolution. At all resolutions, we run the test once with settings set to low, and DirectX 11 tessellation and NVidia PhysX deactivated, then again with all settings maxed and activated.
As always, 30 frames per seconds if considered playable. However, given “Metro: Last Light’s” demanding nature, a score between 17 and 27 frames per second at native resolution and the highest settings is considered good, but still unplayable.
Laptop Battery Test
This test, developed in the Laptop labs, replicates continuous web surfing over Wi-Fi until the battery is completely drained. Starting with a full battery, a notebook runs a script that visits 60 popular web sites in a loop, pausing for 30 seconds on each, then closing and reopening the notebook’s native browser with the next page. The test is run with the screen at 100 nits, and the notebook’s settings are tweaked to prevent it from entering standby mode or going into hibernation.
Laptop File Transfer Test
This benchmarking test was developed inside the Laptop labs. During this test, a 4.97GB folder of mixed media files, including photos, documents, videos, and music files of varying sizes, is copied from one folder on the notebook’s hard drive to another. We record the speed with which the notebook records the file in MBps.
Using a stopwatch, we record the amount of time the notebook takes to boot from being completely shut down to the moment we have control of the desktop and all of the system tray items are loaded.
To test the system’s external temperature, we stream a Hulu video at full screen for fifteen minutes, and then use a Raytek MiniTemp laser temperature gauge to measure the temperature (in Fahrenheit) of the touchpad, the space between the G and H keys, and the underside of the notebook. We also measure any other hot spots on the notebook, as well.
In the case of gaming laptops, we play a game–such as “Bioshock Infinite”–for 15 minutes, and then retake the temperatures in those same areas.
We consider anything above 95 degrees to be uncomfortable and anything above 100 degrees too hot.
Display Brightness and Quality
To measure the brightness of a notebook’s display, we enable the notebook’s high-contrast white background and then use an Spyder4 colorimeter and the Dispcal app to measure the brightness of each of the four corners of the screen, as well as the center. We then average the five readings to determine the display’s overall brightness in nits.
We’re using the Spyder4 colorimeter for more than just measuring brightness; we’re also using it to see how good a screen is at rendering colors. Using the Dispcal app, we measure a screen’s RGB color gamut and Delta-E.
RGB color gamut is measured on a scale of 1 to 100 percent; the closer a screen is to 100 percent, the more colors it can display in the RGB color space. Displays are capable of exceeding the RGB color space–which is not a bad thing–but at the very least, a good display will be able to render 100 percent.
Delta E (dE) measures how accurately the screen displays different colors with 0 being a perfect match and higher numbers reflecting lower accuracy. While one could get dE numbers for many numbers, our test measures the average. Generally, a dE of 1.0 is regarded as the smallest difference a human eye can see.
Laptop Spreadsheet Macro Test
Like the Battery Test and Transfer Test, the Spreadsheet Macro Test was designed inside the Laptop labs to stress the CPU. During this test, 20,000 names are matched to their corresponding addresses in OpenOffice. We time how long it takes the notebook to complete this task; the shorter, the better.
We measure two metrics on keyboards, key travel and actuation. Key travel measures the difference in the height of the key from its resting state to when it is fully depressed. Thinner notebooks will have less travel (perhaps 1 millimeter) while gaming notebooks will have greater travel. Key actuation measures the amount of force, in grams, required for a key to depress. Generally, we prefer key travel between 1.5 to 2mm or greater, and an actuation of at least 50 grams.
Laptop Audio Test
Developed within the Laptop labs, this test uses a decibel meter to measure the maximum loudness of notebook speakers. Using a steady tone file that plays on loop into the decibel meter from 23 inches away (our determined general distance from a laptop screen when in use), we record the loudness in decibels. We conduct this test within a studio surrounded in foam paneling to minimize the effect of bouncing sound waves on the decibel reading.
Generally speaking, decibel outputs above 80 dB are considered satisfactory. Differences in as little as 3 dB equate to the doubling or halfing of perceived loudness. (E.g. 83 dB is commonly perceived to be twice as loud as 80 dB by the human ear.)
After we complete our lab testing, a product is turned over to a writer who spends a significant amount of time using the device, software, or service. The writer and Laptop’s editors determine a rating based on design, ease of use, features, performance, and overall value. We also take into consideration the target audience of a product and what it is trying to accomplish, and how it stacks up compared to the competition. Each product is rated on a scale of 1 to 5 stars, with half-star ratings possible.
The ratings should be interpreted as follows:
This Editor’s Choice award recognizes products that are the very best in their categories at the time they are reviewed, and only those products that have received a rating of 4 stars and above are eligible. Laptop’s editors carefully consider each product’s individual merits and its value relative to the competitive landscape before deciding whether to bestow this award.