NVIDIA GeForce GTX TITAN X review and testing: beating babies. Review of the NVIDIA TITAN X video adapter: large Pascal Temperature mode of the video card

Nvidia Geforce GTX Titan X

The most powerful single-processor accelerator

  • Part 2 - Practical introduction

Due to the late receipt of a test sample of the new accelerator (and software for it), as well as due to the participation of our author Alexei Berillo in the work of GTC, the parts of this review devoted to the architecture of the new Nvidia product and the analysis of synthetic tests will be released later (in about a week ). And now we present a material that introduces readers to the features of the video card, as well as the results of gaming tests.

Device (s)



Nvidia Geforce GTX Titan X 12288MB 384-bit GDDR5 PCI-E
ParameterMeaningNominal value (reference)
GPUGeforce GTX Titan X (GM200)
InterfacePCI Express x16
GPU frequency (ROPs), MHz1000—1075 1000—1075
Memory frequency (physical (effective)), MHz1750 (7000) 1750 (7000)
Memory bus width, bit384
The number of computing units in the GPU / block frequency, MHz24/1000—1075 24/1000—1075
Number of operations (ALU) per block128
Total number of operations (ALU)3072
Texture units (BLF / TLF / ANIS)192
ROP units96
Dimensions, mm270 × 100 × 35270 × 100 × 35
The number of slots in the system unit occupied by the video card2 2
PCB colorblackblack
Power consumption (peak in 3D / in 2D mode / in "sleep" mode), W257/98/14 257/98/14
Noise level (in 2D mode / in 2D mode (video viewing) / in maximum 3D mode), dBA20/21/29,5
Output jacks1 × DVI (Dual-Link / HDMI), 1 × HDMI 2.0, 3 × DisplayPort 1.2
Multiprocessor supportSLI
Maximum number of receivers / monitors for simultaneous image display4 4
Additional power: number of 8-pin connectors1 1
Auxiliary power supply: number of 6-pin connectors1 1
Maximum 2D Resolution: DP / HDMI / Dual-Link DVI / Single-Link DVI
Max 3D Resolution: DP / HDMI / Dual-Link DVI / Single-Link DVI3840 × 2400/3840 × 2400/2560 × 1600/1920 × 1200

Equipped with local memory

The card has 12288 MB of GDDR5 SDRAM located in 24 4 Gbit chips (12 on each side of the PCB).

We used examples from Microsoft and AMD SDKs, as well as the Nvidia demo program for our synthetic DirectX 11 benchmarks. The first is HDRToneMappingCS11.exe and NBodyGravityCS11.exe from the DirectX SDK (February 2010). We also took applications from both video chip manufacturers: Nvidia and AMD. Samples DetailTessellation11 and PNTriangles11 were taken from ATI Radeon SDK (they are also in DirectX SDK). Additionally, Nvidia's demo program, Realistic Water Terrain, also known as Island11, was used.

Synthetic tests were carried out on the following video cards:

  • Geforce GTX Titan X GTX Titan X)
  • Geforce GTX Titan Z with standard parameters (abbreviated GTX Titan Z)
  • Geforce GTX 980 with standard parameters (abbreviated GTX 980)
  • Radeon R9 295X2 with standard parameters (abbreviated R9 295X2)
  • Radeon R9 290X with standard parameters (abbreviated R9 290X)

To analyze the performance of the new model of the Geforce GTX Titan X video card, these solutions were chosen for the following reasons. The Geforce GTX 980 is based on the GPU of the same Maxwell architecture, but at a lower level - the GM204, and it will be very interesting for us to evaluate what caused the complication of the chip to the GM200. Well, the dual-GPU Geforce GTX Titan Z is just taken as a guideline - as the most productive Nvidia video card based on a pair of GK110 chips from the previous Kepler architecture.

We also chose two video cards from rival AMD for our comparison. They are very different in principle, although they are based on the same Hawaii GPUs - they just have different number of GPUs on the cards and they differ in positioning and price. The Geforce GTX Titan X has no price rivals, so we took the most powerful dual-GPU video card, the Radeon R9 295X2, although such a comparison will not be very interesting from the technical point of view. For the latter, the fastest single-chip video card of a competitor is taken - Radeon R9 290X, although it was released too long ago and is based on a GPU of clearly lower complexity. But there is simply no other choice from AMD solutions.

Direct3D 10: PS 4.0 Pixel Shader Tests (Texturing, Loops)

We have abandoned the outdated DirectX 9 tests, as ultra-powerful solutions like the Geforce GTX Titan X show poor results in them, being always limited by memory bandwidth, fill rate or texturing. Not to mention the fact that dual-GPU video cards do not always work correctly in such applications, and we have two of them.

The second version of RightMark3D includes two already familiar PS 3.0 tests for Direct3D 9, which were rewritten for DirectX 10, as well as two more new tests. The first pair adds the ability to enable self-shadowing and shader supersampling, which additionally increases the load on video chips.

These tests measure the performance of executing pixel shaders with loops with a large number of texture samples (in the heaviest mode, up to several hundred samples per pixel) and a relatively low ALU load. In other words, they measure the texture sampling rate and branching efficiency in a pixel shader.

The first pixel shader test will be Fur. At the lowest settings, it uses 15 to 30 texture samples from the heightmap and two samples from the main texture. Effect detail - “High” mode increases the number of samples up to 40–80, enabling “shader” supersampling - up to 60–120 samples, and the “High” mode, together with SSAA, has a maximum “severity” - from 160 to 320 samples from the height map.

Let's first check the modes without supersampling enabled, they are relatively simple, and the ratio of the results in the “Low” and “High” modes should be approximately the same.

Performance in this test depends on the number and efficiency of TMUs, as well as the efficiency of complex programs execution. And in the variant without supersampling, the effective fill rate and memory bandwidth also have an additional effect on performance. The results at the level of detail "High" are obtained up to one and a half times lower than at "Low".

In the tasks of procedural rendering of fur with a large number of texture fetches, with the release of video chips based on GCN architecture, AMD has long seized the leadership. It is the Radeon boards that are the best in these comparisons to this day, which indicates that they are more efficient in performing these programs. This conclusion is confirmed by today's comparison - the Nvidia video card we are considering lost even to the outdated single-chip Radeon R9 290X, not to mention the closest price competitor from AMD.

In the first Direct3D 10 test, the new video card Geforce GTX Titan X turned out to be slightly faster than its younger sister on a chip of the same architecture in the form of a GTX 980, but the lag of the latter is small - 9-12%. This result can be explained by the noticeably lower texturing speed of the GTX 980, and it lags behind in other parameters, although the matter is clearly not in the performance of ALUs. The dual-GPU Titan Z is faster, but not as fast as the Radeon R9 295X2.

Let's look at the result of the same test, but with “shader” supersampling enabled, which quadruples the work: in such a situation, something should change, and the memory bandwidth with fill rate will have a smaller effect:

In difficult conditions, the new Geforce GTX Titan X video card is already more noticeably ahead of the younger model from the same generation - GTX 980, being faster by a decent 33-39%, which is much closer to the theoretical difference between them. And the gap with competitors in the form of the Radeon R9 295X2 and R9 290X has decreased - the new product from Nvidia has almost caught up with the single-chip Radeon. However, the dual-GPU remained far ahead, because AMD chips prefer per-pixel calculations and are very strong in such calculations.

The next DX10 test measures the performance of complex pixel shaders with loops with a large number of texture samples and is called Steep Parallax Mapping. At low settings, it uses 10 to 50 texture samples from the heightmap and three samples from the main textures. When you turn on the heavy mode with self-shadowing, the number of samples doubles, and supersampling increases this number four times. The most difficult test mode with supersampling and self-shadowing selects from 80 to 400 texture values, that is, eight times more than the simple mode. We first check the simple options without supersampling:

The second pixel-shader test Direct3D 10 is more interesting from a practical point of view, since varieties of parallax mapping are widely used in games, and heavy options like steep parallax mapping have long been used in many projects, for example, in games from the Crysis, Lost Planet and many others series. Besides, in our test, in addition to supersampling, you can enable self-shadowing, which doubles the load on the video chip - this mode is called “High”.

The diagram is generally similar to the previous one, also without supersampling enabled, and this time the new Geforce GTX Titan X turned out to be a little closer to the GTX Titan Z, losing not so much to a dual-GPU board based on a pair of Kepler GPUs. Under different conditions, the new product is 14-19% ahead of the previous top-end model of the current generation from Nvidia, and even if we compare it with AMD video cards, something has changed - in this case, the new GTX Titan X is slightly inferior to the Radeon R9 290X. The two-chip R9 295X2, however, is far ahead of everyone. Let's see what will change the inclusion of supersampling:

When supersampling and self-shadowing are enabled, the task becomes more difficult; enabling two options at once increases the load on the cards by almost eight times, causing a serious drop in performance. The difference between the speed indicators of the tested video cards has changed slightly, although the inclusion of supersampling has less effect than in the previous case.

AMD Radeon graphics solutions perform more efficiently in this D3D10 pixel shader test than competing Geforce cards, but the new GM200 chip changes the situation for the better - Geforce GTX Titan X on a Maxwell architecture chip outperforms Radeon R9 290X in all conditions ( however, based on a noticeably less complex GPU). The dual-GPU solution based on the Hawaii pair remained the leader, but in comparison with other Nvidia solutions, the new product is not bad. It showed speeds almost on par with the dual-GPU Geforce GTX Titan Z, and outperformed the Geforce GTX 980 by 28-33%.

Direct3D 10: PS 4.0 Pixel Shader Benchmarks (Compute)

The next couple of pixel shader tests contain the minimum number of texture fetches to reduce the impact of TMU performance. They use a large number of arithmetic operations, and they measure exactly the mathematical performance of video chips, the speed of execution of arithmetic instructions in a pixel shader.

The first math test is Mineral. This is a complex procedural texturing test that uses only two texture data samples and 65 instructions like sin and cos.

The results of extreme mathematical tests most often correspond to the difference in frequencies and the number of computing units, but only approximately, since the results are influenced by the different efficiency of their use in specific tasks, and driver optimization, and the latest frequency and power control systems, and even the emphasis on memory bandwidth. ... In the case of the Mineral test, the new Geforce GTX Titan X is only 10% faster than the GTX 980 based on the GM204 chip from the same generation, and the dual-GPU GTX Titan Z was not so fast in this test - there is clearly something preventing Nvidia boards from revealing ...

Comparing the Geforce GTX Titan X to competing boards from AMD would not be so frustrating if the GPUs in the R9 290X and Titan X were similar in complexity. But the GM200 is much larger than the Hawaii, and its small win is only natural. Upgrading Nvidia's architecture from Kepler to Maxwell brought the new chips closer to competing solutions from AMD in such tests. But even the cheaper dual-GPU solution Radeon R9 295X2 is noticeably faster.

Consider the second shader computation test called Fire. It is heavier for ALU, and there is only one texture fetch in it, and the number of instructions like sin and cos is doubled, to 130. Let's see what has changed with increasing load:

In the second math test from RigthMark, we see different results of video cards relative to each other. So, the new Geforce GTX Titan X is already stronger (by 20%) ahead of the GTX 980 on a chip of the same graphics architecture, and the dual-GPU Geforce is very close to the new product - Maxwell copes with computational tasks much better than Kepler.

The Radeon R9 290X is left behind, but as we already wrote, the Hawaii GPU is noticeably simpler than the GM200, and this difference is logical. But although the dual-GPU Radeon R9 295X2 continues to be the leader in the tests of mathematical calculations, in general, the new Nvidia video chip performed well in such tasks, although it did not achieve the theoretical difference with the GM204.

Direct3D 10: geometry shader benchmarks

The RightMark3D 2.0 package contains two tests of the speed of geometry shaders, the first version is called "Galaxy", the technique is similar to the "point sprites" from previous versions of Direct3D. It animates a particle system on a GPU, a geometric shader creates four vertices from each point, forming a particle. Similar algorithms should be widely used in future DirectX 10 games.

Changing the balancing in the geometry shader tests does not affect the final rendering result, the final image is always exactly the same, only the scene processing methods change. The "GS load" parameter determines in which of the shaders the calculations are performed - in vertex or geometric. The number of calculations is always the same.

Let's consider the first variant of the Galaxy test, with calculations in a vertex shader, for three levels of geometric complexity:

The ratio of speeds with different geometric complexity of scenes is approximately the same for all solutions, the performance corresponds to the number of points, with each step the FPS drop is close to twofold. This task for powerful modern video cards is very simple, and its performance is limited by the geometry processing speed, and sometimes by the memory bandwidth and / or fill rate.

The difference between the results of video cards based on Nvidia and AMD chips is usually in favor of the solutions of the Californian company, and it is due to differences in the geometric pipelines of the chips of these companies. In this case, the top-end Nvidia video chips have a lot of geometry processing units, so the gain is obvious. In geometry tests, Geforce cards are always more competitive than Radeon cards.

The new Geforce GTX Titan X lags slightly behind the previous generation dual-GPU GTX Titan Z, but the GTX 980 outperforms the GTX 980 by 12-25%. Radeon video cards show noticeably different results, since the R9 295X2 is based on a pair of GPUs, and only it can compete with the novelty in this test, and the Radeon R9 290X has become an outsider. Let's see how the situation will change when transferring part of the calculations to the geometry shader:

When the load changed in this test, the numbers changed slightly for AMD boards and for Nvidia solutions. And that doesn't really change anything. In this test of geometry shaders, video cards react poorly to changes in the GS load parameter, which is responsible for transferring part of the calculations to the geometry shader, so the conclusions remain the same.

Unfortunately, "Hyperlight" is the second geometry shader test that demonstrates the use of several techniques at once: instancing, stream output, buffer load, which uses dynamic geometry creation by drawing in two buffers, as well as a new Direct3D 10 feature - stream output, on all modern graphics cards from AMD just don't work. At some point, another update of Catalyst drivers led to the fact that this test stopped running on boards of this company, and this has not been fixed for several years.

Direct3D 10: speed of fetching textures from vertex shaders

The Vertex Texture Fetch tests measure the speed of a large number of texture fetches from the vertex shader. The tests are similar, in fact, so the ratio between the results of the cards in the tests "Earth" and "Waves" should be approximately the same. Both tests use displacement mapping based on texture fetch data, the only significant difference is that the Waves test uses conditional transitions, while the Earth test does not.

Consider the first test "Earth", first in the "Effect detail Low" mode:

Our previous research showed that the results of this test can be influenced by both the fill rate and the memory bandwidth, which is clearly visible in the results of Nvidia boards, especially in simple modes. The new video card from Nvidia in this test demonstrates the speed clearly lower than it should - all Geforce cards turned out to be approximately at the same level, which clearly does not correspond to theory. In all modes, they obviously run into something like the memory bandwidth. However, the Radeon R9 295X2 is also not twice as fast as the R9 290X.

By the way, the single-chip board from AMD this time turned out to be stronger than all the boards from Nvidia in the easy mode and approximately at their level in the heavy mode. Well, the dual-GPU Radeon R9 295X2 again became the leader of our comparison. Let's look at the performance in the same test with an increased number of texture fetches:

The situation on the diagram has slightly changed, the single-chip solution from AMD in heavy modes has lost significantly more Geforce cards. The new Geforce GTX Titan X showed speeds up to 14% faster than the Geforce GTX 980, and outperformed the single-GPU Radeon in all but the lightest modes due to the same focus on something. If we compare the new product with a dual-GPU solution from AMD, then Titan X was able to fight in heavy mode, showing similar performance, but lagging behind in light modes.

Let's consider the results of the second test of texture samples from vertex shaders. The Waves test has a smaller number of samples, but it uses conditional jumps. The number of bilinear texture samples in this case is up to 14 ("Effect detail Low") or up to 24 ("Effect detail High") for each vertex. The complexity of the geometry changes in the same way as in the previous test.

The results in the second test of vertex texturing "Waves" are not at all similar to what we saw in the previous diagrams. The speed indicators of all GeForce in this test seriously deteriorated, and the new model Nvidia Geforce GTX Titan X shows speed only slightly faster than the GTX 980, lagging behind the dual-GPU Titan Z. all modes. Let's consider the second variant of the same problem:

With the complication of the task in the second test of texture sampling, the speed of all solutions became lower, but Nvidia video cards suffered more, including the model in question. In the conclusions, almost nothing changes, the new Geforce GTX Titan X is up to 10-30% faster than the GTX 980, lagging behind both the dual-GPU Titan Z and both Radeon boards. The Radeon R9 295X2 turned out to be far ahead in these tests, and from the point of view of theory this is simply inexplicable, except for insufficient optimization from Nvidia.

3DMark Vantage: Feature Benchmarks

The synthetic benchmarks from the 3DMark Vantage suite will show us what we missed earlier. Feature tests from this test suite have DirectX 10 support, are still relevant and interesting because they differ from ours. When analyzing the results of the latest video card model Geforce GTX Titan X in this package, we will draw some new and useful conclusions that eluded us in the tests from the RightMark family of packages.

Feature Test 1: Texture Fill

The first test measures the performance of texture fetch units. Used to fill a rectangle with values ​​read from a small texture using multiple texture coordinates that change every frame.

The efficiency of AMD and Nvidia video cards in the texture test by Futuremark is quite high and the final figures for different models are close to the corresponding theoretical parameters. So, the difference in speed between GTX Titan X and GTX 980 turned out to be 38% in favor of a solution based on GM200, which is close to theory, because the new product has one and a half times more TMUs, but they operate at a lower frequency. Naturally, the lag behind the dual-GPU GTX Titan Z remains, since the two GPUs have a high texturing speed.

When it comes to comparing the texturing speed of the new top-end video card from Nvidia with similarly priced solutions from a competitor, the new product is inferior to a two-GPU rival, which is a conditional neighbor in the price niche, but outperforms the Radeon R9 290X, although not too significantly. Still, AMD graphics cards still do a little better with texturing.

Feature Test 2: Color Fill

The second task is to test the fill rate. It uses a very simple pixel shader with no performance cap. The interpolated color value is written to the offscreen buffer (render target) using alpha blending. A 16-bit FP16 off-screen buffer is used, which is most often used in games that use HDR rendering, so this test is quite timely.

The numbers of the second 3DMark Vantage subtest show the performance of ROP units without taking into account the amount of video memory bandwidth (the so-called "effective fill rate"), and the test measures ROP performance. The Geforce GTX Titan X board we are reviewing is noticeably ahead of both Nvidia boards, the GTX 980 and even the GTX Titan Z, outperforming the single-chip board based on GM204 by as much as 45% - the number of ROPs and their efficiency in the top GPU of the Maxwell architecture is excellent!

And if we compare the scene fill rate with the new Geforce GTX Titan X video card with AMD video cards, then the Nvidia board we are considering in this test shows the best scene fill rate even in comparison with the most powerful dual-GPU Radeon R9 295X2, not to mention the considerably lagging Radeon R9 290X. A large number of ROPs and optimizations for the compression efficiency of the framebuffer data did their job.

Feature Test 3: Parallax Occlusion Mapping

One of the most interesting feature tests, as a similar technique is already used in games. It draws one quadrilateral (more precisely, two triangles) using a special technique called Parallax Occlusion Mapping, which simulates complex geometry. Quite resource-intensive ray tracing and high resolution depth map are used. Also this surface is shaded using the heavy Strauss algorithm. This is a test of a very complex and GPU-heavy pixel shader containing numerous texture selections for ray tracing, dynamic branches and complex lighting calculations by Strauss.

This test from the 3DMark Vantage package differs from the previous ones in that the results in it depend not only on the speed of mathematical calculations, the efficiency of branching or the speed of texture fetching, but on several parameters simultaneously. To achieve high speed in this task, the correct balance of the GPU is important, as well as the efficiency of the execution of complex shaders.

In this case, both mathematical and texture performance are important, and in this synthetic from 3DMark Vantage the new Geforce GTX Titan X is more than a third faster than the model based on the GPU of the same Maxwell architecture. And even the dual-GPU Kepler in the form of GTX Titan Z won less than 10% over the new product.

The single-chip top-end Nvidia board in this test showed the result clearly better than the one-chip Radeon R9 290X, but both are very seriously inferior to the two-GPU model Radeon R9 295X2. AMD graphics processors work somewhat more efficiently in this task than Nvidia chips, while the R9 295X2 has two of them.

Feature Test 4: GPU Cloth

The fourth test is interesting in that it calculates physical interactions (tissue imitation) using a video chip. Vertex simulation is used, using combined work of vertex and geometry shaders, with several passes. Use stream out to transfer vertices from one simulation pass to another. Thus, the performance of vertex and geometry shaders execution and stream out speed are tested.

Rendering speed in this test also depends on several parameters at once, and the main factors of influence should be geometry processing performance and geometry shader execution efficiency. That is, the strengths of Nvidia chips should be manifested, but alas - we saw a very strange result (we rechecked), the new Nvidia video card showed not very high speed, to put it mildly. Geforce GTX Titan X in this subtest showed the worst result of all solutions, lagging almost 20% behind even the GTX 980!

Well, the comparison with Radeon boards in this test is just as unsightly for a new product. Despite the theoretically smaller number of geometric execution units and the geometric performance lag of AMD chips compared to competing solutions, both Radeon cards perform very efficiently in this test and outperform all three Geforce cards presented in comparison. Again, it looks like a lack of optimization in Nvidia drivers for a specific task.

Feature Test 5: GPU Particles

Physical simulation test of effects based on particle systems calculated using a video chip. Vertex simulation is also used, each vertex represents a single particle. Stream out is used for the same purpose as in the previous test. Several hundred thousand particles are calculated, all are animated separately, and their collisions with the height map are also calculated.

Similar to one of the tests in our RightMark3D 2.0, particles are rendered using a geometry shader, which creates four vertices from each point that form a particle. But the test most of all loads shader units with vertex calculations, stream out is also tested.

In the second "geometric" test from 3DMark Vantage, the situation has changed significantly, this time all Geforce already show more or less normal results, although the dual-GPU Radeon still remains in the lead. The new GTX Titan X model is 24% faster than its sister GTX 980 and is about the same lagging behind the two-GPU Titan Z on the previous generation GPU.

Comparison of the new Nvidia with competing video cards from AMD this time is more positive - it showed the result between two boards from a rival company, and turned out to be closer to the Radeon R9 295X2, which has two GPUs. The novelty is significantly ahead of the Radeon R9 290X and this clearly shows us how different two seemingly similar tests can be: simulating fabrics and simulating a particle system.

Feature Test 6: Perlin Noise

The latest feature test of the Vantage package is a mathematically intensive test of the video chip, it calculates several octaves of the Perlin noise algorithm in a pixel shader. Each color channel uses its own noise function for more load on the video chip. Perlin noise is a standard algorithm often used in procedural texturing and uses a lot of math.

In this case, the performance of solutions does not quite correspond to theory, although it is close to what we saw in similar tests. In the mathematical test from the Futuremark suite, which shows the peak performance of video chips in extreme tasks, we see a different distribution of results compared to similar tests from our test suite.

We have known for a long time that GPUs from AMD with GCN architecture still cope with such tasks better than competing solutions, especially in cases where intensive "math" is performed. But the new top model from Nvidia is based on a large GM200 chip, and therefore the Geforce GTX Titan X in this test showed a result noticeably higher than the Radeon R9 290X.

If we compare the new product with the best model of the Geforce GTX 900 family, then in this test the difference between them was almost 40% - in favor of the video card we are considering today, of course. This is also close to the theoretical difference. Not a bad result for the Titan X, only the dual-GPU Radeon R9 295X2 is ahead, and far ahead.

Direct3D 11: Compute Shaders

We used examples from the SDKs and demos from Microsoft, Nvidia, and AMD to test Nvidia's recently released top-of-the-line solution for tasks using DirectX 11 features such as tessellation and compute shaders.

First, we'll look at benchmarks using Compute shaders. Their appearance is one of the most important innovations in the latest versions of the DX API, they are already used in modern games to perform various tasks: post-processing, simulations, etc. The first test shows an example of HDR rendering with tone mapping from the DirectX SDK, with post-processing using pixel and compute shaders.

The calculation speed in the computational and pixel shaders is approximately the same for all AMD and Nvidia boards, the differences were observed only in video cards based on GPUs of previous architectures. Judging by our previous tests, the results in a task often depend not so much on the mathematical power and computational efficiency as on other factors, such as memory bandwidth.

In this case, the new top-end video card is faster than the single-chip versions Geforce GTX 980 and Radeon R9 290X, but lags behind the two-chip R9 295X2, which is understandable, because it has the power of a pair of R9 290X. If we compare the new product with the Geforce GTX 980, then the board of the Californian company being considered today is 34-36% faster - exactly according to theory.

The second computational shader test, also taken from the Microsoft DirectX SDK, shows the computational N-body gravity problem - a simulation of a dynamic particle system that is acted upon by physical forces such as gravity.

In this test, the emphasis is most often observed on the speed of execution of complex mathematical calculations, geometry processing and the efficiency of code execution with branches. And in this DX11 test, the balance of forces between the solutions of two different companies turned out to be completely different - obviously in favor of Geforce video cards.

However, the results of a couple of Nvidia solutions on different chips are also strange - Geforce GTX Titan X and GTX 980 are almost equal, they are separated by only 5% of the performance difference. Dual-GPU rendering does not work in this task, so the rivals (single-GPU and dual-GPU Radeon models) are roughly equal in speed. Well, the GTX Titan X is three times ahead of them. It seems that this task is much more efficiently calculated on GPUs of the Maxwell architecture, which we noted earlier.

Direct3D 11: Tessellation Performance

Compute shaders are very important, but another major innovation in Direct3D 11 is hardware tessellation. We examined it in great detail in our theoretical article about the Nvidia GF100. Tessellation has long been used in DX11 games such as STALKER: Call of Pripyat, DiRT 2, Aliens vs Predator, Metro Last Light, Civilization V, Crysis 3, Battlefield 3 and others. Some of them use tessellation for character models, while others use tessellation to simulate a realistic water surface or landscape.

There are several different schemes for partitioning graphic primitives (tessellations). For example phong tessellation, PN triangles, Catmull-Clark subdivision. So, the PN Triangles partitioning scheme is used in STALKER: Call of Pripyat, and in Metro 2033 - Phong tessellation. These methods are relatively quickly and easily introduced into the game development process and existing engines, and therefore have become popular.

The first tessellation test will be the Detail Tessellation example from the ATI Radeon SDK. It implements not only tessellation, but also two different per-pixel processing techniques: simple overlay normal maps and parallax occlusion mapping. Well, let's compare DX11 solutions from AMD and Nvidia in different conditions:

In the simple bumpmapping test, the speed of the boards is not very important, since this task has long become too easy, and the performance in it is limited by memory bandwidth or fill rate. Today's hero of the review is 23% ahead of the previous top-end model Geforce GTX 980 based on the GM204 chip and is slightly inferior to the competitor in the form of the Radeon R9 290X. The dual-chip version is even slightly faster.

In the second subtest with more complex pixel-by-pixel calculations, the new product has already become 34% faster than the Geforce GTX 980, which is closer to the theoretical difference between them. But Titan X this time is already a little faster than a single-chip conditional competitor based on a single Hawaii. Since the two chips in the Radeon R9 295X2 work perfectly, this task is performed even faster on it. Although the efficiency of performing mathematical calculations in pixel shaders is higher for chips of the GCN architecture, the release of solutions of the Maxwell architecture has improved the position of Nvidia solutions.

In the subtest using a light degree of tessellation, the recently announced Nvidia board is again only a quarter faster than the Geforce GTX 980 - perhaps the speed is limited by the memory bandwidth, since texturing has almost no effect in this test. If we compare the new product with AMD boards in this subtest, then the Nvidia board is again inferior to both Radeon cards, since in this tessellation test the partitioning of triangles is quite moderate and the geometric performance does not limit the overall rendering speed.

The second test of tessellation performance will be another example for 3D developers from ATI Radeon SDK - PN Triangles. Actually, both examples are also included in the DX SDK, so we are sure that game developers create their code on their basis. We tested this example with a different tessellation factor to see how much of a change would have an effect on overall performance.

In this test, more complex geometry is used, therefore, a comparison of the geometric power of various solutions yields different conclusions. The modern solutions presented in the material cope quite well with light and medium geometric loads, showing high speed. But although in light conditions, the Hawaii GPUs in the Radeon R9 290X and R9 295X2 in the number of one and two pieces work fine, in heavy conditions the Nvidia boards go far ahead. So, in the most difficult modes, the Geforce GTX Titan X presented today shows the speed noticeably better than the dual-GPU Radeon.

As for the comparison of Nvidia boards based on GM200 and GM204 chips with each other, the Geforce GTX Titan X model under consideration today increases its advantage with an increase in geometric load, since in light mode everything depends on memory bandwidth. As a result, the new product outperforms the Geforce GTX 980 board, depending on the mode complexity, by up to 31%.

Let's take a look at the results of another test, the Nvidia Realistic Water Terrain demo program, also known as Island. This demo uses tessellation and displacement mapping to render realistic looking ocean and terrain surfaces.

The Island test is not a purely synthetic test for measuring exclusively the geometric performance of the GPU, since it contains complex pixel and computational shaders, including, and such a load is closer to real games in which all GPU units are used, and not just geometric ones, as in previous geometry tests. Although the main one is still the load on the geometry processing units, the same memory bandwidth, for example, can also affect.

We test all video cards at four different tessellation ratios - in this case, the setting is called Dynamic Tessellation LOD. At the first partitioning factor of triangles, the speed is not limited by the performance of geometric units, and Radeon video cards show a fairly high result, especially the dual-GPU R9 295X2, even surpassing the result of the announced Geforce GTX Titan X card, but already at the next stages of geometric load, the performance of Radeon cards decreases, and solutions Nvidia comes out ahead.

The advantage of the new Nvidia board based on the GM200 video chip over its competitors in such tests is already quite decent, and even multiple. If we compare Geforce GTX Titan X with GTX 980, then the difference between their performance reaches 37-42%, which is perfectly explained by theory and exactly corresponds to it. Maxwell GPUs are noticeably more efficient in mixed load mode, quickly switching from graphics to computing tasks and vice versa, and the Titan X is much faster than even the dual-GPU Radeon R9 295X2 in this test.

After analyzing the results of synthetic tests of the new Nvidia Geforce GTX Titan X video card based on the new top-end GM200 GPU, as well as considering the results of other video card models from both discrete GPU manufacturers, we can conclude that the video card under consideration should become the fastest on the market. competing with the strongest dual-GPU video card from AMD. In general, this is a good follower of the Geforce GTX Titan Black model - the most powerful single-chip.

The new video card from Nvidia shows quite strong results in synthetic - in many tests, though not in all. Radeon and Geforce traditionally have different strengths. In a large number of tests, the two graphics processors in the Radeon R9 295X2 model turned out to be faster, also due to the higher final memory bandwidth and texturing speed with very efficient execution of computational tasks. But in other cases, the top GPU of the Maxwell architecture plays out, especially in geometry tests and examples with tessellation.

However, in real gaming applications everything will be a little different, compared to the "synthetics" and the Geforce GTX Titan X should show there speed noticeably higher than the one-chip Geforce GTX 980 and even more so the Radeon R9 290X. And it is difficult to compare the new product with a two-chip Radeon R9 295X2 - systems based on two or more GPUs have their own unpleasant features, although they provide an increase in the average frame rate with proper optimization.

But the architectural features and functionality are clearly in favor of the premium Nvidia solution. Geforce GTX Titan X consumes much less power than the same Radeon R9 295X2, and in terms of energy efficiency, the new model from Nvidia is very strong - this is a distinctive feature of the Maxwell architecture. Do not forget about the greater functionality of the new Nvidia: there is support for Feature Level 12.1 in DirectX 12, hardware acceleration VXGI, a new anti-aliasing method MFAA and other technologies. We already talked about the market point of view in the first part - in the elite segment, not so much depends on the price. The main thing is that the solution is as functional and productive as possible in gaming applications. Quite simply - it was the best in everything.

Just in order to assess the speed of the new item in games, in the next part of our material we will determine the performance of the Geforce GTX Titan X in our set of gaming projects and compare it with the indicators of competitors, including assessing the reasonableness of the retail price of the new item from the point of view of enthusiasts, and we will also find out how much faster it is than the Geforce GTX 980 already in games.

Asus ProArt PA249Q monitor for work computer provided by Asustek The Cougar 700K desktop keyboard is provided by Cougar

Previous ImageNext image

The GeForce GTX Titan X video accelerator is currently (April 2015) the most technologically advanced in the world. It has unprecedented performance unparalleled in the world. The Titan X graphics card is designed for professional and experienced gamers as well as PC enthusiasts. The board is built on NVIDIA's new Maxwell architecture, delivering twice the performance and incredible power efficiency of the previous generation Kepler GPUs.

The GeForce GTX Titan X graphics card is equipped with a GM200 GPU, which includes absolutely all 3072 CUDA computing cores, which is the maximum value for the GeForce GTX 900 lineup.

The innovative GM200 GPU has a number of impressive gaming technologies that are inherited from previous generations of accelerators, and were developed by NVIDIA engineers from the ground up. Along with the well-known technologies for supporting 3D displays 3D Vision, adaptive synchronization G-Sync and anti-aliasing algorithms MSAA and TXAA, the GeForce GTX 900 family now has multi-frame anti-aliasing technology (MFAA), which guarantees a performance increase of 30%; anti-aliasing method using super high resolution DSR; and Voxel Global Illumination (VXGI), which accelerates dynamic lighting effects for immersive cinematic gameplay.

This accelerator, like other cards in the lineup, has received the updated automatic overclocking technology NVIDIA GPU Boost 2.0, which monitors the operation of the video card, even more efficiently managing the GPU temperature, increasing the processor clock frequency and voltage, which allows you to achieve maximum GPU performance.

The product incorporates NVIDIA Adaptive Vertical Sync technology. This technology is enabled at high frame rates to eliminate tearing, and disabled at low frame rates to minimize frame jitter.

The developer guarantees the full operation of the video card with the new Microsoft DirectX 12 API, which can significantly reduce the load on the central processor and accelerate the rendering of images.

Overall, the new accelerator is the perfect solution for gaming in ultra-high definition UHD 4K at maximum quality settings. It also provides sufficient performance in the increasingly popular virtual reality systems.

Dignity

Maximum Performance The ultimate performance enthusiast solution allows you to play all modern PC games in 4K resolution and the highest picture quality. Has a significant headroom for future games. SLI Support The grouping feature allows you to create dual, triple, and quad card configurations (when using an SLI-compatible motherboard) to further improve gaming performance. Connecting additional displays Allows the simultaneous use of Dual-link DVI, HDMI and DisplayPort for multi-monitor configurations with up to 4 displays. Good Overclocking Thanks to the proven 28nm GPU technology and the high energy efficiency of the Maxwell architecture, the GeForce GTX Titan X graphics card has excellent GPU overclocking capabilities. Professional overclockers are able to overclock the GPU of this accelerator by 2 times. Good video performance Fully accelerated decoding of all major video formats, both DVD / Blu-ray and Internet, picture-in-picture support, CUDA / OpenCL / DirectX acceleration support for video encoders and editors, HEVC hardware decoding ... 3D Vision Stereo Ready The card has more than enough performance to output full stereo in games when using the NVIDIA 3D Vision Kit (compatible monitor required). PhysX acceleration support The GPU is powerful enough to simultaneously render 3D graphics and additional special effects in PhysX-enabled games. Low Power Consumption Thanks to the new GPU architecture, this video accelerator is extremely energy efficient. As a result, a more modest power supply unit (from 600 W) is sufficient for its operation than for the top-end solution of the previous generation, the GeForce GTX Titan Z accelerator. Ready for virtual reality The card has VR Direct technology, which is specially designed to work with virtual reality devices. The development involves the use of multiple video cards in an SLI configuration, includes Asynchronous Warp technology, which reduces image latency and quickly adjusts the picture in accordance with head rotation, and Auto Stereo, which increases the compatibility of games with virtual reality devices such as the Oculus Rift.

disadvantages

High price The cost of more than 1000 USD and more significantly limits the circle of buyers. High System Requirements To get the most out of the card, an “expensive” PC configuration is desirable, including a modern motherboard with PCI Express 3.0 support, the highest performing CPU, DDR4 memory and PCI-e SSD for running games.
Chipset

Key Takeaway: The GeForce GTX TITAN X is the fastest single-GPU gaming graphics card of today. The performance of the new NVIDIA flagship is enough for you to feel free to enjoy modern 3D entertainment in Full HD and WQHD at the highest graphics settings. True, the GeForce GTX 980 can do this too. "Wait, what about 4K?" - the reader will ask. Yes, even though I called the article "First for Ultra HD", in terms of modern games at maximum graphics quality settings, the GeForce GTX TITAN X demonstrates only a relatively playable FPS level. However, this is the best indicator among single-chip video cards. Therefore, for me personally, it is the GeForce GTX TITAN X that is the first video card that is really capable of satisfying the requirement of a gamer who wants to conquer virtual spaces at such a high resolution. Let in some cases you have to delve into the settings. But already a couple of such "titans" are able to rein in any emerging nextgen. If only the drivers and optimization did not fail. However, this is a topic for a separate article.

It is 12 GB of video memory that provides a large margin of safety for the GeForce GTX TITAN X. Of course, someone will rightly notice that such a volume is excessive. However, the same Assassin's Creed Unity in 4K-resolution, not at the very-maximum graphics quality settings, already "eats away" from the video card 5-6 GB of video memory. That is, almost half. That is why (and we have seen this clearly) even super-expensive bundles of several 3D accelerators can have a bottleneck in the form of 4 GB GDDR5. So for gaming in Ultra HD, you need to have some headroom right now.

As always, the reference from NVIDIA has shown itself to be on the good side. The video card has a decent overclocking potential. The evaporator-based cooler cools the 250-watt chip quite efficiently. It works a little noisy, but quite bearable.

Of course, if NVIDIA released this video card in September last year (but not for ~ 1000 bucks - author's note), then the wow effect would have turned out, in my opinion, stronger. However, should we be surprised at the NVIDIA scheme worked out over the years? Price is the strongest limiting factor for the purchase of GeForce GTX TITAN X. In our country, which is experiencing another economic crisis, this is even more so.

In the end, I will just note that the "green" ones have raised the performance bar high enough that the future AMD flagship (Radeon R9 390X?) Will have to at least reach in order to restore the status quo. Or to do something close in performance, but much more budgetary. Agree, it will be very interesting to follow this.

NVIDIA GeForce GTX TITAN X graphics card wins Editors' Choice award.

The previous version of the elite video card NVIDIA GeForce GTX TITAN X 12 GB was released in March 2015 and was based on the GM200 GPU of the Maxwell 2.0 architecture. At that time, the new product was distinguished by a colossal amount of video memory for gaming video cards, very high performance and cost ($ 999). Nevertheless, the dashing prowess of the GeForce GTX TITAN X faded after three months, when the public was presented with an equally fast GeForce GTX 980 Ti in games at a much more acceptable price ($ 649).

It seems that NVIDIA has decided to repeat this path of announcements in the line of top graphics solutions, which can be expressed by the sequence "GeForce GTX 980 -> GeForce TITAN X -> GeForce GTX 980 Ti", only now the video cards are based on the GP104 / 102 cores of the Pascal architecture and are released in 16nm process technology. With the first video card - NVIDIA GeForce GTX 1080 - we already met as with her original versions... Now it's time to explore the newest and most powerful NVIDIA TITAN X graphics card.

The new product began to cost $ 200 more than its predecessor - $ 1200, and, of course, is still positioned as a professional video card for research and deep learning. But, as you probably understand, we are primarily interested in its performance in gaming applications and graphics benchmarks, since all gamers are eagerly awaiting the announcement of the GeForce GTX 1080 Ti, the last signs of which have already deprived the most obvious followers of the company of sleep. Nevertheless, today we will test NVIDIA TITAN X in selected computational benchmarks to ensure its consistency as a professional graphics card.

1. Review of the supervideo card NVIDIA TITAN X 12 GB

technical characteristics of the video card and the recommended cost

The technical characteristics and cost of the NVIDIA TITAN X video card are shown in the table in comparison with the reference NVIDIA GeForce GTX 1080 and the old version of the GeForce GTX TITAN X.




packaging and equipment

NVIDIA has reserved the release of TITAN X strictly for itself, so the packaging of the video card is standard: a compact box that opens upwards and a video card inserted into its center in an antistatic bag.



There is nothing in the package, although there is one additional compartment inside. Recall that the recommended price for NVIDIA TITAN X is 1200 US dollars.

PCB design and features

The design of the new NVIDIA TITAN X has become more daring or even more aggressive than the design of the GeForce GTX TITAN X. The cooling system casing on the front side of the video card was endowed with additional edges that glitter under the rays of light, and the back of the PCB was closed with a corrugated cover made of metal.




Coupled with a chrome-plated fan rotor and the same inscription on the front side, the video card looks really stylish and attractive. Note that the upper end of NVIDIA TITAN X retains the glowing GEFORCE GTX symbols, although they are no longer in the name of the video card.




The reference video card is standard 268 mm long, 102 mm high and 37 mm thick.

The video outputs on the optional triangular-perforated panel are as follows: DVI-D, three DisplayPort version 1.4 and one HDMI version 2.0b.




In this regard, the new product has no changes in comparison with the GeForce GTX 1080.

To create a variety of SLI-configurations, the video card has two connectors. Supports 2-way, 3-way and 4-way SLI options for combining video cards using both new rigid connecting bridges and old flexible ones.




If the reference GeForce GTX 1080 has only one eight-pin connector for additional power, then TITAN X also received a six-pin connector, which is not surprising, because the declared power consumption level of the video card is 250 watts, like the previous GeForce GTX TITAN X model. the recommended power supply for a system with one such video card must be at least 600 watts.

The NVIDIA TITAN X reference PCB is much more complex than the GeForce GTX 1080, which makes sense given the increased power requirements, increased video memory and wider bus.




The GPU power system is five-phase using Dr. MOS power cells and tantalum-polymer capacitors. Two more phases of power supply are reserved for video memory.



GPU power management is handled by uPI Semiconductor's uP9511P controller.



Monitoring functions are provided by the INA3221 controller manufactured by Texas Instruments.



Made according to 16nm standards, the GP102 GPU crystal has an area of ​​471 mm2, was released on the 21st week of 2016 (end of May) and belongs to revision A1.


Apart from the architectural improvements of the Pascal GPU line, the new GP102 contains 16.7% more universal shader processors, and their total number is 3584 compared to the GM200 GPU of the NVIDIA GeForce GTX TITAN X graphics card. 1080 is an impressive 40%. The same alignment and the number of texture units, which the new TITAN X has 224 pieces. Complementing the GP102 scores are 96 raster operation units (ROPs).

GPU frequencies have also increased. If the GeForce GTX TITAN X base GPU frequency in 3D mode was 1000 MHz and could be boosted up to 1076 MHz, then the new TITAN X has a base frequency of 1418 MHz (+ 41.8%), and the declared boost frequency is 1531 MHz. In fact, according to monitoring data, the frequency of the graphics processor briefly increased to 1823 MHz, and averaged 1823 MHz. This is a very significant increase in comparison with its predecessor. We add that when switching to 2D mode, the GPU frequency is reduced to 139 MHz with a simultaneous decrease in voltage from 1.050 V to 0.781 V.

NVIDIA TITAN X is equipped with 12 GB of GDDR5X memory, composed of twelve Micron chips (labeled 6KA77 D9TXS), soldered only on the front side of the PCB.



Compared to the previous GeForce GTX TITAN X on the GM200, the memory frequency of the new TITAN X on the GP102 is 10008 MHz, which is 42.7% higher. Thus, with the 384-bit memory bus unchanged, the memory bandwidth of TITAN X reaches an impressive 480.4 GB / s, which is only slightly less than the current record holder in this area - AMD Radeon R9 Fury X with its high-speed HBM and 512 GB / s. In 2D mode, the memory frequency is reduced to 810 effective megahertz.

The summary of the hardware review of the new video card will be summed up by information from the GPU-Z utility.


We also spread the BIOS of the video card, read and saved using the same utility.

cooling system - efficiency and noise level

The NVIDIA TITAN X cooling system is identical to the NVIDIA GeForce GTX 1080 Founders Edition cooler.



It is based on a nickel-plated aluminum heatsink with a copper vapor chamber at the base, which is responsible for cooling the GPU.



In terms of area, this radiator is small, and the intercostal distance does not exceed two millimeters.



Thus, it is not difficult to assume that the efficiency of cooling the GPU with this radiator will be seriously dependent on the fan speed (which, in fact, was confirmed later).

A metal plate with thermal spacers is reserved for cooling memory chips and power circuit elements.



We used nineteen cycles of the Fire Strike Ultra stress test from the 3DMark package to test the temperature regime of the video card as a load.



To monitor temperatures and all other parameters, the MSI Afterburner program version 4.3.0 Beta 14 and newer was used, as well as the GPU-Z utility version 1.12.0. Tests were carried out in the closed case of the system unit, the configuration of which you can see in the next section of the article, at room temperature 23,5~23,9 degrees Celcius.

First of all, we checked the cooling efficiency of NVIDIA TITAN X and its temperature regime with fully automatic fan speed control.



Automatic mode (1500 ~ 3640 rpm)


As you can see from the monitoring graph, the temperature of the graphics processor of the NVIDIA TITAN X video card very quickly reached 88-89 degrees Celsius, and then, thanks to a relatively sharp increase in the fan speed from 1500 to 3500 rpm, stabilized at around 86 degrees Celsius. Further, during the test, the fan speed increased to 3640 rpm. It is unlikely that any of us with you expected other temperature indicators from a reference video card with a thermal package of 250 watts, which practically do not differ from the GeForce GTX TITAN X.

At maximum fan speed, NVIDIA TIAN X GPU temperature is reduced by 12-13 degrees Celsius compared to Auto Adjust.



Maximum speed (~ 4830 rpm)


In both fan modes, the NVIDIA TITAN X is a very noisy graphics card. By the way, NVIDIA does not deprive the owners of this video card model of the warranty when replacing the reference cooler with alternative options.

overclocking potential

When checking the overclocking potential of NVIDIA TITAN X, we increased the power limit by the maximum possible 120%, the temperature limit increased to 90 degrees Celsius, and the fan speed was manually fixed at 88% power or 4260 rpm. After several hours of testing, we found out that without loss of stability and the appearance of image defects, the base frequency of the graphics processor can be increased by 225 MHz (+ 15.9%), and the effective frequency of video memory - by 1240 MHz (+ 12.4%).



As a result, the frequencies of the overclocked NVIDIA TITAN X in 3D mode were 1643-1756 / 11248 MHz.


Due to the significant variation in GPU frequencies during the test of the temperature regime of the overclocked video card, the test from the 3DMark package again reported the instability of TITAN X.



Despite this fact, all 19 cycles of this test, as well as all the games of the test suite were successfully passed, and according to the monitoring data, the core frequency of the overclocked video card increased up to 1987 MHz.



88% power (~ 4260 rpm)


Taking into account the overclocking of the reference NVIDIA TITAN X, we can assume that the original GeForce GTX 1080 Ti will overclock even better. However, time will tell.

2. Test configuration, tools and testing methodology

The video cards were tested on a system with the following configuration:

motherboard: ASUS X99-A II (Intel X99 Express, LGA2011-v3, BIOS 1201 dated 10/11/2016);
central processor: Intel Core i7-6900K (14 nm, Broadwell-E, R0, 3.2 GHz, 1.1 V, 8 x 256 KB L2, 20 MB L3);
CPU cooling system: Phanteks PH-TC14PЕ (2 Corsair AF140, ~ 900 rpm);
thermal interface: ARCTIC MX-4 (8.5 W / (m * K));
RAM: DDR4 4 x 4 GB Corsair Vengeance LPX 2800 MHz (CMK16GX4M4A2800C16) (XMP 2800 MHz / 16-18-18-36_2T / 1.2 V or 3000 MHz / 16-18-18-36_2T / 1.35 V) ;
video cards:

NVIDIA TITAN X 12 GB 1418-1531 (1848) / 10008 MHz and overclocked to 1643-1756 (1987) / 11248 MHz;
Gigabyte GeForce GTX 1080 G1 Gaming 8 GB 1607-1746 (1898) / 10008 MHz and overclocked to 1791-1930 (2050) / 11312 MHz;
NVIDIA GeForce GTX 980 Ti 6 GB 1000-1076 (1189) / 7012 MHz and overclocked to 1250-1326 (1437) / 8112 MHz;

disk for system and games: Intel SSD 730 480GB (SATA-III, BIOS vL2010400);
benchmark disk: Western Digital VelociRaptor (SATA-II, 300 GB, 10,000 rpm, 16 MB, NCQ);
archive disk: Samsung Ecogreen F4 HD204UI (SATA-II, 2 TB, 5400 rpm, 32 MB, NCQ);
sound card: Auzen X-Fi HomeTheater HD;
case: Thermaltake Core X71 (four be quiet! Silent Wings 2 (BL063) at 900 rpm);
control and monitoring panel: Zalman ZM-MFC3;
PSU: Corsair AX1500i Digital ATX (1500 W, 80 Plus Titanium), 140 mm fan;
monitor: 27-inch Samsung S27A850D (DVI, 2560 x 1440, 60 Hz).

Of course, the previous versions of the TITAN X video card could not remain with us, so we will compare the new product with two other video cards, but not at all slow. The first of them will be the original Gigabyte GeForce GTX 1080 G1 Gaming, which we tested at the frequencies of the reference NVIDIA GeForce GTX 1080, as well as when overclocked to 1791-1930 / 11312 MHz.





Note that the peak frequency of the graphics processor of this video card during overclocking reached 2050 MHz.

The second video card for testing is the reference NVIDIA GeForce GTX 980 Ti, the performance of which we tested both at nominal frequencies and when overclocked to 1250-1326 (1437) / 8112 MHz.





Since at its release, the GeForce GTX 980 Ti in games demonstrated performance equal to the previous GeForce GTX TITAN X, this comparison can be considered a comparison of two different TITAN X. Let us add that the power and temperature limits on all video cards were increased to the maximum possible, and the GeForce drivers have been prioritized for maximum performance.

To reduce the dependence of the performance of video cards on the platform speed, the 14-nm eight-core processor with a multiplier of 40, a reference frequency of 100 MHz and the Load-Line Calibration function activated to the third level was overclocked to 4.0 GHz when the voltage in the BIOS of the motherboard rises to 1.2095 V.



At the same time, 16 gigabytes of RAM functioned at a frequency 3.2 GHz with timings 16-16-16-28 CR1 at a voltage of 1.35 V.

Testing, which began on October 20, 2016, was conducted under the Microsoft Windows 10 Professional operating system with all updates as of the specified date and with the following drivers installed:

motherboard chipset Intel Chipset Drivers - 10.1.1.38 WHQL from 10/12/2016;
Intel Management Engine Interface (MEI) - 11.6.0.1025 WHQL from 14.10.2016;
video card drivers on NVIDIA GPUs - GeForce 375.57 WHQL from 10/20/2016.

Since the video cards in today's testing are very productive, it was decided to abandon tests at a resolution of 1920 x 1080 pixels and only a resolution of 2560 x 1440 pixels was used. Resolutions are even higher, unfortunately, the existing monitor does not support. However, given the results in the latest innovations, there is no reason to regret the unavailability of higher resolutions. Two graphics quality modes were used for the tests: Quality + AF16x - texture quality in the drivers by default with anisotropic filtering at 16x and Quality + AF16x + MSAA 4x (8x) with anisotropic filtering at 16x and full-screen anti-aliasing at 4x or 8x, in cases when average frames per second remained high enough for comfortable gaming. In some games, due to the specifics of their game engines, other anti-aliasing algorithms were used, which will be indicated later in the methodology and in the diagrams. Anisotropic filtering and full-screen anti-aliasing was enabled directly in the game settings. If these settings were absent in games, then the parameters were changed in the control panel of the GeForce drivers. V-Sync was also forcibly disabled there. Apart from the above, no additional changes were made to the driver settings.

The video cards were tested in one graphics test, one VR test and fifteen games, updated to the latest versions as of the date of the publication of this article. Compared to our previous test video cards the old and not resource-intensive Thief and Sniper Elite III are excluded from the test set, but the new Total War: WARHAMMER and Gears of War 4 with support for API DirectX 12 are included (now there are five such games in the set). In addition, in the following articles about video cards, another new game with support for API DirectX 12 will appear in the list. So, now the list of test applications looks like this (games and further test results in them are arranged in the order of their official release):

3DMark(DirectX 9/11) - version 2.1.2973, tested in the Fire Strike, Fire Strike Extreme, Fire Strike Ultra and Time Spy scenes (the diagram shows the graphic score);
SteamVR- test for support of "virtual reality", the result was taken as the number of tested frames during the test;
Crysis 3(DirectX 11) - version 1.3.0.0, all graphics quality settings to maximum, the degree of blur is medium, glare is on, modes with FXAA and with MSAA 4x, double sequential pass of a scripted scene from the beginning of the Swamp mission lasting 105 seconds;
Metro: Last Light(DirectX 11) - version 1.0.0.15, the built-in game test was used, graphics quality settings and tessellation at the Very High level, Advanced PhysX technology in two test modes, tests with SSAA and without anti-aliasing, double sequential run of the D6 scene;
Battlefield 4(DirectX 11) - version 1.2.0.1, all graphics quality settings on Ultra, double sequential run of the scripted scene from the beginning of the TASHGAR mission lasting 110 seconds;
Grand theft auto v(DirectX 11) - build 877, quality settings at Very High, ignoring proposed restrictions enabled, V-Sync disabled, FXAA enabled, NVIDIA TXAA disabled, MSAA for reflections disabled, NVIDIA soft shadows;
DiRT Rally(DirectX 11) - version 1.22, we used the built-in test on the Okutama track, graphics quality settings to the maximum level for all points, Advanced Blending - On; tests with MSAA 8x and without anti-aliasing;
Batman: arkham knight(DirectX 11) - version 1.6.2.0, quality settings at High, Texture Resolutioin normal, Anti-Aliasing on, V-Sync disabled, tests in two modes - with and without activation of the last two NVIDIA GameWorks options, dual sequential run of the built-in into a dough game;
(DirectX 11) - version 4.3, texture quality settings at the Very High level, Texture Filtering - Anisotropic 16X and other maximum quality settings, tests with MSAA 4x and without anti-aliasing, double sequential run of the test built into the game.
Rise of the Tomb Raider(DirectX 12) - version 1.0 build 753.2_64, all parameters at the Very High level, Dynamic Foliage - High, Ambient Occlusion - HBAO +, tessellation and other quality improvement techniques are activated, two test cycles of the built-in benchmark (Geothermal Valley scene) without anti-aliasing and with SSAA 4.0 activation;
Far cry primal(DirectX 11) - version 1.3.3, maximum quality level, high-resolution textures, volumetric fog and shadows to maximum, built-in performance test without anti-aliasing and with SMAA activation;
Tom clancy's the division(DirectX 11) - version 1.4, maximum quality level, all picture enhancement parameters are activated, Temporal AA - Supersampling, testing modes without anti-aliasing and with activation of SMAA 1X Ultra, built-in performance test, but fixing FRAPS results;
Hitman(DirectX 12) - version 1.5.3, built-in test with graphics quality settings at the "Ultra" level, SSAO enabled, shadow quality "Ultra", memory protection disabled;
Deus Ex: Mankind Divided(DirectX 12) - version 1.10 build 592.1, all quality settings are manually set to the maximum level, tessellation and depth of field are activated, at least two consecutive runs of the benchmark built into the game;
Total War: WARHAMMER(DirectX 12) - version 1.4.0 build 11973.949822, all graphics quality settings to the maximum level, reflections enabled, unlimited video memory and SSAO enabled, double sequential run of the benchmark built into the game;
Gears of War 4(DirectX 12) - version 9.3.2.2, quality settings at Ultra level, V-Sync disabled, all effects enabled, instead of anti-aliasing not supported by the game, resolution scaling by 150% (up to 3840 x 2160) was used, double sequential run of the benchmark built into the game ...

If the games implemented the ability to fix the minimum number of frames per second, then it was also reflected in the diagrams. Each test was carried out twice, the best of the two obtained values ​​was taken as the final result, but only if the difference between them did not exceed 1%. If the deviations of the test runs exceeded 1%, then the testing was repeated at least one more time in order to obtain a reliable result.

3. Results of performance tests

On the diagrams, the results of testing video cards without overclocking are highlighted in green, and when overclocked - in dark turquoise. Since all the results on the diagrams have a common pattern, we will not comment on each of them separately, but we will analyze the pivot diagrams in the next section of the article.

3DMark




SteamVR




Crysis 3




Metro: Last Light







Battlefield 4




Grand theft auto v




DiRT Rally




Batman: arkham knight




Tom Clancy's Rainbow Six: Siege




Rise of the Tomb Raider




Far cry primal




Tom clancy's the division




Hitman




Deus Ex: Mankind Divided




Total War: WARHAMMER

Since we are testing Total War: WARHAMMER for the first time, we will give the settings at which this game will be tested today and in our subsequent articles about video cards.



And then the results.




Gears of War 4

We will also present the settings of the new game Gears of War 4, which was included in the test set for the first time.








The results are as follows.



Let's supplement the constructed diagrams with a summary table with test results with the displayed average and minimum value of the number of frames per second for each video card.



Pivot charts and analysis of results are next.

4. Summary charts and analysis of results

On the first pair of summary diagrams, we propose to compare the performance of the new NVIDIA TITAN X 12 GB at nominal frequencies and the reference NVIDIA GeForce GTX 980 Ti 6 GB also at nominal frequencies. The results of the last video card are taken as the starting point of reference, and the average FPS of the NVIDIA TITAN X video card is deferred as a percentage of it. The advantage of the new graphics card is, without a doubt, impressive.



In our test conditions and settings, NVIDIA TITAN X is at least 48% faster than NVIDIA GeForce GTX 980 Ti, and its maximum superiority values ​​reach a staggering 85%! Considering that the GeForce GTX 980 Ti in games was actually equal to the former GeForce TITAN X, we can say that NVIDIA TITAN X is just as much faster than its predecessor. The progress of a full-fledged Pascal GPU is incredible, it's a pity that while all this is very expensive, but the GeForce GTX 1080 Ti already flickering on the horizon will be noticeably more affordable (the only question is, what exactly will be cut in them?). So, on average for all games at 2560 x 1440 pixels, NVIDIA TITAN X is 64.7% faster than NVIDIA GeForce GTX 980 Ti in modes without anti-aliasing and by 70.4% when various anti-aliasing algorithms are activated.

Now let's estimate how much NVIDIA TITAN X at the nominal frequencies is ahead of the Gigabyte GeForce GTX 1080 G1 Gaming with the frequency formula given to the level of the reference versions of the GeForce GTX 1080.



And again, a very decent performance gain! At least, the new product is 19% faster than the GeForce GTX 1080, and in Rise of Tomb Raider its advantage reaches an impressive 45.5%. On average across all games, NVIDIA TITAN X is 27.0% faster in non-anti-aliasing modes and 32.7% faster when enabled.

Now let's dream that when the GeForce GTX 1080 Ti is released, NVIDIA will not cut the top-end Pascal in terms of the number of blocks and the number of shader processors, and at the same time its partners will release original versions with higher frequencies. How much more will the flagship's performance grow? The answer is in the following pivot chart.



Overclocking NVIDIA TITAN X by 15.9% in core and 12.4% in video memory accelerates the already mind-blowingly fast video card by 12.9% in modes without anti-aliasing and by 13.4% when AA is activated. Going back to the first pivot chart, it's easy to assume that the original GeForce GTX 1080 Ti might turn out to be twice as fast reference GeForce GTX 980 Ti or GeForce GTX TITAN X. Of course, such a comparison is not objective, because everyone knows that the original GeForce GTX 980 Ti is often capable of overclocking to 1.45-1.50 GHz in core, which means the advantage of potential GeForce GTX 1080 Ti won't be that high. Nevertheless, even a 60-70% performance increase over the flagship of the previous generation cannot fail to impress. Where do you and I have a similar increase in central processing units or RAM? There is nothing like it, even in the top segment. And NVIDIA already has such capabilities!

5. Computing on GPU

First, we will test the performance of the new NVIDIA TITAN X video card in the CompuBench CL version 1.5.8 benchmark. The first two tests are face recognition based on the Viola-Jones algorithm and based on the calculation of the TV-L1 Optical Flow motion vector.



Once again, the performance of the NVIDIA TITAN X is impressive. In the nominal operating mode, the new product outperforms the reference GeForce GTX 980 Ti by 66.6% in the Face Detection test and by 90.4% in the TV-L1 Optical Flow benchmark. The advantage over the GeForce GTX 1080 is also quite noticeable, and the overclocking of the new "Titan" accelerates this video card by another 8.1-12.1%. However, the performance gain is approximately the same for the other two test video cards when the frequencies are increased.

Next, we have the next test for drawing the movement of waves of the water surface by the fast discrete Fourier transform - Ocean Surface Simulation, as well as the test of physical simulation of particles Particle Simulation.



A distinctive feature of this pair of tests was the relative closeness of the results of the GeForce GTX 980 Ti and the GeForce GTX 1080, it seems that the Maxwell core is not going to give up easily. But before the new TITAN X, both of these video cards give way, losing from 42.6 to 54.4%.

The results in the Video Composition test are much denser.



The overclocked Gigabyte GeForce GTX 1080 G1 Gaming even manages to catch up with the nominal NVIDIA TITAN X, although the latter demonstrates a twenty percent advantage over the GeForce GTX 980 Ti.

But in the simulation of mining the Bitcoin cryptocurrency, we again see the colossal advantage of NVIDIA TITAN X.



The new product is almost twice as fast as the GeForce GTX 980 Ti and 30.4% faster than the Gigabyte GeForce GTX 1080 G1 Gaming at the frequencies of the reference NVIDIA GeForce GTX 1080. At this rate of performance gain, NVIDIA and video cards based on AMD graphics processors will remain quite a bit.

Next in line is the GPGPU test from the AIDA64 Extreme utility version 5.75.3981 Beta. From the results obtained, we built diagrams for floating point operations with single and double precision.



If earlier NVIDIA GeForce GTX TITAN X outpaced the first version of GeForce GTX TITAN by 62% in these tests, then the new TITAN X on the Pascal core outperforms its predecessor by 97.5%! For any other results of the AIDA64 GPGPU test, you can contact the discussion topic of the article in our conference.

Finally, let's test the most difficult scene of the latest LuxMark 3.1 - Hotel Lobby.



Note that the old GeForce GTX 980 Ti "does not let down" the Gigabyte GeForce GTX 1080 G1 Gaming in this test, but TITAN X immediately outstrips it by 58.5%. Phenomenal performance! Still, it's a pity that NVIDIA is delaying the release of the GeForce GTX 1080 Ti for now, and it's especially a pity that no one is pushing it in this.

6. Power consumption

The energy consumption was measured using a Corsair AX1500i power supply unit via the Corsair Link interface and the program of the same name, version 4.3.0.154. The energy consumption of the entire system as a whole was measured without taking into account the monitor. The measurement was carried out in 2D mode during normal work in Microsoft Word or Internet surfing, as well as in 3D mode. In the latter case, the load was created using four consecutive cycles of the Swamp-level intro scene from Crysis 3 at 2560 x 1440 pixels at maximum graphics quality settings using MSAA 4X. CPU power saving technologies are disabled.

Let's compare the power consumption of the systems with the graphics cards tested today in the diagram.



Despite the colossal increase in performance everywhere and everywhere, NVIDIA managed to keep the thermal package of the new TITAN X with a Pascal core within the same limits as in the previous version of TITAN X - 250 watts, so the level of power consumption of systems with these video cards differs insignificantly. So, in the nominal operating mode, the configuration with NVIDIA TITAN X consumes 41 watts more than with the NVIDIA GeForce GTX 980 Ti, and when overclocking both video cards, this difference is reduced to 23 watts. At the same time, we note that the system with the Gigabyte GeForce GTX 1080 G1 Gaming is more economical than both versions of TITAN X, and at the frequencies of the reference GeForce GTX 1080 it almost falls within the 400 watts limit, and this is taking into account the fact that the configuration contains a decently overclocked eight-core processor ... The novelty is also more economical in 2D mode.

Conclusion

Since today NVIDIA video cards in the person of the GeForce GTX 1080 and GTX 1070 occupy the sole leadership in performance in the upper price segment, the release of an even more powerful TITAN X, we can well consider the most that neither is the demonstration of its technological superiority over the only competitor. Moreover, this demonstration was fully successful, because, being in the same thermal package, the advantage of the new product over the flagship NVIDIA video card of the previous generation in gaming tests sometimes reaches 85%, and on average it is about 70%! The performance gain in computing looks no less impressive, which, as we know, is paramount for NVIDIA TITAN series graphics cards.

The difference in performance with the GeForce GTX 1080 is a little more modest and amounts to 27-33%, but the performance gain from overclocking in TITAN X is higher (about 13% versus 10% in the GeForce GTX 1080), which means that when a GeForce GTX 1080 Ti appears based on the same GP102 we can count on even higher frequencies and, as a result, performance gain. The negative point in the announcement of TITAN X is a two-hundred-dollar increase in the recommended cost, however, in our opinion, for potential consumers of such video cards a 20% increase in cost will not cause serious problems. Well, the more modest gamers are eagerly awaiting the appearance of the GeForce GTX 1080 Ti, as well as its "red" competitor.

In addition, we note that, despite the stunning performance in games, NVIDIA itself positions TITAN X, first of all, as an effective tool for training neural networks and solving problems related to Deep Learning algorithms (deep learning). These algorithms are actively used today in a variety of areas: speech, image, video recognition, hydrometeorological forecasts, more accurate medical diagnoses, high-precision maps, robotics, self-driving cars, and so on. Therefore, we can say that the possibilities of the new NVIDIA TITAN X graphics card are endless and will satisfy any user.

We thank NVIDIA and personally Irina Shekhovtsova
for the video card provided for testing
.