Palit GeForce RTX 4090 GameRock OC review: say hello to my little friend!

An architecture for a new age.

Jump to: Architecture | The Card | Performance | Conclusion

The arrival of GeForce RTX 4090 marks a new era in PC gaming, where traditional brute-force performance aligns with forward-looking architecture to deliver breath-taking raytraced graphics and smooth gameplay.

It has taken three generations of RTX graphics cards for us to feel assured in stating Nvidia’s ambitions have been fulfilled, but here we are, comfortable with the progress that has been made. It wasn’t always this way, mind you, and a lot has changed over the past four years.

Back in 2018, RTX 20 Series could have been deemed a risky move. Reserving crucial die space for RT and Tensor cores was unexpected, and real-world ramifications for gamers were murky at best. Hey, want to tank performance in favour of better reflections in a few select games? The argument for raytracing was far from clear, yet Nvidia’s leadership in pure rasterisation was such that it could take that gamble without relinquishing top spot.

Fast forward to 2020’s GeForce RTX 30 Series and the tide began to turn. Raytraced graphics showed signs of living up to that initial promise, the performance hit on upgraded architecture proved less significant, and with maturing DLSS, AI-enhanced optimisations paved the way for higher framerates. Solid foundations were laid, and it is testament to Nvidia RTX that both AMD and Intel now view raytracing and upscaling technology as a cornerstone of current and future GPUs.

A strategy well executed is reflected in Nvidia’s confident launch plans for 2022’s RTX 40 Series. By all accounts, new GPUs may have arrived sooner were it not for excess stock of last-gen parts, and such is the potential of the Ada Lovelace architecture that Nvidia was more than happy to lift the lid on performance, specifications, and pricing weeks ahead of launch. A manufacturer in confident mood.

AD102 Building Block

Those of you craving benchmark numbers can skip right ahead – they’re astonishing in many ways – but for the geeks among us, let’s begin with all that’s happening beneath RTX 40 Series’ hood.

Nvidia AD102 Block Diagram

The full might of third-generation RTX architecture, codenamed Ada Lovelace, visualised as the AD102 GPU. It’s quite something, isn’t it?

What you’re looking at is one of the most complex consumer GPUs to date. Nvidia’s 608mm2 die, fabricated on a TSMC 4N process, packs a scarcely believable 76.3 billion transistors. Putting that figure into perspective, the best GeForce chip of the previous 8nm generation, GA102, measures 628mm2 yet accommodates a puny 28.3 billion transistors. Nvidia has effectively jumped two nodes in one fell swoop, as Samsung’s 8nm process is more akin to 10nm from other foundry manufacturers.

We’ve gone from flyweight to heavyweight in the space of a generation, and a 170 per cent increase in transistor count naturally bodes well for specs. A full-fat die is home to 12 graphics processing clusters (GPCs), each sporting a dozen streaming multiprocessors (SMs), six texture processing clusters (TPCs) and 16 render output units (ROPs). Getting into the nitty-gritty of the block diagram, each SM carries 128 CUDA cores, four Tensor cores, one RT core and four texture units.

All told, Ada presents a staggering 18,432 CUDA cores in truest form, representing a greater than 70 per cent hike over last-gen champion, RTX 3090 Ti. Plenty of promise, yet to the frustration of performance purists, Nvidia chooses not to unleash the full might of Ada in this first wave. The initial trio of GPUs shapes up as follows:

GeForceRTX 4090RTX 4080
16GB
RTX 4080
12GB
RTX 3090 TiRTX 3080 TiRTX 3080
12GB
Launch dateOct 2022Nov 2022Nov 2022Mar 2022Jun 2021Jan 2022
CodenameAD102AD103AD104GA102GA102GA102
ArchitectureAda LovelaceAda LovelaceAda LovelaceAmpereAmpereAmpere
Process (nm)444888
Transistors (bn)76.345.935.828.328.328.3
Die size (mm2)608.5378.6294.5628.4628.4628.4
SMs128 of 14476 of 8060 of 6084 of 8480 of 8470 of 84
CUDA cores16,3849,7287,68010,75210,2408,960
Boost clock (MHz)2,5202,5052,6101,8601,6651,710
Peak FP32 TFLOPS82.648.740.140.034.130.6
RT cores1287660848070
RT TFLOPS191.0112.792.778.166.659.9
Tensor cores512304240336320280
ROPs1761128011211296
Texture units512304240336320280
Memory size (GB)241612241212
Memory typeGDDR6XGDDR6XGDDR6XGDDR6XGDDR6XGDDR6X
Memory bus (bits)384256192384384384
Memory clock (Gbps)2122.421211919
Bandwidth (GB/s)1,0087175041,008912912
L1 cache (MB)169.57.510.5108.8
L2 cache (MB)726448666
Power (watts)450320285450350350
Launch MSRP ($)1,5991,1998991,9991,199799

Leaving scope for a fabled RTX 4090 Ti, inaugural RTX 4090 disables a single GPC, enabling 128 of 144 possible SMs. Resulting figures of 16,384 CUDA cores, 128 RT cores and 512 Tensor cores remain mighty by comparison, and frequency headroom on the 4nm process is hugely impressive, with Nvidia specifying a 2.5GHz boost clock for the flagship product. It won’t have escaped your attention that peak teraflops have more than doubled, from 40 on RTX 3090 Ti to 82.6 on RTX 4090.

The thought of a $1,999 RTX 4090 Ti waiting in the wings will rankle die-hard gamers wanting the absolute best on day one, however if you happen to be a PCMR loyalist, take pleasure in the fact that Xbox Series X produces just 12.1 teraflops. PlayStation 5, you ask? Pffft, a mere 10.3 teraflops.

The thought of a $1,999 RTX 4090 Ti waiting in the wings will rankle die-hard gamers

Front-end RTX 40 Series specifications are eye-opening, but the back end is noticeably less revolutionary, where a familiar 24GB of GDDR6X memory operates at 21Gbps, providing 1,008GB/s of bandwidth. The needle hasn’t moved – Nvidia has opted against Micron’s quicker 24Gbps chips – however the load on memory has softened with a significant bump in on-chip cache.

RTX 4090 carries 16MB of L1 and 72MB of L2. We’ve previously seen AMD attach as much as 128MB Infinity Cache on Radeon graphics cards, and though Nvidia doesn’t detail data rates or clock cycles, a greater than 5x increase in cache between generations reduces the need to routinely spool out to memory, raising performance and reducing latency.

Heard rumours of RTX 40 Series requiring a nuclear reactor to function? Such reports are wide of the mark. RTX 4090 maintains the same 450W TGP as RTX 3090 Ti, while second- and third-tier Ada Lovelace solutions scale down as low as 285W. 450W is still significant, and those who’ve been around the GPU block will remember that 250W was considered the upper limit not so long ago. Unlike RTX 3090 Ti, however, RTX 4090’s sheer grunt is such that performance-per-watt is much improved; we’re promised double the performance in the same power budget as last generation.

Nvidia Ada Lovelace die

Looking across to other Ada Lovelace GPUs, both RTX 4080 16GB and the controversial RTX 4080 12GB are set to follow in November. The dearer of the two once again falls short of full implementation, with 76 of 80 possible SMs. Controversy arises from the fact that both models utilise unique die; AD103 for the 16GB card and AD104 for the 12GB.

Casual gamers going solely by branding could be fooled into thinking both models carry broadly similar characteristics, yet differences run deep, with AD104 reducing core count from 9,728 to 7,680. RT and Tensor cores are shaved proportionally, while memory bus width is cut to 192 bits.

Common sentiment is that RTX 4080 12GB ought to have been named RTX 4070, so why risk the kerfuffle? Nvidia’s official stance is the third-tier GPU is so fast it is deserving of 80-class branding. That argument is backed by peak teraflops being on par with RTX 3090 Ti, but a more logical reason is one of price. It is, after all, much easier for an RTX 4080 to command an $899 price point than an RTX 4070, and in the interests of keeping shareholders happy, a 294.5mm2 GPU priced at $899 looks extremely handsome compared to its 628.4mm2, $799 predecessor.

Common sentiment is that RTX 4080 12GB ought to have been named RTX 4070, so why risk the kerfuffle?

Nvidia’s been in the game long enough to know that GPUs need to be massaged to hit various price points, and with Ada Lovelace there’s plenty of wiggle room. Don’t be surprised to see full-fat RTX 4090 Ti and RTX 4080 Ti further down the line, and when RTX 4070 does eventually materialise, it could well take the shape of cut-down AD104.

Putting on our conjecturing hat, disabling an entire AD104 GPC would theoretically leave RTX 4070 with 6,144 CUDA cores and 32 teraflops performance. A potential $699 replacement for long-in-the-tooth RTX 3080 10GB?

Pricing will ultimately continue to grate PC users who feel squeezed out of high-performance gaming, and though there’s little chance of mining-fuelled demand this time around, market conditions and rampant inflation aren’t conducive to high-end GPU bargains. While we’re hopeful Intel Arc and AMD RDNA3 will restore competition in the mid-range – RTX 4080 12GB is the one to aim for – we don’t expect any serious challengers to RTX 4090.

Outside of core specifications, it is worth knowing display outputs remain tied to HDMI 2.1 and DisplayPort 1.4a – DisplayPort 2.0 hasn’t made the cut – and PCIe Gen 4 continues as the preferred interface. There’s no urgency to switch to Gen 5, says Nvidia, as even RTX 4090 can’t saturate the older standard. Finally, NVLink is conspicuous by its absence; SLI is nowhere to be seen on any RTX 40 Series product announced thus far, signalling multi-GPU setups are well and truly dead.

Ada Optimisations

While a shift to a smaller node affords more transistor firepower, such a move typically precludes sweeping changes to architecture. Optimisations and resourcefulness are order of the day, and the huge computational demands of raytracing are such that raw horsepower derived from a 3x increase in transistor budget isn’t enough; something else is needed, and Ada Lovelace brings a few neat tricks to the table.

Nvidia often refers to raytracing as a relatively new technology, stressing that good ol’ fashioned rasterisation has been through wave after wave of optimisation, and such refinement is actively being engineered for RT and Tensor cores. There’s plenty of opportunity where low-hanging fruit is yet to be picked.

Shader Execution Reordering

Shaders have been running efficiently for years, whereby one instruction is executed in parallel across multiple threads. You may know it as SIMT.

Raytracing, however, throws a spanner in those smooth works, as while pixels in a rasterised triangle lend themselves to running concurrently, keeping all lanes occupied, secondary rays are divergent by nature and the scattergun approach of hitting different areas of a scene leads to massive inefficiency through idle lanes.

Ada Lovelace - Shader Execution Reordering

Ada’s fix, dubbed Shader Execution Reordering (SER), is a new stage in the raytracing pipeline tasked with scanning individual rays on the fly and grouping them together. The result, going by Nvidia’s internal numbers, is a 2x improvement in raytracing performance in scenes with high levels of divergence.

Nvidia portentously claims SER is “as big an innovation as out-of-order execution was for CPUs.” A bold statement and there is a proviso in that Shader Execution Reordering is an extension of Microsoft’s DXR APIs, meaning it us up to developers to implement and optimise SER in games.

There’s no harm in having the tools, mind, and Nvidia is quickly discovering that what works for rasterisation can evidently be made to work for RT.

Upgraded RT Cores

In the rasterisation world, geometry bottlenecks are alleviated through mesh shaders. In a similar vein, displaced micro-meshes aim to echo such improvements in raytracing.

The era of brute-force graphics rendering is over

Bryan Catanzaro, Nvidia VP of applied deep learning research

With Ampere, the Bounding Volume Hierarchy (BVH) was forced to contain every single triangle in the scene, ready for the RT core to sample. Ada, in contrast, can evaluate meshes within the RT core, identifying a base triangle prior to tessellation in an effort to drastically reduce storage requirements.

A smaller, compressed BVH has the potential to enable greater detail in raytraced scenes with less impact on memory. Having to insert only the base triangles, BVH build times are improved by an order of magnitude and data sizes shrink significantly, helping reduce CPU overhead.

The sheer complexity of raytracing is such that eliminating unnecessary shader work has never been more important. To that end, an Opacity Micromap Engine has also been added to Ada’s RT core to reduce the amount of information going back and forth to shaders.

Ada Lovelace - Opacity Micromap Engine

In the common leaf example, developers place the texture of foliage within a rectangle and use opaque polygons to determine the leaf’s position. A way to construct entire trees efficiently, yet with Ampere the RT core lacked this basic ability, with all rays passed back to the shader to determine which areas are opaque, transparent, or unknown. Ada’s Opacity Micromap Engine can identify all the opaque and transparent polygons without invoking any shader code, resulting in 2x faster alpha traversal performance in certain applications.

These two new hardware units make the third-generation RT core more capable than ever before – TFLOPS per RT core has risen by ~65 per cent between generations – yet all this isn’t enough to back up Nvidia’s claims of Ada Lovelace delivering up to 4x the performance of the previous generation. For that, Team Green continues to rely on AI.

DLSS 3

Since 2019, Deep Learning Super Sampling has played a vital role in GeForce GPU development. Nvidia’s commitment to the tech is best expressed by Bryan Catanzaro, VP of applied deep learning research, who states with no uncertainty that “the era of brute-force graphics rendering is over.”

Third-generation DLSS, deemed a “total revolution in neural graphics,” expands upon DLSS Super Resolution’s AI-trained upscaling by employing optical flow estimation to generate entire frames. Through a combination of DLSS Super Resolution and DLSS Frame Generation, Nvidia reckons DLSS 3 can now reconstruct seven-eighths of a game’s total displayed pixels, to dramatically increase performance and smoothness.

Generating so much on-screen content without invoking the shader pipeline would have been unthinkable just a few years ago. It is a remarkable change of direction, but those magic extra frames aren’t conjured from thin air. DLSS 3 takes four inputs – two sequential in-game frames, an optical flow field and engine data such as motion vectors and depth buffers – to create and insert synthesised frames between working frames.

In order to capture the required information, Ada’s Optical Flow Accelerator is capable of up to 300 TeraOPS (TOPS) of optical-flow work, and that 2x speed increase over Ampere is viewed as vital in generating accurate frames without artifacts.

Ada Lovelace - Optical Flow Accelerator

The real-world benefit of AI-generated frames is most keenly felt in games that are CPU bound, where DLSS Super Resolution can typically do little to help. Nvidia’s preferred example is Microsoft Flight Simulator, whose vast draw distances inevitably force a CPU bottleneck. Internal data suggests DLSS 3 Frame Generation can boost Flight Sim performance by as much as 2x.

Do note, also, that Frame Generation and Super Resolution can be implemented independently by developers. In an ideal world, gamers will have the choice of turning the former on/off, while being able to adjust the latter via a choice of quality settings.

More demanding AI workloads naturally warrant faster Tensor Cores, and Ada obliges by imbuing the FP8 Transformer Engine from HPC-optimised Hopper. Peak FP16 Tensor teraflops performance is already doubled from 320 on Ampere to 661 on Ada, but with added support for FP8, RTX 4090 can deliver a theoretical 1.3 petaflops of Tensor processing.

Plenty of bombast, yet won’t such processing result in an unwanted hike in latency? Such concerns are genuine; Nvidia has taken the decision to make Reflex a mandatory requirement for DLSS 3 implementation.

Designed to bypass the traditional render queue, Reflex synchronises CPU and GPU workloads for optimal responsiveness and up to a 2x reduction in latency. Ada optimisations and in particular Reflex are key in keeping DLSS 3 latency down to DLSS 2 levels, but as with so much that is DLSS related, success is predicated on the assumption developers will jump through the relevant hoops. In this case, Reflex markers must be added to code, allowing the game engine to feed back the data required to coordinate both CPU and GPU.

Given the often-sketchy state of PC game development, gamers are right to be cautious when the onus is placed in the hands of devs, and there is another caveat in that DLSS tech is becoming increasingly fragmented between generations.

DLSS 3 now represents a superset of three core technologies: Frame Generation (exclusive to RTX 40 Series), Super Resolution (RTX 20/30/40 Series), and Reflex (any GeForce GPU since the 900 Series). Nvidia has no immediate plans to backport Frame Generation to slower Ampere cards.

8th Generation NVENC

Last but not least, Ada Lovelace is wise not to overlook the soaring popularity of game streaming, both during and after the pandemic.

Building upon Ampere’s support for AV1 decoding, Ada adds hardware encoding, improving H.264 efficiency to the tune of 40 per cent. This, says Nvidia, allows streamers to bump their stream resolution to 1440p while maintaining the same bitrate.

AV1 support also bodes well for professional apps – DaVinci Resolve is among the first to announce compatibility – and Nvidia extends this potential on high-end RTX 40 Series GPUs by ensuring all three launch models include dual 8th Gen NVENC encoders (enabling 8K60 capture and 2x quicker exports) as well as a 5th Gen NVDEC decoder as standard.

Souped-up Founders Edition

Much has been made of Nvidia’s relationship with add-in board manufacturers. With the past few generations of GeForce hardware placing an emphasis on in-house Founders Edition graphics cards, murmurings of displeased partners erupted this past month when long-time partner EVGA announced it would not be producing RTX 40 Series graphics cards.

The ramifications of the seemingly acrimonious dispute remain unclear, yet one thing’s for certain, Nvidia is not going to take the spotlight away from its own products. Partners were barely mentioned in the company’s press briefings, and a decision to launch Founders Edition reviews a day prior to AIB cards has left manufacturers playing second fiddle.

Nvidia GeForce RTX 4090 Founders Edition

Whichever card you choose, it is going to be big. Much has been made of chonky RTX 4090 designs, yet while we’re enjoying the memes, actual Founders Edition dimensions suggest not a great deal has changed between generations.

The already-large RTX 3090 paved the way for triple-slot Founders Edition boards but fell just shy of full, three-slot thickness (2.7-slot, to be precise). RTX 4090 goes the whole hog with a 3.0-slot form factor, making it fractionally thicker, though length, you might be surprised to hear, is shortened from 313mm to 304mm. That reduction, alongside a subtle curve on the exterior, makes the card appear plumper than dimensions suggest.

Putting my neck on the line here, I’ve nothing against larger graphics cards. In fact, we’ve often suggested that waning support for SLI should allow AIBs to produce chunkier cards in favour of lower temps and quieter acoustics. It was for those reasons we took a liking to Asus’s 4.3-slot Noctua Edition coolers.

Nvidia’s revamped Founders Edition cooler, applicable to both RTX 4090 and RTX 4080 16GB (RTX 4080 12GB will only be available in partner designs), takes advantage of a broader waistline by incorporating a new vapour chamber, improved heatpipe configuration, larger fans touting a 20 per cent increase in airflow, and particular emphasis on memory cooling.

Nvidia GeForce RTX 4090 PCB

GDDR6X ran notoriously hot on last-gen cards, and Nvidia has tackled that issue in two ways. Firstly, the Micron chips are now built on a smaller, more efficient node. Secondly, increased density allows all the memory to reside on one side of the PCB for more effective cooling. How effective? Nvidia claims a 10°C reduction in GDDR6X temperature when gaming.

Nvidia’s petite PCB – shaped to optimise airflow – carries two additional layers to enhance efficiency, and the number of GPU power phases rises from 16 to 20. The most notable cleanup takes place in the upper-right corner, where multiple PCIe power connectors are replaced by a single, forward-looking 12VHPWR offering compatibility with ATX 3.0 power supplies.

In no hurry to give up your existing PSU? Adapters are included with all RTX 4090 cards, converting three/four eight-pin PCIe connectors to the 12VHPWR standard. The end result is a messy arrangement of cables, and anyone planning an ultra-sleek build will want to factor in an ATX 3.0 PSU. That said, we’ve had zero issues running RTX 4090 cards on our trusty be quiet! Straight Power 11 Platinum 1,000W supply, and note that only three PCIe connectors are mandatory; the fourth, taking power from 450W to 600W, is only warranted if you’re inclined to overclock the shiny new GPU.

Palit GameRock OC

Impressive though the Founders Edition may be, custom-cooled variants are at hand for those seeking something even bigger and bolder. There are literally dozens to choose from, and Palit piles on the glitz with GameRock OC.

As a general rule, partners tend to balloon-out the reference design. With Nvidia now using a full three slots, AIB cards are spilling liberally into the fourth or even fifth bay. Officially designated a 3.5-slot design, Palit’s behemoth measures 329.4mm x 137.5mm x 71.5mm and tips the scales at 1,984g.

It is, without doubt, a monster, and huge cards have quickly become commonplace in 2022. Palit’s own RTX 3090 Ti was no shrinking violet, and dimensions aren’t too dissimilar to Sapphire’s hulking Radeon RX 6950 XT.

Palit GeForce RTX 4090 GameRock OC - Size Comparison

With heatsinks of these proportions, AIB partners ought to have no trouble taming even a 450W GPU. Palit employs a vapour chamber at the heart of its beast, alongside a trio of ‘gale hunter’ fans. All three turn off at low load, which is good, but spin up a little too abruptly from zero to 30 per cent when temperature exceeds 50°C. The fans do remain suitably quiet when gaming, however the transition from off to on could be smoother.

The solitary 12VHPWR power connector can be used with the pictured four-way adapter on existing PSUs, and it is interesting to note Palit’s power-supply recommendations are more robust than most. Whereas Nvidia suggests 850W for systems featuring the Founders Edition, Palit recommends a startling 1,200W of available juice. This is erring on the extreme side of caution; the card barely tickles our 1,000W supply.

A redesigned full-metal backplate gives necessary rigidity, and Palit’s bundle includes an addressable RGB cable for motherboard synchronisation, as well as a helpful support bracket. The latter attaches to a choice of mounting holes toward the end of the card and can be adjusted in height using a selection of poles. It’s a simple-but-effective addition whose rubber foot prevents any scratches or damage to the chassis.

Minor factory overclocking is represented in a boost clock of 2,610MHz – up from 2,520MHz on the FE – but the real reason to consider the Palit is huge amounts of vivid RGB lighting. The backlit logos across the top tell only half the story, as the ‘starlight black crystal design’ across the front is where the appeal really lies.

Palit GeForce RTX 4090 GameRock OC - Bling!

Heart set on a build that will outshine your Christmas tree this holiday season? The relatively bland Founders Edition just won’t cut it. Palit’s crystalised fascia, on the other hand, brings a rig to life. The illumination effect actually works quite well in a traditional horizontal layout – the amount of lighting reflects nicely in our Fractal Design Define R6 chassis – but the card is guaranteed to turn heads in a vertical orientation.

True, some effects aren’t perfect – there aren’t enough LEDs for transitions to appear seamless – yet top marks to Palit for attempting to stand out from the crowd. The question mark that remains is one of price. With Nvidia’s Founders Edition sitting at £1,679 here on UK shores, we feel as though air-cooled partner cards need to keep as close to the £1,700 mark as possible. That may be difficult given the plummeting value of the pound, yet in our discussions with Palit, we’re told custom-cooled cards are expected to range from approximately £1,675 for entry-level models, rising to £1,730 for the likes of GameRock OC.

We’ve rambled on long enough, how about some benchmarks?

Performance

An entirely new architecture needs to be tested from the ground-up. All of our comparison GPUs are tested from scratch on our enduring Ryzen 9 5950X test platform. The Asus ROG Crosshair VIII Formula motherboard has been updated to the latest 4201 BIOS, Windows 11 is up to version 22H2, and we’ve used the most recent Nvidia and AMD drivers at the time of writing.

We’d normally take this opportunity to point out relevant comparisons to watch for. There aren’t any; RTX 4090 stands alone.

3DMark - Time Spy
3DMark - Time Spy Extreme

Yessir, Ada’s pretty quick. A 55 per cent uplift over RTX 3090 Ti is not to be scoffed at – the score aligns with a 52 per cent increase in CUDA cores – yet something’s amiss. Given the architectural enhancements, we’d expect RTX 4090 to sail past 30k in the standard 3DMark Time Spy test.

Further investigation reveals a surprising truth; RTX 4090 is so fast a Ryzen 9 5950X simply can’t keep up. Our trusty test platform has served us well for years, but it would appear the time has come to transition to something quicker.

3DMark - Time Spy Extreme Stress Test

We don’t expect any partner card to fall short of the 97 per cent pass requirement in 3DMark’s Stress Test. The last card to fail was, indeed, Nvidia’s GeForce RTX 3080 Ti Founders Edition.

3DMark - DirectX Raytracing

Faster RT Cores… and more of them. 3DMark’s DirectX Raytracing test reveals over a 2x leap in generational performance.

3DMark - Port Royal
3DMark - Mesh Shader

You know a graphics card is quick when results force you to change the axis’ maximum bounds. Vroom.

Assassin’s Creed Valhalla

Assassin's Creed Valhalla - FHD
Assassin's Creed Valhalla - QHD
Assassin's Creed Valhalla - UHD

When it comes to actual games, Palit’s RTX 4090 GameRock OC was found to be hitting ~2.8GHz during real-world use inside our fully built test platform. Such GPU power deserves a high-resolution, high-refresh panel to go with. QHD at over 200Hz would be a good fit, but increasingly, it seems 4K120 is the industry-wide target.

Cyberpunk 2077

Cyberpunk 2077 - FHD
Cyberpunk 2077 - QHD
Cyberpunk 2077 - UHD

Cyberpunk 2077 is the modern-day equivalent of Crysis, a destroyer of feeble GPUs. We now have in excess of 80 frames per second at QHD with raytracing set to ultra. 4K60, however, remains outside of 4090’s grasp. DLSS, as you might have guessed, will have something to say about that a little later in our testing.

Far Cry 6

Far Cry 6 - FHD
Far Cry 6 - QHD
Far Cry 6 - UHD

There’s a bottleneck or framerate cap in Far Cry 6. RTX 4090 comfortably tops the chart at 4K, but even older GPUs can deliver 4K60 in this raytraced title.

Final Fantasy XIV: Endwalker

Final Fantasy XIV: Endwalker - FHD
Final Fantasy XIV: Endwalker - QHD
Final Fantasy XIV: Endwalker - UHD

Final Fantasy XIV: Endwalker is usually a good sign of a GPU’s rasterisation capabilities. RTX 4090 is found to be 42 per cent quicker than RTX 3090 Ti. Fast, but might you have expected more from 76.6 billion transistors?

Forza Horizon 5

Forza Horizon 5 - FHD
Forza Horizon 5 - QHD
Forza Horizon 5 - UHD

Playground Games’ beautiful racer provides fascinating insight. You might be wondering why no minimum FPS is charted at FHD or QHD. The reason is that the reported GPU minimum is higher than the in-game average; once again, our Ryzen 9 5950X test platform is unable to keep up with the fleeting GeForce. That bottleneck is alleviated at 4K, where RTX 4090 maintains a 50 per cent advantage over RTX 3090 Ti.

Marvel’s Guardians of the Galaxy

Marvel's Guardians of the Galaxy - FHD
Marvel's Guardians of the Galaxy - QHD
Marvel's Guardians of the Galaxy - UHD

Add raytracing to the mix and the generational performance gap grows considerably. From 59 to 99 frames per second at 4K is nothing short of astonishing and you will want a high-res monitor for RTX 4090 to stretch its legs. The value of such firepower is diminished at 1080p.

Tom Clancy’s Rainbow Six Extraction

Tom Clancy's Rainbow Six Extraction - FHD
Tom Clancy's Rainbow Six Extraction - QHD
Tom Clancy's Rainbow Six Extraction - UHD

On more than one occasion, Nvidia has reiterated desire to push competitive gamers to look beyond 1080p. The ubiquitous full-HD resolution remains a mainstay in the pro-gaming space, where contestants crave ultra-high refresh rates. Has the time come for 1440p to become the default? For those who can afford the very best GPU, perhaps so.

If, somehow, you’re still questioning performance, take another look at the Rainbow Six graphs and note that RTX 4090’s minimum frame rate is higher than every other card’s average. Bonkers.

Show your support for impartial Club386 reviews

Club386 takes great pride in providing in-depth, high-quality reviews built on honest analysis and sage buying advice. As an independent publication, free from shareholder or commercial influence, we are committed to maintaining the standards our readers expect.

To help support our work, please consider making a donation via our Patreon channel.

Every contribution makes a difference and we thank you for your support.

Efficiency, Temps and Noise

System Power Consumption

Power consumption on Ada Lovelace has been subject to intense speculation in recent months. While 450W is at the top end of the GPU scale, RTX 4090’s configuration is such that real-world power draw is often lower than you’d think.

Our entire test system was found drawing less than 550W in most titles; only in the most demanding raytraced games were we able to get that figure closer to the worst-case-scenario of 595W. A marked improvement over RTX 3090 Ti, which always ran close to the limit.

Energy Efficiency at UHD

A better way to look at efficiency is to divide the average 4K UHD framerate across all tested titles by peak system-wide power consumption. 450W GPUs don’t sound all that appealing in this era of frightening energy costs, yet no other card in our line-up comes close to matching RTX 4090 in performance-per-watt.

It pays to know exactly what your costs could be in this day and age. Taking our Ryzen 9 5950X/GeForce RTX 4090 system as an example, we calculate an hour of gaming each day would cost £74 in annual electricity. Game for four hours a day and you’d be looking at a bill of £246. Figures are based on the October UK Energy Price cap rate of 34 pence per kWh.

GPU Temperature

It doesn’t matter if you’re taming a CPU, GPU or simple SSD, bigger cooling apparatus is a sure-fire method of delivering lower temps. Given the frequencies RTX 4090 is hitting, sub-70°C is suitably chilled.

System Noise

Palit’s GameRock OC keeps noise levels down to a reasonable level, too. We imagine the most fine-tuned AIB coolers will go even better; who’ll be first to get under-load levels down below 35dB, we wonder.

Overclocking

There’s good news and bad news on the overclocking front. Starting with the good, RTX 4090, despite lofty shipping frequencies, does have further headroom. Those inclined to up the ante with manual tinkering will get close to speeds of 3GHz.

For the sake of our card’s health, we chose to use Afterburner’s automated OC scanner, which helpfully still works on Ada Lovelace. With memory bumped to an effective 23Gbps, in-game core clock routinely skipped past 2.9GHz.

Overclocking - Cyberpunk 2077 - UHD
Overclocking - Marvel's Guardians of the Galaxy - UHD

What’s the bad news? Well, overclocking is a risky pastime on any GPU, let alone one costing in excess of £1,600. Minor performance improvements are offset by a hike in power consumption, and if you really want to up framerate, meaningful gains are available through other means. Speaking of which.

DLSS 3

Having spent the past few decades evaluating GPUs in familiar fashion, it is eye-opening to witness a sea change in how graphics performance is both delivered and appraised.

Thunderous rasterisation and lightning RT Cores, yet DLSS 3 is the storm that’s brewing

RTX 4090 may boast thunderous rasterisation and lightning RT Cores, yet DLSS 3 is the storm that’s brewing. Rejogging the memory banks after all those benchmarks, remember developers now have the option to enable individual controls for Super Resolution and/or Frame Generation. The former, as you’re no doubt aware, upscales image quality based on setting; Ultra Performance works its magic on a 1280×720 render, Performance on 1920×1080, Balanced works at 2227×1253, and Quality upscales from 2560×1440.

On top of that, Frame Generation inserts a synthesised frame between two rendered, resulting in multiple configuration options. Want the absolute maximum framerate? Switch Super Resolution to Ultra Performance and Frame Generation On, leading to a 720p upscale from which whole frames are also synthesised.

Want to avoid any upscaling but willing to live with frames generated from full-resolution renders? Then turn Super Resolution off and Frame Generation On. Note that enabling the latter automatically invokes Reflex; the latency-reducing tech is mandatory, once again reaffirming the fact that additional processing risks performance in other areas.

Cyberpunk 2077 - DLSS 3 Settings

Reviewers have been granted access to pre-release versions of select titles incorporating DLSS 3 tech. Initial impressions are that developers are still getting to grips with implementation. Certain games require settings to be enabled in a particular sequence for DLSS 3 to properly enable, while others simply crash when alt-tabbing back to desktop. There are other limitations, too. DLSS 3 is not currently compatible with V-Sync (FreeSync and G-Sync are fine) and Frame Generation only works with DX12.

Point is, while Nvidia reckons uptake is strong – some 35 DLSS 3 games have already been announced – these are early days, and some implementations will work better than others. Cyberpunk 2077, formerly a bug-ridden mess, has evolved dramatically over the months into a much-improved game and one that best showcases what raytracing and DLSS can do.

DLSS 3 - Cyberpunk 2077 - UHD

Performance is benchmarked at a 4K UHD resolution with raytracing set to Ultra and DLSS in eight unique configurations (we weren’t kidding when we said gamers will have plenty of options). Putting everything in the hands of rasterisation evidently isn’t enough; 42 frames per second is disapointing.

Turn on Frame Generation, whereby frames are synthesised from the native 4K render, and framerate jumps by 71 per cent, equivalent to Super Resolution at the highest quality setting. Beyond that, performance scales as image quality degrades. Maximum Super Resolution (1440p upscaling) with Frame Generation bumps FPS up to 112, and dropping Super Resolution quality down to Performance (1080p upscaling) sends framerate to 146. This is where Nvidia’s claims of up to a 4x increase in performance become apparent.

DLSS 3 works, the numbers attest to that, but there are so many other questions raised here. Performance purists will naturally lament the fact that not all frames are properly rasterised, and though Nvidia reckons developers are on board, you have to wonder how game artists feel about their creations being artificially amalgamated.

A lot of the answers ultimately revolve around how well DLSS maintains image quality. The pros and cons of Super Resolution are well documented, but Frame Generation is an entirely new beast, and even at this early stage, signs are good.

The screengrabs above, showing a synthesised frame between two renders, suggests Nvidia’s optical flow accelerator is doing a fine of job of reconstructing a frame, though visual artifacts do remain. HUD elements, in particular, tend to confuse DLSS 3 (look for the 150M objective marker as an example), but on the whole it is a surprisingly accurate portrayal.

It is also worth pointing out that we’re having to pore through individual frames just to find artifacts. You may discover the odd missing branch or blurred sign, but inserted frames go by so quickly that errors are practically imperceptible during actual gameplay. On the contrary, what you do notice is how much smoother the experience is with DLSS 3 enabled.

For gamers with high-refresh panels the tech makes good sense, of course, but convincing the e-sports crowd will be easier said than done. That’s an arena where accurate frames matter most.

There’s also latency to take into consideration. In the above tests, average latency was recorded as 85ms with DLSS turned off completely. Latency drops to 31ms with Super Resolution enabled to Quality mode, but then rises to 45ms with both Super Resolution and Frame Generation working in tandem.

Plenty of food for thought. DLSS 3 examination will continue in earnest with the release of compatible retail games. For now, consider us intrigued and a little bit optimistic.

Conclusion

More than just a new GPU architecture, Ada Lovelace represents an admission that the best graphics cards needs to evolve in order to realise major advancements in 3D visuals.

Rasterisation alone isn’t getting the job done, and as is the case with most major processor technologies of today, specialised cores will become the weapon of choice to help developers unlock true next-gen progress. Nvidia has bet big on RT and Tensor Cores, and though only time will tell whether or not DLSS is the silver bullet it is made out to be, RTX, at the very least, opens up an important conversation.

The most complete manifestation of this new outlook is represented in GeForce RTX 4090, a humdinger of a GPU that excels in practically every performance metric. A substantial hike in rasterisation alongside accelerated RT Cores delivers benchmark-busting results, and when you add the potential of DLSS 3, raytraced framerates truly begin to soar. We should expect no less with 76 billion transistors at work, yet it is satisfying to see everything come together on a TSMC 4N process.

We often witness halo products deliver diminishing returns at exorbitant premiums, yet for the first time in a while, that doesn’t apply to top-end RTX 4090. While undoubtedly costly at over £1,600, it heralds a level of performance that still manages to feel like good value for those who game at this extreme end of the market.

The more pressing concern, we feel, is how developers go about utilising the new tech. Opportunity is there to push the boundaries of PC graphics, yet our time with pre-release DLSS 3 titles suggests there’s some way to go in terms of optimisation and stability. It is also worth pointing out heightened system requirements. RTX 4090 is so quick it demands suitably lavish accompanying hardware; factor a high-res screen and a state-of-the-art CPU into your plans.

PC enthusiasts seeking the next big increase in gaming performance need look no further. GeForce RTX 4090 is a startling upgrade for anyone willing to foot the bill.

Palit GeForce RTX 4090 GameRock OC - Shiny!

Palit GeForce RTX 4090 GameRock OC

Verdict: Ferociously quick, GeForce RTX 4090 is the only choice for ultra-high-end gaming rigs.


Club386 Editor's Choice

Pros

Performance off the charts
DLSS 3.0 a game changer
Runs cool and quiet
Single power connector
24GB memory

Cons

450W now the norm
Large footprint


Club386 may earn an affiliate commission when you purchase products through links on our site. This helps keep our content free for all.
Rest assured, our buying advice will forever remain impartial and unbiased.

Related Articles

An architecture for a new age.Palit GeForce RTX 4090 GameRock OC review: say hello to my little friend!