The arrival of RTX 40 Series has proven to be a trying time for Nvidia’s loyal army of add-in-board partners. All the top brands were forced to watch the high-profile RTX 4090 launch from the sidelines, with in-house Founders Edition stealing the limelight, and the first model to launch as a partner exclusive, RTX 4080 12GB, was cancelled unceremoniously.
Not ideal for board manufacturers wanting to plan ahead, and just to confound matters, partners were then instructed to rebrand RTX 4080 12GB to RTX 4070 Ti and wipe $100 off the asking price. It’s no wonder EVGA decided to bail, and last-minute adjustments inevitably have a knock-on effect; given that AIBs were targeting the x80 Series segment, we now have an array of x70 Series cards that are absolutely huge.
Take, for example, the MSI GeForce RTX 4070 Ti Suprim X. Demonstrating the very best of MSI build quality, it’s an absolute beast measuring 338mm long, 142mm tall and 73mm wide. That’s almost as gargantuan as the flagship RTX 4090 variant, and you absolutely will need to set aside four expansion slots to house this monster.
RTX 4070 Ti really needn’t be this large – the underlying 295mm2 GPU is terrifically efficient and naturally lends itself to smaller, more petite form factors – yet going big has benefits and Suprim X does it better than most. Construction is exemplary throughout, with a brushed metal front cover and full-length metal backplate sandwiching a custom PCB that is reinforced with a die-cast, anti-bending plate.
A triple-fan tank, in other words, albeit one that looks particularly sleek. MSI’s angular accents, subtle lighting and grey/silver colour scheme make for one of the most visually appealing designs on the market, and that refined aesthetic carries across to usage. All three Tri Frozr 3S fans turn off at low load, and when gaming inside our real-world test platform, fan speed rarely exceeds 1,000RPM, resulting in one of the best graphics cards that’s barely audible at any time.
What’s the harm in a little overkill? Well, pricing, as you might have guessed. A £799 starting price for RTX 4070 Ti has already upset those expecting 2023 GPU prices to plummet in one fell swoop, and board partners playing with tight margins are struggling to keep close to Nvidia’s target. MSI has three RTX 4070 Ti models winging their way to UK stores; the entry-level Ventus 3X at £849, the mid-pack Gaming X Trio at £889, and the all-singing, all-dancing Suprim X at £949.
A near-20 per cent premium puts the heavyweight, 2,015g Suprim X dangerously close to Radeon RX 7900 XTX territory, but that’s the price you pay for such lavish design. Given that RTX 4070 Ti is relatively straightforward to cool, Ventus 3X will suffice for most users, and you’ll need to love Suprim X looks to warrant the extra outlay.
Helping incentivise the upgrade, MSI’s bundle includes a token mouse mat, metal GPU support stand and three-way adapter for the card’s single 16-pin 12VHPWR connector. Requiring an extra cable may seem a disadvantage – most RTX 4070 Tis will make do with a two-way configuration – but this is a nod toward Suprim X’s overclocking credentials. MSI allows for board power to be raised from the default 285W to 365W, the highest we’ve seen on any partner card to date.
Out-the-box frequencies are naturally heightened, from 2,610MHz to 2,775MHz, or 2,790MHz if you choose to enable the ‘Extreme Performance’ mode through MSI’s companion app. A physical dual BIOS switch is also present to toggle between ‘Silent’ and ‘Gaming’ modes, but do be aware that core frequency doesn’t differ between the two; both operate at the standard 2,775MHz, with Gaming mode simply upping fan speed. The 12GB of GDDR6X memory, meanwhile, is dialled-in to a default 21Gbps.
Big, over-engineered yet undeniably beautiful, Suprim X looks lovely inside our test platform. Keep scrolling for the benchmarks, but first, a recap on GPU positioning and architecture.
|GeForce||RTX 4090||RTX 4080|
RTX 4070 TI
|RTX 3090 Ti||RTX 3080 Ti||RTX 3080|
|Launch date||Oct 2022||Nov 2022||Jan 2023||Mar 2022||Jun 2021||Jan 2022|
|Architecture||Ada Lovelace||Ada Lovelace||Ada Lovelace||Ampere||Ampere||Ampere|
|Die size (mm2)||608.5||378.6||294.5||628.4||628.4||628.4|
|SMs||128 of 144||76 of 80||60 of 60||84 of 84||80 of 84||70 of 84|
|Boost clock (MHz)||2,520||2,505||2,610||1,860||1,665||1,710|
|Peak FP32 TFLOPS||82.6||48.7||40.1||40.0||34.1||30.6|
|Memory size (GB)||24||16||12||24||12||12|
|Memory bus (bits)||384||256||192||384||384||384|
|Memory clock (Gbps)||21||22.4||21||21||19||19|
|L1 cache (MB)||16||9.5||7.5||10.5||10||8.8|
|L2 cache (MB)||72||64||48||6||6||6|
|Launch MSRP ($)||1,599||1,199||1,999||1,199||799|
As we all know, GPUs are designed to scale and massage multiple price points. Top-end AD102 is a monster, and set the stage magnificently, yet any architecture worth its salt needs to extend effectively down the stack.
AD104, pictured below in block diagram form and productised as GeForce RTX 4070 Ti, brings sweeping cuts to just about every facet. A full quota of 60 SMs translates to 7,680 CUDA cores operating at a heightened 2,610MHz. Extra frequency never hurts, yet the magnitude of the chop is reflected in the fact that RTX 4070 Ti features less than half the core count of flagship RTX 4090, and over 20 per cent less than RTX 4080.
RT and Tensor cores drop accordingly, and the less-of-everything approach is equally savage at the back end, where memory capacity is reduced to 12GB of GDDR6X and bus width is narrowed to 192 bits. At the default 21Gbps, we’re looking at 504GB/s of bandwidth, representing an almost-30 per cent reduction over RTX 4080. Such brutal cuts are mitigated somewhat by the presence of more onboard cache – there’s less need to routinely spool out to memory – yet a card with only 12GB of RAM can’t truly be deemed 4K ready in 2023.
QHD is instead the primary focus, with Nvidia specifically marketing RTX 4070 Ti as a means of achieving over 120 frames per second at an increasingly common 1440p resolution. Such positioning decreases the overall cost of ownership and also has the benefit of lower running costs. While RTX 4090 maintained the same lofty 450W TGP as RTX 3090 Ti, RTX 4070 Ti sips a mere 285W. That’s nearly 20 per cent lower than the last-generation x80 equivalent.
AD104 GPUs that don’t quite make the cut may one day be repurposed for the mid-range – a pared-down, non-Ti RTX 4070 will surely see the light of day – but the question for today’s review is whether or not full-fat RTX 4070 Ti does enough to justify a reduced $799 fee amid renewed competition. Nvidia will argue will that RTX 4070 Ti offers RTX 3090 Ti-like performance for less than half the price, with the added bonus of latest-gen ray tracing supremacy and DLSS enhancements. That much is true, but it won’t have escaped the attention of PC gamers that previous-generation RTX 3070 Ti launched at $599.
Pricing will inevitably lead to fierce debate in tech circles – onlookers are already concerned that a $799 RTX 3070 Ti will pave the way for a $599 RTX 3060 – and there’s no denying the fact that rampant price inflation of recent years has had a lasting effect on the GPU market whether your allegiance is Green or Red. PC gaming fans have every right to be concerned as previously attainable product lines move out of reach, and much like the smartphone market, shocking premium price tags have quickly transitioned to the norm.
Nvidia’s army of partners will be hoping for a positive reaction as, for the first time on a 40 Series card, there’s no Founders Edition to compete with. RTX 4070 Ti will only be available from add-in-board partners, many of whom have at least half-a-dozen unique SKUs to choose from.
While a shift to a smaller node affords more transistor firepower, such a move typically precludes sweeping changes to architecture. Optimisations and resourcefulness are order of the day, and the huge computational demands of raytracing are such that raw horsepower derived from a 3x increase in transistor budget isn’t enough; something else is needed, and Ada Lovelace brings a few neat tricks to the table.
Nvidia often refers to raytracing as a relatively new technology, stressing that good ol’ fashioned rasterisation has been through wave after wave of optimisation, and such refinement is actively being engineered for RT and Tensor cores. There’s plenty of opportunity where low-hanging fruit is yet to be picked.
Shader Execution Reordering
Shaders have been running efficiently for years, whereby one instruction is executed in parallel across multiple threads. You may know it as SIMT.
Raytracing, however, throws a spanner in those smooth works, as while pixels in a rasterised triangle lend themselves to running concurrently, keeping all lanes occupied, secondary rays are divergent by nature and the scattergun approach of hitting different areas of a scene leads to massive inefficiency through idle lanes.
Ada’s fix, dubbed Shader Execution Reordering (SER), is a new stage in the raytracing pipeline tasked with scanning individual rays on the fly and grouping them together. The result, going by Nvidia’s internal numbers, is a 2x improvement in raytracing performance in scenes with high levels of divergence.
Nvidia portentously claims SER is “as big an innovation as out-of-order execution was for CPUs.” A bold statement and there is a proviso in that Shader Execution Reordering is an extension of Microsoft’s DXR APIs, meaning it us up to developers to implement and optimise SER in games.
There’s no harm in having the tools, mind, and Nvidia is quickly discovering that what works for rasterisation can evidently be made to work for RT.
Upgraded RT Cores
In the rasterisation world, geometry bottlenecks are alleviated through mesh shaders. In a similar vein, displaced micro-meshes aim to echo such improvements in raytracing.
“The era of brute-force graphics rendering is over”Bryan Catanzaro, Nvidia VP of applied deep learning research
With Ampere, the Bounding Volume Hierarchy (BVH) was forced to contain every single triangle in the scene, ready for the RT core to sample. Ada, in contrast, can evaluate meshes within the RT core, identifying a base triangle prior to tessellation in an effort to drastically reduce storage requirements.
A smaller, compressed BVH has the potential to enable greater detail in raytraced scenes with less impact on memory. Having to insert only the base triangles, BVH build times are improved by an order of magnitude and data sizes shrink significantly, helping reduce CPU overhead.
The sheer complexity of raytracing is such that eliminating unnecessary shader work has never been more important. To that end, an Opacity Micromap Engine has also been added to Ada’s RT core to reduce the amount of information going back and forth to shaders.
In the common leaf example, developers place the texture of foliage within a rectangle and use opaque polygons to determine the leaf’s position. A way to construct entire trees efficiently, yet with Ampere the RT core lacked this basic ability, with all rays passed back to the shader to determine which areas are opaque, transparent, or unknown. Ada’s Opacity Micromap Engine can identify all the opaque and transparent polygons without invoking any shader code, resulting in 2x faster alpha traversal performance in certain applications.
These two new hardware units make the third-generation RT core more capable than ever before – TFLOPS per RT core has risen by ~65 per cent between generations – yet all this isn’t enough to back up Nvidia’s claims of Ada Lovelace delivering up to 4x the performance of the previous generation. For that, Team Green continues to rely on AI.
Since 2019, Deep Learning Super Sampling has played a vital role in GeForce GPU development. Nvidia’s commitment to the tech is best expressed by Bryan Catanzaro, VP of applied deep learning research, who states with no uncertainty that “the era of brute-force graphics rendering is over.”
Third-generation DLSS, deemed a “total revolution in neural graphics,” expands upon DLSS Super Resolution’s AI-trained upscaling by employing optical flow estimation to generate entire frames. Through a combination of DLSS Super Resolution and DLSS Frame Generation, Nvidia reckons DLSS 3 can now reconstruct seven-eighths of a game’s total displayed pixels, to dramatically increase performance and smoothness.
Generating so much on-screen content without invoking the shader pipeline would have been unthinkable just a few years ago. It is a remarkable change of direction, but those magic extra frames aren’t conjured from thin air. DLSS 3 takes four inputs – two sequential in-game frames, an optical flow field and engine data such as motion vectors and depth buffers – to create and insert synthesised frames between working frames.
In order to capture the required information, Ada’s Optical Flow Accelerator is capable of up to 300 TeraOPS (TOPS) of optical-flow work, and that 2x speed increase over Ampere is viewed as vital in generating accurate frames without artifacts.
The real-world benefit of AI-generated frames is most keenly felt in games that are CPU bound, where DLSS Super Resolution can typically do little to help. Nvidia’s preferred example is Microsoft Flight Simulator, whose vast draw distances inevitably force a CPU bottleneck. Internal data suggests DLSS 3 Frame Generation can boost Flight Sim performance by as much as 2x.
Do note, also, that Frame Generation and Super Resolution can be implemented independently by developers. In an ideal world, gamers will have the choice of turning the former on/off, while being able to adjust the latter via a choice of quality settings.
More demanding AI workloads naturally warrant faster Tensor Cores, and Ada obliges by imbuing the FP8 Transformer Engine from HPC-optimised Hopper. Peak FP16 Tensor teraflops performance is already doubled from 320 on Ampere to 661 on Ada, but with added support for FP8, RTX 4090 can deliver a theoretical 1.3 petaflops of Tensor processing.
Plenty of bombast, yet won’t such processing result in an unwanted hike in latency? Such concerns are genuine; Nvidia has taken the decision to make Reflex a mandatory requirement for DLSS 3 implementation.
Designed to bypass the traditional render queue, Reflex synchronises CPU and GPU workloads for optimal responsiveness and up to a 2x reduction in latency. Ada optimisations and in particular Reflex are key in keeping DLSS 3 latency down to DLSS 2 levels, but as with so much that is DLSS related, success is predicated on the assumption developers will jump through the relevant hoops. In this case, Reflex markers must be added to code, allowing the game engine to feed back the data required to coordinate both CPU and GPU.
Given the often-sketchy state of PC game development, gamers are right to be cautious when the onus is placed in the hands of devs, and there is another caveat in that DLSS tech is becoming increasingly fragmented between generations.
DLSS 3 now represents a superset of three core technologies: Frame Generation (exclusive to RTX 40 Series), Super Resolution (RTX 20/30/40 Series), and Reflex (any GeForce GPU since the 900 Series). Nvidia has no immediate plans to backport Frame Generation to slower Ampere cards.
8th Generation NVENC
Last but not least, Ada Lovelace is wise not to overlook the soaring popularity of game streaming, both during and after the pandemic.
Building upon Ampere’s support for AV1 decoding, Ada adds hardware encoding, improving H.264 efficiency to the tune of 40 per cent. This, says Nvidia, allows streamers to bump their stream resolution to 1440p while maintaining the same bitrate.
AV1 support also bodes well for professional apps – DaVinci Resolve is among the first to announce compatibility – and Nvidia extends this potential on high-end RTX 40 Series GPUs by ensuring both launch models include dual 8th Gen NVENC encoders (enabling 8K60 capture and 2x quicker exports) as well as a 5th Gen NVDEC decoder as standard.
Club386 Test Platform
Processor: AMD Ryzen 9 5950X
Motherboard: Asus ROG Crosshair VII Formula
CPU Cooler: Corsair iCue H115i Elite Capellix
Memory: 64GB G.Skill Trident Z Neo DDR4-3200
Storage: 2TB Corsair MP600
Power Supply: be quiet! Straight Power 11 Platinum 1,000W
Chassis: Fractal Design Define 7 Clear Tempered Glass
Monitor: Philips Momentum 436M6VBPAB
Operating System: Microsoft Windows 11 Pro
Club386 may earn an affiliate commission when you purchase products through links on our site.
Our trusty test platforms have been working overtime these past few months, and though the PCIe slot is starting to look worse for wear, the twin AM4 rigs haven’t skipped a beat.
The biggest RTX 4070 Ti isn’t necessarily the quickest. Using out-the-box settings, Surprim X settled at around 2,800MHz inside our test PC, putting it just shy of Palit’s GamingPro. As expected, Nvidia’s most affordable RTX 40 Series GPU to date benchmarks at a level akin to last-gen flagship, RTX 3090 Ti.
RTX 40 Series has proven to be efficient and every current-gen card surpasses the 97 per cent pass mark in 3DMark’s Stress Test with consummate ease.
The majesty of RTX 4090 becomes all the more astonishing when lesser siblings are brought into view. RTX 4070 Ti features 53 per cent few RT cores than the flagship, bringing performance back down to the earthly level of previous-gen champion, RTX 3090 Ti.
We still haven’t been able to explain the low scores for RDNA 3 in the Mesh Shader test, but as expected Nvidia maintains a healthy lead with respects to ray tracing.
Assassin’s Creed Valhalla
Make no mistake, Radeon RX 7900 XT is a thorn in RTX 4070 Ti’s side. The dearer AMD GPU is quicker to the tune of 11 per cent at an ubiquitous 1080p resolution, and with an extra 8GB of onboard RAM, that gap grows at 1440p and 2160p.
Do you measure performance with an emphasis on pure rasterisation or forward-looking ray tracing smarts? If it’s the latter, RTX 4070 Ti is a solid option for high-quality QHD gameplay.
Far Cry 6
RX 7900 XT at £899 or RTX 4070 Ti at £799? That’s the choice AMD and Nvidia are presenting, yet actual pricing for souped-up partner cards throws a spanner in the works. Remember it is £949 for an RTX 4070 Ti Suprim X.
Final Fantasy XIV: Endwalker
Gaming at QHD? It’s been hard to separate RX 7900 XT and RTX 4070 Ti thus far, both are excellent options for maximum image quality at a 1440p resolution.
Forza Horizon 5
QHD120 is Nvidia’s target and such framerates are often attainable. Note also that while RTX 4090 is a freakish headline act, you can get smooth, high-quality 4K visuals for much less.
Marvel’s Guardians of the Galaxy
Ray tracing is now a staple of cutting-edge game development, and there’s simply no escaping the fact that GeForce GPUs handle the heightened demands better than anyone else. RTX 4070 Ti is 22 per cent quicker than RX 7900 XT in Marvel’s Guardians of the Galaxy at a QHD resolution with ray-traced reflections set to ultra.
Tom Clancy’s Rainbow Six Extraction
At £799, GeForce RTX 4070 Ti is intended to undercut rival Radeon RX 7900 XT. At £949, MSI’s Suprim X gets closer to top-end Radeon RX 7900 XTX that’s comfortably quicker in the rasterisation stakes.
Show your support for impartial Club386 reviews
Club386 takes great pride in providing in-depth, high-quality reviews built on honest analysis and sage buying advice. As an independent publication, free from shareholder or commercial influence, we are committed to maintaining the standards our readers expect.
To help support our work, please consider making a donation via our Patreon channel.
Every contribution makes a difference and we thank you for your support.
Power, Temps and Noise
All-new RTX 4070 Ti and previous-gen RTX 3090 Ti may behave similarly in most games, yet the juxtaposition of these two GPUs is clear in other areas. Whereas the latter raises system-wide power consumption to over 600 watts, the Ada Lovelace card delivers that same level of punch in a far more elegant way.
Good Better news for the energy bill.
Every partner card ought to be able to deliver sub-70°C temperatures with ease. Cooling performance is now realistically a secondary concern; feel free to choose the card that’s priced right and aesthetically pleasing.
We mentioned MSI’s fans barely tickling 1,000RPM and here’s the proof. RTX 4070 Ti Suprim X is top of the chart and one of the quietest cards we’ve ever tested.
Seeing as MSI provides a generous 365W to play with, it makes sense to dabble with a little bit of overclocking. Using the automated scanner built into the firm’s Afterburner utility, pushing boost clock to 2,802MHz is effortless. Our logs reveal the card actually hits 2,925MHz during gaming and we suspect more adventurous users will hit the 3GHz mark. Memory, meanwhile, had no trouble scaling to an effective 23Gbps.
Gaining up to an extra five per cent in performance is easy enough, but don’t expect RTX 4070 Ti to get close to GPUs further up the stack using conventional methods such as increased frequency.
Having spent the past few decades evaluating GPUs in familiar fashion, it is eye-opening to witness a sea change in how graphics performance is both delivered and appraised.
Thunderous rasterisation and lightning RT Cores, yet DLSS 3 is the storm that’s brewing
RTX 40 Series may boast thunderous rasterisation and lightning RT Cores, yet DLSS 3 is the storm that’s brewing. Rejogging the memory banks after all those benchmarks, remember developers now have the option to enable individual controls for Super Resolution and/or Frame Generation. The former, as you’re no doubt aware, upscales image quality based on setting; Ultra Performance works its magic on a 1280×720 render, Performance on 1920×1080, Balanced works at 2227×1253, and Quality upscales from 2560×1440.
On top of that, Frame Generation inserts a synthesised frame between two rendered, resulting in multiple configuration options. Want the absolute maximum framerate? Switch Super Resolution to Ultra Performance and Frame Generation On, leading to a 720p upscale from which whole frames are also synthesised.
Want to avoid any upscaling but willing to live with frames generated from full-resolution renders? Then turn Super Resolution off and Frame Generation On. Note that enabling the latter automatically invokes Reflex; the latency-reducing tech is mandatory, once again reaffirming the fact that additional processing risks performance in other areas.
Reviewers have been granted access to pre-release versions of select titles incorporating DLSS 3 tech. Initial impressions are that developers are still getting to grips with implementation. Certain games require settings to be enabled in a particular sequence for DLSS 3 to properly enable, while others simply crash when alt-tabbing back to desktop. There are other limitations, too. DLSS 3 is not currently compatible with V-Sync (FreeSync and G-Sync are fine) and Frame Generation only works with DX12.
Point is, while Nvidia reckons uptake is strong – some 35 DLSS 3 games have already been announced – these are early days, and some implementations will work better than others. Microsoft Flight Simulator is one of the first to officially support DLSS 3 as part of Sim Update XI (SU11, pictured above), and the results are dramatic to say the least.
RTX 4070 Ti is benchmarked at a 4K UHD resolution with Ultra rendering quality. With DLSS disabled entirely, the GPU falls short of our preferred 60fps target in the Sydney Famous Landing Challenge.
Enabling Super Resolution (DLSS 2) to either quality, balanced or performance modes lifts performance by almost 40 per cent, but it is Frame Generation (DLSS 3) that continues to astonish. Turning it independently boosts performance by 53 per cent, climbing to 94 per cent when Super Resolution is added to the mix at maximum quality.
Flight Simulator is an excellent example of where Frame Generation holds merit. This is a tech unlikely to be widely used by competitive gamers who rely on maximum accuracy, but in the right scenario, the framerate gains are simply too large to ignore. With the right monitor, the difference between 4K60 and 4K120 can be profound, and when done well there’s little detriment to image quality.
That isn’t to say it’s perfect – we did notice occasional flickering of elements in the cockpit – but you really have to look for artifacts to know they’re there. Implementation is impressive, even at this early stage, and Nvidia does well to keep latency in check through Reflex. With DLSS disabled entirely, in-game Flight Simulator latency is recorded as 67.1ms. This rises to 75.4ms with Frame Generation enabled, and then drops back down to 66.9ms with Frame Generation and Super Resolution at maximum quality.
It is too soon to deem DLSS 3 a must-have feature, yet make no mistake, it is an important feather in the 40 Series’ cap and one that rivals will be rushing to replicate. Rival AMD Radeon RX 7000 Series promises Fluid Motion Frames in an effort to double 4K framerate, but when exactly the tech will be available, and how well it performs, remains unknown.
Component shortages and a recent mining boom have had a lasting effect on graphics card prices. Those hoping for swift reductions in 2023 look set to be disappointed, and that’s a shame as the current crop of GPUs are excellent in many regards.
RTX 4070 Ti delivers the performance punch of a last-generation flagship in an efficient 285W package that makes light work of max-quality QHD gameplay at high framerates. Improved ray tracing and best-ever DLSS make it a tantalising proposition, and it is only a £799 fee that serves as an obvious stumbling block.
And do be aware the advertised MSRP is merely a starting point. The best cards will naturally attract a higher price tag, and when it comes to build quality, presentation and acoustics, Suprim X sets a high bar. MSI’s implementation has spared no expense, but the end result, inevitably, is a £949 card that’s at odds with RTX 4070 Ti’s performance potential.
MSI GeForce RTX 4070 Ti Suprim X
Verdict: Luxurious design comes with a hefty premium, but RTX 4070 Ti doesn’t get much better than this.
Delivers on 2K120 promise
RT and DLSS 3 perks
Barely audible at all times
Relatively low power draw
Dual BIOS and 365W limit
Only 12GB memory
£949 price tag
Club386 may earn an affiliate commission when you purchase products through links on our site. This helps keep our content free for all.
Rest assured, our buying advice will forever remain impartial and unbiased.