NVIDIA shows performance and efficiency gains in AI inference in MLPerf tests

NVIDIA announced this Wednesday (5) the results of a new round of MLPerf tests. The company has taken AI inference to higher levels of performance and efficiency from the cloud to the edge (edge).

According to the company, the increased performance came from NVIDIA H100 Tensor Core GPUs running on DGX H100 systems. Coupled with software optimizations, these GPUs have seen gains of up to 54% since their debut last September.

The highlight of the H100, with its Transformer Engine, was BERT, a language model that paved the way for the use of generative AI. With it, it is possible to help computers understand the meaning of ambiguous language in the text, in order to establish a context through the surrounding content.

Generative AI – known in recent times as ChatGPT – works for the quick creation of texts, images, 3D models and more. The feature has been adopted by companies, startups and cloud service providers to enable new business models and accelerate existing ones.

GTC 2023: NVIDIA introduces new service


Fairs and events
21 Mar

GTC 2022: NVIDIA Announces Hopper H100 GPU, Supercomputer, and More


Fairs and events
22 Mar

Debut of the L4 GPUs

Announced in late March, NVIDIA L4 Tensor Core GPUs debuted in MLPerf tests and recorded more than triple the speed of the previous generation – NVIDIA T4. They ran all the workloads of the experiment and had results considered “impressive” in the BERT model.

L4 GPUs also deliver up to 10X faster image decoding, up to 3.2X faster video processing, 4X faster graphics and real-time rendering performance.

“With the advancement of generative AI, software available from NVIDIA is helping to perform and optimize workloads. It is very important that we contribute with these technologies, especially for large companies in the sector.”

Marcio Aguiar

Head of NVIDIA’s Enterprise Division for Latin America

Software and Network System Testing

Also in the MLPerf test, NVIDIA’s full-stack AI platform underwent data transmission to a remote inference server, where the division’s benchmark is performed.

NVIDIA DGX A100 systems reached up to 96% local maximum performance on BERT as they had to wait for some tasks to complete for the CPUs. In the ResNet-50 for computational vision – with use only by GPUs – it was possible to obtain 100%.

Orin triples earnings

The separate NVIDIA Jetson AGX Orin system-on-module achieved gains of up to 63% in energy efficiency and 81% in performance compared to year-ago results. It focuses on inference when AI is needed in tight spaces with low energy levels.

In the case of applications that require smaller modules and with lower consumption, the Jetson Orin NX 16G made its debut in the benchmarks and showed up to 3.2 times the performance of the previous generation – Jetson Xavier NX.

AI ecosystem

NVIDIA further reinforced that the results indicate support for the AI ​​ecosystem within the deep learning sector. In addition, ten partner companies presented in this round, through the NVIDIA platform. The list includes Microsoft Azure, ASUS, Dell Technologies, GIGABYTE, H3C, Lenovo, Nettrix, Supermicro and xFusion.

The company also reinforced that all the NVIDIA AI Enterprise software used for the tests is available in the MLPerf repository, for anyone to get the world-class results.

What is your assessment of the results obtained by NVIDIA in the MLPerf tests? Interact with us!

Leave a Comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.