NVIDIA Launches Ampere A100 GPU With 80GB HBM2E Memory

Twice as much VRAM as the current Ampere A100, which has 40 GB of HMB2 memory.

Last May, NVIDIA formalized its Ampere A100, a GPU comprising 54 billion transistors equipped with 40 GB of HBM2 memory. On the occasion of the SC20, NVIDIA presented a new variant of this GPU: it is armed with twice the memory, 80 GB of memory, which is more of the HBM2E type. The speed thus drops from 2.4 Gbit / s to 3.2 Gbit / s, which increases the memory bandwidth to 2 TB / s against 1.6 TB / s for the first iteration.

Image 2: NVIDIA launches Ampere A100 GPU with 80 GB of HBM2E memory

Still intended for AI intensive computing platforms, this A100, which is better equipped with VRAM, should be able to process an even larger amount of data. For the rest, the specifications are identical to the current A100. We therefore find the 6,912 CUDA FP32 cores, a GPU frequency of 1.41 GHz for a single precision computing power of 19.5 TFLOPS. Note that the two solutions will coexist in the coming months: the A100 therefore it is a question here does not replace the old one.

NVIDIA’s Ampere A100 card wins over MLPerf

GPUs engraved in 7 nm by TSMC

The site AnandTech has drawn up the table below to facilitate comparison between the two versions. If in order to supply itself with HBM2E memory, NVIDIA apparently supplies from Samsung (that of SK Hynix offers a memory speed of 3,6 Gbit/s), which is also responsible for burning RTX 3000 Ampere for individuals in 8 nm, the production of these A100 GPUs comes back to TSMC, and in 7 nm.

GPU A100 (80 Go) A100 (40 Go) V100
CUDA FP32 cores 6912 6912 5120
Boost frequency 1,41 GHz 1,41 GHz 1,53 GHz
Memory speed 3,2 Gbit/s HBM2e 2,4 Gbit/s HBM2 1,75 Gbit/s HBM2
Memory bus size 5120-bit 5120-bit 4096-bit
Memory bandwidth 2,0 To/sec 1,6 To/sec 900 Go/sec
VRAM 80 Go 40 Go 16 Go/32 Go
Simple precision 19,5 TFLOPs 19,5 TFLOPs 15,7 TFLOPs
Double precision 9,7 TFLOPs 9,7 TFLOPs 7,8 TFLOPs
INT8 Tensor 624 TOPs 624 TOPs N/A
FP16 Tensor 312 TFLOPs 312 TFLOPs 125 TFLOPs
TF32 Tensor 156 TFLOPs 156 TFLOPs N/A
Interconnection NVLink 3 – 12 Liens (600 Go/sec) NVLink 3 – 12 Liens (600 Go/sec) NVLink 2 – 6 Liens (300 Go/sec)
GPU GA100 (826 mm2) GA100 (826 mm2) GV100 (815 mm2)
Number of transistors 54.2 billion 54.2 billion 21.1 billion
TDP 400 W 400 W 300 W/350 W
Process TSMC 7N TSMC 7N TSMC 12 nm FFN
Interface SXM4 SXM4 SXM2/SXM3
Architecture Ampere Ampere Time

For the moment, NVIDIA offers HGX and DGX configurations with 4 or 8 GPUs of this 80 GB A100. version PCIe remains for now the prerogative of the A100 unveiled in May.

Image 3: NVIDIA launches Ampere A100 GPU with 80 GB of HBM2E memory

In addition, NVIDIA offers a DGX Station A100, the replacement for the DGX Station Volta, equipped with 4 A100 GPUs. In the workstation, these work together with a 64-core AMD EPYC CPU.

Image 4: NVIDIA launches Ampere A100 GPU with 80 GB of HBM2E memory

Source : NVIDIA

Leave a Comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.