Twice as much VRAM as the current Ampere A100, which has 40 GB of HMB2 memory.
Last May, NVIDIA formalized its Ampere A100, a GPU comprising 54 billion transistors equipped with 40 GB of HBM2 memory. On the occasion of the SC20, NVIDIA presented a new variant of this GPU: it is armed with twice the memory, 80 GB of memory, which is more of the HBM2E type. The speed thus drops from 2.4 Gbit / s to 3.2 Gbit / s, which increases the memory bandwidth to 2 TB / s against 1.6 TB / s for the first iteration.
Still intended for AI intensive computing platforms, this A100, which is better equipped with VRAM, should be able to process an even larger amount of data. For the rest, the specifications are identical to the current A100. We therefore find the 6,912 CUDA FP32 cores, a GPU frequency of 1.41 GHz for a single precision computing power of 19.5 TFLOPS. Note that the two solutions will coexist in the coming months: the A100 therefore it is a question here does not replace the old one.
GPUs engraved in 7 nm by TSMC
The site AnandTech has drawn up the table below to facilitate comparison between the two versions. If in order to supply itself with HBM2E memory, NVIDIA apparently supplies from Samsung (that of SK Hynix offers a memory speed of 3,6 Gbit/s), which is also responsible for burning RTX 3000 Ampere for individuals in 8 nm, the production of these A100 GPUs comes back to TSMC, and in 7 nm.
|GPU||A100 (80 Go)||A100 (40 Go)||V100|
|CUDA FP32 cores||6912||6912||5120|
|Boost frequency||1,41 GHz||1,41 GHz||1,53 GHz|
|Memory speed||3,2 Gbit/s HBM2e||2,4 Gbit/s HBM2||1,75 Gbit/s HBM2|
|Memory bus size||5120-bit||5120-bit||4096-bit|
|Memory bandwidth||2,0 To/sec||1,6 To/sec||900 Go/sec|
|VRAM||80 Go||40 Go||16 Go/32 Go|
|Simple precision||19,5 TFLOPs||19,5 TFLOPs||15,7 TFLOPs|
|Double precision||9,7 TFLOPs||9,7 TFLOPs||7,8 TFLOPs|
|INT8 Tensor||624 TOPs||624 TOPs||N/A|
|FP16 Tensor||312 TFLOPs||312 TFLOPs||125 TFLOPs|
|TF32 Tensor||156 TFLOPs||156 TFLOPs||N/A|
|Interconnection||NVLink 3 – 12 Liens (600 Go/sec)||NVLink 3 – 12 Liens (600 Go/sec)||NVLink 2 – 6 Liens (300 Go/sec)|
|GPU||GA100 (826 mm2)||GA100 (826 mm2)||GV100 (815 mm2)|
|Number of transistors||54.2 billion||54.2 billion||21.1 billion|
|TDP||400 W||400 W||300 W/350 W|
|Process||TSMC 7N||TSMC 7N||TSMC 12 nm FFN|
For the moment, NVIDIA offers HGX and DGX configurations with 4 or 8 GPUs of this 80 GB A100. version PCIe remains for now the prerogative of the A100 unveiled in May.
In addition, NVIDIA offers a DGX Station A100, the replacement for the DGX Station Volta, equipped with 4 A100 GPUs. In the workstation, these work together with a 64-core AMD EPYC CPU.
Source : NVIDIA