Processors with a large number of cores (manycores): Complete file

2023-11-07 23:00:00

Without taking into account the graphics processors (GPUs) which constitute an architectural class in themselves, processors with a large number of cores are distinguished from multi-core processors, not only by the number of cores, but also by a certain number of characteristics: the type and performance of the cores, the hierarchical decomposition into clusters (or nodes) of cores, the memory model (shared memory or distributed memory) and the software problems linked to the fact that they are almost always used as coprocessors. These manycores are used in three main classes of applications:

high-performance mobile and embedded applications for which energy constraints are fundamental;

applications for which high performance is the most important criterion, such as scientific computing;

specialized circuits for learning and inference in deep neural networks. This class of applications is the one which sees the greatest number of circuits appear.

The different characteristics are presented with the different variants. Then six examples of manycore processors are detailed.

The first two are architectures intended for high-performance mobile and embedded applications and dissipate from a few watts to three tens of watts:

processors implementing Adapteva’s Epiphany architecture, with two versions used at 16 or 64 cores and a 1024 core version which was a failure;

Kalray’s MPPA architecture, including the study of versions MMPA2 and MPPA3.

The following two examples are intended for high-performance computing and dissipate two to three hundred watts:

Intel’s Xeon Phi processors and coprocessors with the Knights Corner and Knights Landing models. Their production was abandoned in 2018;

the manycore SW26010 used in the Chinese supercomputer TaihuLight which was from June 2016 to November 2017 the first in the TOP500 supercomputers.

The last two examples correspond to the acceleration of learning and inference in neural networks:

the Boqueria AI accelerator from Untether is made up of a 2D grid of memory blocks, themselves made up of a 2D grid of blocks with an elementary processor and an SRAM memory;

Celebras’ WSE-2 circuit interconnects a 2D grid of cores at the wafer level. Each core contains a calculation part and an SRAM. With 850,000 cores and 15 kW of dissipated power, it is the largest circuit in 2022.

Manycore processors are not the continuation of multicores with a greater number of cores. The number of cores is very far from exponential growth. While multicores use the shared memory model with a hierarchy of caches, manycores use the distributed memory model, with memory blocks near the calculation. The use of reduced data formats (16-bit and 8-bit floats) for neural networks makes it possible to increase the number of cores with given consumption and chip surface area compared to circuits with double precision floats.

1699523590
#Processors #large #number #cores #manycores #Complete #file

Leave a Comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.