AI startup Groq launches language chip LPU!Speed ​​rolls over Huida GPU, LLM response time disappears | DongZu DongTun – the most influential blockchain news media

2024-02-24 03:12:01

American AI chip start-up Groq has become popular recently. Groq recently developed a language processing unit (LPU) chip. Its large language model (LLM) inference performance is 18 times faster than Microsoft’s Azure cloud service powered by Huida GPU. May become a competitor of Huida GPU.
(Previous summary: Buterin: It’s time to add complex functions to L1 to reduce the burden on L2! I look forward to AI helping to catch bugs)
(Background supplement: Envy NVIDIA stock to make huge profits? Pay attention to the potential encrypted AI token project)

AI chip company Groq has recently attracted widespread attention on social media. Groq claims to have achieved “the fastest large language model speed in the world.” In a demonstration video that went viral on social media, its chatbot showed an astonishing response speed, almost instantly. Generation makes the current version of ChatGPT, Gemini, Grok… and a series of AI chatbots look sluggish.

Recent benchmarks conducted at ArtificialAnalysis.aitestAmong them, Groq far outperforms the other eight competitors in several key performance indicators such as throughput and total response time. Groq can generate about 247 tokens per second. In comparison, Microsoft Azure generates about 18 tokens per second. Therefore, if ChatGPT is run on Groq, the generation speed will be greatly increased by 13 times.

Source: ArtificialAnalysis.ai

Cryptopolitan reportGroq can achieve such results because it develops a new AI chip “Language Processing Unit” (LPU), which uses self-developed chips to run the AI ​​language interfaces of other open source models. LPU is used to solve the problems of old technologies such as CPU and GPU. limit.

When faced with the huge computing requirements of large language models (LLM), traditional processing architectures often cannot meet the needs, but Groq uses the new tensor flow processor (TPS) architecture to implement LLM operations. TPS and LPU rely on their fast reasoning , the advantage of reducing power consumption is expected to change the way data is processed.

The LPU is designed for deterministic AI operations and breaks away from the traditional Single Instruction Multiple Data (SIMT) model of the GPU. This shift can improve performance and reduce energy consumption. It makes LPU a more environmentally friendly choice in the future,

Senior Risk Architect k_zer0s TweetGroq’s LPU is about 20 times faster than the GPU. Because inference runs use much less data than model training, the LPU is more energy-efficient. Compared with Huida GPUs for inference tasks, the LPU draws data from external memory. Less data is read and less power is consumed.

Open for free trial

Currently Groq developmentFree trialthe official website provides 3 AI chatbots with different LLM models, including Llama 2 by Meta, Mixtral-8x7B and Mistral 7B by Mistal AI. Users can experience chatbots with fast response speeds powered by LPU for free.

📍Related reports📍

Huida’s financial report is released: revenue increased by 265% year-on-year, exceeding expectations, and Nvidia rose 9% after the market closed! AI concept coins are rising (TAO, RNDR, AGIX..)

SoftBank Masayoshi Son launches “Izanagi” project to fight with NVIDIA!New AI chip company plans to raise US$100 billion

The female stock goddess’ favorite is not NVIDIA or Tesla. Ark Investment’s largest holding is this crypto company.

1708860037
#startup #Groq #launches #language #chip #LPUSpeed #rolls #Huida #GPU #LLM #response #time #disappears #DongZu #DongTun #influential #blockchain #news #media

Leave a Comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.