Due to U.S. export restrictions, it is difficult for China to obtain key resources such as HBM. In this regard, Huawei intends to help the Chinese technology market no longer rely on HBM. According to market news, Huawei has developed a new software tool called “Unified Cache Manager” (UCM), which can accelerate the training and inference of large language models (LLM) without using HBM.
Huawei published UCM tools at the 2025 Financial AI Reasoning Application Implementation and Development Forum on Tuesday (12th). The software allocates AI data between HBM, standard DRAM and SSD according to the delay characteristics of different memory types and the delay requirements of various AI applications.
Yue Feng, vice president of Huawei’s data storage products, pointed out that AI reasoning currently faces three major problems: “cannot push” (the input content is too long and beyond the processing range), “slow push” (the response speed is too slow), and “expensive push” (the calculation cost is too high).
UCM maximizes the speed of available resources based on the system through the Key-Value cache architecture of “hierarchical memory”.
According to reports, Huawei released UCM this time as a reasoning acceleration suite centered on KV Cache, it integrates multi-type cache acceleration algorithm tools to manage the KV cache memory data generated during the inference process in a hierarchical manner, expands the inference context window, realizes a high throughput and low latency inference experience, and reduces the inference cost per token.
UCM has conducted a long period of practical application testing before publication. Zhou Yuefeng pointed out that UCM was once piloted at China UnionPay, and its application scenarios include “customer voice analysis, marketing planning and office assistance.” According to Huawei data, using UCM in AI inference tasks with traditional cache and storage technology can reduce latency by up to 90% and increase throughput to 22 times.
Huawei plans to officially open source UCM in September, which will be launched in the Magic Engine community at that time, and will gradually contribute to the mainstream reasoning engine community in the industry and share it with all Share Everything (shared architecture) storage manufacturers and ecological partners in the industry.
Since almost all HBM chips are produced by SK Hynix, Samsung and Micron, the United States has been committed to obstructing China’s purchase or homemade HBM, and the emergence of UCM is expected to get rid of the US constraints and accelerate the independence of China’s chip technology.
(First image source:Flickr/Web Summit CC BY 2.0)
Further reading:
How does UCM optimize AI models’ access too computational resources?
Huawei unveils revolutionary AI Technology “UCM,” Elevating AI Reasoning Speed by 90% and Reducing HBM Dependence
Huawei has just unveiled a groundbreaking advancement in Artificial Intelligence (AI) – the “UCM” technology. This innovation promises to significantly impact machine learning, deep learning, and the broader AI landscape. This article delves deep into UCM,exploring its capabilities,benefits,and implications for the future of AI.
The Core of UCM: A Paradigm Shift in AI Processing
UCM, or “Unified Computing Model,” represents a fundamental shift in how AI tasks are processed. This innovative AI technology optimizes the way AI models access and utilize computational resources.
Key Features and Capabilities
Enhanced Reasoning Speed: UCM boasts a remarkable 90% increase in AI reasoning speed. This acceleration allows for faster processing of complex datasets and improved responsiveness in AI applications.
Reduced HBM Dependence: One of UCM’s most important advantages is its ability to reduce High Bandwidth Memory (HBM) dependence. This is a crucial development, as HBM is a costly and frequently enough a bottleneck component in AI hardware.This translates to potentially lower hardware costs and improved energy efficiency.
Optimized Resource Utilization: UCM intelligently manages and distributes computational resources, leading to superior performance across various workloads.
Scalability and Adaptability: UCM is designed to be scalable, allowing it to adapt to a wide range of AI applications, from edge computing to large-scale data centers.
Improved AI performance across a variety of workloads.
The Benefits of UCM: Real-World Implications
The introduction of UCM has far-reaching implications, offering numerous benefits across different industries:
1. Accelerated AI Applications
Faster Image Recognition: UCM’s speed improvements will dramatically enhance image recognition tasks, enabling more rapid and accurate analysis.
Improved Natural Language Processing (NLP): This technology will significantly boost the performance of NLP models, resulting in more sophisticated language understanding and generation.
Enhanced Real-Time Decision Making: UCM’s increased processing speed makes real-time decision-making in areas like autonomous vehicles and robotics significantly more effective.
2. Cost Optimization
Cost Reduction in HBM: By reducing the reliance on expensive HBM,UCM offers potential cost savings for businesses deploying AI solutions.
Improved Energy Efficiency: Faster processing times often translate to lower energy consumption, making AI operations more enduring and cost-effective.
3. Broader Accessibility
Democratization of AI: The potential for lower hardware costs and enhanced performance opens up AI to a wider range of developers and businesses.
Edge AI Advancement: UCM’s efficiencies can make powerful AI applications more viable at the edge, enabling smarter devices and infrastructure.
UCM and the Future of AI: What to Expect
Huawei’s UCM technology paves the way for an exciting future in artificial intelligence.
Potential advancements include:
Advanced AI Chip Designs: Building on the concept of UCM, we can anticipate more efficient and specialized AI chip designs.
Improved AI model Development: UCM will facilitate the development of more complex and sophisticated AI models capable of tackling more challenging tasks.
New AI Applications: The technology’s efficiency and adaptability will spur innovation in AI applications across several sectors.
Market Impact and Competitive Landscape
The release of UCM will undoubtedly make Huawei a major player in the AI technology industry. The competition is likely to respond by developing similar advancements in AI acceleration and AI chip design.
Practical Tips for Developers and Enterprises
Here’s advice for those looking to leverage UCM:
Stay Updated: Remain current on the latest announcements and documentation from Huawei concerning UCM.
Pilot Programs: Examine the possibilities of participating in pilot programs to get a hands-on understanding of UCM.
Compatibility: Make sure the hardware and software frameworks are compliant with UCM.
* Optimization: Analyze your AI workloads to determine the areas where UCM can provide the most performance gains.
By understanding and leveraging UCM, developers and businesses can position themselves at the forefront of AI innovation.


