PointFive launched DeepWaste™ AI on February 27, 2026, a new module designed to optimize the cost and performance of artificial intelligence workloads across major cloud providers. The tool aims to address the increasing complexity of production AI, moving beyond simple volume metrics to analyze the intricate web of decisions that govern AI execution.
As AI deployments scale, PointFive argues, traditional cloud optimization tools fall short. The company identifies key drivers of AI cost – model selection, token consumption, routing logic, caching behavior, GPU utilization, retry patterns, and data orchestration – as interconnected elements that require a holistic approach. A suboptimal routing choice, for example, can directly increase token usage and associated expenses.
DeepWaste AI connects natively to AWS (Bedrock, SageMaker, and AI managed services), Azure (Azure OpenAI, Azure ML, Cognitive Services), GCP (Vertex AI and AI services), OpenAI, and Anthropic direct APIs. This multi-cloud support is intended to address the reality that organizations frequently operate across different cloud environments, often combining provider-managed services with direct API access. The tool normalizes data from these diverse sources to provide a consistent view of AI service performance.
The platform extends beyond inference to include GPU and data platform optimization. For GPUs, DeepWaste AI identifies underutilized or idle resources, instance-type mismatches, and configuration issues. It too integrates with Snowflake and Databricks, aiming to provide end-to-end visibility from data ingestion through inference, linking data platform orchestration to execution costs.
A key feature of DeepWaste AI is its agentless architecture. The tool connects directly to cloud APIs and metrics without requiring agents, instrumentation, or code changes. PointFive emphasizes this approach minimizes data access requirements and preserves privacy, operating by default using metadata, billing signals, performance metrics, and resource configuration data. Optional, deeper analysis at the inference level is available, with customers retaining control over the scope of data access.
DeepWaste AI categorizes inefficiencies across four layers: Model &. Routing Intelligence, Token & Prompt Economics, Caching & Reuse Optimization, and Infrastructure & Operational Leakage. The Model & Routing layer focuses on issues like model-task mismatch and inefficient routing. Token & Prompt Economics addresses prompt bloat and overprovisioning. Caching & Reuse Optimization identifies duplicate inference and underused caching mechanisms. Finally, Infrastructure & Operational Leakage detects idle GPUs, instance mismatches, and retry-driven cost inflation.
The tool provides quantified savings estimates and implementation guidance, prioritizing recommendations by financial impact and mapping them to engineering and FinOps workflows. According to PointFive, this shifts the focus from reactive monitoring to continuous optimization across models, infrastructure, and data platforms.
“AI workloads introduce a new category of operational complexity,” said Alon Arvatz, CEO of PointFive. “DeepWaste AI gives organizations the intelligence required to scale AI efficiently, across models, infrastructure, and data platforms, without sacrificing control.”
DeepWaste AI is currently available to PointFive customers.