Home » Technology » Observability for AI Agents in 2025 | Transform 2025

Observability for AI Agents in 2025 | Transform 2025

by

Agentic AI Revolution: transforming Observability and Software Development

The Autonomous Software revolution is not a distant dream but a rapidly approaching reality. Experts at Transform 2025, including Ashan Willy, CEO of New Relic, and Sam Witteveen, CEO and co-founder of Red dragon AI, recently discussed the instrumentation of agentic systems for measurable Return On Investment (ROI) and building the infrastructure roadmap to fully leverage agentic Artificial Intelligence (AI).

The Rise of Agentic Observability

New Relic offers observability solutions by capturing and correlating application,log,and infrastructure telemetry in real time. This extends beyond simple monitoring, equipping teams with the necessary context and insights to understand, troubleshoot, and optimize complex systems, even when unexpected issues arise. This task is now even more complex with the integration of generative and agentic AI. The company now monitors various AI models, including Nvidia NIM, DeepSeek, and ChatGPT, with AI monitoring usage up approximately 30% quarter-over-quarter.

“The diversity in models is staggering,” Willy stated. “Enterprises initially focused on GPT but are now exploring a wide range of models. We’ve seen about a 92% increase in the variance of models being used,with enterprises increasingly adopting multiple models. The key question now is: How do you measure the effectiveness of these models?”

Observability in the Age of Agentic systems

The evolution of observability is a meaningful question. Use cases dramatically differ across industries, and functionality varies for each company based on size and objectives. As a notable example, a financial firm might concentrate on maximizing Earnings Before Interest, Taxes, Depreciation, and Amortization (EBITDA) margins, while a product company might prioritize speed to market and quality control.

Since New Relic’s inception in 2008, observability has centered on application monitoring for Software as a service (SaaS), mobile, and cloud infrastructure. Today, the proliferation of AI and agentic AI is bringing observability back to applications, as agents, micro-agents, and nano-agents actively generate AI-written code.

the Role of AI in Enhancing Observability

The increasing number of services and microservices,particularly in digitally native organizations,places a significant cognitive burden on humans managing observability tasks. AI offers a solution to this challenge.

“The future involves a cooperative mode where you have enough information to work effectively,” Willy explained. “The promise of agents in observability is to automate routine workloads, making insights accessible to a broader audience.”

The Power of a Unified Agentic Observability Platform

A single platform designed for observability capitalizes on the agentic world. Agents not only automate workflows but also deeply integrate into an organization’s ecosystem, encompassing tools like Harness, GitHub, and ServiceNow. Agentic AI enables developers to receive immediate alerts about code errors anywhere in the system and resolve them without leaving their coding environment.

For example, if code deployed in GitHub encounters an issue, an agent-powered observability platform can detect it, determine the appropriate solution, and either alert the engineer or fully automate the resolution process.

“Our agent analyzes every piece of information available on our platform,” Willy said. “This includes application performance,the health of the underlying azure or Amazon Web Services (AWS) infrastructure – anything relevant to that code deployment.We refer to these as agentic skills, leveraging our own Application Programming Interfaces (APIs) rather than relying on third parties.”

Within GitHub, developers receive notifications about code performance, error handling, and even the necessity for software rollbacks, with automated rollbacks available upon developer approval. New Relic recently announced a collaboration with Copilot coding agent to pinpoint the exact lines of code causing issues. Copilot then corrects the problem and prepares a new version for deployment.

The Future Landscape of Agentic AI

Organizations adopting agentic AI will find that observability is crucial to its functionality, according to Willy.

“As you integrate agentic components, you need to understand what each agent is doing,” he explained. “This reasoning extends to the infrastructure, helping you understand what’s happening in your production environment. observability provides this insight, and we are at the forefront of this evolution.”

Did You Know?

According to a recent survey, companies using AI-powered observability tools have reported a 40% reduction in incident resolution time.

Key Takeaways of Agentic AI

Agentic AI is transforming how organizations approach software development and system management. By automating routine tasks and providing real-time insights, these systems are improving efficiency, reducing errors, and enabling developers to focus on innovation.

Feature Traditional Monitoring Agentic Observability
Automation Manual processes Automated workflows
Insights Delayed, reactive Real-time, proactive
Integration Limited Seamless across tools
Error Resolution Time-consuming Faster, AI-assisted

Frequently Asked Questions About Agentic AI

  1. what exactly is agentic AI and how does it differ from traditional AI?
  2. In what ways does agentic AI improve observability compared to traditional monitoring systems?
  3. What are the main advantages of using a single platform for agentic observability?
  4. How can AI observability help in managing complex systems and microservices?
  5. What does the future hold for agentic AI in software development and what advancements can we expect?
  6. Why is observability critical for the effective functioning of agentic AI systems?

How do you see agentic AI impacting your organization’s software development processes? What challenges do you anticipate in adopting these new technologies?

Share your thoughts and experiences in the comments below!

Okay, here’s a breakdown of the provided text, formatted for clarity and potential use in a more structured document (like a blog post, report, or presentation). I’ve focused on making it readable and highlighting the key takeaways. I’ve also added some potential headings/subheadings to improve organization.

Observability for AI Agents in 2025 | Transform 2025

The Rise of AI Agents and the Observability Imperative

AI agents are rapidly evolving, becoming more elegant and integrated into various aspects of our lives. from customer service chatbots to autonomous vehicles, the complexity and critical nature of these AI agents necessitate a robust approach to observability. In 2025, the ability to truly “see” inside your AI agents – understanding their behavior, identifying anomalies, and optimizing performance – is no longer optional; it’s essential. This article explores the key aspects of AI agent observability, offering strategies and tools to help you thrive in this evolving landscape.

What is AI Agent Observability?

observability, in the context of AI agents, is the ability to understand the *internal states* of an AI system based on its *external outputs*. It’s about gaining insights by collecting and analyzing data from various sources. This includes:

  • Metrics: Quantifiable data points (e.g., latency, throughput, resource utilization).
  • Logs: Detailed records of events, actions, and errors.
  • Traces: Tracking the flow of requests through the AI agent’s different components and services.

The goal is to provide a clear view of an AI agent’s health, performance, and behavior, enabling rapid detection and resolution of issues. Key Performance Indicators (KPIs) are critical in this analysis, providing actionable insights.

Key Pillars of AI Agent Observability in 2025

Effective AI agent observability relies on a combination of strategies:

1. Extensive Monitoring

Real-time monitoring of AI agent performance is paramount. This involves setting up dashboards and alerts to track critical metrics. Tools to consider include Prometheus and Grafana for metric collection and visualization. Consider metrics such as:

  • Response Time: How quickly the agent responds to user requests.
  • Accuracy: The agent’s ability to correctly interpret and respond.
  • Failure Rate: The percentage of requests that the agent fails to process correctly.
  • Resource Utilization: CPU, memory, and network usage.

2. Detailed Logging and Tracing

Logging and tracing are essential for in-depth troubleshooting. Logs provide context around events,while tracing helps follow the path of a request through the system. Key considerations:

  • Structured Logging: Use formats like JSON for easy parsing and analysis.
  • Centralized Logging: Implement a system like the ELK Stack (Elasticsearch, Logstash, Kibana) to aggregate logs from multiple sources.
  • Distributed Tracing: Tools like Jaeger or Zipkin are vital for understanding the flow of requests and identifying bottlenecks in distributed AI agent architectures. Consider tools like Sentry for capturing anomalies in your logging.

3. Advanced Anomaly Detection

AI Agent anomaly detection goes beyond simple threshold-based alerting. In 2025, we see a greater reliance on machine learning models to proactively identify unusual patterns and behaviors that may indicate problems. Use tools that offer:

  • Baseline Learning: Anomaly detection which learns the normal behavior of your AI agents.
  • Time-Series analysis: Utilizes metrics that are related to time
  • Customizable Alerts: set up alerts to notify you of any unexpected change in your system.

4. Data Visualization and Dashboards

Effective data presentation is key to understanding. Create dashboards that summarize critical information at a glance. Tools such as Grafana and kibana are essential. Display key metrics and logs in an easy-to-understand format.

Tools and Technologies for AI Agent Observability in 2025

Several tools are available to help you build a robust AI agent observability system. Here are some of the most important ones to consider for *Transform 2025*:

Area Tool Purpose
Metrics Collection Prometheus Collect and store time-series data (metrics).
Logging ELK Stack (elasticsearch, Logstash, Kibana) Aggregate, process, and visualize logs.
Tracing Jaeger or Zipkin Trace requests through distributed systems.
Dashboarding Grafana , Kibana Visualize data from various sources.

Consider how these apply to specific AI use cases. For example chatbots may require different monitoring methods compared to autonomous vehicles.

Practical Tips and Best Practices for AI Agent Observability

  • Instrument Your Code: Add logging and metric collection to all code components from the beginning.
  • Define SLOs and SLAs: Establish Service Level Objectives (SLOs) and Service Level Agreements (SLAs) and monitor against them.
  • Automate Alerting: Set up alerts for key metrics to proactively identify issues.
  • embrace Infrastructure as Code (IaC): Use tools like Terraform or Ansible to manage and deploy your observability infrastructure alongside your AI agents.
  • Regularly Review and Improve: Observability is not a one-time setup. continuously refine your monitoring, logging, and alerting based on feedback and evolving needs.
  • Focus on User Experience: How does your agent behave? Metrics on user satisfaction need to be tracked.

Future Trends in AI Agent Observability

The landscape of AI agent observability will continue to evolve quickly. Expect to see:

  • AI-driven Observability: Use AI models to automatically detect anomalies,predict performance issues,and suggest optimizations.
  • Explainable AI (XAI) Integration: Provide greater transparency into how AI agents make decisions,linking observability data to explainable insights.
  • edge Observability: As AI agents move to the edge,robust monitoring and management of distributed deployments become crucial.
  • Privacy-preserving techniques: Anonymization of data to preserve user privacy while still ensuring operational insights.

Conclusion

establishing mature *observability* practices is essential for the triumphant deployment and operation of AI agents in 2025. By embracing the principles of comprehensive monitoring,detailed logging,and proactive anomaly detection,organizations can unlock the full potential of their AI investments. The ongoing evolution of tools and technologies will make monitoring and analytics more robust,leading to more reliable,and efficient AI agent systems.

You may also like

Leave a Comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Adblock Detected

Please support us by disabling your AdBlocker extension from your browsers for our website.