Splunk Agentic Ops Hackathon · 2026

AI Agent Observability Platform

YOUR AGENTS ARE BEING WATCHED

AgentWatch wraps any LangGraph agent with OpenTelemetry, streams every decision into Splunk in real time, detects loops and anomalies automatically, and explains them in plain English using Foundation-Sec.

🚀 Try Live Demo Agent Ops Dashboard → ⭐ GitHub
↓ scroll to explore
2,299
Events Indexed
342
Anomalies Caught
58.1%
Avg Trust Score
280K
Tokens Tracked
<1s
Detection Latency
// The Problem

AGENTS ARE
INVISIBLE

AI agents fail silently. Without observability, a loop runs forever, tokens spiral, costs explode — and you find out from the invoice.

✕ Without AgentWatch
Agent loops undetected — burns 1,000+ API calls in silence
Token usage spikes with no warning — surprise billing at month end
Latency degrades gradually — invisible until users complain
Errors swallowed — no trace of why a reasoning step failed
Debugging means reading logs manually, hoping to find the signal
✓ With AgentWatch
Loop detected in <1s — agent stopped before it costs you
Every token tracked per step — full visibility, no surprises
Latency drift flagged — before it becomes a user problem
Every error indexed in Splunk — full context, stack, and timing
Foundation-Sec explains root cause and fix in plain English
// Live Dashboards

THREE VIEWS,
ONE SYSTEM

Three live interfaces running on Railway. No login. No setup. Click and watch your agent's brain in real time.

// Architecture

FIVE STEPS FROM
EVENT TO INSIGHT

Four Splunk AI tools in a coherent pipeline — instrument, stream, detect, explain, query.

01

Instrument the Agent

OpenTelemetry hooks wrap every LangGraph node — capturing LLM calls, tool invocations, token counts, step latency, and errors with full structured context.

OpenTelemetry + LangGraph
02

Stream to Splunk HEC

Events index in real time via HTTP Event Collector with sourcetype agentwatch:otel. Every reasoning step becomes a searchable, structured log. 2,299+ real events confirmed.

Splunk MCP Server · HEC
03

Detect Anomalies

Splunk's native anomalydetection command runs on tool call frequency time-series. Caught a 139-call spike with 99.25% confidence — zero manual thresholds.

Splunk AI Toolkit
04

Explain in Plain English

Foundation-Sec-1.1-8B reasons over the anomaly context — what happened, root cause, recommended fix, and severity score. One click from alert to actionable engineering guidance.

Foundation-Sec-1.1-8B
05

Query with Natural Language

"Show me all loops in the last hour" → auto-generated SPL → results in seconds. No SPL expertise required. The AI Assistant translates intent into precision queries.

Splunk AI Assistant
agent_runner.py — loop mode
$ python agent_runner.py --mode loop
 
# OTel hooks active — events streaming to Splunk HEC
step_start research trust=100%
llm_call research trust= 96%
tool_call search_tool trust= 88%
tool_call search_tool trust= 62%
tool_call search_tool trust= 44%
tool_call search_tool trust= 18%
tool_call search_tool trust= 5%
 
⚠ ANOMALY DETECTED [confidence: 99.25%]
search_tool called 23x in 4.1 seconds
Splunk anomalydetection fired on time-series spike
 
# Foundation-Sec-1.1-8B root cause analysis:
"Agent stuck at query refinement loop.
No exit condition when search returns empty.
Fix: add empty-result guard at step 3."
 
✓ Events indexed: 2,299
✓ Anomalies: 342 · Trust: 58.1%
✓ Tokens tracked: 279,993
// Capabilities

EVERY FAILURE
MODE, CAUGHT

From infinite loops to silent confidence collapse — AgentWatch surfaces what your logs can't.

🔁

Loop Detection

Splunk's anomalydetection command monitors tool-call frequency time-series. Catches a 139-call spike at 99.25% confidence — before the API bill arrives.

📈

Token Spike Alerts

Per-step token tracking catches unbounded context growth before it hits rate limits. Every LLM call tagged with token count, model, and step ID.

🐌

Latency Drift

Step latency tracked per event in OTel. Progressive slowdowns surface in the trace timeline before they become P99 incidents.

🧠

Live Brain Graph

Three.js force-directed visualization of your agent's reasoning, live. Green = healthy. Yellow = warning. Red = anomaly. Click any node to inspect tokens, latency, and trust.

🔢

Trust Scoring

Every node scores 0–100% based on call patterns, token usage, and error rates. Composite score across the agent surface reveals risk at a glance.

💀

Silent Failure Capture

Unhandled exceptions, tool errors, and confidence collapses — all indexed in Splunk with full stack trace context and structured fields for instant SPL queries.

🔍

Natural Language SPL

"Which tools have the lowest trust scores?" → SPL generated → results in seconds. The Splunk AI Assistant bridges intent and query without learning SPL syntax.

🔬

Foundation-Sec Explainer

Foundation-Sec-1.1-8B reads the anomaly context and returns: what happened, why it happened, severity, and the specific engineering fix. Not generic advice — specific to your run.

🕸

Multi-Agent Topology

3D network view across all agents — 48 nodes, 47 edges, 7 anomaly paths. Anomaly paths highlighted in red. See how agent hubs connect across your system at a glance.

⚠ Anomaly · search_tool
Loop detected — called 23× in 4.1s
trust_score: 0.05 · anomalydetection confidence: 99.25%
✓ Healthy · calculator_tool
28.5 × 1.43 = 40.755B — result returned
trust_score: 0.92 · 1 call · 12ms
🧠 LLM Call · research
847 tokens · gpt-4o-mini · 690ms
trust_score: 0.85
✓ Step · synthesis
duration: 700ms · step complete
trust_score: 0.90 · 0 anomalies
// Real Data — Not Simulated

LIVE RESULTS
FROM SPLUNK

Every number verified from the live agentwatch index. Real events. Real anomalies. Real cost.

2,299
Events Indexed
342
Anomalies Caught
58.1%
Avg Trust Score
280K
Tokens Tracked
<$0.01
Est. Cost
🧠
TRY IT YOURSELF
Click "Loop" on the live dashboard and watch AgentWatch catch the anomaly in under one second. No setup. No login. Runs on Railway.
🚀 Launch Live Brain 📊 Agent Ops → 🕸 Topology Map →
Built with Splunk AI · Agentic Ops Hackathon 2026
Splunk MCP Server
Splunk AI Toolkit
Foundation-Sec-1.1-8B
Splunk AI Assistant
OpenTelemetry
LangGraph
Three.js
FastAPI
Socket.IO
Railway

START
WATCHING

Your agents are running. Is anyone watching?

🚀 Open Live Demo ⭐ Star on GitHub
AgentWatch
AI Agent Observability Platform
Initializing observability layer...