Skip to content

Investigate incidents

NeuBird’s core capability is AI-driven incident investigation. When you ask a question, NeuBird orchestrates an agentic loop — exploring schemas, writing queries, analyzing results, and building toward a root cause.

When you ask NeuBird a question, the agent follows this workflow:

  1. Schema exploration — Discovers relevant tables using list_schemas, list_tables, and search_tables_regex
  2. Table analysis — Examines columns with describe_table and data cardinality with list_dimensions
  3. Data sampling — Previews data shape with get_sample_rows
  4. SQL execution — Runs targeted queries with exec_sql to find evidence
  5. Analysis & synthesis — Correlates findings across tables and presents conclusions

Each step is shown in real-time with a dual progress bar:

  • Top bar (cyan) — Investigation progress with elapsed time
  • Bottom bar (yellow) — AI confidence level, updated with every tool call

Every tool call includes a confidence score (0-100) that reflects how close the AI is to answering your question. The investigation progresses through phases:

  • Exploring — Finding relevant tables and understanding the schema
  • Investigating — Running queries and gathering evidence
  • Analyzing — Synthesizing findings into conclusions

The agent may loop multiple times — retrying with simpler SQL if a query fails, or widening the search if initial results are empty.

NeuBird works best with questions that include context:

> Why did api-gateway latency spike between 2am and 3am today?
> What changed in production in the last 6 hours that could cause 5xx errors?
> Show me the top error-producing services this week
> Are there any anomalies in the checkout-service metrics?

If your question is broad (“any issues?”), NeuBird will ask you to narrow it — suggesting a time window or specific service.

NeuBird maintains full conversation context within a session. After initial findings, you can drill deeper:

> What are the top errors for api-gateway?
... [NeuBird investigates and reports] ...
> Can you check if there was a deployment around that time?
... [NeuBird uses the same context to investigate] ...
> What services does api-gateway depend on?
... [NeuBird checks topology/service map tables] ...

When NeuBird declares a root cause, it provides:

  1. The specific change or event — A commit, deploy, config change, or infrastructure failure
  2. When it took effect — A timestamp that explains why the incident started when it did
  3. The mechanism — How that change produced the specific observed behavior

If the evidence is insufficient, NeuBird will explicitly say “root cause is inconclusive” and rank hypotheses by evidence strength.

During investigation, you’ll see tool calls appear in the output:

[list_tables] Searching for HTTP-related tables
[describe_table] Examining metrics.http_requests
[exec_sql] Checking p99 latency for api-gateway in the last 4 hours

Each tool call includes:

  • Reason — Why the tool is being called (shown in the UI)
  • Confidence — Current confidence score
  • Result preview — A truncated preview of the tool output

Press Ctrl+C to cancel a running investigation. The agent will stop after the current tool call completes. Press Ctrl+C twice quickly to force-quit.

For complex investigations, NeuBird may reach the response token limit. When this happens, you’ll see a continue prompt — press Enter to let the agent continue its analysis.