Docs

How Sage Analyzes Your Database

Understanding Sage's analysis methodology and reasoning engine.

Overview

Sage is not a chatbot. It is a diagnostic engine that uses AI to interpret your database's real-time state, historical patterns, and environment context. Every response is grounded in actual data from your PostgreSQL instance.

Data Sources

When you ask Sage a question, it assembles a context package from:

  • -pg_stat_activity - currently running queries, locks, wait events
  • -pg_stat_statements - historical query performance statistics
  • -pg_stat_user_tables - table-level statistics, dead tuples, vacuum status
  • -pg_stat_replication - replication lag and standby status
  • -pg_settings - current PostgreSQL configuration
  • -pg_locks - active lock chains and blocking sessions
  • -OS metrics - CPU, memory, disk, network (via agent or cloud API)
  • -Integration data - Prometheus, Datadog, CloudWatch metrics if configured

The 5 Whys Methodology

Sage traces issues to their root cause using the 5 Whys technique:

  • -Why 1: What is the immediate symptom? (e.g., high CPU)
  • -Why 2: What is causing the symptom? (e.g., autovacuum running aggressively)
  • -Why 3: What triggered the cause? (e.g., 12M dead tuples on orders table)
  • -Why 4: Why did the trigger occur? (e.g., batch job without intermediate commits)
  • -Why 5: What is the underlying problem? (e.g., idle-in-transaction blocking vacuum)

Response Format

Every Sage response includes:

  • -Current state - what is happening right now based on live data
  • -Root cause - the full chain from symptom to underlying cause
  • -Fix - specific SQL commands or configuration changes to resolve the issue
  • -Prevention - long-term changes to prevent recurrence

Knowledge Base

Sage draws from a RAG (Retrieval Augmented Generation) knowledge base containing PostgreSQL official documentation, best practices guides, and community knowledge. This ensures recommendations follow established PostgreSQL conventions.