How Sage Analyzes Your Database

Understanding Sage's analysis methodology and reasoning engine.

Overview

Sage is not a chatbot. It is a diagnostic engine that uses AI to interpret your database's real-time state, historical patterns, and environment context. Every response is grounded in actual data from your PostgreSQL instance.

Data Sources

When you ask Sage a question, it assembles a context package from:

-pg_stat_activity - currently running queries, locks, wait events
-pg_stat_statements - historical query performance statistics
-pg_stat_user_tables - table-level statistics, dead tuples, vacuum status
-pg_stat_replication - replication lag and standby status
-pg_settings - current PostgreSQL configuration
-pg_locks - active lock chains and blocking sessions
-OS metrics - CPU, memory, disk, network (via agent or cloud API)
-Integration data - Prometheus, Datadog, CloudWatch metrics if configured

The 5 Whys Methodology

Sage traces issues to their root cause using the 5 Whys technique:

-Why 1: What is the immediate symptom? (e.g., high CPU)
-Why 2: What is causing the symptom? (e.g., autovacuum running aggressively)
-Why 3: What triggered the cause? (e.g., 12M dead tuples on orders table)
-Why 4: Why did the trigger occur? (e.g., batch job without intermediate commits)
-Why 5: What is the underlying problem? (e.g., idle-in-transaction blocking vacuum)

Response Format

Every Sage response includes:

-Current state - what is happening right now based on live data
-Root cause - the full chain from symptom to underlying cause
-Fix - specific SQL commands or configuration changes to resolve the issue
-Prevention - long-term changes to prevent recurrence

Knowledge Base

Sage draws from a RAG (Retrieval Augmented Generation) knowledge base containing PostgreSQL official documentation, best practices guides, and community knowledge. This ensures recommendations follow established PostgreSQL conventions.