Scenario 15: Debugging Support

Developer debugging issues with code navigation, call path tracing, breakpoint suggestions, and issue-specific analysis strategies powered by CPG.

Table of Contents

Quick Start

# Select Debugging Scenario
/select 15

How It Works

Two-Level Intent Detection

S15 uses a two-level intent detection system. The base level selects which CPG finder to execute; the handler level selects a template-based handler for structured reports.

Level 1 (base): detect_debug_intent() in debugging/intent.py classifies queries into 8 intents using keyword_match_count_morphological() against patterns loaded from the domain plugin via get_debug_patterns_from_plugin():

Intent Description
logging Error reporting and logging function calls
assertion Assertion macros (Assert, static_assert)
trace Tracing and instrumentation functions
explain Query plan analysis functions
stack_trace Backtrace and stack dump functions
debug_output Debug-level output macros
breakpoint Execution breakpoint locations
generic Fallback — generic debug search

extract_error_level() also detects specific error levels mentioned in the query (ERROR, WARNING, INFO, LOG, DEBUG, TRACE, FATAL, CRITICAL), loaded from get_error_levels_from_plugin().

Level 2 (handler): DebuggingIntentDetector in debugging_handlers/intent_detector.py classifies queries into 5 handler-specific intents, sorted by priority:

Intent Priority Description
breakpoint_suggestion 10 Breakpoint location suggestions
call_stack 20 Call stack and execution path analysis
variable_trace 30 Variable flow tracing through code
debugging_strategy 40 Issue-specific debugging approaches
general_debugging Fallback (confidence=0.5)

The detector also extracts: - target_function — function name from the query (7 regex patterns including Russian: для palloc, функция palloc) - issue_type — problem classification: crash, hang, memory_leak, performance, logic_error - topic — subsystem/category from domain plugin semantic mappings

Two-Phase Architecture

S15 uses a two-phase approach:

Phase 1: Handler-based (no LLM). integrate_handlers() runs DebuggingIntentDetector, checks confidence against cfg.thresholds.confidence_low, then tries 2 registered handlers. If a handler matches and produces results, DebugReportFormatter formats a structured report without calling the LLM.

Phase 2: LLM fallback. If no handler matched or confidence is too low, the full pipeline runs: detect_debug_intent() selects one of 8 finder functions → CPG queries collect debugging constructs → LLM generates analysis. If LLM fails, generate_fallback_answer() produces a structured response using domain plugin sections.

Query -> DebuggingIntentDetector -> integrate_handlers()
  |                                      |
  |  Phase 1: Handler matched?  Yes -> DebugReportFormatter (no LLM)
  |                             No  -> Phase 2: Full pipeline
  |
  Phase 2: detect_debug_intent() -> 8 finder functions -> CPG queries
           -> LLM with debug context -> Fallback if LLM unavailable

Registered Handlers

2 handlers are registered in HandlerRegistry("debugging"):

Handler Priority Intent
BreakpointHandler 10 breakpoint_suggestion
CallStackHandler 20 call_stack

Both inherit from DebuggingHandler (extends BaseHandler), which provides 4 analysis methods described below.

CPG Finder Functions

Phase 2 dispatches to one of 8 finder functions based on the detected base intent. Each finder builds SQL queries against nodes_call / nodes_method tables:

Finder Intent Description
find_logging_calls(cpg, query, error_level) logging Error reporting and log calls, filterable by error level
find_assertions(cpg, query) assertion Assertion macro locations
find_trace_points(cpg, query) trace Tracing instrumentation points
find_explain_code(cpg, query) explain Query plan analysis functions
find_stack_trace_functions(cpg, query) stack_trace Backtrace and stack dump functions
find_debug_output(cpg, query) debug_output Debug-level output calls
find_breakpoint_functions(cpg, query) breakpoint Breakpoint-worthy functions with subsystem matching
generic_debug_search(cpg, query) generic Fallback — extracts identifiers from query and searches CPG

find_breakpoint_functions uses morphological keyword matching against subsystem patterns from domain.get_breakpoint_subsystem_patterns() to select context-specific SQL queries via build_breakpoint_query(subsystem, like_prefix).

generic_debug_search extracts potential function names via regex, filters stopwords, and searches nodes_method for matches, falling back to common debug patterns.

Domain-Agnostic Pattern Loading

All debugging patterns are loaded from the active domain plugin — no hardcoded domain-specific symbols. get_debug_patterns_from_plugin() in debugging/patterns.py provides 7 categories with functions and keywords:

Plugin Method Returns
domain.get_all_debug_functions() Debug functions by category
domain.get_breakpoint_functions(topic) Breakpoint-worthy functions
domain.get_error_levels() Error level strings
domain.get_debug_query_patterns(query_type) Category-specific query patterns
domain.get_breakpoint_subsystem_patterns() Subsystem patterns for breakpoint selection
domain.get_debugging_keywords() Domain-specific debugging keywords
domain.get_debugging_section_keywords() Section keywords for fallback response

Generic defaults (printf, assert, trace, etc.) are used when no domain plugin is available.

DebuggingHandler Analysis Methods

DebuggingHandler(BaseHandler) in debugging_handlers/handlers/base.py provides 4 analysis methods used by the registered handlers:

Breakpoint Suggestion

_suggest_breakpoint_locations(target_function, issue_type, topic, limit=15) — suggests breakpoint locations with priority levels:

  • If target_function is provided: adds the function itself (high priority), its callers (medium), and its callees (medium) from CPG
  • Loads domain-specific breakpoint functions via _get_breakpoint_functions_from_plugin(topic)
  • Each suggestion has: function, reason, priority (high/medium/low), type (entry_point/caller/callee/checkpoint/topic_function)

Call Stack Analysis

_analyze_call_stack(start_function, depth=5) — analyzes call chains from a starting function:

  • Gets callers (who calls this function) via cpg.get_callers()
  • Gets callees (what this function calls) via cpg.get_callees()
  • Returns callers_chain and callees_chain lists with function names and file paths

Variable Flow Tracing

_trace_variable_flow(variable_name, function_name) — traces variable flow through function scopes:

  • Looks up the function in CPG via cpg.find_method_by_name()
  • Gets callees to identify where variable values may propagate
  • Returns flow points with type (function_scope, potential_propagation), filename, and line number

Debugging Strategy Generation

_generate_debugging_strategy(issue_type, target_function) — generates issue-specific debugging strategies:

Issue Type Recommended Steps Tools
crash Check null pointers, memory allocation, error paths gdb, valgrind, AddressSanitizer
hang Check infinite loops, lock patterns, deadlock conditions gdb, strace, ThreadSanitizer
memory_leak Find unmatched allocations, check early returns, verify error path cleanup valgrind, LeakSanitizer, massif
General Step through execution, examine variables, trace call stack gdb, lldb

Issue Type Detection

DebuggingIntentDetector._extract_issue_type() classifies the problem type from the query using morphological matching with EN+RU keywords:

Issue Type EN Keywords RU Keywords
crash crash, segfault, core dump краш, падение, дамп
hang hang, freeze, stuck, deadlock зависание, заморозка, взаимоблокировка
memory_leak memory leak, leak, memory growth утечка памяти, утечка, рост памяти
performance slow, performance, bottleneck медленный, производительность, узкое место
logic_error wrong result, incorrect, bug, error неправильный, некорректный, ошибка, баг

The detected issue type influences breakpoint suggestions (which functions to prioritize) and debugging strategy (which tools and steps to recommend).

Report Formatters

DebugReportFormatter(DebuggingFormatter) in debugging_handlers/formatters/debug_report.py provides 2 report formats:

  • format_breakpoint_report(report_data, language) — markdown report with target function, issue type, priority breakdown (high/medium/low counts), suggestions table (priority badge, function, type badge, reason), and 3 recommendations
  • format_call_stack_report(report_data, language) — markdown report with target function, summary (caller/callee counts, analysis depth), callers chain list, callees chain list, and 3 recommendations

DebuggingFormatter base class provides badge formatting: format_priority_badge(priority, language) and format_breakpoint_type_badge(bp_type, language) for entry_point, caller, callee, checkpoint types.

CLI Usage

# Find breakpoint locations for a function
python -m src.cli query "Suggest breakpoints for debugging heap_insert"

# Analyze call stack
python -m src.cli query "Show call stack for ExecProcNode"

# Find error handling functions
python -m src.cli query "Find error handlers in the executor module"

# Trace variable flow
python -m src.cli query "Trace variable flow through palloc"

# Debugging strategy for a crash
python -m src.cli query "How to debug a crash in heap_insert"

# Find logging calls
python -m src.cli query "Find all logging calls at ERROR level"

# Find assertion macros
python -m src.cli query "Find assertion macros in the storage subsystem"

Example Questions

  • “Suggest breakpoints for debugging [function]”
  • “Show call stack for [function]”
  • “Find error handlers in [module]”
  • “What functions does [function] call?”
  • “Trace variable flow through [function]”
  • “How to debug a crash in [function]”
  • “Find all logging calls at ERROR level”
  • “Find assertion macros in [subsystem]”
  • “Debug memory leak in [module]”
  • “Show debugging strategy for deadlock issues”
  • Performance - Performance profiling and bottleneck analysis (S06)
  • Security - Security vulnerability detection (S02)
  • Refactoring - Code smell detection and dead code (S05)
  • Onboarding - Codebase exploration and understanding (S01)

S15 vs S06 vs S02 vs S05: S15 focuses on interactive debugging support — breakpoint suggestions, call stack analysis, error path tracing, and issue-specific debugging strategies (crash, hang, memory leak). S06 focuses on performance profiling — hotspot detection, bottleneck analysis, and optimization recommendations. S02 focuses on security — vulnerability scanning, taint analysis, and compliance checks. S05 focuses on code quality — smell detection, dead code, and refactoring plans. S15 shares CPG queries and call graph analysis with these scenarios but applies them specifically to debugging workflows.