Automated security incident investigation using call graph analysis, taint flow tracing, and attack path visualization.
Overview¶
The incident response handler analyzes security vulnerabilities by tracing attack paths from entry points to vulnerable functions. It is domain-agnostic — all language-specific data (function names, patterns, entry points) is loaded dynamically from the active domain plugin via DomainRegistry.
The analysis pipeline:
- Query Analysis — extract vulnerable function, match keywords to analysis categories
- Vulnerability Discovery — find critical functions via CPG database queries
- Taint Flow Analysis — trace data flow from sources to sinks via
DataFlowTracer - Attack Path Tracing — find entry points and shortest call chains via
CallGraphAnalyzer - LLM-Assisted Response — generate incident report with remediation recommendations
Key components:
| Component | Role |
|---|---|
CallGraphAnalyzer |
Entry point detection (fan_in=0), attack path tracing, PageRank |
DataFlowTracer |
Taint flow analysis (source → sink paths) |
DomainRegistry |
Domain-specific function mappings, taint sources/sinks, entry points |
LLMInterface |
Natural language incident report generation |
render_attack_path_mermaid() |
Mermaid diagram visualization of attack paths |
Quick Start¶
# Select Incident Response Scenario
/select 14
CLI¶
python -m src.cli audit --db data/projects/myproject.duckdb --language en
MCP (AI Assistant)¶
codegraph_query(query="Trace impact of CVE-2023-XXXX in parse_input", scenario_id="scenario_14")
Incident Investigation¶
CVE Impact Analysis¶
> Trace the impact of CVE-2023-XXXX in validate_input
╭─────────────── CVE Impact Analysis ─────────────────────────╮
│ │
│ CVE: CVE-2023-XXXX │
│ Vulnerability: Buffer overflow in validate_input() │
│ Severity: CRITICAL (CVSS 9.8) │
│ │
│ Affected Function: │
│ validate_input() │
│ Location: src/server/parser/input.c:234 │
│ │
│ Attack Surface: │
│ Entry Points: 3 │
│ - handle_request() [Network accessible] │
│ - api_process_message() [API endpoint] │
│ - do_internal_call() [Internal API] │
│ │
│ Exploitation Path: │
│ Client → read_message() → handle_request() │
│ → validate_input() [VULNERABLE] │
│ │
│ Blast Radius: │
│ Direct callers: 5 │
│ Transitive impact: 156 functions │
│ │
╰──────────────────────────────────────────────────────────────╯
Trace Exploitation Paths¶
> Find all paths from network input to vulnerable function
╭─────────────── Exploitation Paths ──────────────────────────╮
│ │
│ Paths from Network to Vulnerable Code: │
│ │
│ Path 1 (Direct, 3 hops): │
│ handle_request() → parse_command() │
│ → validate_input() ⚠️ │
│ │
│ Path 2 (API route, 4 hops): │
│ api_process_message() → route_handler() │
│ → parse_command() │
│ → validate_input() ⚠️ │
│ │
│ Path 3 (Extended protocol, 5 hops): │
│ on_connection() → dispatch_command() │
│ → route_handler() │
│ → parse_command() │
│ → validate_input() ⚠️ │
│ │
│ Total exploitable paths: 3 │
│ │
╰──────────────────────────────────────────────────────────────╯
Attack Path Tracing¶
CodeGraph automatically discovers entry points and traces shortest call chains from each entry point to the vulnerability. Attack paths are ranked by risk amplification and visualized as Mermaid diagrams.
Entry Point Discovery¶
Entry points are detected using a combination of heuristics:
- Methods with fan_in = 0 (no callers — likely API roots or main functions)
- Methods matching patterns: main, handle_*, on_*, api_*, route_*, test_*, do_*, cmd_*, serve_*, dispatch_*
- Domain plugin entry point functions (via DomainRegistry.get_entry_point_functions())
Risk Amplification¶
Each attack path receives a risk score:
- Shorter paths = higher risk (direct access to vulnerability)
- Higher fan_in entry points = slightly higher risk (more exposed)
- Formula: (1 / max(chain_length, 1)) * (1 + fan_in * 0.1)
The blast radius report includes attack_paths, entry_points_count, and most_exposed_entry_point.
Attack Path Diagram¶
> Trace attack paths to strcpy vulnerability
╭─────────────── Attack Paths ─────────────────────────────╮
│ │
│ Entry Point to Vulnerability Call Chains │
│ │
│ | Entry Point | Vulnerability | Length | Risk |│
│ |-------------------|---------------|--------|----------|│
│ | handle_request | strcpy | 3 | 1.000 |│
│ | api_process | strcpy | 4 | 0.500 |│
│ | do_internal_call | strcpy | 5 | 0.200 |│
│ │
│ Path 1: │
│ │
│ ```mermaid │
│ graph TD │
│ N0["handle_request (api.c:45)"]:::entry │
│ N1["parse_command"] │
│ N2["strcpy (parser.c:120)"]:::vuln │
│ N0 --> N1 --> N2 │
│ classDef entry fill:#69f,stroke:#333 │
│ classDef vuln fill:#f33,stroke:#333 │
│ ``` │
│ │
│ Most Exposed Entry Point: handle_request │
│ │
╰───────────────────────────────────────────────────────────╯
Remediation Analysis¶
Find Similar Vulnerabilities¶
The handler uses LLM-assisted analysis combined with call graph context to identify potentially similar vulnerability patterns. The LLM receives taint flow data, call chain context, and vulnerability metadata to reason about similar code locations. This is not AST-based clone detection — results are advisory and require manual verification.
> Find similar patterns that might have same vulnerability
╭─────────────── Pattern Analysis ────────────────────────────╮
│ │
│ LLM-Assisted Similarity Analysis │
│ │
│ Pattern: Unbounded string copy from network input │
│ │
│ Potentially Similar Locations (review recommended): │
│ │
│ 🔴 HIGH RISK (same pattern): │
│ src/server/parser/grammar.c:567 │
│ src/server/core/main.c:890 │
│ src/server/commands/copy.c:234 │
│ │
│ 🟡 MEDIUM RISK (similar pattern): │
│ src/server/replication/sender.c:456 │
│ src/server/auth/handler.c:789 │
│ │
│ ⚠ Results are LLM-generated and require manual review │
│ │
╰──────────────────────────────────────────────────────────────╯
Patch Impact Assessment¶
> Assess impact of proposed patch
╭─────────────── Patch Impact ────────────────────────────────╮
│ │
│ Proposed Fix: Add bounds checking to validate_input() │
│ │
│ Patch Location: │
│ File: src/server/parser/input.c │
│ Lines: 234-240 │
│ │
│ Functional Impact: │
│ - Input length now limited to MAX_INPUT_SIZE │
│ - Error raised for oversized inputs │
│ │
│ Blast Radius (from call graph): │
│ - Direct callers: 5 │
│ - Transitive callers: 156 │
│ - Entry points affected: 3 │
│ │
│ Risk Level: LOW │
│ - Defensive check only │
│ - No behavioral change for valid inputs │
│ │
╰──────────────────────────────────────────────────────────────╯
Forensic Analysis¶
Trace Affected Data¶
The forensic analysis section is LLM-generated based on vulnerability context, taint flow data, and call graph information. The LLM reasons about potential data exposure, privilege levels, and indicators of compromise. These results are advisory — they represent informed analysis, not computed facts from the codebase.
> What data could have been accessed through this vulnerability?
╭─────────────── Data Access Analysis ────────────────────────╮
│ │
│ LLM-Assisted Forensic Analysis │
│ │
│ Potential Data Exposure (based on taint flow context): │
│ │
│ Memory Regions Accessible: │
│ ├── Input buffer (direct access) │
│ ├── Connection state (adjacent memory) │
│ ├── Authentication tokens (same context) │
│ └── Other session data (stack frames) │
│ │
│ Privilege Level: │
│ - Runs as: server backend process │
│ - Can access: All data files │
│ - Cannot access: OS-level without escalation │
│ │
│ Indicators of Compromise: │
│ - Requests with unusual binary content │
│ - Unexpected memory allocation errors │
│ - Crash logs with validate_input in stack trace │
│ │
│ ⚠ This analysis is LLM-generated and advisory in nature │
│ │
╰──────────────────────────────────────────────────────────────╯
Incident Timeline¶
Generate Incident Report¶
> Generate incident response report
╭─────────────── Incident Report ─────────────────────────────╮
│ │
│ SECURITY INCIDENT REPORT │
│ Generated: 2024-12-20 14:30 UTC │
│ │
│ Vulnerability Summary: │
│ Type: Buffer Overflow (CWE-120) │
│ Severity: CRITICAL │
│ Affected: validate_input() in input.c │
│ │
│ Analysis Results: │
│ Critical vulnerabilities: 3 │
│ Taint flow paths: 12 │
│ Attack paths traced: 3 │
│ Most exposed entry point: handle_request │
│ │
│ Remediation: │
│ 1. Apply bounds checking to validate_input() │
│ 2. Review similar patterns (5 locations) │
│ 3. Update security documentation │
│ │
╰──────────────────────────────────────────────────────────────╯
Example Questions¶
- “Trace the impact of [CVE] in [function]”
- “Find exploitation paths to [function]”
- “Trace attack paths to [vulnerable function]”
- “Find entry points that reach [function]”
- “Find similar vulnerability patterns”
- “What data could be accessed through [vulnerability]?”
- “Assess impact of patch to [function]”
- “Generate incident report”
Related Scenarios¶
- Security Audit - Proactive security analysis
- Entry Points - Attack surface mapping
- Code Review - Review security patches