Scenario 02: Security Audit

Security engineer performing vulnerability assessment, taint analysis, and code hardening checks using CPG-powered analysis.

Table of Contents

Quick Start

# Select Security Audit Scenario
/select 02

How It Works

Intent Classification

When you ask a security question, the SecurityIntentDetector classifies it into one of 12 specific intents using keyword matching (supports both English and Russian queries):

Intent Keywords (EN) Keywords (RU) Handler
hardcoded_credentials hardcoded, credentials, secret захардкоженный, жестко закодированный VulnerabilityScanHandler
entry_points entry point, attack surface точка входа, поверхность атаки EntryPointsHandler
taint_analysis taint, data flow, trace поток данных, распространение TaintAnalysisHandler
info_disclosure information disclosure, leak раскрытие информации, утечка InfoDisclosureHandler
null_check null, dereference нулевой указатель, разыменование NullCheckHandler
memory_lifetime use after free, double free использование после освобождения MemoryLifetimeHandler
race_condition race condition, TOCTOU гонка, состояние гонки RaceConditionHandler
weak_crypto weak random, PRNG, entropy слабое случайное число WeakCryptoHandler
denial_of_service DoS, resource exhaustion отказ в обслуживании DoSHandler
vulnerability_scan vulnerability, CWE, injection уязвимость, инъекция VulnerabilityScanHandler

Handler Architecture

The security audit scenario uses a handler-based architecture with 11 parallel handlers, each registered with a priority:

User Query
    |
    v
SecurityIntentDetector → classify intent (12 types)
    |
    v
security_registry → route to handler by priority
    |
    +-- [4]  EntryPointsHandler      → attack surface discovery
    +-- [5]  TaintSourcesHandler     → source/sink reference
    +-- [10] VulnerabilityScanHandler → pattern-based scanning
    +-- [20] TaintAnalysisHandler    → deep dataflow tracing
    +-- [25] NullCheckHandler        → null pointer dereferences
    +-- [26] InfoDisclosureHandler   → information leaks
    +-- [27] MemoryLifetimeHandler   → use-after-free, double-free
    +-- [28] RaceConditionHandler    → concurrency issues
    +-- [29] WeakCryptoHandler       → weak PRNG/entropy
    +-- [30] DoSHandler              → resource exhaustion
    +-- [--] AutofixHandler          → fix generation (separate)

Lower priority number = higher precedence. Each handler has a paired formatter for structured output.

Security Analysis Classes

Class Module Purpose
SecurityScanner src/security/ CPG pattern execution, returns SecurityFinding list
HardeningScanner src/security/hardening/ D3FEND compliance checks (11 techniques)
TaintPropagator src/analysis/dataflow/taint/ Field-sensitive taint analysis with symbolic execution
TaintVerifiedScanner src/security/ High-confidence taint findings with sanitization tracking
AutofixEngine src/analysis/autofix/ Three-strategy fix generation (SSR → template → LLM)
SARIFExporter src/security/ SARIF 2.1.0 export with codeFlows and OWASP tags

Vulnerability Scanning

SQL Injection Detection

> Find SQL injection vulnerabilities

╭─────────────── Security Findings ─────────────────────────╮
│                                                           │
│  CRITICAL: SQL Injection                                  │
│                                                           │
│     Location: src/pl/plpgsql/src/pl_exec.c:4567           │
│     Pattern: Dynamic query with string concatenation      │
│     Code:                                                 │
│       snprintf(query, "SELECT * FROM %s", table_name);    │
│                                                           │
│     Risk: User-controlled table_name can inject SQL       │
│     Fix: Use quote_identifier() for table names           │
│     CWE: CWE-89                                           │
│                                                           │
│  Total findings: 7 critical, 12 high, 34 medium           │
│                                                           │
╰───────────────────────────────────────────────────────────╯

Buffer Overflow Detection

> Find buffer overflow risks in string functions
> Show functions using sprintf without bounds

Taint Flow Analysis

Identify Entry Points

> Find all external entry points

╭─────────────── Entry Points ──────────────────────────────╮
│                                                           │
│  Network Entry Points:                                    │
│    - pq_getmsgstring()     - Read string from client      │
│    - pq_getmsgint()        - Read int from client         │
│    - ProcessClientRead()   - Raw socket read              │
│                                                           │
│  SQL Entry Points:                                        │
│    - exec_simple_query()   - Direct SQL execution         │
│    - exec_parse_message()  - Prepared statement           │
│    - exec_bind_message()   - Parameter binding            │
│                                                           │
│  File Entry Points:                                       │
│    - pg_read_file()        - Read arbitrary file          │
│    - pg_ls_dir()           - List directory               │
│                                                           │
│  Total: 47 entry points identified                        │
│                                                           │
╰───────────────────────────────────────────────────────────╯

Trace Data Flow

The TaintPropagator traces data from untrusted sources to dangerous sinks with field-sensitive tracking and symbolic execution for path feasibility:

> Trace data flow from PQgetvalue to SQL execution

╭─────────────── Taint Flow ────────────────────────────────╮
│                                                           │
│  SOURCE: PQgetvalue() [Client Input]                      │
│      ↓                                                    │
│  pq_getmsgstring()                                        │
│      ↓                                                    │
│  exec_simple_query()                                      │
│      ↓                                                    │
│  pg_parse_query()                                         │
│      ↓                                                    │
│  SINK: SPI_execute() [SQL Execution]                      │
│                                                           │
│  Risk Level: HIGH                                         │
│  Recommendation: Add input validation at entry point      │
│                                                           │
╰───────────────────────────────────────────────────────────╯

D3FEND Source Code Hardening

11 D3FEND Techniques

The HardeningScanner analyzes code for 11 MITRE D3FEND Source Code Hardening techniques:

ID Technique Description CWE
D3-VI Variable Initialization Uninitialized variables CWE-457
D3-CS Credential Scrubbing Hardcoded credentials CWE-798
D3-IRV Integer Range Validation Integer overflow risks CWE-190
D3-PV Pointer Validation Pointer dereference without check CWE-476
D3-RN Reference Nullification Use-after-free risks CWE-416
D3-TL Trusted Library Unsafe function usage CWE-676
D3-VTV Variable Type Validation Type confusion vulnerabilities CWE-843
D3-MBSV Memory Block Start Validation Out-of-bounds access CWE-787
D3-NPC Null Pointer Checking Missing NULL checks CWE-476
D3-DLV Domain Logic Validation Business logic flaws CWE-20
D3-OLV Operational Logic Validation Operational invariant violations CWE-670

Full Hardening Audit

> Run D3FEND hardening compliance check

╭─────────────── D3FEND Compliance Report ─────────────────────╮
│                                                               │
│  Overall Compliance Score: 72.5%                              │
│                                                               │
│  Findings by Technique:                                       │
│                                                               │
│  D3-VI (Variable Initialization): 23 issues                   │
│  D3-TL (Trusted Library): 12 issues                           │
│  D3-NPC (Null Pointer Checking): 8 issues                     │
│  D3-RN (Reference Nullification): 6 issues                    │
│                                                               │
│  Category Scores:                                             │
│    Initialization: 65%                                        │
│    Memory Safety: 78%                                         │
│    Pointer Safety: 82%                                        │
│    Library Safety: 58%                                        │
│                                                               │
╰───────────────────────────────────────────────────────────────╯

Check Unsafe Functions (D3-TL)

> Check for unsafe function usage (D3-TL Trusted Library)

╭─────────────── D3-TL: Trusted Library ───────────────────────╮
│                                                               │
│  CRITICAL - strcpy (buffer overflow risk):                    │
│    src/backend/utils/adt/varlena.c:234                        │
│    src/backend/libpq/pqformat.c:567                           │
│                                                               │
│  CRITICAL - sprintf (format string risk):                     │
│    src/backend/libpq/auth.c:567                               │
│                                                               │
│  Remediation:                                                 │
│    - strcpy → strncpy/strlcpy                                 │
│    - sprintf → snprintf                                       │
│                                                               │
╰───────────────────────────────────────────────────────────────╯

Check Null Pointer Safety (D3-NPC)

> Find null pointer vulnerabilities (D3-NPC)

╭─────────────── D3-NPC: Null Pointer Checking ────────────────╮
│                                                               │
│  Missing NULL Checks After Allocation: 23                     │
│                                                               │
│  malloc without check:                                        │
│    src/backend/utils/mmgr/aset.c:345                          │
│      char *buf = malloc(size);                                │
│      use(buf);  // ← No NULL check!                           │
│                                                               │
│  Example Fix:                                                 │
│    char *buf = malloc(size);                                  │
│    if (buf == NULL) {                                         │
│        ereport(ERROR, (errcode(ERRCODE_OUT_OF_MEMORY)));      │
│    }                                                          │
│                                                               │
╰───────────────────────────────────────────────────────────────╯

Language-Specific Security Patterns

Supported Languages (11)

CodeGraph supports security analysis for 11 programming languages with CWE mappings, plus framework-specific hypothesis providers:

Language Patterns Key CWEs
C/C++ Buffer overflow, format string, UAF, command injection CWE-120, CWE-134, CWE-416, CWE-78
Python/Django SQL injection, XSS, CSRF, deserialization CWE-89, CWE-79, CWE-352, CWE-502
JavaScript/TypeScript XSS, prototype pollution, eval injection, SSRF CWE-79, CWE-1321, CWE-94, CWE-918
Go Race conditions, SQL injection, path traversal, insecure TLS CWE-362, CWE-89, CWE-22, CWE-295
Ruby/Rails Eval injection, YAML deserialization, mass assignment CWE-94, CWE-502, CWE-915
C#/.NET SQL injection, XSS, insecure deserialization, XXE CWE-89, CWE-79, CWE-502, CWE-611
Kotlin/Android WebView XSS, intent redirection, insecure storage CWE-79, CWE-927, CWE-312
Swift/iOS Keychain misuse, URL scheme hijacking, TLS bypass CWE-312, CWE-939, CWE-295
Java SQL injection, deserialization, XXE, LDAP injection CWE-89, CWE-502, CWE-611, CWE-90
PHP SQL injection, XSS, command injection, path traversal CWE-89, CWE-79, CWE-78, CWE-22
1C (OneC) SQL injection, command injection, path traversal CWE-89, CWE-78, CWE-22

Framework-specific providers (via Hypothesis System): Django, Express.js, Next.js, Spring, Gin, PostgreSQL.

CWE Coverage (69 CWEs)

The security pattern database covers 69 unique CWE identifiers across all languages:

Injection Vulnerabilities: - CWE-78: OS Command Injection - CWE-89: SQL Injection - CWE-94: Code Injection - CWE-134: Format String

Web Vulnerabilities: - CWE-79: Cross-Site Scripting (XSS) - CWE-352: Cross-Site Request Forgery (CSRF) - CWE-502: Insecure Deserialization - CWE-611: XML External Entity (XXE) - CWE-918: Server-Side Request Forgery (SSRF) - CWE-1321: Prototype Pollution

Memory Safety (C/C++): - CWE-120: Buffer Overflow - CWE-416: Use After Free - CWE-476: NULL Pointer Dereference - CWE-190: Integer Overflow - CWE-787: Out-of-bounds Write

Authentication/Authorization: - CWE-798: Hardcoded Credentials - CWE-284: Improper Access Control - CWE-862: Missing Authorization

Full list includes 69 CWEs spanning injection, memory safety, web, crypto, concurrency, and access control categories.

CAPEC Attack Pattern Coverage

CodeGraph’s security patterns are defined through CWE identifiers. Each CWE maps to corresponding CAPEC attack patterns via the standard MITRE CWE→CAPEC mapping:

CWE CAPEC Attack Pattern
CWE-89 CAPEC-66 SQL Injection
CWE-79 CAPEC-86 XSS Through HTTP Headers
CWE-78 CAPEC-88 OS Command Injection
CWE-120 CAPEC-100 Buffer Overflow
CWE-22 CAPEC-126 Path Traversal
CWE-502 CAPEC-586 Object Injection (Deserialization)
CWE-918 CAPEC-664 Server-Side Request Forgery

The Hypothesis System generates security hypotheses for these attack scenarios with framework-specific providers, enabling targeted vulnerability detection beyond static pattern matching.

Language Examples

Python/Django:

# Switch to Django project
/project switch my_django_app

# Run security queries
> Find SQL injection vulnerabilities in views
> Check for XSS in templates
> Find endpoints without CSRF protection

JavaScript/TypeScript:

> Find prototype pollution vulnerabilities
> Check for XSS via innerHTML
> Find eval() usage with user input

Go:

> Find race conditions in goroutines
> Check for SQL injection in database queries
> Find insecure TLS configurations

OWASP Top 10 Compliance

Security audit results are automatically enriched with OWASP Top 10 2021 mapping via owasp_mapping.py (69 CWEs mapped to A01–A10):

  • Each finding receives an OWASP-Axx tag (e.g., OWASP-A03 for Injection)
  • A compliance table summarizes pass/fail status per OWASP category
  • SARIF export includes OWASP tags on reportingDescriptor rules

Example compliance output:

| OWASP Category                | Status | Findings |
|-------------------------------|--------|----------|
| A01 Broken Access Control     | PASS   | 0        |
| A02 Cryptographic Failures    | FAIL   | 2        |
| A03 Injection                 | FAIL   | 5        |
| A04 Insecure Design           | PASS   | 0        |
| A05 Security Misconfiguration | WARN   | 1        |
| ...                           | ...    | ...      |

Taint Path Visualization

Taint analysis results include Mermaid diagrams showing exploitation paths:

graph LR
    src["user_input()"]:::source --> proc["process_data()"]
    proc --> sink["exec_sql()"]:::sink
    classDef source fill:#f90,stroke:#333
    classDef sink fill:#f33,stroke:#333
    classDef sanitizer fill:#3c3,stroke:#333
  • Orange nodes: taint sources (user input entry points)
  • Red nodes: taint sinks (dangerous operations)
  • Green nodes: sanitizers (validation/escaping functions)

The MCP tool codegraph_taint_analysis returns structured taint_paths array with source, sink, and intermediate nodes. SARIFExporter includes codeFlows per SARIF 2.1.0 §3.36.

Autofix Integration

The AutofixEngine generates concrete code patches for detected vulnerabilities using three strategies (SSR → template → LLM). See the dedicated Autofix Guide for:

  • Three-strategy pipeline (SSR, template, LLM)
  • Supported vulnerability types and fix templates
  • Confidence scoring and approval flow
  • Configuration and domain extension

Quick usage:

# CLI: audit with autofix suggestions
python -m src.cli audit --db data/projects/postgres.duckdb --autofix

# MCP: generate fix for a specific method
codegraph_autofix(method_name="process_query", cwe="CWE-89")

CLI & MCP Usage

# Full security report (auto-detect language)
python -m src.cli.security_audit full \
  --path /path/to/project \
  --output ./security_reports

# Specify language explicitly
python -m src.cli.security_audit full \
  --path /path/to/project \
  --language python \
  --output ./security_reports

# Available languages: auto, c, cpp, python, javascript, typescript,
#                      go, csharp, kotlin, java, php, onec

# Query-based security analysis
python -m src.cli query "Find SQL injection vulnerabilities"

# Security incident tracing
python -m src.cli security incident "compromised function" --db PATH

Output formats: - security_report.md — Human-readable report - security_report.json — Machine-readable for CI/CD - security_report.sarif — GitHub Security Alerts format (SARIF 2.1.0)

MCP tools: - codegraph_taint_analysis(method_name, source_category, sink_category) — trace data flows - codegraph_autofix(method_name, cwe="") — generate fix suggestions

Example Questions

Vulnerability Scanning: - “Find SQL injection vulnerabilities” - “Find buffer overflow risks in string functions” - “Check for hardcoded credentials” - “Find information disclosure risks”

Taint Analysis: - “Trace data flow from user input to SQL execution” - “Find all external entry points” - “Show taint paths for process_query”

D3FEND Hardening: - “Run D3FEND hardening compliance check” - “Check for unsafe function usage (D3-TL)” - “Find null pointer vulnerabilities”

Memory & Concurrency: - “Find use-after-free vulnerabilities” - “Find race conditions in goroutines” - “Check for weak random number generation” - “Find resource exhaustion risks”