Security engineer performing vulnerability assessment, taint analysis, and code hardening checks using CPG-powered analysis.
Table of Contents¶
- Quick Start
- How It Works
- Intent Classification
- Handler Architecture
- Security Analysis Classes
- Vulnerability Scanning
- SQL Injection Detection
- Buffer Overflow Detection
- Taint Flow Analysis
- Identify Entry Points
- Trace Data Flow
- D3FEND Source Code Hardening
- 11 D3FEND Techniques
- Full Hardening Audit
- Check Unsafe Functions (D3-TL)
- Check Null Pointer Safety (D3-NPC)
- Language-Specific Security Patterns
- Supported Languages (11)
- CWE Coverage (69 CWEs)
- CAPEC Attack Pattern Coverage
- Language Examples
- OWASP Top 10 Compliance
- Taint Path Visualization
- Autofix Integration
- CLI & MCP Usage
- Example Questions
- Related Scenarios
Quick Start¶
# Select Security Audit Scenario
/select 02
How It Works¶
Intent Classification¶
When you ask a security question, the SecurityIntentDetector classifies it into one of 12 specific intents using keyword matching (supports both English and Russian queries):
| Intent | Keywords (EN) | Keywords (RU) | Handler |
|---|---|---|---|
hardcoded_credentials |
hardcoded, credentials, secret | захардкоженный, жестко закодированный | VulnerabilityScanHandler |
entry_points |
entry point, attack surface | точка входа, поверхность атаки | EntryPointsHandler |
taint_analysis |
taint, data flow, trace | поток данных, распространение | TaintAnalysisHandler |
info_disclosure |
information disclosure, leak | раскрытие информации, утечка | InfoDisclosureHandler |
null_check |
null, dereference | нулевой указатель, разыменование | NullCheckHandler |
memory_lifetime |
use after free, double free | использование после освобождения | MemoryLifetimeHandler |
race_condition |
race condition, TOCTOU | гонка, состояние гонки | RaceConditionHandler |
weak_crypto |
weak random, PRNG, entropy | слабое случайное число | WeakCryptoHandler |
denial_of_service |
DoS, resource exhaustion | отказ в обслуживании | DoSHandler |
vulnerability_scan |
vulnerability, CWE, injection | уязвимость, инъекция | VulnerabilityScanHandler |
Handler Architecture¶
The security audit scenario uses a handler-based architecture with 11 parallel handlers, each registered with a priority:
User Query
|
v
SecurityIntentDetector → classify intent (12 types)
|
v
security_registry → route to handler by priority
|
+-- [4] EntryPointsHandler → attack surface discovery
+-- [5] TaintSourcesHandler → source/sink reference
+-- [10] VulnerabilityScanHandler → pattern-based scanning
+-- [20] TaintAnalysisHandler → deep dataflow tracing
+-- [25] NullCheckHandler → null pointer dereferences
+-- [26] InfoDisclosureHandler → information leaks
+-- [27] MemoryLifetimeHandler → use-after-free, double-free
+-- [28] RaceConditionHandler → concurrency issues
+-- [29] WeakCryptoHandler → weak PRNG/entropy
+-- [30] DoSHandler → resource exhaustion
+-- [--] AutofixHandler → fix generation (separate)
Lower priority number = higher precedence. Each handler has a paired formatter for structured output.
Security Analysis Classes¶
| Class | Module | Purpose |
|---|---|---|
SecurityScanner |
src/security/ |
CPG pattern execution, returns SecurityFinding list |
HardeningScanner |
src/security/hardening/ |
D3FEND compliance checks (11 techniques) |
TaintPropagator |
src/analysis/dataflow/taint/ |
Field-sensitive taint analysis with symbolic execution |
TaintVerifiedScanner |
src/security/ |
High-confidence taint findings with sanitization tracking |
AutofixEngine |
src/analysis/autofix/ |
Three-strategy fix generation (SSR → template → LLM) |
SARIFExporter |
src/security/ |
SARIF 2.1.0 export with codeFlows and OWASP tags |
Vulnerability Scanning¶
SQL Injection Detection¶
> Find SQL injection vulnerabilities
╭─────────────── Security Findings ─────────────────────────╮
│ │
│ CRITICAL: SQL Injection │
│ │
│ Location: src/pl/plpgsql/src/pl_exec.c:4567 │
│ Pattern: Dynamic query with string concatenation │
│ Code: │
│ snprintf(query, "SELECT * FROM %s", table_name); │
│ │
│ Risk: User-controlled table_name can inject SQL │
│ Fix: Use quote_identifier() for table names │
│ CWE: CWE-89 │
│ │
│ Total findings: 7 critical, 12 high, 34 medium │
│ │
╰───────────────────────────────────────────────────────────╯
Buffer Overflow Detection¶
> Find buffer overflow risks in string functions
> Show functions using sprintf without bounds
Taint Flow Analysis¶
Identify Entry Points¶
> Find all external entry points
╭─────────────── Entry Points ──────────────────────────────╮
│ │
│ Network Entry Points: │
│ - pq_getmsgstring() - Read string from client │
│ - pq_getmsgint() - Read int from client │
│ - ProcessClientRead() - Raw socket read │
│ │
│ SQL Entry Points: │
│ - exec_simple_query() - Direct SQL execution │
│ - exec_parse_message() - Prepared statement │
│ - exec_bind_message() - Parameter binding │
│ │
│ File Entry Points: │
│ - pg_read_file() - Read arbitrary file │
│ - pg_ls_dir() - List directory │
│ │
│ Total: 47 entry points identified │
│ │
╰───────────────────────────────────────────────────────────╯
Trace Data Flow¶
The TaintPropagator traces data from untrusted sources to dangerous sinks with field-sensitive tracking and symbolic execution for path feasibility:
> Trace data flow from PQgetvalue to SQL execution
╭─────────────── Taint Flow ────────────────────────────────╮
│ │
│ SOURCE: PQgetvalue() [Client Input] │
│ ↓ │
│ pq_getmsgstring() │
│ ↓ │
│ exec_simple_query() │
│ ↓ │
│ pg_parse_query() │
│ ↓ │
│ SINK: SPI_execute() [SQL Execution] │
│ │
│ Risk Level: HIGH │
│ Recommendation: Add input validation at entry point │
│ │
╰───────────────────────────────────────────────────────────╯
D3FEND Source Code Hardening¶
11 D3FEND Techniques¶
The HardeningScanner analyzes code for 11 MITRE D3FEND Source Code Hardening techniques:
| ID | Technique | Description | CWE |
|---|---|---|---|
| D3-VI | Variable Initialization | Uninitialized variables | CWE-457 |
| D3-CS | Credential Scrubbing | Hardcoded credentials | CWE-798 |
| D3-IRV | Integer Range Validation | Integer overflow risks | CWE-190 |
| D3-PV | Pointer Validation | Pointer dereference without check | CWE-476 |
| D3-RN | Reference Nullification | Use-after-free risks | CWE-416 |
| D3-TL | Trusted Library | Unsafe function usage | CWE-676 |
| D3-VTV | Variable Type Validation | Type confusion vulnerabilities | CWE-843 |
| D3-MBSV | Memory Block Start Validation | Out-of-bounds access | CWE-787 |
| D3-NPC | Null Pointer Checking | Missing NULL checks | CWE-476 |
| D3-DLV | Domain Logic Validation | Business logic flaws | CWE-20 |
| D3-OLV | Operational Logic Validation | Operational invariant violations | CWE-670 |
Full Hardening Audit¶
> Run D3FEND hardening compliance check
╭─────────────── D3FEND Compliance Report ─────────────────────╮
│ │
│ Overall Compliance Score: 72.5% │
│ │
│ Findings by Technique: │
│ │
│ D3-VI (Variable Initialization): 23 issues │
│ D3-TL (Trusted Library): 12 issues │
│ D3-NPC (Null Pointer Checking): 8 issues │
│ D3-RN (Reference Nullification): 6 issues │
│ │
│ Category Scores: │
│ Initialization: 65% │
│ Memory Safety: 78% │
│ Pointer Safety: 82% │
│ Library Safety: 58% │
│ │
╰───────────────────────────────────────────────────────────────╯
Check Unsafe Functions (D3-TL)¶
> Check for unsafe function usage (D3-TL Trusted Library)
╭─────────────── D3-TL: Trusted Library ───────────────────────╮
│ │
│ CRITICAL - strcpy (buffer overflow risk): │
│ src/backend/utils/adt/varlena.c:234 │
│ src/backend/libpq/pqformat.c:567 │
│ │
│ CRITICAL - sprintf (format string risk): │
│ src/backend/libpq/auth.c:567 │
│ │
│ Remediation: │
│ - strcpy → strncpy/strlcpy │
│ - sprintf → snprintf │
│ │
╰───────────────────────────────────────────────────────────────╯
Check Null Pointer Safety (D3-NPC)¶
> Find null pointer vulnerabilities (D3-NPC)
╭─────────────── D3-NPC: Null Pointer Checking ────────────────╮
│ │
│ Missing NULL Checks After Allocation: 23 │
│ │
│ malloc without check: │
│ src/backend/utils/mmgr/aset.c:345 │
│ char *buf = malloc(size); │
│ use(buf); // ← No NULL check! │
│ │
│ Example Fix: │
│ char *buf = malloc(size); │
│ if (buf == NULL) { │
│ ereport(ERROR, (errcode(ERRCODE_OUT_OF_MEMORY))); │
│ } │
│ │
╰───────────────────────────────────────────────────────────────╯
Language-Specific Security Patterns¶
Supported Languages (11)¶
CodeGraph supports security analysis for 11 programming languages with CWE mappings, plus framework-specific hypothesis providers:
| Language | Patterns | Key CWEs |
|---|---|---|
| C/C++ | Buffer overflow, format string, UAF, command injection | CWE-120, CWE-134, CWE-416, CWE-78 |
| Python/Django | SQL injection, XSS, CSRF, deserialization | CWE-89, CWE-79, CWE-352, CWE-502 |
| JavaScript/TypeScript | XSS, prototype pollution, eval injection, SSRF | CWE-79, CWE-1321, CWE-94, CWE-918 |
| Go | Race conditions, SQL injection, path traversal, insecure TLS | CWE-362, CWE-89, CWE-22, CWE-295 |
| Ruby/Rails | Eval injection, YAML deserialization, mass assignment | CWE-94, CWE-502, CWE-915 |
| C#/.NET | SQL injection, XSS, insecure deserialization, XXE | CWE-89, CWE-79, CWE-502, CWE-611 |
| Kotlin/Android | WebView XSS, intent redirection, insecure storage | CWE-79, CWE-927, CWE-312 |
| Swift/iOS | Keychain misuse, URL scheme hijacking, TLS bypass | CWE-312, CWE-939, CWE-295 |
| Java | SQL injection, deserialization, XXE, LDAP injection | CWE-89, CWE-502, CWE-611, CWE-90 |
| PHP | SQL injection, XSS, command injection, path traversal | CWE-89, CWE-79, CWE-78, CWE-22 |
| 1C (OneC) | SQL injection, command injection, path traversal | CWE-89, CWE-78, CWE-22 |
Framework-specific providers (via Hypothesis System): Django, Express.js, Next.js, Spring, Gin, PostgreSQL.
CWE Coverage (69 CWEs)¶
The security pattern database covers 69 unique CWE identifiers across all languages:
Injection Vulnerabilities: - CWE-78: OS Command Injection - CWE-89: SQL Injection - CWE-94: Code Injection - CWE-134: Format String
Web Vulnerabilities: - CWE-79: Cross-Site Scripting (XSS) - CWE-352: Cross-Site Request Forgery (CSRF) - CWE-502: Insecure Deserialization - CWE-611: XML External Entity (XXE) - CWE-918: Server-Side Request Forgery (SSRF) - CWE-1321: Prototype Pollution
Memory Safety (C/C++): - CWE-120: Buffer Overflow - CWE-416: Use After Free - CWE-476: NULL Pointer Dereference - CWE-190: Integer Overflow - CWE-787: Out-of-bounds Write
Authentication/Authorization: - CWE-798: Hardcoded Credentials - CWE-284: Improper Access Control - CWE-862: Missing Authorization
Full list includes 69 CWEs spanning injection, memory safety, web, crypto, concurrency, and access control categories.
CAPEC Attack Pattern Coverage¶
CodeGraph’s security patterns are defined through CWE identifiers. Each CWE maps to corresponding CAPEC attack patterns via the standard MITRE CWE→CAPEC mapping:
| CWE | CAPEC | Attack Pattern |
|---|---|---|
| CWE-89 | CAPEC-66 | SQL Injection |
| CWE-79 | CAPEC-86 | XSS Through HTTP Headers |
| CWE-78 | CAPEC-88 | OS Command Injection |
| CWE-120 | CAPEC-100 | Buffer Overflow |
| CWE-22 | CAPEC-126 | Path Traversal |
| CWE-502 | CAPEC-586 | Object Injection (Deserialization) |
| CWE-918 | CAPEC-664 | Server-Side Request Forgery |
The Hypothesis System generates security hypotheses for these attack scenarios with framework-specific providers, enabling targeted vulnerability detection beyond static pattern matching.
Language Examples¶
Python/Django:
# Switch to Django project
/project switch my_django_app
# Run security queries
> Find SQL injection vulnerabilities in views
> Check for XSS in templates
> Find endpoints without CSRF protection
JavaScript/TypeScript:
> Find prototype pollution vulnerabilities
> Check for XSS via innerHTML
> Find eval() usage with user input
Go:
> Find race conditions in goroutines
> Check for SQL injection in database queries
> Find insecure TLS configurations
OWASP Top 10 Compliance¶
Security audit results are automatically enriched with OWASP Top 10 2021 mapping via owasp_mapping.py (69 CWEs mapped to A01–A10):
- Each finding receives an
OWASP-Axxtag (e.g.,OWASP-A03for Injection) - A compliance table summarizes pass/fail status per OWASP category
- SARIF export includes OWASP tags on
reportingDescriptorrules
Example compliance output:
| OWASP Category | Status | Findings |
|-------------------------------|--------|----------|
| A01 Broken Access Control | PASS | 0 |
| A02 Cryptographic Failures | FAIL | 2 |
| A03 Injection | FAIL | 5 |
| A04 Insecure Design | PASS | 0 |
| A05 Security Misconfiguration | WARN | 1 |
| ... | ... | ... |
Taint Path Visualization¶
Taint analysis results include Mermaid diagrams showing exploitation paths:
graph LR
src["user_input()"]:::source --> proc["process_data()"]
proc --> sink["exec_sql()"]:::sink
classDef source fill:#f90,stroke:#333
classDef sink fill:#f33,stroke:#333
classDef sanitizer fill:#3c3,stroke:#333
- Orange nodes: taint sources (user input entry points)
- Red nodes: taint sinks (dangerous operations)
- Green nodes: sanitizers (validation/escaping functions)
The MCP tool codegraph_taint_analysis returns structured taint_paths array with source, sink, and intermediate nodes. SARIFExporter includes codeFlows per SARIF 2.1.0 §3.36.
Autofix Integration¶
The AutofixEngine generates concrete code patches for detected vulnerabilities using three strategies (SSR → template → LLM). See the dedicated Autofix Guide for:
- Three-strategy pipeline (SSR, template, LLM)
- Supported vulnerability types and fix templates
- Confidence scoring and approval flow
- Configuration and domain extension
Quick usage:
# CLI: audit with autofix suggestions
python -m src.cli audit --db data/projects/postgres.duckdb --autofix
# MCP: generate fix for a specific method
codegraph_autofix(method_name="process_query", cwe="CWE-89")
CLI & MCP Usage¶
# Full security report (auto-detect language)
python -m src.cli.security_audit full \
--path /path/to/project \
--output ./security_reports
# Specify language explicitly
python -m src.cli.security_audit full \
--path /path/to/project \
--language python \
--output ./security_reports
# Available languages: auto, c, cpp, python, javascript, typescript,
# go, csharp, kotlin, java, php, onec
# Query-based security analysis
python -m src.cli query "Find SQL injection vulnerabilities"
# Security incident tracing
python -m src.cli security incident "compromised function" --db PATH
Output formats:
- security_report.md — Human-readable report
- security_report.json — Machine-readable for CI/CD
- security_report.sarif — GitHub Security Alerts format (SARIF 2.1.0)
MCP tools:
- codegraph_taint_analysis(method_name, source_category, sink_category) — trace data flows
- codegraph_autofix(method_name, cwe="") — generate fix suggestions
Example Questions¶
Vulnerability Scanning: - “Find SQL injection vulnerabilities” - “Find buffer overflow risks in string functions” - “Check for hardcoded credentials” - “Find information disclosure risks”
Taint Analysis: - “Trace data flow from user input to SQL execution” - “Find all external entry points” - “Show taint paths for process_query”
D3FEND Hardening: - “Run D3FEND hardening compliance check” - “Check for unsafe function usage (D3-TL)” - “Find null pointer vulnerabilities”
Memory & Concurrency: - “Find use-after-free vulnerabilities” - “Find race conditions in goroutines” - “Check for weak random number generation” - “Find resource exhaustion risks”
Related Scenarios¶
- Autofix — Automated fix generation for detected vulnerabilities
- Compliance (S08) — Regulatory compliance checks
- Incident Response (S14) — Post-breach investigation
- Entry Points (S16) — Attack surface mapping