STRIDE Threat Model

Overview¶

CodeGraph includes an automated STRIDE threat model generator that builds threat models directly from the Code Property Graph (CPG). Instead of manually drawing data flow diagrams and enumerating threats in spreadsheets, CodeGraph extracts DFD components, data flows, and trust boundaries from the parsed codebase, then classifies threats using the STRIDE methodology with 43 CWE-to-STRIDE mappings.

The feature produces structured output in multiple formats (JSON, Markdown, GOST, SARIF 2.1.0, Mermaid DFD) and supports incremental updates so that the threat model stays current as code evolves.

Key capabilities:

Automatic DFD extraction from CPG entry points, call graph, taint sources/sinks
Five trust boundary types detected via heuristic pattern matching
43 CWEs mapped to 6 STRIDE categories (MITRE Top 25 + OWASP Top 10)
18 standard mitigations plus 14 CWE-specific recommendations
GOST R 56939-2024 process 5.7 compliance (artifacts 5.7.3.1 through 5.7.3.7)
Incremental delta computation between model versions
Bilingual export (English and Russian)

Source files: src/security/threat_model/

File	Purpose
`models.py`	Data models: `ThreatModel`, `STRIDEThreat`, `DataComponent`, `DataFlow`, `TrustBoundary`, `ThreatModelDelta`
`dfd_builder.py`	Extracts DFD from CPG tables (`nodes_method`, `edges_call`)
`trust_boundary.py`	Detects 5 trust boundary types from entry point categories
`stride_classifier.py`	Maps CWE findings and CPG patterns to STRIDE categories
`mitigation.py`	Recommends countermeasures per STRIDE category and CWE
`exporter.py`	Exports to JSON, Markdown (EN/RU), GOST, SARIF 2.1.0, Mermaid DFD
`incremental.py`	Computes delta between old and new models

STRIDE Methodology¶

STRIDE is a threat classification framework developed by Microsoft. Each letter represents a category of threat:

Category	Enum Value	Description	Security Property Violated
Spoofing	`spoofing`	Impersonating a user, system, or component	Authentication
Tampering	`tampering`	Unauthorized modification of data or code	Integrity
Repudiation	`repudiation`	Denying an action without proof otherwise	Non-repudiation
Information Disclosure	`info_disclosure`	Exposing data to unauthorized parties	Confidentiality
Denial of Service	`dos`	Degrading or denying availability	Availability
Elevation of Privilege	`elevation`	Gaining unauthorized access or capabilities	Authorization

CodeGraph maps 43 CWEs to these 6 categories. Some CWEs map to multiple categories (e.g., CWE-79 maps to both Tampering and Information Disclosure). The full mapping is defined in CWE_TO_STRIDE inside stride_classifier.py.

Architecture¶

The threat model pipeline consists of six stages executed in sequence:

graph LR
    CPG[(CPG DuckDB)] --> DFD[DFD Builder]
    Domain[Domain Plugin] --> DFD
    DFD --> TB[Trust Boundary Detector]
    TB --> SC[STRIDE Classifier]
    Hyp[Hypothesis Results] --> SC
    SC --> MR[Mitigation Recommender]
    MR --> EXP[Exporter]
    EXP --> JSON[JSON]
    EXP --> MD[Markdown]
    EXP --> GOST[GOST Report]
    EXP --> SARIF[SARIF 2.1.0]
    EXP --> MER[Mermaid DFD]
    PrevModel[Previous Model] --> INC[Incremental Updater]
    EXP --> INC
    INC --> Delta[ThreatModelDelta]

Stage 1: DFD Builder (`dfd_builder.py`)¶

The DFDBuilder class extracts a Data Flow Diagram from the CPG. It queries three categories of components:

Processes: Modules containing entry points (from nodes_method WHERE is_entry_point = true), grouped by file-level granularity.
Data Stores: Inferred from taint sink categories (database, file, cache, log, network_send) via the domain plugin or built-in fallback.
External Entities: Inferred from taint source categories (network, user_input, file_system, environment, ipc) with UNTRUSTED trust level.

Data flows are extracted from the call graph (edges_call) between entry-point modules, plus synthetic flows from external entities to processes with matching categories.

Each component receives a TrustLevel: - UNTRUSTED (0) – network-facing, socket, user input, protocol handlers - PARTIALLY_TRUSTED (1) – auth, connection handlers - TRUSTED (2) – internal modules

Stage 2: Trust Boundary Detector (`trust_boundary.py`)¶

The TrustBoundaryDetector identifies five types of trust boundaries:

Boundary Type	Indicators	Entry Categories
`network`	recv, accept, listen, http_handler, grpc_handler, serve	network, socket, protocol, http_handler
`auth`	authenticate, authorize, check_permission, verify_token, login	auth, connection
`ffi`	cgo_call, ctypes, cffi, jni, ffi_call	extension
`process`	exec, subprocess, popen, system, spawn	exec
`file_system`	open, read_file, write_file, fopen, readdir	file_access

Detection is heuristic: it matches component names and entry point categories against indicator lists. Custom indicators can be provided via configuration.

After detection, the detector marks data flows that cross trust boundaries. A flow crosses a boundary when the source and target have different trust levels, or when one is inside and the other outside a boundary.

Stage 3: STRIDE Classifier (`stride_classifier.py`)¶

The STRIDEClassifier generates threats from four sources:

Hypothesis findings – CWE-based security findings are mapped to STRIDE categories via the CWE_TO_STRIDE dictionary.
CPG pattern findings – Rows from cpg_pattern_findings are classified by their CWE IDs, or by pattern name heuristics when CWEs are absent.
Boundary crossing inference – Unencrypted flows across trust boundaries produce Information Disclosure threats; user input across boundaries produces Tampering threats; credential flows without auth boundaries produce Spoofing threats.
Unprotected entry points – Network-facing entry points not inside an auth boundary produce Spoofing threats (CWE-306).

The classifier deduplicates threats by (category, affected_component, cwe_ids) and ranks them by risk score (severity x likelihood).

Stage 4: Mitigation Recommender (`mitigation.py`)¶

The MitigationRecommender provides two layers of recommendations:

Category-based: 18 standard mitigations across 6 STRIDE categories (3 per category), each with an ID (e.g., M-S-1, M-T-2), title, and description.
CWE-specific: 14 CWE-specific mitigations for the most common vulnerabilities (CWE-89, CWE-79, CWE-78, CWE-120, CWE-287, CWE-200, CWE-352, CWE-502, CWE-319, CWE-400, CWE-862, CWE-787, CWE-416, CWE-476).

Mitigation output is prioritized by severity: critical/high threats get all mitigations, medium gets 4, low gets 2.

Stage 5: Exporter (`exporter.py`)¶

The ThreatModelExporter supports five output formats:

Format	Method	Description
JSON	`to_json()` / `to_json_string()`	Full model as JSON dict or formatted string
Markdown	`to_markdown(language="en"\\|"ru")`	Report with summary tables, DFD, threat list, mitigations
GOST	`to_gost(language="ru")`	GOST R 56939-2024 artifacts 5.7.3.1-5.7.3.4
SARIF 2.1.0	`to_sarif()`	OASIS SARIF format for IDE/CI integration
Mermaid DFD	`to_mermaid_dfd()`	Mermaid diagram with trust boundary subgraphs

The Mermaid DFD renderer uses distinct shapes for component types: - Processes: (name) (rounded) - Data Stores: [(name)] (cylinder) - External Entities: [/name\] (trapezoid) - Encrypted flows: solid arrows with “encrypted” label - Boundary-crossing flows: dashed arrows

Stage 6: Incremental Updater (`incremental.py`)¶

The IncrementalThreatModelUpdater compares a previous and newly generated model, producing a ThreatModelDelta with:

added_threats – threats present only in the new model
removed_threats – threats present only in the old model
modified_threats – threats with the same ID but changed severity, status, category, CWEs, or affected component
added_components / removed_components – DFD component changes

The delta is stored in the merged model’s metadata.delta_summary for audit trail purposes.

CLI Usage¶

All threat model commands are under the threat-model subcommand group.

Generate a full threat model¶

python -m src.cli threat-model generate \
    --db data/projects/myproject.duckdb \
    --format json \
    --output threat_model.json

Options:

Flag	Default	Description
`--db`	Active project DB	Path to DuckDB CPG database
`--format`	`json`	Output format: `json`, `markdown`, `gost`, `sarif`, `mermaid`
`--output`	stdout	Output file path
`--language`	`en`	Language for Markdown/GOST: `en` or `ru`
`--include-mitigations`	`true`	Include mitigation recommendations
`--hypothesis-results`	none	Path to JSON file with hypothesis findings

Incremental update¶

python -m src.cli threat-model update \
    --db data/projects/myproject.duckdb \
    --previous threat_model_v1.json \
    --output threat_model_v2.json \
    --changed-files src/auth/login.c src/net/handler.c

Produces the updated model plus a delta summary showing added, removed, and modified threats.

Export DFD only¶

python -m src.cli threat-model dfd \
    --db data/projects/myproject.duckdb \
    --format mermaid \
    --output dfd.mmd

Generates just the Data Flow Diagram in Mermaid format, without running STRIDE classification.

List threats¶

python -m src.cli threat-model list \
    --db data/projects/myproject.duckdb \
    --severity high,critical \
    --category spoofing,elevation \
    --format json

Filters and lists threats from the current model, useful for CI pipelines.

API Endpoints¶

Seven REST endpoints are available under /api/v1/security/threat-model/.

POST /generate¶

Generate a full threat model for the active project.

curl -X POST http://localhost:8000/api/v1/security/threat-model/generate \
  -H "Content-Type: application/json" \
  -H "X-Project-Id: myproject" \
  -d '{
    "include_mitigations": true,
    "hypothesis_results_path": null
  }'

Response: full ThreatModel JSON object.

GET /export¶

Export the current threat model in a specified format.

curl "http://localhost:8000/api/v1/security/threat-model/export?format=markdown&language=en"

GET /dfd¶

Return the Data Flow Diagram for the active project.

curl "http://localhost:8000/api/v1/security/threat-model/dfd?format=mermaid"

Returns Mermaid source text or JSON component/flow structure depending on format.

GET /threats¶

List threats with optional filtering.

curl "http://localhost:8000/api/v1/security/threat-model/threats?severity=critical,high&category=spoofing"

Query parameters: severity, category, status, cwe_id. Returns a filtered list.

POST /update¶

Incremental update from a previous model.

curl -X POST http://localhost:8000/api/v1/security/threat-model/update \
  -H "Content-Type: application/json" \
  -d '{
    "previous_model": { ... },
    "changed_files": ["src/auth/login.c"]
  }'

Response includes model (updated) and delta (changes).

GET /mitigations¶

Get mitigation recommendations for all threats or a specific threat ID.

curl "http://localhost:8000/api/v1/security/threat-model/mitigations?threat_id=TM-spoofing-CWE-287-hyp-1"

GET /stride-mapping¶

Return the CWE-to-STRIDE mapping table.

curl "http://localhost:8000/api/v1/security/threat-model/stride-mapping"

Returns { "CWE-89": ["tampering"], "CWE-287": ["spoofing"], ... } with all 43 entries.

MCP Tools¶

Two MCP tools are exposed for IDE and agent integration.

codegraph_security_threat_model_generate¶

Generates a threat model for the active project.

{
  "tool": "codegraph_security_threat_model_generate",
  "arguments": {
    "format": "json",
    "language": "en",
    "include_mitigations": true
  }
}

Returns the complete threat model in the requested format.

codegraph_security_threat_model_dfd_generate¶

Returns the Data Flow Diagram.

{
  "tool": "codegraph_security_threat_model_dfd_generate",
  "arguments": {
    "format": "mermaid"
  }
}

Returns Mermaid DFD source or JSON structure with components, flows, and trust boundaries.

GOST R 56939-2024 Compliance¶

The threat model feature implements process 5.7 (Threat Modeling) from GOST R 56939-2024. The to_gost() exporter produces four artifacts required by the standard:

Artifact	GOST Section	Content
Threat model table	5.7.3.1	STRIDE-classified threats with severity, CWE, component, status
Mitigation list	5.7.3.2	Prioritized countermeasures per threat
Attack surface description	5.7.3.3	Entry points, trust boundaries, boundary-crossing flows
Research targets	5.7.3.4	High-risk components ranked by critical/high threat count

The ThreatModel.compliance_score property calculates the ratio of mitigated threats to total threats, per GOST 5.7 requirements.

Per GOST R 56939-2024 section 5.7.2.4, the threat model must be updated when the codebase changes. The incremental updater (incremental.py) fulfills this requirement by computing deltas between model versions and recording the change context (changed files, previous version) in the model metadata.

Configuration¶

Threat model settings are in config.yaml under the threat_model: section, backed by ThreatModelConfig in unified_config.py.

threat_model:
  enabled: true

  # Minimum severity to include in reports
  min_severity: low  # low | medium | high | critical

  # Include mitigations by default
  include_mitigations: true

  # Default export format
  default_format: json  # json | markdown | gost | sarif | mermaid

  # Default language for bilingual exports
  default_language: en  # en | ru

  # Trust boundary detection customization
  trust_boundary_detection:
    network_indicators: null   # Override default network indicators
    auth_indicators: null      # Override default auth indicators
    ffi_indicators: null       # Override default FFI indicators

  # Incremental update settings
  incremental:
    auto_update: false        # Auto-update on CPG re-parse
    store_history: true       # Store previous versions for delta
    max_history_versions: 10  # Max stored versions

  # GOST compliance
  gost:
    enabled: true             # Enable GOST artifact generation
    include_research_targets: true

Access configuration programmatically:

from src.config import get_unified_config

cfg = get_unified_config()
cfg.threat_model.enabled
cfg.threat_model.min_severity
cfg.threat_model.default_format
cfg.threat_model.trust_boundary_detection

CWE-to-STRIDE Mapping Reference¶

The 43 CWEs are distributed across STRIDE categories as follows. Some CWEs appear in multiple categories.

Spoofing (7 CWEs): CWE-287, CWE-290, CWE-294, CWE-295, CWE-306, CWE-384, CWE-613

Tampering (9 CWEs): CWE-20, CWE-79, CWE-89, CWE-94, CWE-78, CWE-352, CWE-434, CWE-502, CWE-611, CWE-917

Repudiation (3 CWEs): CWE-778, CWE-223, CWE-532

Information Disclosure (9 CWEs): CWE-200, CWE-209, CWE-312, CWE-319, CWE-522, CWE-538, CWE-601, CWE-732

Denial of Service (5 CWEs): CWE-400, CWE-770, CWE-776, CWE-835, CWE-674

Elevation of Privilege (9 CWEs): CWE-250, CWE-269, CWE-276, CWE-863, CWE-862, CWE-120, CWE-416, CWE-476, CWE-787, CWE-190

Cross-category mappings: CWE-79 (Tampering + Info Disclosure), CWE-94 (Tampering + Elevation), CWE-78 (Tampering + Elevation), CWE-502 (Tampering + Elevation), CWE-611 (Tampering + Info Disclosure), CWE-532 (Repudiation + Info Disclosure), CWE-601 (Info Disclosure + Spoofing), CWE-476 (DoS + Elevation), CWE-190 (Elevation + DoS).

Risk Scoring¶

Each STRIDEThreat has a computed risk_score property:

risk_score = severity_value * likelihood_value

Severity values: critical=10, high=7, medium=4, low=1. Likelihood values: high=3, medium=2, low=1.

Maximum risk score: 30 (critical severity, high likelihood). Threats are sorted by risk score descending in classifier output.

Integration with Hypothesis System¶

The STRIDE classifier consumes findings from the Security Hypothesis System (V2). When hypothesis results are provided (via CLI --hypothesis-results flag or API request), the classifier maps each finding’s CWE ID to STRIDE categories and creates threats with full provenance (hypothesis finding ID in the evidence field).

This integration means the threat model benefits from the hypothesis system’s 58 CWE patterns and 27 CAPEC attack patterns, significantly increasing coverage beyond what static CPG pattern matching alone provides.

Examples¶

Generating a GOST-compliant report¶

python -m src.cli threat-model generate \
    --db data/projects/postgres.duckdb \
    --format gost \
    --language ru \
    --output threat_model_gost.md

Mermaid DFD in a CI pipeline¶

python -m src.cli threat-model dfd \
    --db data/projects/myapp.duckdb \
    --format mermaid \
    --output docs/dfd.mmd

# Convert to image (requires mmdc / mermaid-cli)
mmdc -i docs/dfd.mmd -o docs/dfd.svg

Incremental update after a code change¶

# 1. Re-parse changed files
./gocpg parse --input=src/ --output=myapp.duckdb --lang=c --incremental

# 2. Update threat model
python -m src.cli threat-model update \
    --db myapp.duckdb \
    --previous threat_model_v1.json \
    --output threat_model_v2.json \
    --changed-files src/auth/handler.c src/net/server.c

# 3. Check delta
python -c "
import json
with open('threat_model_v2.json') as f:
    model = json.load(f)
delta = model['metadata']['delta_summary']
print(f'Added: {delta[\"added_threats\"]}, Removed: {delta[\"removed_threats\"]}, Modified: {delta[\"modified_threats\"]}')
"