STRIDE Threat Model

Overview

CodeGraph includes an automated STRIDE threat model generator that builds threat models directly from the Code Property Graph (CPG). Instead of manually drawing data flow diagrams and enumerating threats in spreadsheets, CodeGraph extracts DFD components, data flows, and trust boundaries from the parsed codebase, then classifies threats using the STRIDE methodology with 43 CWE-to-STRIDE mappings.

The feature produces structured output in multiple formats (JSON, Markdown, GOST, SARIF 2.1.0, Mermaid DFD) and supports incremental updates so that the threat model stays current as code evolves.

Key capabilities:

  • Automatic DFD extraction from CPG entry points, call graph, taint sources/sinks
  • Five trust boundary types detected via heuristic pattern matching
  • 43 CWEs mapped to 6 STRIDE categories (MITRE Top 25 + OWASP Top 10)
  • 18 standard mitigations plus 14 CWE-specific recommendations
  • GOST R 56939-2024 process 5.7 compliance (artifacts 5.7.3.1 through 5.7.3.7)
  • Incremental delta computation between model versions
  • Bilingual export (English and Russian)

Source files: src/security/threat_model/

File Purpose
models.py Data models: ThreatModel, STRIDEThreat, DataComponent, DataFlow, TrustBoundary, ThreatModelDelta
dfd_builder.py Extracts DFD from CPG tables (nodes_method, edges_call)
trust_boundary.py Detects 5 trust boundary types from entry point categories
stride_classifier.py Maps CWE findings and CPG patterns to STRIDE categories
mitigation.py Recommends countermeasures per STRIDE category and CWE
exporter.py Exports to JSON, Markdown (EN/RU), GOST, SARIF 2.1.0, Mermaid DFD
incremental.py Computes delta between old and new models

STRIDE Methodology

STRIDE is a threat classification framework developed by Microsoft. Each letter represents a category of threat:

Category Enum Value Description Security Property Violated
Spoofing spoofing Impersonating a user, system, or component Authentication
Tampering tampering Unauthorized modification of data or code Integrity
Repudiation repudiation Denying an action without proof otherwise Non-repudiation
Information Disclosure info_disclosure Exposing data to unauthorized parties Confidentiality
Denial of Service dos Degrading or denying availability Availability
Elevation of Privilege elevation Gaining unauthorized access or capabilities Authorization

CodeGraph maps 43 CWEs to these 6 categories. Some CWEs map to multiple categories (e.g., CWE-79 maps to both Tampering and Information Disclosure). The full mapping is defined in CWE_TO_STRIDE inside stride_classifier.py.


Architecture

The threat model pipeline consists of six stages executed in sequence:

graph LR
    CPG[(CPG DuckDB)] --> DFD[DFD Builder]
    Domain[Domain Plugin] --> DFD
    DFD --> TB[Trust Boundary Detector]
    TB --> SC[STRIDE Classifier]
    Hyp[Hypothesis Results] --> SC
    SC --> MR[Mitigation Recommender]
    MR --> EXP[Exporter]
    EXP --> JSON[JSON]
    EXP --> MD[Markdown]
    EXP --> GOST[GOST Report]
    EXP --> SARIF[SARIF 2.1.0]
    EXP --> MER[Mermaid DFD]
    PrevModel[Previous Model] --> INC[Incremental Updater]
    EXP --> INC
    INC --> Delta[ThreatModelDelta]

Stage 1: DFD Builder (dfd_builder.py)

The DFDBuilder class extracts a Data Flow Diagram from the CPG. It queries three categories of components:

  • Processes: Modules containing entry points (from nodes_method WHERE is_entry_point = true), grouped by file-level granularity.
  • Data Stores: Inferred from taint sink categories (database, file, cache, log, network_send) via the domain plugin or built-in fallback.
  • External Entities: Inferred from taint source categories (network, user_input, file_system, environment, ipc) with UNTRUSTED trust level.

Data flows are extracted from the call graph (edges_call) between entry-point modules, plus synthetic flows from external entities to processes with matching categories.

Each component receives a TrustLevel: - UNTRUSTED (0) – network-facing, socket, user input, protocol handlers - PARTIALLY_TRUSTED (1) – auth, connection handlers - TRUSTED (2) – internal modules

Stage 2: Trust Boundary Detector (trust_boundary.py)

The TrustBoundaryDetector identifies five types of trust boundaries:

Boundary Type Indicators Entry Categories
network recv, accept, listen, http_handler, grpc_handler, serve network, socket, protocol, http_handler
auth authenticate, authorize, check_permission, verify_token, login auth, connection
ffi cgo_call, ctypes, cffi, jni, ffi_call extension
process exec, subprocess, popen, system, spawn exec
file_system open, read_file, write_file, fopen, readdir file_access

Detection is heuristic: it matches component names and entry point categories against indicator lists. Custom indicators can be provided via configuration.

After detection, the detector marks data flows that cross trust boundaries. A flow crosses a boundary when the source and target have different trust levels, or when one is inside and the other outside a boundary.

Stage 3: STRIDE Classifier (stride_classifier.py)

The STRIDEClassifier generates threats from four sources:

  1. Hypothesis findings – CWE-based security findings are mapped to STRIDE categories via the CWE_TO_STRIDE dictionary.
  2. CPG pattern findings – Rows from cpg_pattern_findings are classified by their CWE IDs, or by pattern name heuristics when CWEs are absent.
  3. Boundary crossing inference – Unencrypted flows across trust boundaries produce Information Disclosure threats; user input across boundaries produces Tampering threats; credential flows without auth boundaries produce Spoofing threats.
  4. Unprotected entry points – Network-facing entry points not inside an auth boundary produce Spoofing threats (CWE-306).

The classifier deduplicates threats by (category, affected_component, cwe_ids) and ranks them by risk score (severity x likelihood).

Stage 4: Mitigation Recommender (mitigation.py)

The MitigationRecommender provides two layers of recommendations:

  • Category-based: 18 standard mitigations across 6 STRIDE categories (3 per category), each with an ID (e.g., M-S-1, M-T-2), title, and description.
  • CWE-specific: 14 CWE-specific mitigations for the most common vulnerabilities (CWE-89, CWE-79, CWE-78, CWE-120, CWE-287, CWE-200, CWE-352, CWE-502, CWE-319, CWE-400, CWE-862, CWE-787, CWE-416, CWE-476).

Mitigation output is prioritized by severity: critical/high threats get all mitigations, medium gets 4, low gets 2.

Stage 5: Exporter (exporter.py)

The ThreatModelExporter supports five output formats:

Format Method Description
JSON to_json() / to_json_string() Full model as JSON dict or formatted string
Markdown to_markdown(language="en"\|"ru") Report with summary tables, DFD, threat list, mitigations
GOST to_gost(language="ru") GOST R 56939-2024 artifacts 5.7.3.1-5.7.3.4
SARIF 2.1.0 to_sarif() OASIS SARIF format for IDE/CI integration
Mermaid DFD to_mermaid_dfd() Mermaid diagram with trust boundary subgraphs

The Mermaid DFD renderer uses distinct shapes for component types: - Processes: (name) (rounded) - Data Stores: [(name)] (cylinder) - External Entities: [/name\] (trapezoid) - Encrypted flows: solid arrows with “encrypted” label - Boundary-crossing flows: dashed arrows

Stage 6: Incremental Updater (incremental.py)

The IncrementalThreatModelUpdater compares a previous and newly generated model, producing a ThreatModelDelta with:

  • added_threats – threats present only in the new model
  • removed_threats – threats present only in the old model
  • modified_threats – threats with the same ID but changed severity, status, category, CWEs, or affected component
  • added_components / removed_components – DFD component changes

The delta is stored in the merged model’s metadata.delta_summary for audit trail purposes.


CLI Usage

All threat model commands are under the threat-model subcommand group.

Generate a full threat model

python -m src.cli threat-model generate \
    --db data/projects/myproject.duckdb \
    --format json \
    --output threat_model.json

Options:

Flag Default Description
--db Active project DB Path to DuckDB CPG database
--format json Output format: json, markdown, gost, sarif, mermaid
--output stdout Output file path
--language en Language for Markdown/GOST: en or ru
--include-mitigations true Include mitigation recommendations
--hypothesis-results none Path to JSON file with hypothesis findings

Incremental update

python -m src.cli threat-model update \
    --db data/projects/myproject.duckdb \
    --previous threat_model_v1.json \
    --output threat_model_v2.json \
    --changed-files src/auth/login.c src/net/handler.c

Produces the updated model plus a delta summary showing added, removed, and modified threats.

Export DFD only

python -m src.cli threat-model dfd \
    --db data/projects/myproject.duckdb \
    --format mermaid \
    --output dfd.mmd

Generates just the Data Flow Diagram in Mermaid format, without running STRIDE classification.

List threats

python -m src.cli threat-model list \
    --db data/projects/myproject.duckdb \
    --severity high,critical \
    --category spoofing,elevation \
    --format json

Filters and lists threats from the current model, useful for CI pipelines.


API Endpoints

Seven REST endpoints are available under /api/v1/security/threat-model/.

POST /generate

Generate a full threat model for the active project.

curl -X POST http://localhost:8000/api/v1/security/threat-model/generate \
  -H "Content-Type: application/json" \
  -H "X-Project-Id: myproject" \
  -d '{
    "include_mitigations": true,
    "hypothesis_results_path": null
  }'

Response: full ThreatModel JSON object.

GET /export

Export the current threat model in a specified format.

curl "http://localhost:8000/api/v1/security/threat-model/export?format=markdown&language=en"

Query parameters: format (json|markdown|gost|sarif|mermaid), language (en|ru).

GET /dfd

Return the Data Flow Diagram for the active project.

curl "http://localhost:8000/api/v1/security/threat-model/dfd?format=mermaid"

Returns Mermaid source text or JSON component/flow structure depending on format.

GET /threats

List threats with optional filtering.

curl "http://localhost:8000/api/v1/security/threat-model/threats?severity=critical,high&category=spoofing"

Query parameters: severity, category, status, cwe_id. Returns a filtered list.

POST /update

Incremental update from a previous model.

curl -X POST http://localhost:8000/api/v1/security/threat-model/update \
  -H "Content-Type: application/json" \
  -d '{
    "previous_model": { ... },
    "changed_files": ["src/auth/login.c"]
  }'

Response includes model (updated) and delta (changes).

GET /mitigations

Get mitigation recommendations for all threats or a specific threat ID.

curl "http://localhost:8000/api/v1/security/threat-model/mitigations?threat_id=TM-spoofing-CWE-287-hyp-1"

GET /stride-mapping

Return the CWE-to-STRIDE mapping table.

curl "http://localhost:8000/api/v1/security/threat-model/stride-mapping"

Returns { "CWE-89": ["tampering"], "CWE-287": ["spoofing"], ... } with all 43 entries.


MCP Tools

Two MCP tools are exposed for IDE and agent integration.

codegraph_threat_model_generate

Generates a threat model for the active project.

{
  "tool": "codegraph_threat_model_generate",
  "arguments": {
    "format": "json",
    "language": "en",
    "include_mitigations": true
  }
}

Returns the complete threat model in the requested format.

codegraph_threat_model_dfd

Returns the Data Flow Diagram.

{
  "tool": "codegraph_threat_model_dfd",
  "arguments": {
    "format": "mermaid"
  }
}

Returns Mermaid DFD source or JSON structure with components, flows, and trust boundaries.


GOST R 56939-2024 Compliance

The threat model feature implements process 5.7 (Threat Modeling) from GOST R 56939-2024. The to_gost() exporter produces four artifacts required by the standard:

Artifact GOST Section Content
Threat model table 5.7.3.1 STRIDE-classified threats with severity, CWE, component, status
Mitigation list 5.7.3.2 Prioritized countermeasures per threat
Attack surface description 5.7.3.3 Entry points, trust boundaries, boundary-crossing flows
Research targets 5.7.3.4 High-risk components ranked by critical/high threat count

The ThreatModel.compliance_score property calculates the ratio of mitigated threats to total threats, per GOST 5.7 requirements.

Per GOST R 56939-2024 section 5.7.2.4, the threat model must be updated when the codebase changes. The incremental updater (incremental.py) fulfills this requirement by computing deltas between model versions and recording the change context (changed files, previous version) in the model metadata.


Configuration

Threat model settings are in config.yaml under the threat_model: section, backed by ThreatModelConfig in unified_config.py.

threat_model:
  enabled: true

  # Minimum severity to include in reports
  min_severity: low  # low | medium | high | critical

  # Include mitigations by default
  include_mitigations: true

  # Default export format
  default_format: json  # json | markdown | gost | sarif | mermaid

  # Default language for bilingual exports
  default_language: en  # en | ru

  # Trust boundary detection customization
  trust_boundary_detection:
    network_indicators: null   # Override default network indicators
    auth_indicators: null      # Override default auth indicators
    ffi_indicators: null       # Override default FFI indicators

  # Incremental update settings
  incremental:
    auto_update: false        # Auto-update on CPG re-parse
    store_history: true       # Store previous versions for delta
    max_history_versions: 10  # Max stored versions

  # GOST compliance
  gost:
    enabled: true             # Enable GOST artifact generation
    include_research_targets: true

Access configuration programmatically:

from src.config import get_unified_config

cfg = get_unified_config()
cfg.threat_model.enabled
cfg.threat_model.min_severity
cfg.threat_model.default_format
cfg.threat_model.trust_boundary_detection

CWE-to-STRIDE Mapping Reference

The 43 CWEs are distributed across STRIDE categories as follows. Some CWEs appear in multiple categories.

Spoofing (7 CWEs): CWE-287, CWE-290, CWE-294, CWE-295, CWE-306, CWE-384, CWE-613

Tampering (9 CWEs): CWE-20, CWE-79, CWE-89, CWE-94, CWE-78, CWE-352, CWE-434, CWE-502, CWE-611, CWE-917

Repudiation (3 CWEs): CWE-778, CWE-223, CWE-532

Information Disclosure (9 CWEs): CWE-200, CWE-209, CWE-312, CWE-319, CWE-522, CWE-538, CWE-601, CWE-732

Denial of Service (5 CWEs): CWE-400, CWE-770, CWE-776, CWE-835, CWE-674

Elevation of Privilege (9 CWEs): CWE-250, CWE-269, CWE-276, CWE-863, CWE-862, CWE-120, CWE-416, CWE-476, CWE-787, CWE-190

Cross-category mappings: CWE-79 (Tampering + Info Disclosure), CWE-94 (Tampering + Elevation), CWE-78 (Tampering + Elevation), CWE-502 (Tampering + Elevation), CWE-611 (Tampering + Info Disclosure), CWE-532 (Repudiation + Info Disclosure), CWE-601 (Info Disclosure + Spoofing), CWE-476 (DoS + Elevation), CWE-190 (Elevation + DoS).


Risk Scoring

Each STRIDEThreat has a computed risk_score property:

risk_score = severity_value * likelihood_value

Severity values: critical=10, high=7, medium=4, low=1. Likelihood values: high=3, medium=2, low=1.

Maximum risk score: 30 (critical severity, high likelihood). Threats are sorted by risk score descending in classifier output.


Integration with Hypothesis System

The STRIDE classifier consumes findings from the Security Hypothesis System (V2). When hypothesis results are provided (via CLI --hypothesis-results flag or API request), the classifier maps each finding’s CWE ID to STRIDE categories and creates threats with full provenance (hypothesis finding ID in the evidence field).

This integration means the threat model benefits from the hypothesis system’s 58 CWE patterns and 27 CAPEC attack patterns, significantly increasing coverage beyond what static CPG pattern matching alone provides.


Examples

Generating a GOST-compliant report

python -m src.cli threat-model generate \
    --db data/projects/postgres.duckdb \
    --format gost \
    --language ru \
    --output threat_model_gost.md

Mermaid DFD in a CI pipeline

python -m src.cli threat-model dfd \
    --db data/projects/myapp.duckdb \
    --format mermaid \
    --output docs/dfd.mmd

# Convert to image (requires mmdc / mermaid-cli)
mmdc -i docs/dfd.mmd -o docs/dfd.svg

Incremental update after a code change

# 1. Re-parse changed files
./gocpg parse --input=src/ --output=myapp.duckdb --lang=c --incremental

# 2. Update threat model
python -m src.cli threat-model update \
    --db myapp.duckdb \
    --previous threat_model_v1.json \
    --output threat_model_v2.json \
    --changed-files src/auth/handler.c src/net/server.c

# 3. Check delta
python -c "
import json
with open('threat_model_v2.json') as f:
    model = json.load(f)
delta = model['metadata']['delta_summary']
print(f'Added: {delta[\"added_threats\"]}, Removed: {delta[\"removed_threats\"]}, Modified: {delta[\"modified_threats\"]}')
"