Scenario 11: Architecture Analysis

Software architect understanding and documenting system architecture, detecting violations, and analyzing dependencies.

Table of Contents

Quick Start

# Select Architecture Scenario
/select 11

How It Works

Intent Classification

The ArchitectureIntentDetector classifies queries into one of 17 intents (+ fallback), sorted by priority (lower = more specific = checked first). Each intent has EN and RU keywords with morphological matching, plus domain-specific keywords loaded from the active plugin.

Intent Priority Description
dependency_cycle 5 Circular/cyclic dependencies
srp_violation 6 Single Responsibility Principle violations
wcc_analysis 7 Weakly connected components
shared_deps 8 Shared dependencies between modules
transitive_deps 9 Transitive dependencies from entry points
extraction_candidates 10 Candidates for extraction to library
di_points 11 Dependency injection points
hidden_deps 12 Hidden dependencies via global variables
layer_analysis 13 Architecture layer analysis
layer_violation 14 Layer boundary violations
stability_metrics 15 Dependency stability metrics
change_impact 16 Change impact analysis
coupling_analysis 17 Coupling/cohesion, fan-in/fan-out
module_deps 18 Specific module dependencies
interface_deps 19 Interface/exported function dependencies
modularity_check 20 Modularity and encapsulation check
full_analysis 21 Comprehensive end-to-end analysis
external_dependencies 50 External libraries and third-party deps

If no intent matches, the fallback general_architecture is used (confidence=0.5).

The detector also extracts: - target_component — subsystem/module name from the query - severity_filter — critical / high / medium / low / all (EN + RU morphological matching)

Two-Phase Architecture

S11 uses a two-phase approach for optimal performance:

Phase 1: Handler-based (no LLM). The integrate_handlers() function tries 17 registered template-based handlers. Each handler produces structured reports directly from CPG data using ArchitectureReportFormatter. If a handler matches the detected intent and finds results, the response is returned without calling the LLM.

Phase 2: LLM fallback. If no handler matched or the handler found 0 results, the full pipeline runs: DependencyAnalyzer detects violations, LayerValidator checks layering rules, CallGraphAnalyzer analyzes dependency paths, and ArchitectureReporter generates a prioritized report — then LLM enriches the response.

Query -> ArchitectureIntentDetector -> integrate_handlers()
  |                                          |
  |  Phase 1: Handler matched?      Yes -> Structured report (no LLM)
  |                                 No  -> Phase 2: Full pipeline
  |
  Phase 2: DependencyAnalyzer -> LayerValidator
           -> CallGraphAnalyzer -> ArchitectureReporter -> LLM

The 17 handlers are registered in HandlerRegistry("architecture"):

Handler Priority Intent
SRPViolationHandler 8 srp_violation
DependencyCycleHandler 10 dependency_cycle
LayerAnalysisHandler 11 layer_analysis
SharedDepsHandler 12 shared_deps
ModuleDepsHandler 13 module_deps
WCCAnalysisHandler 14 wcc_analysis
CouplingHandler 15 coupling_analysis
TransitiveDepsHandler 16 transitive_deps
InterfaceDepsHandler 17 interface_deps
ChangeImpactHandler 18 change_impact
StabilityMetricsHandler 19 stability_metrics
LayerViolationHandler 20 layer_violation
FullAnalysisHandler 21 full_analysis
HiddenDepsHandler 22 hidden_deps
DIPointsHandler 23 di_points
ExtractionCandidatesHandler 24 extraction_candidates
ExternalDepsHandler 30 external_dependencies

Three Architecture Agents

The LLM fallback pipeline uses three specialized agents from src/architecture/agents/:

DependencyAnalyzer(cpg_service) — Agent 1. Detects dependency violations using architecture patterns and CallGraphAnalyzer: - Circular dependencies between modules - God modules (excessive fan-out) - Unstable dependencies (stable modules depending on unstable ones) - Feature envy (methods too interested in other modules) - Inappropriate intimacy (bidirectional tight coupling) - Methods: detect_all_violations(limit_per_pattern), calculate_dependency_metrics(), identify_architectural_chokepoints()

LayerValidator(cpg_service, layer_hierarchy=None) — Agent 2. Validates architectural layering: - Lower layers calling higher layers (violation) - Cross-layer dependencies (skip-layer smell) - Default 4-tier hierarchy: system(0), storage/data(1), business/logic(2), presentation/interface(3) - Methods: validate_all_layers(limit), get_layering_violations()

ArchitectureReporter() — Agent 3. Generates reports: - Structured violation reports with severity breakdown (critical/high/medium/low) - Category breakdown (dependency/layering/coupling/cohesion) - Remediation recommendations with RemediationAction - Methods: generate_report(findings, dependency_analysis, layer_metrics), create_remediation_plan(findings)

Key data models (src/architecture/agents/models.py): - ViolationFinding — detected violation (severity, category, description, affected_files) - DependencyAnalysis / DependencyMetrics — analysis results and metrics - ArchitectureReport — complete report with findings and remediation - RemediationAction — recommended fix with priority

Dependency Analysis

Circular Dependencies

> Find circular dependencies in the codebase

╭─────────────── Dependency Cycles ─────────────────────────────╮
│                                                                │
│  Detected Cycles: 3                                            │
│                                                                │
│  Cycle 1 (critical):                                           │
│    executor -> planner -> executor                             │
│    Files: execMain.c, planmain.c                               │
│    Impact: 12 transitive dependents                            │
│                                                                │
│  Cycle 2 (high):                                               │
│    storage -> catalog -> storage                               │
│    Files: bufmgr.c, pg_class.c                                 │
│    Impact: 8 transitive dependents                             │
│                                                                │
│  Cycle 3 (medium):                                             │
│    utils -> parser -> utils                                    │
│    Files: stringinfo.c, scansup.c                              │
│    Impact: 3 transitive dependents                             │
│                                                                │
╰────────────────────────────────────────────────────────────────╯

The DependencyCycleHandler uses CallGraphAnalyzer to traverse the call graph and detect cycles, then ranks them by severity based on transitive impact.

God Modules and Unstable Dependencies

DependencyAnalyzer detects additional violation types: - God modules — modules with excessive fan-out (too many dependencies) - Unstable dependencies — when a stable module depends on an unstable one (violates the Stable Dependencies Principle) - Feature envy — methods that access data from other modules more than their own - Inappropriate intimacy — bidirectional tight coupling between modules

Dependency Fast-Path

For simple dependency queries (“What does X depend on?”, “Who includes X?”), S11 uses a fast-path that bypasses the full agent pipeline:

  • detect_architecture_query_type() classifies the query into: who_imports_x, what_x_depends_on, or external_deps
  • Queries the nodes_import table directly for #include / import relationships
  • Extracts target_module from query using regex patterns (EN + RU)
  • Returns results without LLM involvement

Layer Analysis

Layer Hierarchy

LayerValidator enforces a configurable layer hierarchy. The default 4-tier model:

Layer Level Examples
System/Infrastructure 0 OS calls, utilities
Storage/Data 1 Buffer management, file I/O
Business/Logic 2 Query processing, optimization
Presentation/Interface 3 UI, CLI, protocols

Rules: Higher layers can depend on lower layers. Lower layers CANNOT depend on higher layers. Skip-layer dependencies (e.g., presentation -> storage) are flagged as a smell.

Layer Violations

> Find architecture layer violations

╭─────────────── Layer Violations ──────────────────────────────╮
│                                                                │
│  Violations Found: 5                                           │
│                                                                │
│  1. storage -> presentation (critical)                         │
│     bufmgr.c calls format_output()                             │
│     Rule: Storage layer cannot call presentation               │
│                                                                │
│  2. data -> interface (high)                                   │
│     catalog.c references client_auth()                         │
│     Rule: Data layer cannot reference interface                │
│                                                                │
│  Remediation:                                                  │
│    - Introduce abstraction layer between violating modules     │
│    - Use dependency injection to invert the dependency         │
│                                                                │
╰────────────────────────────────────────────────────────────────╯

Coupling and Cohesion

Fan-In and Fan-Out

The CouplingHandler analyzes module coupling using fan-in/fan-out metrics:

  • Fan-out — number of modules a given module depends on (high = potential god module)
  • Fan-in — number of modules that depend on a given module (high = critical infrastructure)
  • Coupling score — combined metric indicating how tightly coupled a module is

SRP Violations

The SRPViolationHandler detects modules that violate the Single Responsibility Principle — modules with multiple unrelated responsibilities, indicated by low cohesion and high method count across distinct functional areas.

Specialized Analysis

Hidden Dependencies

The HiddenDepsHandler detects dependencies not visible through imports: - Global variable access across modules - Implicit dependencies through shared state - Side effects that create coupling without explicit imports

Extraction Candidates

The ExtractionCandidatesHandler identifies modules suitable for extraction into separate libraries: - High cohesion (self-contained) - Low coupling with the rest of the system - Clear interface boundaries

Change Impact and Stability Metrics

Change impact (ChangeImpactHandler): Analyzes the blast radius of modifying a specific module — how many other modules would be affected, using CallGraphAnalyzer transitive dependency analysis.

Stability metrics (StabilityMetricsHandler): Calculates the Stable Dependencies Principle metrics: - Instability = fan-out / (fan-in + fan-out) - Stable modules (I close to 0) should not depend on unstable modules (I close to 1)

CLI Usage

# Find circular dependencies
python -m src.cli query "Find circular dependencies in the codebase"

# Analyze module dependencies
python -m src.cli query "What does the executor module depend on?"

# Detect layer violations
python -m src.cli query "Find architecture layer violations"

# Coupling analysis
python -m src.cli query "Show coupling analysis with fan-in fan-out"

# Change impact
python -m src.cli query "What is the impact of changing the buffer manager?"

# Full architecture audit (via audit composite)
python -m src.cli audit --db path/to/project.duckdb

Example Questions

  • “Find circular dependencies in the codebase”
  • “What does the [module] module depend on?”
  • “Show architecture layer violations”
  • “Analyze coupling and cohesion of [subsystem]”
  • “Find god modules with excessive dependencies”
  • “What is the impact of changing [module]?”
  • “Show stability metrics for all modules”
  • “Find hidden dependencies via global variables”
  • “Which modules are candidates for extraction?”
  • “Detect SRP violations”
  • “Show shared dependencies between [module1] and [module2]”
  • “Find transitive dependencies from main entry point”

S11 and audit: S11 provides interactive architecture analysis on demand. The audit composite uses S11 for 4 of its 12 quality dimensions: dependency health, modularity assessment, architecture compliance, and coupling analysis. S05 (refactoring) and S12 (tech debt) focus on code improvements, while S11 focuses on structural analysis and violation detection.