Scenario 18: Code Optimization

Developer or tech lead improving code quality through AI-powered analysis with composite orchestration of security, performance, architecture, refactoring, and tech debt scenarios.

Table of Contents

Quick Start

# Select Code Optimization Scenario
/select 18

How It Works

Dual-Mode Architecture

S18 operates in two modes selected by the router in route_by_intent():

Mode Function Trigger
Simple optimization_workflow() Default — single-category analysis
Composite optimization_composite_workflow() composite_mode=True or composite query detected

Simple mode analyzes code for optimization opportunities using OptimizationEngine, generates suggestions by category, and presents them for approval with undo support.

Composite mode orchestrates 5 sub-scenarios (S02, S05, S06, S11, S12) in parallel, merges findings, resolves conflicts, and calculates combined priority scores for unified optimization recommendations.

Query -> Intent Classification -> route_by_intent()
  |                                    |
  |  composite_mode?  No  -> optimization_workflow() [Simple]
  |                   Yes -> optimization_composite_workflow() [Composite]
  |
  Simple: OptimizationEngine -> Suggestions -> ApprovalWorkflow
  Composite: 5 sub-scenarios parallel -> Merge -> Resolve -> Priority -> Unified findings

If the composite orchestrator is not enabled in config, it falls back to the simple workflow automatically.

Composite Mode Detection

is_composite_optimization_query() in code_optimization_composite.py determines when to use composite mode:

  • Explicit keywords: “comprehensive”, “complete”, “full analysis”, “all issues”, “everything”, “комплексный”, “полный анализ”, “все проблемы”
  • Multiple categories (>=2): When 2+ of security, performance, architecture, refactoring, debt are mentioned in the query

Simple Workflow

optimization_workflow() in code_optimization.py provides single-mode code optimization:

  1. Intent detection: is_optimization_query() matches against 80 EN+RU keywords (optimize, performance, security, readability, оптимизировать, производительность, безопасность, читаемость, etc.)
  2. Parameter extraction: _extract_optimization_params() extracts file paths (via regex) and categories from the query
  3. Analysis: OptimizationEngine analyzes files/directories
  4. Approval: Suggestions submitted to ApprovalWorkflow for review
  5. Response: Formatted markdown with category summary and suggestion list

OptimizationEngine

OptimizationEngine() from src.code_optimization analyzes code:

  • engine.analyze_file(path, categories=...) — analyze a single file
  • engine.analyze_directory(path, categories=...) — analyze all files in a directory
  • Returns list of suggestions with id, title, category, severity, file_path, start_line

Optimization Categories

OptimizationCategory enum defines 3 analysis categories:

Category Description
performance Loop-invariant computations, inefficient patterns, missing caching
security Injection risks, missing validation, hardcoded secrets
readability Long functions, deep nesting, missing documentation

Categories are extracted from the query: “performance” / “производительность”, “security” / “безопасность”, “readability” / “читаемость”. If none specified, all categories are analyzed.

Approval Workflow and Undo

  • ApprovalWorkflow(undo_stack=undo_stack, batch_mode=True) — manages suggestion approval queue
  • UndoStack() — tracks applied changes for undo support
  • Suggestions stored in state["optimization_suggestions"] as serialized dicts

Composite Orchestrator

optimization_composite_workflow() in code_optimization_composite.py orchestrates comprehensive analysis through a 4-step pipeline.

5 Sub-Scenarios

The orchestrator invokes 5 sub-scenarios defined in OPTIMIZATION_SUB_SCENARIOS:

Sub-Scenario Role
S02 (Security Audit) Vulnerability scanning, taint analysis
S05 (Refactoring) Dead code, duplicates, code smells
S06 (Performance) Hotspot detection, bottleneck analysis
S11 (Architecture) Circular dependencies, coupling issues
S12 (Tech Debt) Debt items with ROI estimation

Sub-scenarios are configurable via config.yamlcomposition.orchestrators.scenario_18.sub_scenarios.

4-Step Pipeline

Step 1: ScenarioInvoker.invoke_parallel()
        ThreadPoolExecutor(max_workers=4), 60s timeout
        -> Dict[scenario_id -> SubScenarioResult]
            |
Step 2: ResultMerger.merge()
        Strategy: WEIGHTED, dedup threshold: 0.8
        -> MergeResult (unified_findings, duplicates_removed)
            |
Step 3: ConflictResolver.resolve_conflicts()
        Mode: PRIORITY, security boost 1.5x, compliance 1.3x
        -> (resolved_findings, List[ConflictResolution])
            |
Step 4: PriorityCalculator.calculate_batch()
        Algorithm: WEIGHTED_SUM
        -> priority_scores Dict[finding_id -> float]

Step 1 — Invoke: ScenarioInvoker(config=orchestrator_config) runs all 5 sub-scenarios using ThreadPoolExecutor with max_workers=4 and a 60-second timeout. Falls back to sequential execution if parallel_execution: false in config. Each result is a SubScenarioResult with findings, execution time, and error status.

Step 2 — Merge: ResultMerger(config=merging_config) deduplicates findings from all sub-scenarios. Uses a deduplication threshold of 0.8 to identify similar findings. Tracks finding_sources mapping each finding to its source scenario(s).

Step 3 — Resolve: ConflictResolver(config=conflict_config) resolves contradictory recommendations from different scenarios. Uses priority-based resolution with configurable boosts: security findings get 1.5x priority, compliance findings get 1.3x.

Step 4 — Priority: PriorityCalculator(config=priority_config) calculates a combined priority score (0.0–1.0) for each finding, then sorts by priority and generates a report with urgency breakdown.

Merge Strategies

MergeStrategy enum defines 4 strategies:

Strategy Description
UNION Include all findings
INTERSECTION Only findings from 2+ scenarios
WEIGHTED Weight by scenario priority (default for S18)
CONSENSUS Only findings with agreement from 2+ scenarios

Conflict Resolution

ConflictResolutionMode enum defines 4 modes:

Mode Description
PRIORITY Higher-priority scenario wins (default)
MANUAL Flag for manual review
MERGE Combine conflicting recommendations
FIRST_WINS First scenario’s finding wins

Conflict resolution tracks each decision in conflict_log with: conflict_id, resolution_method, winning_finding_id, suppressed_finding_ids, reason.

Priority Calculation

PriorityAlgorithm.WEIGHTED_SUM (default) calculates priority using 5 weighted factors:

Factor Weight Description
severity 0.30 Finding severity (critical=1.0, high=0.75, medium=0.5, low=0.25, info=0.1)
impact 0.25 Impact score from the finding
roi 0.25 Return on investment score
confidence 0.15 Detection confidence
consensus 0.05 Agreement across sub-scenarios

Other algorithms available: MULTIPLICATIVE, MAX.

Finding Model

Finding dataclass in src/workflow/composition/state.py is the unified representation for findings from all sub-scenarios:

Field Type Description
id str Unique finding identifier
source_scenario str Originating scenario (e.g., scenario_02)
category FindingCategory SECURITY, PERFORMANCE, ARCHITECTURE, TECH_DEBT, REFACTORING, COMPLIANCE, …
severity FindingSeverity CRITICAL, HIGH, MEDIUM, LOW, INFO
title str Finding title
description str Detailed description
file_path str File path
line_number int Line number
method_name str Method name (optional)
confidence float Detection confidence (0–1)
impact_score float Impact assessment
roi_score float ROI estimation
suggestion str Suggested fix (optional)

Related types: - SubScenarioResult — tracks scenario_id, success, findings, execution_time_ms, error - ConflictResolution — tracks conflict_id, resolution_method, winning_finding_id, suppressed_finding_ids, reason - CompositeWorkflowState — extends MultiScenarioState with 20+ composition-specific fields

Configuration

S18 composite mode is configured in config.yamlcomposition:

composition:
  orchestrators:
    scenario_18:
      sub_scenarios:
        - scenario_02  # Security
        - scenario_05  # Refactoring
        - scenario_06  # Performance
        - scenario_11  # Architecture
        - scenario_12  # Tech Debt
      parallel_execution: true
      timeout_seconds: 60
      max_findings_per_scenario: 50
      enabled: true

  merging:
    strategy: weighted
    deduplication_threshold: 0.8
    max_findings: 100

  priority:
    algorithm: weighted_sum
    weights:
      severity: 0.30
      impact: 0.25
      roi: 0.25
      confidence: 0.15
      consensus: 0.05

  conflicts:
    resolution_mode: priority
    security_priority_boost: 1.5
    compliance_priority_boost: 1.3

CLI Usage

# Simple optimization analysis
python -m src.cli query "Optimize src/core/ for performance"

# Security-focused optimization
python -m src.cli query "Find security improvements in src/api/"

# Readability analysis
python -m src.cli query "Analyze src/utils/ for readability issues"

# Comprehensive composite analysis (triggers composite mode)
python -m src.cli query "Comprehensive code optimization for src/"

# Full analysis mentioning multiple categories (triggers composite mode)
python -m src.cli query "Analyze security, performance, and architecture of src/"

# Composite via API
# POST /api/v1/composition/query
# {"query": "Optimize src/", "orchestrator": "scenario_18"}

Example Questions

  • “Optimize [path] for performance”
  • “Find security improvements in [path]”
  • “Analyze [path] for readability issues”
  • “Comprehensive code optimization for [path]”
  • “Full analysis of security, performance, and architecture”
  • “Find all code quality issues in [path]”
  • “What are the most impactful optimizations for [module]?”
  • “Run complete analysis of [path] — security, performance, debt”
  • “Show optimization suggestions for [path]”
  • “Analyze [path] for all issues”

S18 vs S19: S18 (Code Optimization) and S19 (Standards Check) are both composite orchestrators sharing the same composition infrastructure (ScenarioInvoker, ResultMerger, ConflictResolver, PriorityCalculator). S18 orchestrates S02+S05+S06+S11+S12 in parallel (60s) for comprehensive code optimization. S19 orchestrates S08+S17+S18 sequentially (45s) for standards compliance checking. S18 focuses on finding and prioritizing code improvements; S19 focuses on verifying adherence to coding standards.