Architecture and usage guide for scenario composition and orchestration.
Table of Contents¶
- Overview
- Architecture
- S18 as Orchestrator
- S19 as Orchestrator
- Composition Components
- Configuration
- API Endpoints
- CLI Commands
- Code Quality Audit
- User Story Validation
- Interface Documentation Sync
- Best Practices
Overview¶
Composite workflows allow scenarios to act as orchestrators, invoking multiple sub-scenarios for comprehensive analysis. This enables complex, multi-faceted code analysis without requiring users to run scenarios individually.
Key Benefits¶
- Comprehensive Analysis: Combine security, performance, and quality checks
- Deduplication: Automatically merge and deduplicate findings
- Conflict Resolution: Resolve conflicting recommendations
- Priority Scoring: Rank findings by combined priority metrics
- Single Entry Point: One command for complete analysis
Architecture¶
┌─────────────────────────────────────────────────────────────────────────────┐
│ COMPOSITE WORKFLOW ARCHITECTURE │
│ │
│ User Query: "Optimize src/" │
│ │ │
│ ▼ │
│ ┌────────────────────────────────────────────────────────────────────────┐ │
│ │ ORCHESTRATOR (S18 or S19) │ │
│ │ │ │
│ │ ┌─────────────────────────────────────────────────────────────────┐ │ │
│ │ │ SCENARIO INVOKER │ │ │
│ │ │ │ │ │
│ │ │ Parallel Execution: │ │ │
│ │ │ ┌─────┐ ┌─────┐ ┌─────┐ ┌─────┐ ┌─────┐ │ │ │
│ │ │ │ S02 │ │ S05 │ │ S06 │ │ S11 │ │ S12 │ │ │ │
│ │ │ │Sec. │ │Refac│ │Perf │ │Arch │ │Debt │ │ │ │
│ │ │ └──┬──┘ └──┬──┘ └──┬──┘ └──┬──┘ └──┬──┘ │ │ │
│ │ │ │ │ │ │ │ │ │ │
│ │ │ └────────┴───────┴────────┴────────┘ │ │ │
│ │ │ │ │ │ │
│ │ │ Sub-Scenario Results │ │ │
│ │ └──────────────────────┼───────────────────────────────────────────┘ │ │
│ │ │ │ │
│ │ ┌──────────────────────▼───────────────────────────────────────────┐ │ │
│ │ │ RESULT MERGER │ │ │
│ │ │ │ │ │
│ │ │ Strategy: union | intersection | weighted | consensus │ │ │
│ │ │ - Deduplication (similarity threshold: 0.8) │ │ │
│ │ │ - Source tracking │ │ │
│ │ │ - Metadata preservation │ │ │
│ │ └──────────────────────┼───────────────────────────────────────────┘ │ │
│ │ │ │ │
│ │ ┌──────────────────────▼───────────────────────────────────────────┐ │ │
│ │ │ CONFLICT RESOLVER │ │ │
│ │ │ │ │ │
│ │ │ Mode: priority_based | security_first | interactive │ │ │
│ │ │ - Detect conflicting recommendations │ │ │
│ │ │ - Apply resolution rules │ │ │
│ │ │ - Log resolution decisions │ │ │
│ │ └──────────────────────┼───────────────────────────────────────────┘ │ │
│ │ │ │ │
│ │ ┌──────────────────────▼───────────────────────────────────────────┐ │ │
│ │ │ PRIORITY CALCULATOR │ │ │
│ │ │ │ │ │
│ │ │ Algorithm: weighted_sum | risk_based | custom │ │ │
│ │ │ Weights: severity(0.3) × impact(0.25) × roi(0.2) │ │ │
│ │ │ × confidence(0.15) × consensus(0.1) │ │ │
│ │ └──────────────────────┼───────────────────────────────────────────┘ │ │
│ │ │ │ │
│ └─────────────────────────┼────────────────────────────────────────────────┘ │
│ │ │
│ ▼ │
│ UNIFIED FINDINGS │
│ (sorted by priority) │
│ │
└─────────────────────────────────────────────────────────────────────────────┘
S18 as Orchestrator¶
Scenario 18 (Code Optimization) orchestrates multiple analysis scenarios for comprehensive code improvement:
Sub-Scenarios¶
| Scenario | Name | Weight | Contribution |
|---|---|---|---|
| S02 | Security Audit | 1.5 | Security vulnerabilities |
| S05 | Refactoring | 1.0 | Code smells, dead code |
| S06 | Performance | 1.2 | Performance bottlenecks |
| S11 | Architecture | 1.1 | Architecture violations |
| S12 | Tech Debt | 0.9 | Technical debt items |
Execution Flow¶
S18 Optimization Query
│
├─→ S02: Security Analysis
├─→ S05: Refactoring Analysis
├─→ S06: Performance Analysis
├─→ S11: Architecture Analysis
└─→ S12: Tech Debt Analysis
│
▼
Merge Findings (weighted)
│
▼
Resolve Conflicts
│
▼
Calculate Priorities
│
▼
Unified Optimization Report
Example Usage¶
# CLI
python -m src.cli composition run "Optimize src/core/" -o s18
# MCP
/select 18
> Analyze src/core/ for comprehensive optimization
Output¶
╭─────────────── Composite Optimization Results ────────────────╮
│ │
│ Sub-scenarios invoked: S02, S05, S06, S11, S12 │
│ Findings before merge: 87 │
│ Findings after merge: 52 │
│ Duplicates removed: 35 │
│ Conflicts resolved: 3 │
│ Execution time: 4.2s │
│ │
│ Top Findings: │
│ │
│ 1. [Critical] SQL injection vulnerability │
│ Source: S02 (Security) │
│ Priority Score: 0.95 │
│ │
│ 2. [High] Performance bottleneck in loop │
│ Source: S06 (Performance) │
│ Priority Score: 0.87 │
│ │
│ 3. [High] Architecture layer violation │
│ Source: S11 (Architecture) │
│ Priority Score: 0.82 │
│ │
╰────────────────────────────────────────────────────────────────╯
S19 as Orchestrator¶
Scenario 19 (Standards Check) orchestrates compliance checking with document-driven rules:
Sub-Scenarios¶
| Scenario | Name | Role | Contribution |
|---|---|---|---|
| S08 | Compliance | Required | Standard compliance rules, pattern matching |
| S17 | File Editing | Optional | AST-based fix application |
| S18 | Code Optimization | Optional | Optimized fix recommendations |
S08 is always invoked as the core compliance engine. S17 and S18 are optionally integrated — S17 for applying fixes, S18 for optimization passes on the proposed changes.
Document Enrichment¶
Standards Document
│
▼
Rule Extraction
│
├─→ S08: Compliance Analysis (required)
│ │
│ ▼
│ Violations Found
│ │
│ ▼
├─→ S17: Fix Application (optional)
│ │
│ ▼
└─→ S18: Optimize Fixes (optional)
│
▼
Violations with References
Example Usage¶
# CLI
python -m src.cli composition run "Check against OWASP standards" -o s19
# With custom standards document
python -m src.cli composition run "Check against company_standards.yaml" -o s19
Code Quality Audit¶
The audit is the most comprehensive composite workflow. It runs 9 sub-scenarios across 12 code quality dimensions with multi-layered false positive filtering.
Sub-Scenarios¶
| Scenario | Name | Contribution |
|---|---|---|
| S02 | Security Audit | Vulnerabilities, taint analysis |
| S03 | Documentation | Comment and documentation quality |
| S05 | Refactoring | Code smells, dead code |
| S06 | Performance | Performance bottlenecks |
| S07 | Test Coverage | Testability and coverage gaps |
| S08 | Compliance | Coding standards, regulatory compliance |
| S11 | Architecture | Modularity, circular dependencies |
| S12 | Tech Debt | Accumulated debt, refactoring ROI |
| S16 | Entry Points | Attack surface, external inputs |
12 Quality Dimensions¶
| # | Dimension | Sources |
|---|---|---|
| 1 | Readability & coding standards | S05 |
| 2 | Module structure & component reuse | S11 |
| 3 | Redundant & hard-to-maintain code | S05, S12 |
| 4 | Scalability | S06, S11 |
| 5 | Performance with large data volumes | S06 |
| 6 | Architecture for scalability | S11 |
| 7 | Security vulnerabilities | S02, S08 |
| 8 | SQL injection, XSS & data leaks | S02, S16 |
| 9 | Maintainability | S12, S05 |
| 10 | Test coverage & testability | S07 |
| 11 | Complexity & dependencies | S11, S12 |
| 12 | Documentation & comments | S03 |
Execution Flow¶
Audit Query
│
▼
AuditRunner.run()
│
▼
AuditRunner._collect_metrics()
│
├─→ S02: Security
├─→ S03: Documentation
├─→ S05: Refactoring
├─→ S06: Performance
├─→ S07: Tests
├─→ S08: Compliance
├─→ S11: Architecture
├─→ S12: Tech Debt
└─→ S16: Entry Points
│
▼
False Positive Filtering (V25-V30)
│
▼
Dead Code Counting
│
▼
Merge + Deduplication
│
▼
Report across 12 Dimensions
Dead Code False Positive Filtering¶
Multi-layered filtering system (V25–V30) progressively reduces FP rate:
| Version | Filter | Description |
|---|---|---|
| V25 | is_test |
Exclude test methods |
| V26 | Class-aware reachability | Methods called via class instance |
| V26b | Inheritance-aware | Base classes with alive subclass methods |
| V27 | is_nested |
Exclude nested functions |
| V28 | All-dead module exclusion | Modules where all methods are dead |
| V29 | Low-vitality exclusion | Modules with ≥5 methods and <40% alive |
| V30 | Alive-file companion | FILE-level functions in files with alive code |
Example Usage¶
# CLI
python -m src.cli audit --db data/projects/codegraph.duckdb --language en
# With autofix suggestions
python -m src.cli audit --db data/projects/codegraph.duckdb --autofix
# JSON format for CI/CD
python -m src.cli audit --db data/projects/codegraph.duckdb --format json --output report.json
Configuration¶
composition:
orchestrators:
audit:
sub_scenarios:
- scenario_02
- scenario_03
- scenario_05
- scenario_06
- scenario_07
- scenario_08
- scenario_11
- scenario_12
- scenario_16
parallel_execution: true
timeout_seconds: 600
max_findings_per_scenario: 50
deduplicate_paths: true
User Story Validation¶
Story validation checks that each completed user story from USER_STORIES.md is accessible through at least one interface.
Checked Interfaces¶
| Interface | CPG Path | Description |
|---|---|---|
| CLI | src/cli/ |
CLI commands |
| REST API | src/api/routers/ |
REST endpoints |
| MCP | src/mcp/ |
MCP server tools |
| ACP | src/acp/ |
Agent communication protocol |
| gRPC | src/services/gocpg/grpc_transport.py |
Go CPG gRPC transport |
Evidence Types¶
| Symbol | Type | Confidence | Description |
|---|---|---|---|
+ |
dedicated | ≥0.8 | Dedicated endpoint |
~ |
passthrough | ≤0.5 | Generic gateway (e.g. /chat) |
* |
scenario_map | 0.7 | From scenario-to-interface mapping |
- |
not found | — | Not found |
Example Usage¶
# Basic run
python -m src.cli dogfood validate-stories --db data/projects/codegraph.duckdb
# With Go CPG
python -m src.cli dogfood validate-stories \
--db data/projects/codegraph.duckdb \
--go-db data/projects/gocpg.duckdb \
--output report.md
Interface Documentation Sync¶
Interface documentation sync is a composite workflow that scans 6 interfaces for documentation drift. It runs a 5-phase pipeline to discover code entities, parse existing docs, detect mismatches, and generate a coverage report.
Checked Interfaces¶
| Interface | Code Path | Doc Path |
|---|---|---|
| REST API | src/api/routers/ |
docs/api/{lang}/REST_API.md |
| CLI | src/cli/, src/api/cli.py |
docs/guides/{lang}/CLI_GUIDE.md |
| MCP | src/mcp/ |
docs/api/{lang}/MCP_TOOLS.md |
| ACP | src/acp/ |
docs/api/{lang}/ACP_INTEGRATION.md |
| gRPC | src/services/gocpg/grpc_transport.py |
docs/api/{lang}/GRPC_API.md |
| WebSocket | src/api/websocket/routes.py, src/api/routers/dashboard_ws.py |
docs/api/{lang}/WEBSOCKET_API.md |
5-Phase Pipeline¶
Discovery Doc Parsing Generation Drift Detection Report
│ │ │ │ │
▼ ▼ ▼ ▼ ▼
Scan code Parse markdown Generate missing Match code↔docs Markdown/JSON
entities for documented doc stubs Multi-strategy Coverage %
per interface entities (optional) matching per interface
Drift Categories¶
| Category | Description |
|---|---|
UNDOCUMENTED |
Code entity exists but has no documentation |
STALE |
Documented entity no longer exists in code |
OUTDATED |
Both exist but parameters/signatures differ |
COVERED |
Properly documented |
Matching Strategies¶
The drift detector uses a multi-strategy matching pipeline (Phase 4):
- Exact match — direct name comparison (confidence: 1.0)
- Route-aware match — strip route prefixes like
/api/v1(confidence: 0.95) - Case-normalized match — snake_case ↔ kebab-case equivalence (confidence: 0.95)
- Fuzzy match — Jaccard similarity on name tokens (confidence: similarity score)
Exclude patterns (regex) can filter entities from drift detection. Configuration:
composition:
orchestrators:
interface_docs_sync:
drift_detection:
route_prefix_strip: ["/api/v1", "/api/v2", "/api"]
exclude_patterns: ["^internal_", "^_"]
case_normalize: true
fuzzy_threshold: 0.6
min_coverage_warning: 0.8
Example Usage¶
# Full report
python -m src.cli docs-sync --db data/projects/codegraph.duckdb
# CI mode (exit 1 if coverage below threshold)
python -m src.cli docs-sync --check --format json
# Filter interfaces
python -m src.cli docs-sync --interfaces rest_api,cli --language en
# REST API
curl -X POST /api/v1/documentation/sync -d '{"interfaces": ["rest_api", "cli"]}'
# MCP tool
codegraph_docs_sync(interfaces="rest_api,cli", output_format="json")
CI Integration¶
GitHub Actions workflow .github/workflows/docs-sync.yml runs on PRs that modify interface code or docs. Posts a sticky PR comment with coverage metrics and uploads the report as an artifact.
Composition Components¶
ScenarioInvoker¶
Invokes sub-scenarios with parallel or sequential execution:
from src.workflow.composition import ScenarioInvoker
invoker = ScenarioInvoker()
# Parallel execution (default)
results = invoker.invoke_parallel(
scenarios=["scenario_02", "scenario_06", "scenario_11"],
state=state,
timeout=30.0,
)
# Sequential execution
results = invoker.invoke_sequential(
scenarios=["scenario_02", "scenario_06"],
state=state,
)
ResultMerger¶
Merges findings with configurable strategies:
from src.workflow.composition import ResultMerger, MergeStrategy
merger = ResultMerger()
# Merge with union strategy (include all)
result = merger.merge(
scenario_results={"s02": findings_s02, "s06": findings_s06},
strategy=MergeStrategy.UNION,
)
# Merge with weighted strategy
result = merger.merge(
scenario_results=all_findings,
strategy=MergeStrategy.WEIGHTED,
)
Merge Strategies¶
| Strategy | Description | Use Case |
|---|---|---|
union |
Include all findings | Comprehensive analysis |
intersection |
Only findings from 2+ scenarios | High-confidence only |
weighted |
Weight by scenario priority | Prioritized analysis |
consensus |
Findings with agreement | Conservative approach |
ConflictResolver¶
Resolves conflicting recommendations:
from src.workflow.composition import ConflictResolver
resolver = ConflictResolver()
# Resolve conflicts
resolved_findings, resolution_log = resolver.resolve_conflicts(
findings=unified_findings,
mode="priority_based",
)
Conflict Types¶
| Type | Description | Resolution |
|---|---|---|
| Same Location | Multiple findings for same code | Keep highest priority |
| Contradictory | Conflicting recommendations | Apply resolution rules |
| Overlapping | Partially overlapping findings | Merge or choose |
PriorityCalculator¶
Calculates combined priority scores:
from src.workflow.composition import PriorityCalculator
calculator = PriorityCalculator()
# Calculate and sort by priority
prioritized = calculator.sort_by_priority(findings)
# Get priority breakdown
breakdown = calculator.get_priority_breakdown(finding)
Priority Weights¶
| Factor | Weight | Description |
|---|---|---|
severity |
0.30 | Finding severity level |
impact |
0.25 | Potential impact score |
roi |
0.20 | Return on investment |
confidence |
0.15 | Detection confidence |
consensus |
0.10 | Cross-scenario agreement |
Configuration¶
Configure composition in config.yaml:
composition:
enabled: true
orchestrators:
scenario_18:
sub_scenarios:
- scenario_02 # Security
- scenario_05 # Refactoring
- scenario_06 # Performance
- scenario_11 # Architecture
- scenario_12 # Tech Debt
parallel_execution: true
timeout_seconds: 60
enabled: true
scenario_19:
sub_scenarios:
- scenario_08 # Compliance (required)
optional_sub_scenarios:
- scenario_17 # File Editing (optional)
- scenario_18 # Code Optimization (optional)
parallel_execution: false
timeout_seconds: 30
enabled: true
merging:
strategy: weighted # union, intersection, weighted, consensus
deduplication_threshold: 0.8
max_findings: 100
preserve_sources: true
priority:
algorithm: weighted_sum # weighted_sum, risk_based, custom
weights:
severity: 0.30
impact: 0.25
roi: 0.20
confidence: 0.15
consensus: 0.10
conflicts:
resolution_mode: priority_based # priority_based, security_first, interactive
security_priority_boost: 1.5
compliance_priority_boost: 1.3
log_resolutions: true
API Endpoints¶
POST /api/v1/composition/query¶
Execute a composite workflow:
POST /api/v1/composition/query
Content-Type: application/json
Authorization: Bearer <token>
{
"query": "Optimize src/core/",
"orchestrator": "scenario_18",
"context": {
"language": "en",
"file_paths": ["src/core/"]
},
"parallel": true,
"merge_strategy": "weighted"
}
Response:
{
"session_id": "sess_abc123",
"answer": "Found 52 optimization opportunities...",
"unified_findings": [],
"priority_scores": {},
"sub_scenario_results": {},
"conflicts_resolved": 3,
"execution_time_ms": 4234.5,
"metadata": {}
}
POST /api/v1/composition/apply¶
Apply a pending edit from a session:
POST /api/v1/composition/apply
Content-Type: application/json
Authorization: Bearer <token>
{
"session_id": "sess_abc123",
"finding_id": "find_xyz789",
"preview": true
}
GET /api/v1/composition/conflicts/{session_id}¶
Get conflict information for a session:
GET /api/v1/composition/conflicts/sess_abc123
Authorization: Bearer <token>
GET /api/v1/composition/session/{session_id}¶
Get full session state:
GET /api/v1/composition/session/sess_abc123
Authorization: Bearer <token>
DELETE /api/v1/composition/session/{session_id}¶
Delete a session and its state:
DELETE /api/v1/composition/session/sess_abc123
Authorization: Bearer <token>
GET /api/v1/composition/config¶
Get composition configuration:
GET /api/v1/composition/config
Authorization: Bearer <token>
GET /api/v1/composition/scenarios¶
List available scenarios for composition:
GET /api/v1/composition/scenarios
Authorization: Bearer <token>
CLI Commands¶
# Run composite workflow
python -m src.cli composition run "<query>" -o s18|s19
# Run with specific sub-scenarios
python -m src.cli composition run "Analyze src/" -o s18 -s scenario_02 -s scenario_06
# Run with merge strategy
python -m src.cli composition run "Optimize code" -o s18 --merge-strategy weighted
# Apply pending edit
python -m src.cli composition apply <finding_id> -s <session_id>
# Preview edit before applying
python -m src.cli composition apply <finding_id> -s <session_id> --preview
# View conflicts
python -m src.cli composition conflicts -s <session_id>
# View configuration
python -m src.cli composition config
# List available scenarios
python -m src.cli composition scenarios
Best Practices¶
1. Choose the Right Orchestrator¶
- Audit for comprehensive analysis: Full check across 12 quality dimensions (9 sub-scenarios, 600s timeout)
- S18 for optimization: When you need performance, security, and quality improvements (5 sub-scenarios)
- S19 for compliance: When you need standards verification with references (S08 required + optional S17/S18)
- Interface Docs Sync for documentation coverage: When you need to detect undocumented endpoints and stale docs (6 interfaces, 120s timeout)
2. Configure Merge Strategy¶
# For comprehensive analysis (include everything)
merging:
strategy: union
# For high-confidence findings only
merging:
strategy: intersection
# For prioritized analysis
merging:
strategy: weighted
3. Handle Conflicts¶
Enable conflict logging to understand resolution decisions:
conflicts:
log_resolutions: true
4. Optimize Performance¶
For large codebases, use parallel execution with timeout:
orchestrators:
scenario_18:
parallel_execution: true
timeout_seconds: 120
5. Customize Priority Weights¶
Adjust weights based on your team’s priorities:
priority:
weights:
severity: 0.40 # Prioritize severity
impact: 0.20
roi: 0.20
confidence: 0.10
consensus: 0.10
Project Lifecycle Operations¶
In addition to composite analysis workflows, CodeGraph provides project lifecycle management across all interfaces:
Available Operations¶
| Operation | CLI | MCP | REST API |
|---|---|---|---|
| List projects | projects list |
codegraph_project_list |
GET /api/v1/projects |
| Switch project | projects activate <name> |
codegraph_project_switch |
POST /api/v1/projects/{id}/activate |
| Rename project | projects rename <old> <new> |
codegraph_project_rename |
PUT /api/v1/projects/{id} |
| Delete project | projects delete <name> |
codegraph_project_delete |
DELETE /api/v1/projects/{id} |
Data Cleanup on Delete¶
Deletion can optionally remove associated data:
- DuckDB files — CLI
--delete-files, API implicitly via import registry - ChromaDB collections — CLI
--delete-collections, API?delete_collections=true, MCPdelete_data=true - Confirmation prompt — CLI requires
--yes/-yto skip interactive confirmation when deleting data
RBAC Permissions¶
| Operation | Required Role |
|---|---|
| List / Switch | GroupRole.VIEWER |
| Rename / Update | GroupRole.EDITOR |
| Delete | GroupRole.ADMIN |
| Delete with data | GroupRole.ADMIN |