Workflows Reference¶

Documentation for CodeGraph’s workflow system.

Table of Contents¶

Workflow Architecture
MultiScenarioCopilot
Workflow State
MultiScenarioState
Specialized States
Workflow Nodes
Intent Classification
Scenario Routing
Scenario Execution
Scenario Workflows
Structure
Available Scenarios
Handler Base Class
Composite Workflows
Error Handling
Retry Logic
Fallback Strategies
Custom Workflows
Creating a Custom Workflow
Conditional Routing
Streaming
Next Steps

Workflow Architecture¶

CodeGraph uses LangGraph for workflow orchestration. All queries flow through a single entry point — MultiScenarioCopilot — which classifies intent and routes to the appropriate scenario workflow.

                    ┌──────────────────┐
                    │   Entry Point    │
                    │   (User Query)   │
                    └────────┬─────────┘
                             │
                    ┌────────▼─────────┐
                    │  Intent Classify  │
                    │ (Keyword + LLM)  │
                    └────────┬─────────┘
                             │
                    ┌────────▼─────────┐
                    │  Scenario Router  │
                    │ (21 scenarios)   │
                    └────────┬─────────┘
                             │
              ┌──────────────┼──────────────┐
              │              │              │
     ┌────────▼───┐  ┌──────▼─────┐  ┌────▼────────┐
     │  Security  │  │ Onboarding │  │  Perf / ... │
     │  Workflow  │  │  Workflow  │  │  Workflows  │
     └────────┬───┘  └──────┬─────┘  └────┬────────┘
              │              │              │
              └──────────────┼──────────────┘
                             │
                    ┌────────▼─────────┐
                    │    Output        │
                    │  (Answer +       │
                    │   Evidence)      │
                    └──────────────────┘

MultiScenarioCopilot¶

The main entry point for all workflow execution.

Location: src/workflow/orchestration/copilot.py (re-exported from src/workflow/)

from src.workflow import MultiScenarioCopilot

copilot = MultiScenarioCopilot()

# Auto-detect scenario from query
result = copilot.run("Find SQL injection vulnerabilities")

# Force a specific scenario
result = copilot.run(
    "Analyze this module",
    context={"scenario_id": "scenario_2"}
)

# Result structure
{
    'query': 'Find SQL injection vulnerabilities',
    'intent': 'security_audit',
    'scenario_id': 'scenario_2',
    'confidence': 0.92,
    'answer': 'Found 3 potential SQL injection...',
    'evidence': ['Function exec_simple_query at line 142...'],
    'metadata': {...}
}

The copilot builds a LangGraph graph internally via build_multi_scenario_graph(), which wires together intent classification, scenario routing, and scenario-specific handler execution.

Workflow State¶

MultiScenarioState¶

All workflows share MultiScenarioState, a TypedDict that flows through the LangGraph nodes.

Location: src/workflow/state.py

from typing import Any, Dict, List, Optional, TypedDict

class MultiScenarioState(TypedDict):
    # Input
    query: str
    context: Optional[Dict[str, Any]]
    language: Optional[str]               # "en" or "ru"

    # Intent Classification
    intent: Optional[str]                 # e.g., "security_audit"
    scenario_id: Optional[str]            # e.g., "scenario_2"
    confidence: Optional[float]           # 0.0–1.0
    classification_method: Optional[str]  # "keyword" or "llm"

    # CPG Data
    cpg_results: Optional[List[Dict]]
    subsystems: Optional[List[str]]
    methods: Optional[List[Dict]]
    call_graph: Optional[Any]

    # Output
    answer: Optional[str]
    evidence: Optional[List[str]]
    metadata: Optional[Dict[str, Any]]
    retrieved_functions: Optional[List[str]]

    # Error Handling
    error: Optional[str]
    retry_count: int

    # Workflow Configuration
    enrichment_config: Optional[Dict[str, Any]]
    vector_store: Optional[Any]

Create initial state with the helper:

from src.workflow.state import create_initial_state

state = create_initial_state(
    query="Find memory leaks",
    language="en",
    context={"subsystem": "executor"}
)

Specialized States¶

Some scenarios extend the base state with additional fields:

State Class	Additional Fields
`SecurityWorkflowState`	`vulnerabilities`, `taint_paths`, `security_findings`, `risk_score`
`PerformanceWorkflowState`	`hotspots`, `complexity_metrics`, `bottlenecks`
`ArchitectureWorkflowState`	`dependency_graph`, `layer_violations`, `module_coupling`

Workflow Nodes¶

Intent Classification¶

The first node classifies the user query into a scenario using bilingual (EN/RU) keyword matching and optional LLM fallback.

Location: src/workflow/scenarios/_intent/

from src.workflow.orchestration import classify_intent_node

# Called internally by the graph
state = classify_intent_node(state)
# Populates: state['intent'], state['scenario_id'], state['confidence']

Scenario Routing¶

After classification, the router dispatches to the matched scenario workflow.

from src.workflow.orchestration import route_by_intent

# Returns the scenario node name
next_node = route_by_intent(state)
# e.g., "security_workflow", "onboarding_workflow"

Scenario Execution¶

Each scenario workflow is a LangGraph subgraph that: 1. Queries the CPG database via CPGQueryService 2. Processes results through scenario-specific handlers 3. Formats the answer using localized formatters

Scenario Workflows¶

Structure¶

Each scenario follows the pattern src/workflow/scenarios/{name}_handlers/:

src/workflow/scenarios/
├── _base/                    # Base handler class
│   └── handler.py            # BaseHandler
├── _intent/                  # Intent classification
├── security/                 # Security scenario
│   ├── handlers/
│   ├── formatters/
│   └── __init__.py
├── onboarding/               # Onboarding scenario
│   ├── handlers/
│   └── __init__.py
├── architecture_handlers/    # Architecture scenario
│   ├── handlers/
│   ├── formatters/
│   └── __init__.py
├── performance_handlers/     # Performance scenario
├── refactoring_handlers/     # Refactoring scenario
├── code_review_handlers/     # Code review scenario
├── compliance_handlers/      # Compliance scenario
├── documentation_handlers/   # Documentation scenario
├── tech_debt_handlers/       # Tech debt scenario
├── debugging_handlers/       # Debugging scenario
├── concurrency_handlers/     # Concurrency scenario
├── coverage_handlers/        # Test coverage scenario
├── cross_repo_handlers/      # Cross-repo scenario
├── feature_dev_handlers/     # Feature development scenario
├── audit_composite.py        # Audit (runs 9 sub-scenarios)
├── code_optimization.py      # Code optimization
├── file_editing.py           # File editing
├── pattern_search_handlers/  # Structural pattern search scenario
│   ├── handlers/
│   └── __init__.py
├── standards_check.py        # Standards check
└── dependencies_analysis.py  # Dependency analysis

Available Scenarios¶

ID	Name	Entry Point	Purpose
01	onboarding	`onboarding_workflow`	Codebase onboarding and navigation
02	security	`security_workflow`	Vulnerability detection
03	performance	`performance_workflow`	Performance and complexity
04	architecture	`architecture_workflow`	Architectural analysis
05	refactoring	`refactoring_workflow`	Refactoring assistance
06	documentation	`documentation_workflow`	Documentation generation
07	compliance	`compliance_workflow`	Compliance checking
08	code_review	`code_review_workflow`	Code review automation
09	tech_debt	`tech_debt_workflow`	Tech debt quantification
10	cross_repo	`cross_repo_workflow`	Cross-repo impact analysis
11	debugging	`debugging_workflow`	Debugging support
12	concurrency	`concurrency_workflow`	Concurrency analysis
13	coverage	`test_coverage_workflow`	Test coverage analysis
14	feature_dev	`feature_dev_workflow`	Feature development
15	security_incident	`security_incident_workflow`	Incident response
16	large_scale_refactoring	`large_scale_refactoring_workflow`	Enterprise-scale refactoring
17	file_editing	`file_editing_workflow`	AST-based file editing
18	code_optimization	`optimization_workflow`	Code optimization
19	standards_check	`standards_check_workflow`	Standards-guided optimization
20	dependencies	`dependencies_workflow`	Dependency analysis
21	pattern_search	`pattern_search_workflow`	Structural pattern search with CPG constraints
—	audit	`AuditRunner`	Composite: 12-dimension quality audit

Handler Base Class¶

All scenario handlers inherit from BaseHandler:

Location: src/workflow/scenarios/_base/handler.py

from src.workflow.scenarios._base.handler import BaseHandler
from src.workflow.scenarios._base.handler import HandlerResult

class MyHandler(BaseHandler):
    async def handle(self) -> HandlerResult:
        # self.cpg   — CPGQueryService instance
        # self.state  — MultiScenarioState dict
        # self.cfg    — Unified config
        # self.query  — Original query string
        # self.language — "en" or "ru"
        results = self.cpg.get_methods_by_subsystem("executor")
        return HandlerResult(
            answer="Found methods...",
            evidence=results,
            metadata={"handler": "my_handler"}
        )

Warning: Do NOT use AnalysisHandler from src/workflow/handlers/analysis.py as a base class — its constructor signature is incompatible with the scenario registry.

Composite Workflows¶

Three composite orchestrators run sub-scenarios in parallel or sequentially:

Composite	Scenario IDs	Mode	Timeout
code_optimization (S18)	02, 05, 06, 11, 12	Parallel	60s
standards_check (S19)	08, 17, 18	Sequential	45s
audit	02, 03, 05, 06, 07, 08, 11, 12, 16	Parallel	600s

Audit runs 9 sub-scenarios in parallel, covering 12 code quality dimensions (security, complexity, duplication, dependencies, naming, error handling, testing, documentation, performance, portability, style, architecture). Exposed via:

python -m src.cli audit --db PATH [--language ru] [--format json]
python -m src.cli audit --db PATH --autofix  # Audit + autofix suggestions

The --autofix flag generates automated fix suggestions for security vulnerabilities found during the audit, using AutofixEngine on taint paths. Configuration in config.yaml → autofix section.

Exec provides non-interactive CI/CD execution with PR security review:

python -m src.cli exec --prompt "Review security" --base-ref origin/main \
    --sarif-file out.sarif --comment-file comment.md --sandbox read-only

The exec pipeline gets changed files, scans changed methods, computes “New vs Fixed” delta via fingerprinting, generates SARIF 2.1.0 output (via SARIFExporter), and PR comment markdown. Configuration in config.yaml → reporting.

Conflict resolution uses priority mode with security (1.5x) and compliance (1.3x) boosts. Configuration in config.yaml → composition.

Error Handling¶

Retry Logic¶

Workflows support automatic retry with query refinement:

# Built into the graph — configurable via state['retry_count']
# Default: up to 2 retries with adaptive query refinement

Fallback Strategies¶

When LLM-based generation fails, workflows fall back to template-based query matching:

# Automatic fallback chain:
# 1. LLM-generated SQL query
# 2. Template-matched query from query examples
# 3. Direct CPG method call

Custom Workflows¶

Creating a Custom Workflow¶

from langgraph.graph import StateGraph
from src.workflow.state import MultiScenarioState

def create_custom_workflow():
    workflow = StateGraph(MultiScenarioState)

    workflow.add_node("analyze", my_analyze_node)
    workflow.add_node("process", my_process_node)
    workflow.add_node("interpret", my_interpret_node)

    workflow.add_edge("analyze", "process")
    workflow.add_edge("process", "interpret")

    workflow.set_entry_point("analyze")
    workflow.set_finish_point("interpret")

    return workflow.compile()

result = create_custom_workflow().invoke({"query": "..."})

Conditional Routing¶

def route_by_intent(state: MultiScenarioState) -> str:
    if state["intent"] == "find_vulnerabilities":
        return "security_node"
    elif state["intent"] == "find_performance":
        return "performance_node"
    else:
        return "general_node"

workflow.add_conditional_edges(
    "analyze",
    route_by_intent,
    {
        "security_node": "security",
        "performance_node": "performance",
        "general_node": "general"
    }
)

Streaming¶

Progress streaming is supported through the LangGraph streaming interface:

copilot = MultiScenarioCopilot()

# Streaming is handled at the API/TUI layer
# See src/api/routers/ for WebSocket streaming
# See src/tui/ for terminal streaming

Next Steps¶

API Reference - Complete API
Agents Reference - Agent details
Scenarios Guide - Scenario usage examples