Workflows Reference¶
Documentation for CodeGraph’s workflow system.
Table of Contents¶
- Workflow Architecture
- MultiScenarioCopilot
- Workflow State
- MultiScenarioState
- Specialized States
- Workflow Nodes
- Intent Classification
- Scenario Routing
- Scenario Execution
- Scenario Workflows
- Structure
- Available Scenarios
- Handler Base Class
- Composite Workflows
- Error Handling
- Retry Logic
- Fallback Strategies
- Custom Workflows
- Creating a Custom Workflow
- Conditional Routing
- Streaming
- Next Steps
Workflow Architecture¶
CodeGraph uses LangGraph for workflow orchestration. All queries flow through a single entry point — MultiScenarioCopilot — which classifies intent and routes to the appropriate scenario workflow.
┌──────────────────┐
│ Entry Point │
│ (User Query) │
└────────┬─────────┘
│
┌────────▼─────────┐
│ Intent Classify │
│ (Keyword + LLM) │
└────────┬─────────┘
│
┌────────▼─────────┐
│ Scenario Router │
│ (21 scenarios) │
└────────┬─────────┘
│
┌──────────────┼──────────────┐
│ │ │
┌────────▼───┐ ┌──────▼─────┐ ┌────▼────────┐
│ Security │ │ Onboarding │ │ Perf / ... │
│ Workflow │ │ Workflow │ │ Workflows │
└────────┬───┘ └──────┬─────┘ └────┬────────┘
│ │ │
└──────────────┼──────────────┘
│
┌────────▼─────────┐
│ Output │
│ (Answer + │
│ Evidence) │
└──────────────────┘
MultiScenarioCopilot¶
The main entry point for all workflow execution.
Location: src/workflow/orchestration/copilot.py (re-exported from src/workflow/)
from src.workflow import MultiScenarioCopilot
copilot = MultiScenarioCopilot()
# Auto-detect scenario from query
result = copilot.run("Find SQL injection vulnerabilities")
# Force a specific scenario
result = copilot.run(
"Analyze this module",
context={"scenario_id": "scenario_2"}
)
# Result structure
{
'query': 'Find SQL injection vulnerabilities',
'intent': 'security_audit',
'scenario_id': 'scenario_2',
'confidence': 0.92,
'answer': 'Found 3 potential SQL injection...',
'evidence': ['Function exec_simple_query at line 142...'],
'metadata': {...}
}
The copilot builds a LangGraph graph internally via build_multi_scenario_graph(), which wires together intent classification, scenario routing, and scenario-specific handler execution.
Workflow State¶
MultiScenarioState¶
All workflows share MultiScenarioState, a TypedDict that flows through the LangGraph nodes.
Location: src/workflow/state.py
from typing import Any, Dict, List, Optional, TypedDict
class MultiScenarioState(TypedDict):
# Input
query: str
context: Optional[Dict[str, Any]]
language: Optional[str] # "en" or "ru"
# Intent Classification
intent: Optional[str] # e.g., "security_audit"
scenario_id: Optional[str] # e.g., "scenario_2"
confidence: Optional[float] # 0.0–1.0
classification_method: Optional[str] # "keyword" or "llm"
# CPG Data
cpg_results: Optional[List[Dict]]
subsystems: Optional[List[str]]
methods: Optional[List[Dict]]
call_graph: Optional[Any]
# Output
answer: Optional[str]
evidence: Optional[List[str]]
metadata: Optional[Dict[str, Any]]
retrieved_functions: Optional[List[str]]
# Error Handling
error: Optional[str]
retry_count: int
# Workflow Configuration
enrichment_config: Optional[Dict[str, Any]]
vector_store: Optional[Any]
Create initial state with the helper:
from src.workflow.state import create_initial_state
state = create_initial_state(
query="Find memory leaks",
language="en",
context={"subsystem": "executor"}
)
Specialized States¶
Some scenarios extend the base state with additional fields:
| State Class | Additional Fields |
|---|---|
SecurityWorkflowState |
vulnerabilities, taint_paths, security_findings, risk_score |
PerformanceWorkflowState |
hotspots, complexity_metrics, bottlenecks |
ArchitectureWorkflowState |
dependency_graph, layer_violations, module_coupling |
Workflow Nodes¶
Intent Classification¶
The first node classifies the user query into a scenario using bilingual (EN/RU) keyword matching and optional LLM fallback.
Location: src/workflow/scenarios/_intent/
from src.workflow.orchestration import classify_intent_node
# Called internally by the graph
state = classify_intent_node(state)
# Populates: state['intent'], state['scenario_id'], state['confidence']
Scenario Routing¶
After classification, the router dispatches to the matched scenario workflow.
from src.workflow.orchestration import route_by_intent
# Returns the scenario node name
next_node = route_by_intent(state)
# e.g., "security_workflow", "onboarding_workflow"
Scenario Execution¶
Each scenario workflow is a LangGraph subgraph that:
1. Queries the CPG database via CPGQueryService
2. Processes results through scenario-specific handlers
3. Formats the answer using localized formatters
Scenario Workflows¶
Structure¶
Each scenario follows the pattern src/workflow/scenarios/{name}_handlers/:
src/workflow/scenarios/
├── _base/ # Base handler class
│ └── handler.py # BaseHandler
├── _intent/ # Intent classification
├── security/ # Security scenario
│ ├── handlers/
│ ├── formatters/
│ └── __init__.py
├── onboarding/ # Onboarding scenario
│ ├── handlers/
│ └── __init__.py
├── architecture_handlers/ # Architecture scenario
│ ├── handlers/
│ ├── formatters/
│ └── __init__.py
├── performance_handlers/ # Performance scenario
├── refactoring_handlers/ # Refactoring scenario
├── code_review_handlers/ # Code review scenario
├── compliance_handlers/ # Compliance scenario
├── documentation_handlers/ # Documentation scenario
├── tech_debt_handlers/ # Tech debt scenario
├── debugging_handlers/ # Debugging scenario
├── concurrency_handlers/ # Concurrency scenario
├── coverage_handlers/ # Test coverage scenario
├── cross_repo_handlers/ # Cross-repo scenario
├── feature_dev_handlers/ # Feature development scenario
├── audit_composite.py # Audit (runs 9 sub-scenarios)
├── code_optimization.py # Code optimization
├── file_editing.py # File editing
├── pattern_search_handlers/ # Structural pattern search scenario
│ ├── handlers/
│ └── __init__.py
├── standards_check.py # Standards check
└── dependencies_analysis.py # Dependency analysis
Available Scenarios¶
| ID | Name | Entry Point | Purpose |
|---|---|---|---|
| 01 | onboarding | onboarding_workflow |
Codebase onboarding and navigation |
| 02 | security | security_workflow |
Vulnerability detection |
| 03 | performance | performance_workflow |
Performance and complexity |
| 04 | architecture | architecture_workflow |
Architectural analysis |
| 05 | refactoring | refactoring_workflow |
Refactoring assistance |
| 06 | documentation | documentation_workflow |
Documentation generation |
| 07 | compliance | compliance_workflow |
Compliance checking |
| 08 | code_review | code_review_workflow |
Code review automation |
| 09 | tech_debt | tech_debt_workflow |
Tech debt quantification |
| 10 | cross_repo | cross_repo_workflow |
Cross-repo impact analysis |
| 11 | debugging | debugging_workflow |
Debugging support |
| 12 | concurrency | concurrency_workflow |
Concurrency analysis |
| 13 | coverage | test_coverage_workflow |
Test coverage analysis |
| 14 | feature_dev | feature_dev_workflow |
Feature development |
| 15 | security_incident | security_incident_workflow |
Incident response |
| 16 | large_scale_refactoring | large_scale_refactoring_workflow |
Enterprise-scale refactoring |
| 17 | file_editing | file_editing_workflow |
AST-based file editing |
| 18 | code_optimization | optimization_workflow |
Code optimization |
| 19 | standards_check | standards_check_workflow |
Standards-guided optimization |
| 20 | dependencies | dependencies_workflow |
Dependency analysis |
| 21 | pattern_search | pattern_search_workflow |
Structural pattern search with CPG constraints |
| — | audit | AuditRunner |
Composite: 12-dimension quality audit |
Handler Base Class¶
All scenario handlers inherit from BaseHandler:
Location: src/workflow/scenarios/_base/handler.py
from src.workflow.scenarios._base.handler import BaseHandler
from src.workflow.scenarios._base.handler import HandlerResult
class MyHandler(BaseHandler):
async def handle(self) -> HandlerResult:
# self.cpg — CPGQueryService instance
# self.state — MultiScenarioState dict
# self.cfg — Unified config
# self.query — Original query string
# self.language — "en" or "ru"
results = self.cpg.get_methods_by_subsystem("executor")
return HandlerResult(
answer="Found methods...",
evidence=results,
metadata={"handler": "my_handler"}
)
Warning: Do NOT use
AnalysisHandlerfromsrc/workflow/handlers/analysis.pyas a base class — its constructor signature is incompatible with the scenario registry.
Composite Workflows¶
Three composite orchestrators run sub-scenarios in parallel or sequentially:
| Composite | Scenario IDs | Mode | Timeout |
|---|---|---|---|
| code_optimization (S18) | 02, 05, 06, 11, 12 | Parallel | 60s |
| standards_check (S19) | 08, 17, 18 | Sequential | 45s |
| audit | 02, 03, 05, 06, 07, 08, 11, 12, 16 | Parallel | 600s |
Audit runs 9 sub-scenarios in parallel, covering 12 code quality dimensions (security, complexity, duplication, dependencies, naming, error handling, testing, documentation, performance, portability, style, architecture). Exposed via:
python -m src.cli audit --db PATH [--language ru] [--format json]
python -m src.cli audit --db PATH --autofix # Audit + autofix suggestions
The --autofix flag generates automated fix suggestions for security vulnerabilities found during the audit, using AutofixEngine on taint paths. Configuration in config.yaml → autofix section.
Exec provides non-interactive CI/CD execution with PR security review:
python -m src.cli exec --prompt "Review security" --base-ref origin/main \
--sarif-file out.sarif --comment-file comment.md --sandbox read-only
The exec pipeline gets changed files, scans changed methods, computes “New vs Fixed” delta via fingerprinting, generates SARIF 2.1.0 output (via SARIFExporter), and PR comment markdown. Configuration in config.yaml → reporting.
Conflict resolution uses priority mode with security (1.5x) and compliance (1.3x) boosts. Configuration in config.yaml → composition.
Error Handling¶
Retry Logic¶
Workflows support automatic retry with query refinement:
# Built into the graph — configurable via state['retry_count']
# Default: up to 2 retries with adaptive query refinement
Fallback Strategies¶
When LLM-based generation fails, workflows fall back to template-based query matching:
# Automatic fallback chain:
# 1. LLM-generated SQL query
# 2. Template-matched query from query examples
# 3. Direct CPG method call
Custom Workflows¶
Creating a Custom Workflow¶
from langgraph.graph import StateGraph
from src.workflow.state import MultiScenarioState
def create_custom_workflow():
workflow = StateGraph(MultiScenarioState)
workflow.add_node("analyze", my_analyze_node)
workflow.add_node("process", my_process_node)
workflow.add_node("interpret", my_interpret_node)
workflow.add_edge("analyze", "process")
workflow.add_edge("process", "interpret")
workflow.set_entry_point("analyze")
workflow.set_finish_point("interpret")
return workflow.compile()
result = create_custom_workflow().invoke({"query": "..."})
Conditional Routing¶
def route_by_intent(state: MultiScenarioState) -> str:
if state["intent"] == "find_vulnerabilities":
return "security_node"
elif state["intent"] == "find_performance":
return "performance_node"
else:
return "general_node"
workflow.add_conditional_edges(
"analyze",
route_by_intent,
{
"security_node": "security",
"performance_node": "performance",
"general_node": "general"
}
)
Streaming¶
Progress streaming is supported through the LangGraph streaming interface:
copilot = MultiScenarioCopilot()
# Streaming is handled at the API/TUI layer
# See src/api/routers/ for WebSocket streaming
# See src/tui/ for terminal streaming
Next Steps¶
- API Reference - Complete API
- Agents Reference - Agent details
- Scenarios Guide - Scenario usage examples