Workflow Scenarios Guide¶
CodeGraph supports 21 specialized analysis scenarios.
Table of Contents¶
- Scenario Overview
- 1. Codebase Onboarding
- 2. Security Audit
- 3. Documentation Generation
- 4. Feature Development
- 5. Refactoring
- 6. Performance Analysis
- 7. Test Coverage
- 8. Compliance Verification
- 9. Code Review
- 10. Cross-Repository Analysis
- 11. Architecture Analysis
- 12. Tech Debt Assessment
- 13. Mass Refactoring
- 14. Security Incident Response
- 15. Debugging Support
- 16. Entry Points & Attack Surface
- 17. File Editing
- 18. Code Optimization
- 19. Standards Check
- 20. Dependency Analysis
- 21. Structural Pattern Search
- Combining Scenarios
- Next Steps
Scenario Overview¶
| # | Scenario | Use Case |
|---|---|---|
| 1 | Codebase Onboarding | Navigate the codebase for new developers |
| 2 | Security Audit | Comprehensive audit with taint analysis |
| 3 | Documentation Generation | Auto-generate technical documentation |
| 4 | Feature Development | Guidance for implementing new features |
| 5 | Refactoring | Refactoring recommendations with impact analysis |
| 6 | Performance Analysis | Identify performance bottlenecks |
| 7 | Test Coverage | Test coverage analysis and recommendations |
| 8 | Compliance Verification | OWASP, GDPR, ISO 27001 compliance checks |
| 9 | Code Review | Automated PR/MR review |
| 10 | Cross-Repository Analysis | Cross-module dependency analysis |
| 11 | Architecture Analysis | Detect architectural constraint violations |
| 12 | Tech Debt Assessment | Quantify technical debt |
| 13 | Mass Refactoring | Automated mass refactoring (API migrations) |
| 14 | Security Incident Response | Incident investigation with recommendations |
| 15 | Debugging Support | Data-flow-based debugging assistance |
| 16 | Entry Points & Attack Surface | API entry points and attack surface analysis |
| 17 | File Editing | AST-based precise code editing |
| 18 | Code Optimization | Comprehensive optimization (security, refactoring, architecture) |
| 19 | Standards Check | Code standards verification |
| 20 | Dependency Analysis | Dependency and import analysis |
| 21 | Structural Pattern Search | Find code patterns with CPG constraints |
1. Codebase Onboarding¶
Navigate the codebase for new developers: find function/class/struct definitions, explore architecture.
Example Questions¶
Explain the project architecture
Find method 'heap_insert'
Where is AbortTransaction defined?
Show the definition of RelFileNode struct
Find all methods in file 'xact.c'
Usage¶
from src.workflow import MultiScenarioCopilot
copilot = MultiScenarioCopilot()
result = copilot.run("Explain the project architecture", scenario="onboarding")
print(result['answer'])
2. Security Audit¶
Comprehensive security audit: vulnerability detection, taint analysis, call graph analysis, CWE mapping.
Example Questions¶
Find SQL injection vulnerabilities
Show potential buffer overflows
Find unsanitized user input
Trace user input to query execution
What functions call LWLockAcquire?
Security Patterns Detected¶
- SQL Injection (CWE-89)
- Buffer Overflow (CWE-120)
- Command Injection (CWE-78)
- Format String (CWE-134)
- Integer Overflow (CWE-190)
Usage¶
from src.workflow import MultiScenarioCopilot
copilot = MultiScenarioCopilot()
result = copilot.run("Find SQL injection vulnerabilities")
for vuln in result['vulnerabilities']:
print(f"CWE-{vuln['cwe']}: {vuln['description']}")
print(f" File: {vuln['file']}:{vuln['line']}")
print(f" Severity: {vuln['severity']}")
3. Documentation Generation¶
Auto-generate technical documentation from source code.
Example Questions¶
Generate API documentation
Document the transaction subsystem
Create a summary of the buffer manager
Usage¶
copilot = MultiScenarioCopilot()
result = copilot.run("Document the transaction subsystem")
print(result['documentation'])
# Markdown-formatted documentation
4. Feature Development¶
Guidance for implementing new features: placement recommendations, pattern examples, dependency navigation.
Example Questions¶
Where should I add a new endpoint?
Where should I place new cache invalidation feature?
Find similar features to buffer management
Show pattern examples for executor subsystem
Features¶
- Optimal placement: Recommends the best file and nearby method for new code
- Pattern examples: Classifies existing methods (Initialization, Handler, Query, Validation, Cleanup patterns)
- Confidence scoring: Indicates how well the feature description matches the recommended subsystem
Usage¶
copilot = MultiScenarioCopilot()
result = copilot.run("Where should I place new cache invalidation feature?")
print(result['answer'])
See Feature Development Scenario for detailed examples.
5. Refactoring¶
Refactoring recommendations with impact analysis: dead code, duplication, extract-method opportunities.
Example Questions¶
Find unreachable functions
Find duplicate code blocks
Plan refactoring of the buffer manager
Find functions to split
Detected Patterns¶
- Functions with no callers (dead code)
- Unreachable code blocks
- Copy-paste duplication
- Long functions for method extraction
Usage¶
copilot = MultiScenarioCopilot()
result = copilot.run("Find unused functions")
for dead in result['dead_code']:
print(f"Unused: {dead['name']} in {dead['file']}")
6. Performance Analysis¶
Identify performance bottlenecks: cyclomatic complexity, expensive loops, concurrency, memory allocation patterns.
Example Questions¶
Find N+1 database queries
Find functions with high cyclomatic complexity
Find O(n^2) patterns
Find race conditions
Show lock ordering issues
Metrics Analyzed¶
- Cyclomatic complexity
- Loop nesting depth
- Function length and call frequency
- Memory allocation patterns
- Thread safety (locks, race conditions)
Usage¶
copilot = MultiScenarioCopilot()
result = copilot.run("Find functions with complexity > 20")
for func in result['complex_functions']:
print(f"{func['name']}: complexity={func['complexity']}")
7. Test Coverage¶
Test coverage analysis and recommendations. Supports importing runtime coverage data from external tools.
Example Questions¶
Which functions are not covered by tests?
Find untested code
Show functions without tests
Importing Coverage Data¶
# Import pytest-cov JSON report
python -m src.cli coverage import --file coverage.json --format pytest-cov --db data/projects/postgres.duckdb
# Import lcov trace file
python -m src.cli coverage import --file coverage.lcov --format lcov
# Import Cobertura XML (Java/C#)
python -m src.cli coverage import --file coverage.xml --format cobertura --source-root /project
After importing, the “Find untested code” query automatically switches to hybrid mode, combining runtime coverage_percent values with heuristic test-caller analysis.
CPG-Based Test Recommendations¶
- Branch coverage: Counts
IF/FOR/SWITCHcontrol structures and estimates required test cases - Parameter boundaries: Maps parameter types to boundary value suggestions (zero, null, empty, max)
- Error paths: Counts
TRYblocks and multipleRETURNstatements indicating error handling
8. Compliance Verification¶
Verify code compliance with standards: OWASP Top 10, GDPR, ISO 27001.
Example Questions¶
Check compliance with OWASP Top 10
Verify GDPR data handling requirements
Check ISO 27001 compliance
Usage¶
copilot = MultiScenarioCopilot()
result = copilot.run("Check compliance with OWASP Top 10")
print(result['answer'])
9. Code Review¶
Automated code review for PR/MR.
Example Questions¶
Review this PR for issues
Find potential bugs in this change
Check for style violations
Analyze test coverage for changes
Usage¶
python demo_patch_review.py --patch changes.diff
Or programmatically:
copilot = MultiScenarioCopilot()
result = copilot.run("Review changes in path/to/changes.diff")
for finding in result['findings']:
print(f"{finding['severity']}: {finding['description']}")
10. Cross-Repository Analysis¶
Cross-module dependency analysis, duplication between repositories.
Example Questions¶
Find duplicate code between repo A and B
Show cross-module dependencies
Usage¶
copilot = MultiScenarioCopilot()
result = copilot.run("Find duplicate code between repo A and B")
print(result['answer'])
11. Architecture Analysis¶
Detect architectural constraint violations, analyze subsystems, layers, and dependencies.
Example Questions¶
Find circular dependencies
Map the subsystem architecture
Show layer boundaries
Find architectural violations
Show subsystem diagram
Usage¶
copilot = MultiScenarioCopilot()
result = copilot.run("Map the PostgreSQL architecture")
for subsystem in result['subsystems']:
print(f"{subsystem['name']}: {subsystem['description']}")
12. Tech Debt Assessment¶
Quantify technical debt.
Example Questions¶
Assess technical debt of this module
Find code with excessive coupling
Show modules needing refactoring
Identify maintenance hotspots
Debt Indicators¶
- High complexity
- Deep nesting
- Long functions
- High coupling
- Missing error handling
Usage¶
copilot = MultiScenarioCopilot()
result = copilot.run("Find technical debt hotspots")
for debt in result['debt_items']:
print(f"{debt['location']}: {debt['type']} (severity: {debt['severity']})")
13. Mass Refactoring¶
Automated mass refactoring: API migrations, renames.
Example Questions¶
Plan migration from v1 to v2 API
Rename function X to Y across all files
Usage¶
copilot = MultiScenarioCopilot()
result = copilot.run("Plan migration from v1 to v2 API")
print(result['answer'])
14. Security Incident Response¶
Investigate security incidents: call-path tracing from entry points to vulnerabilities, CVE impact analysis, blast radius calculation, Mermaid attack path diagrams, taint flow analysis.
Example Questions¶
Trace the impact of CVE-XXXX
Find all code paths affected by this vulnerability
Show exploitation paths
Trace attack paths to vulnerable function
Find entry points that reach parse_query
Identify affected functions
Usage¶
copilot = MultiScenarioCopilot()
result = copilot.run("Trace impact of vulnerability in parse_query")
# Attack paths from entry points to vulnerability
for path in result['metadata'].get('attack_paths', []):
print(f"{path.entry_point} -> {path.vulnerability} (chain: {path.chain_length})")
15. Debugging Support¶
Data-flow-based debugging assistance.
Example Questions¶
Find all elog(ERROR) locations
Show potential deadlocks
Find missing lock acquisitions
Usage¶
copilot = MultiScenarioCopilot()
result = copilot.run("Find all elog(ERROR) locations")
print(result['answer'])
16. Entry Points & Attack Surface¶
API entry points and attack surface analysis: exported functions, hook functions.
Example Questions¶
Which functions accept user input?
Find all exported functions
Show main API entry points
Find hook functions
Usage¶
copilot = MultiScenarioCopilot()
result = copilot.run("Find API entry points")
for entry in result['entry_points']:
print(f"Entry: {entry['name']} ({entry['type']})")
17. File Editing¶
AST-based precise code editing.
Example Questions¶
Rename function X to Y across all files
18. Code Optimization¶
Comprehensive optimization: composite scenario that runs sub-scenarios S02, S05, S06, S11, S12 in parallel (60s timeout).
Example Questions¶
Optimize the authorization module
19. Standards Check¶
Code standards verification: composite scenario that runs S08, S17, S18 sequentially (45s timeout).
Example Questions¶
Check code against project standards
20. Dependency Analysis¶
Dependency and import analysis: module dependency graph.
Example Questions¶
Show the module dependency tree
What modules depend on storage?
Find circular dependencies
Usage¶
copilot = MultiScenarioCopilot()
result = copilot.run("Show dependencies of transaction module")
for dep in result['dependencies']:
print(f"{dep['from']} -> {dep['to']}")
21. Structural Pattern Search¶
Find code matching structural patterns with CPG-aware constraints (data flow, call graph, types, domain annotations).
Example Questions¶
Find unchecked return values
Find malloc without free
Show functions matching error-handling anti-patterns
Find SQL query construction without parameterization
Find all functions with cyclomatic complexity > 20
Pattern Types¶
- Syntactic: Tree-sitter CST patterns with metavariables (
$VAR,$$ARGS,$_) - CPG-constrained: Patterns with data flow, call graph, type, and domain constraints
- YAML rules: Pre-defined rules in
configs/rules/(190 rules across 14 languages)
Usage¶
CLI¶
# Ad-hoc pattern search
python -m src.cli patterns search "malloc($x)" --lang c
# Scan with all rules
python -m src.cli patterns scan
# Scan specific rule
python -m src.cli patterns scan --rule unchecked-return
# Generate rule from description
python -m src.cli patterns generate "find unchecked return values" --lang c
# Autofix (dry run)
python -m src.cli patterns fix --dry-run
Programmatic¶
from src.workflow import MultiScenarioCopilot
copilot = MultiScenarioCopilot()
result = copilot.run("Find unchecked return values", scenario="pattern_search")
for finding in result.get('findings', []):
print(f"{finding['rule_id']}: {finding['file']}:{finding['line']}")
print(f" {finding['message']}")
Combining Scenarios¶
Run multiple scenarios together:
from src.workflow import MultiScenarioCopilot
copilot = MultiScenarioCopilot()
# Force a specific scenario via context
result = copilot.run(
"Analyze the executor module",
context={"scenario_id": "scenario_2"} # security
)
print(f"Answer: {result['answer']}")
# Or run the composite audit across all dimensions
# python -m src.cli audit --db PATH
Next Steps¶
- TUI User Guide - General usage
- API Reference - Programmatic access