CodeGraph analyzes its own codebase through the Code Property Graph after every commit, creating a Plan-Act-Review feedback loop. Claude Code receives quality metrics, blast radius data, interface impact analysis, and before/after comparison as context immediately after committing code.
Table of Contents¶
- How It Works
- Usage Scenarios
- Scenario 1: Post-commit review with explicit analysis
- Scenario 2: Find and fix quality issues
- Scenario 3: Validate refactoring impact
- Scenario 4: On-demand analysis
- Pipeline Architecture
- Data flow
- Timeout budget
- CPG freshness and automatic update
- Automatic vs explicit freshness controls
- Method deduplication
- Delta report
- Interface impact detection
- Cross-module alerts
- Story coverage delta
- Hook Infrastructure
- CLI Commands
- dogfood status
- dogfood analyze
- dogfood report
- dogfood validate-claims
- dogfood trend
- dogfood validate-stories
- dogfood config-check
- dogfood maintain-db
- dogfood continue
- Configuration
- CommitReport
- Report Format
- Scaling to Other Projects
- Troubleshooting
How It Works¶
The dogfooding pipeline connects three runtime pieces:
-
GoCPG builds and maintains a Code Property Graph (DuckDB) with pre-computed metrics for every method: cyclomatic complexity, fan-in/fan-out, TODO/FIXME flags, debug code, deprecated usage.
-
Dogfood CLI and review runtime query the CPG for changed methods, compute quality metrics, blast radius, interface impact, cross-module alerts, and persist review traces in
data/reviews/when recovery is needed. -
Local service readiness checks surface whether supporting services are available.
dogfood statusreports CPG freshness, maintenance pressure, lock diagnostics, and OpenViking availability for the local development contour.
The result: recent changes can be evaluated with a traceable quality assessment without relying on local git hooks.
Usage Scenarios¶
Scenario 1: Post-commit review with explicit analysis¶
The primary local scenario. You work in Claude Code, make changes, commit, then inspect the diff explicitly:
You: "Commit these changes and analyze the result"
Claude: git add src/intent/classifier.py && git commit -m "refactor: extract pattern table"
Claude: python -m src.cli.import_commands dogfood analyze --base-ref HEAD~1
The review pipeline produces a report:
## Commit Analysis Report
**Summary:** 1 files, 45 methods, 2 high-CC, 3 TODO/FIXME, 128 affected callers
**CPG status:** fresh
**Impact of changes:**
- `_get_fallback_domain`: CC 29->8 (-21), FanOut 18->5 (-13)
**High complexity methods:**
- `classify` (CC: 17)
- `_score_domain` (CC: 16)
**Blast radius:** 128 callers affected
- `classify` called by: `run_intent_classifier`, `IntentBenchmark._evaluate_single` +126 more
**Interface changes detected:**
- **CLI**: `src/cli/intent_commands.py` (`add_intent_commands`, `_run_classify`)
**Cross-module alert** — related interfaces may need updates:
- Changed CLI → check MCP: `codegraph_intent`, `register_intent_tools`
Claude sees this context and can react: “The refactoring reduced _get_fallback_domain complexity from 29 to 8. Two methods still have CC>10: classify and _score_domain. CLI was changed — check if the MCP tool needs updating.”
What runs the analysis: explicit dogfood analyze, dogfood continue, or another runtime flow that calls the same review pipeline.
What does not run it automatically: plain terminal commits, git commit --amend, and older hook-based local flows that are no longer part of the recommended workflow.
Scenario 2: Find and fix quality issues¶
Use CPG queries to find code quality targets, then fix them with the pipeline providing feedback:
You: "Query the CPG for methods with CC > 15 and TODO/FIXME flags in src/workflow/"
Claude: [runs DuckDB query]
Found: _get_fallback_domain (CC=29, TODO), PolicyViolationsHandler.handle (CC=68, TODO)
You: "Refactor _get_fallback_domain to reduce complexity"
Claude: [extracts patterns to data table, replaces if/else chain with loop]
You: "Commit"
Claude: git commit -m "refactor: extract fallback patterns to class-level table"
Claude: python -m src.cli.import_commands dogfood analyze --base-ref HEAD~1
-> Review report shows: CC 29->8 (-21)
This is the full Plan-Act-Review loop: 1. Plan: CPG query identifies the problem 2. Act: Refactoring reduces complexity 3. Review: Hook confirms the improvement with concrete metrics
Scenario 3: Validate refactoring impact¶
Before making a large refactoring, check the blast radius:
You: "What's the blast radius if I change HierarchicalIntentClassifier.classify?"
Claude: [queries call_containment]
213 direct callers across production code and tests
You: "Proceed with the refactoring"
Claude: [makes changes, commits]
Claude: python -m src.cli.import_commands dogfood analyze --base-ref HEAD~1
-> Review report shows: 213 affected callers, CC unchanged, no regressions
The blast radius report helps gauge the risk of changes before they happen.
Scenario 4: On-demand analysis¶
Run analysis without committing:
# Analyze the last commit
python -m src.cli.import_commands dogfood analyze --base-ref HEAD~1
# Analyze changes between branches
python -m src.cli.import_commands dogfood analyze --base-ref origin/main
# Generate a full quality report
python -m src.cli.import_commands dogfood report --format markdown
# Validate numeric claims in documentation
python -m src.cli.import_commands dogfood validate-claims --path docs/
# Show quality trend across recent commits
python -m src.cli.import_commands dogfood trend --commits 20
Pipeline Architecture¶
Data flow¶
git commit
|
v
Explicit review run (`dogfood analyze --base-ref HEAD~1`)
|
+-- CPG freshness check
| Query cpg_git_state.commit_hash, compare to git rev-parse HEAD
|
+-- Pre-update metrics capture (for delta report)
| Query nodes_method for changed files BEFORE CPG update
| Store {full_name: {cc, fan_out, ...}} for later comparison
|
+-- CPG update if stale
| gocpg update --input=<source> --output=<db>
|
+-- Phase 1: Get changed files
| git diff --name-only HEAD~1 HEAD, filter code extensions
|
+-- Phase 2: Get changed methods from CPG
| Query nodes_method for changed files, deduplicate
|
+-- Phase 3: Quality summary
| Compute high-CC, high-fan_out, TODO, debug, deprecated counts
|
+-- Phase 4: Blast radius
| Query call_containment (or nodes_call fallback) for callers
|
+-- Phase 5: Interface impact detection
| Check if changed files belong to interface layers (CLI, REST API, MCP, ACP)
|
+-- Phase 6: Cross-module alerts
| Find related functions in OTHER interface layers by keyword matching
|
+-- Phase 7: Story coverage delta
| Flag layers that changed vs layers not covered
|
+-- Record quality snapshot (cpg_quality_history table)
|
+-- Output: {"additionalContext": "## Commit Analysis Report\n..."}
Injected back into Claude Code conversation
Timeout budget¶
The review runtime is budgeted to keep interactive runs bounded:
| Phase | Budget | Action |
|---|---|---|
| Freshness check | 2s | Compare cpg_git_state.commit_hash to git rev-parse HEAD |
| Pre-update metrics | ~1s | Query current metrics for changed files (for delta report) |
| CPG update if stale | 40s | Run gocpg update --input=<source> --output=<db> |
| Phases 1–7 | ~15s | Changed files, methods, quality, blast radius, interfaces, cross-module, story |
If any phase exceeds its budget, the runtime degrades gracefully: it produces whatever data it has or returns empty {}.
CPG freshness and automatic update¶
The runtime checks CPG freshness by comparing cpg_git_state.commit_hash to git rev-parse HEAD. If stale, it runs gocpg update:
gocpg update --input=<source_path> --output=<db_path>
Hook Infrastructure¶
The legacy hook/runtime support code is still relevant as implementation detail even though the recommended workflow is now explicit dogfood analyze / dogfood continue execution instead of background post-commit hooks.
Key helper modules in src/dogfooding/hooks/:
_feedback.pymaps pipeline results intoReviewFindingandReviewFeedbackrecords._metrics.pyexposeshook_metricscounters and timing helpers for local observability._utils.pyprovides shared helpers such astimed_hookwrappers and runtime-safe formatting._session_cache.pymanagessession_cachestate used to correlate repeated local review runs.
These internals matter when you troubleshoot degraded hook behavior, inspect fallback paths, or compare the current explicit runtime against older hook-based experiments.
This triggers an incremental update of the CPG database. The CPGFreshnessChecker class in src/dogfooding/cpg_freshness.py manages this:
from src.dogfooding.cpg_freshness import CPGFreshnessChecker
checker = CPGFreshnessChecker(db_path, repo_path=".", gocpg_binary="gocpg/gocpg.exe")
checker.is_fresh() # True if CPG commit == HEAD
checker.commits_behind() # Number of commits CPG is behind
checker.ensure_fresh(timeout=40.0, source_path=".") # Update if stale
checker.status() # Full status dict
Freshness checks now include git-head fallback logic for environments where git rev-parse HEAD is unreliable in subprocesses. The checker resolves HEAD from .git/HEAD, refs, and packed-refs (including worktree indirection) before returning head_commit as unknown.
When update fails with DuckDB lock contention, detailed diagnostics include lock classification, lock-holder PIDs, optional auto-unlock attempt results, and actionable next_step / next_command guidance.
Freshness is reported in two forms:
- is_fresh_strict: exact commit match (cpg_commit == head_commit)
- is_fresh: effective freshness (strict match OR no CPG-relevant file changes between commits, e.g. docs-only commits)
Automatic vs explicit freshness controls¶
| Mechanism | Trigger | Typical use |
|---|---|---|
dogfood status |
Explicit/manual | Inspect freshness, lock diagnostics, maintenance due, and OpenViking readiness |
dogfood analyze |
Explicit/manual | Run bounded post-commit or branch-diff review |
codegraph_watch check / codegraph_watch update |
Explicit/manual | Deterministic freshness checks in headless and CI |
CPGFreshnessChecker.ensure_fresh_with_details() |
Explicit/manual | Programmatic control and machine-readable failure diagnostics |
If your workflow depends on guaranteed freshness before analysis, prefer explicit codegraph_watch update or dogfood status over any legacy background automation.
Method deduplication¶
GoCPG may store the same method with different filename formats (forward slash src/file.py vs backslash src\file.py). The analyzer deduplicates by normalizing full_name slashes and keeping the entry with the highest CC value:
# Before dedup: 2 entries for the same method
src\intent\classifier.py:Classifier.classify CC=17
src/intent/classifier.py:Classifier.classify CC=0 (from incremental update)
# After dedup: 1 entry, highest CC wins
src\intent\classifier.py:Classifier.classify CC=17
Delta report¶
When the CPG is stale (needs update), the hook captures pre-update metrics before running gocpg update, then compares against post-update metrics. This produces a delta showing the actual impact of changes:
**Impact of changes:**
- `_get_fallback_domain`: CC 29->8 (-21), FanOut 18->5 (-13)
- `_build_quality_summary`: CC 5->3 (-2)
Methods with no metric changes are omitted.
Interface impact detection¶
The analyzer tracks 4 interface layers defined in INTERFACE_LAYERS:
| Layer | Path Patterns | Description |
|---|---|---|
| CLI | src/cli/ |
CLI commands |
| REST API | src/api/routers/ |
API endpoints |
| MCP | src/mcp/tools/, src/mcp/ |
MCP tools |
| ACP | src/acp/server/, src/acp/ |
ACP handlers |
When a changed file belongs to an interface layer, the report includes an “Interface changes detected” section listing affected layers and methods.
Cross-module alerts¶
When a file in one interface layer changes, the analyzer searches for related functions in OTHER layers by extracting keywords from the changed filename and querying the CPG. For example, if src/cli/reindex_commands.py changes, it looks for functions with “reindex” in their name in MCP, REST API, etc.
The report includes a “Cross-module alert” section suggesting which layers to check:
**Cross-module alert** — related interfaces may need updates:
- Changed CLI → check MCP: `codegraph_reindex`, `register_reindex_tools`
Story coverage delta¶
When interface layers are changed, the analyzer flags which other layers may need updates to maintain feature parity. The report includes a “Story coverage check” section:
**Story coverage check** — verify other interfaces:
- CLI changed (`reindex`), check: MCP, REST API, ACP
Runtime Checks¶
dogfood status is the primary readiness probe for the current local workflow. It reports:
- CPG freshness and commit lag
- maintenance pressure for the DuckDB file
- lock diagnostics and recovery guidance
- persisted review trace state from
data/reviews/ - OpenViking availability for the local development stack
CLI Commands¶
All commands are accessed via python -m src.cli.import_commands dogfood <subcommand>.
dogfood status¶
Check CPG freshness, review state, and local runtime readiness:
python -m src.cli.import_commands dogfood status [--db PATH]
dogfood analyze¶
Run commit analysis on demand:
python -m src.cli.import_commands dogfood analyze [--base-ref HEAD~1] [--db PATH]
dogfood report¶
Generate quality report (markdown or JSON):
python -m src.cli.import_commands dogfood report [--format markdown|json] [--db PATH]
dogfood validate-claims¶
Validate numeric claims in documentation against the CPG. Extracts numbers from markdown (e.g., “95 handlers”, “12 scenarios”) and verifies via SQL:
python -m src.cli.import_commands dogfood validate-claims [--path PATH] [--db PATH]
Claim rules are defined in config.yaml → dogfooding.claims_validation.rules[]. Each rule maps keywords (English + Russian) to a SQL query:
claims_validation:
enabled: true
timeout: 5.0
rules:
- keywords: ["handlers", "обработчиков"]
sql: "SELECT COUNT(DISTINCT full_name) FROM nodes_method WHERE ..."
description: "Scenario handler methods"
dogfood trend¶
Show quality trend across recent commits from the cpg_quality_history table:
python -m src.cli.import_commands dogfood trend [--commits N] [--db PATH]
Output is an ASCII table with columns: Commit, Date, Methods, Avg CC, Dead, Hi-CC, TODO.
Quality snapshots are recorded automatically after each commit analysis via record_snapshot() in src/dogfooding/quality_history.py.
dogfood validate-stories¶
Validate user story interface coverage via CPG using StoryValidationRunner:
python -m src.cli.import_commands dogfood validate-stories [--stories 2,8,11] [--path FILE] [--output FILE] [--db PATH] [--go-db PATH]
dogfood config-check¶
Detect orphan configuration parameters by cross-referencing YAML config, schema, and code usage:
python -m src.cli.import_commands dogfood config-check [--format text|json|csv] [--level error|warning|info|all] [--fix-suggestions] [--config PATH] [--schema PATH] [--source DIR...]
| Parameter | Default | Description |
|---|---|---|
--format |
text |
Output format: text, json, or csv |
--level |
all |
Minimum severity level to show |
--fix-suggestions |
off | Show fix suggestions for each finding |
--config |
config.yaml |
Path to YAML config file |
--schema |
src/config/unified_config.py |
Path to schema file |
--source |
src/ |
Source directories to scan (multiple allowed) |
Detects 6 orphan types: yaml_unused, yaml_missing, code_orphan, path_mismatch, orphaned_dataclass, unused_default. Uses ConfigOrphanAnalyzer from src/analysis/config_analyzer.py.
dogfood maintain-db¶
Perform routine CPG maintenance and cleanup:
python -m src.cli.import_commands dogfood maintain-db [--db PATH] [--force] [--json]
| Parameter | Default | Description |
|---|---|---|
--db |
auto-detected | DuckDB database path |
--force |
off | Continue even when the command detects a risky state |
--json |
off | Return machine-readable maintenance details |
Use this command when quality history tables, review traces, or stale maintenance markers need a controlled cleanup step.
dogfood continue¶
Resume an interrupted dogfooding workflow from the stored review state:
python -m src.cli.import_commands dogfood continue [--db PATH] [--review-dir PATH] [--json]
| Parameter | Default | Description |
|---|---|---|
--db |
auto-detected | DuckDB database path |
--review-dir |
data/reviews |
Directory with persisted review state |
--json |
off | Return machine-readable status for automation |
This is the recovery path when a post-commit review was interrupted and you want to continue from the last saved checkpoint instead of starting over.
Configuration¶
In config.yaml:
dogfooding:
enabled: true
auto_update_cpg: true # Run gocpg update if CPG is stale
cpg_update_timeout: 40 # Seconds for CPG update
analysis_timeout: 16 # Seconds for quality + blast radius
cc_threshold: 10 # Flag methods with CC above this
fan_out_threshold: 30 # Flag methods with fan_out above this
blast_radius_depth: 2 # Max depth for caller traversal
max_files_per_commit: 15 # Max files to analyze per commit
report_format: markdown # markdown or json
record_quality_history: true # Record QualitySnapshot per commit
quality_history_db_path: data/quality_history.duckdb # Optional separate DB for snapshots
include_paths: # Limit dogfooding to selected source roots
- src
- tests
exclude_paths: # Skip generated or third-party code
- .venv
- node_modules
claims_validation:
enabled: true
timeout: 5.0 # Seconds per claim query
rules: # Keyword→SQL mappings for validate-claims
- keywords: ["handlers", "обработчиков"]
sql: "SELECT COUNT(...) FROM nodes_method WHERE ..."
description: "Scenario handler methods"
CommitReport¶
The CommitReport dataclass (src/dogfooding/commit_analyzer.py) holds the full analysis result:
| Field | Type | Description |
|---|---|---|
changed_files |
List[str] |
Code files changed in the commit |
changed_methods |
List[dict] |
Methods in changed files (deduplicated) |
blast_radius |
Dict |
{"callers": {method: [callers]}, "total_affected": N} |
quality_summary |
Dict |
High-CC, high-fan_out, TODO, debug, deprecated counts |
interface_impacts |
List[dict] |
Interface layers affected (CLI, REST API, MCP, ACP) |
cross_module_alerts |
List[dict] |
Related functions in other interface layers |
story_coverage_delta |
List[dict] |
Story coverage gaps across layers |
is_cpg_fresh |
bool |
Whether CPG was up-to-date |
analysis_time_ms |
int |
Total analysis time in milliseconds |
deltas |
List[dict] |
Before→after metric changes |
Report Format¶
The review pipeline returns a markdown report as additionalContext in JSON:
{"additionalContext": "## Commit Analysis Report\n**Summary:** ..."}
Full report structure (sections are omitted when empty):
## Commit Analysis Report
**Summary:** 3 files, 45 methods, 2 high-CC, 1 TODO/FIXME, 128 affected callers
**CPG status:** fresh
**Impact of changes:**
- `_get_fallback_domain`: CC 29->8 (-21), FanOut 18->5 (-13)
**High complexity methods:**
- `classify` (CC: 17)
- `_score_domain` (CC: 16)
**High fan-out methods:**
- `classify` (fan_out: 39)
**Blast radius:** 128 callers affected
- `classify` called by: `run_intent_classifier`, `IntentBenchmark._evaluate_single` +126 more
- `_classify_domain` called by: `classify`, `get_morph` +4 more
**Interface changes detected:**
- **CLI**: `src/cli/intent_commands.py` (`add_intent_commands`, `_run_classify`)
**Cross-module alert** — related interfaces may need updates:
- Changed CLI → check MCP: `codegraph_intent`, `register_intent_tools`
**Story coverage check** — verify other interfaces:
- CLI changed (`intent`), check: MCP, REST API, ACP
*Analysis completed in 95ms*
Scaling to Other Projects¶
The dogfooding pipeline is project-agnostic. To set up for any project:
-
Import the project to create a CPG database:
bash python -m src.cli import /path/to/project --language python -
Register the project in
config.yaml:yaml projects: active: my_project registry: my_project: db_path: data/projects/my_project.duckdb source_path: /path/to/project language: python domain: python_generic -
Verify the local runtime:
bash python -m src.cli.import_commands dogfood status --db data/projects/my_project.duckdb python -m src.cli.import_commands dogfood analyze --base-ref HEAD~1 --db data/projects/my_project.duckdb
The dogfood commands read the active project from config.yaml and resolve the correct database path automatically.
Troubleshooting¶
Analysis produces empty {} output:
- Check that the database file exists at the configured db_path
- Verify the active project in config.yaml has a valid db_path
- Run python -m src.cli.import_commands dogfood status to check freshness
- Ensure the commit changed code files (.py, .go, .c, etc.), not just docs or configs
CPG always shows stale:
- Ensure gocpg binary exists at gocpg/gocpg.exe (or the configured GOCPG_PATH)
- Run python -m src.cli.import_commands dogfood status and inspect recommended_next_action
- Try manual update: gocpg/gocpg.exe update --input=. --output=<db>
CC values are 0 after incremental update:
- Incremental gocpg update may skip MethodMetricsPass for some entries. New entries can have cyclomatic_complexity=0.
- The deduplication logic keeps the entry with the highest CC value, mitigating this.
- If persistent, re-import the project from scratch: python -m src.cli import /path/to/source
DuckDB lock error (“file is being used by another process”):
- Another gocpg.exe process is running (for example from gocpg watch or a concurrent refresh).
- The runtime uses read-only connections and handles lock errors gracefully, falling back to subprocess queries.
- codegraph_watch update / ensure_fresh_with_details() return lock diagnostics (failure_kind=db_lock, locker_pids, auto_unlock_*, next_command) to speed up recovery.
- If the locker PID is the current Python process, auto-unlock intentionally skips killing itself; run the suggested next_command after closing the locker.
Delta report not appearing:
- The delta report only appears when the CPG was stale before the update (pre-update metrics were captured).
- If the CPG is already fresh (e.g., gocpg watch updated it), there are no pre-update metrics to compare against.
Timeout exceeded:
- The 58s budget (60s Claude Code limit minus 2s margin) accommodates most commits. For very large projects, gocpg update may exceed the 40s phase budget.
- Reduce max_files_per_commit in config.
- Ensure GoCPG indexes are up to date: gocpg/gocpg.exe index --db=<db>
OpenViking is missing from status:
- Start the local stack and confirm the OpenViking service is listening on the configured port.
- Run python -m src.cli.import_commands dogfood status again and inspect the openviking_status section.