Dogfooding Guide: CPG-Powered Commit Analysis¶
CodeGraph analyzes its own codebase through the Code Property Graph after every commit, creating a Plan-Act-Review feedback loop. Claude Code receives quality metrics, blast radius data, and before/after impact comparison as context immediately after committing code.
Table of Contents¶
- How It Works
- Usage Scenarios
- Scenario 1: Automatic post-commit feedback
- Scenario 2: Find and fix quality issues
- Scenario 3: Validate refactoring impact
- Scenario 4: On-demand analysis
- Pipeline Architecture
- Data flow
- Timeout budget
- CPG freshness and force re-parse
- Method deduplication
- Delta report
- Setup
- CLI Commands
- Configuration
- Report Format
- Scaling to Other Projects
- Troubleshooting
How It Works¶
The dogfooding pipeline connects three systems:
-
GoCPG builds and maintains a Code Property Graph (DuckDB) with pre-computed metrics for every method: cyclomatic complexity, fan-in/fan-out, TODO/FIXME flags, debug code, deprecated usage.
-
Git hooks trigger CPG updates after each commit, keeping the database synchronized with the codebase.
-
Claude Code hooks fire after
git commitcommands, query the CPG for changed methods, compute quality metrics and blast radius, then inject the report back into the conversation as additional context.
The result: every commit produces an instant quality assessment without leaving the IDE or running separate tools.
Usage Scenarios¶
Scenario 1: Automatic post-commit feedback¶
The primary scenario. You work in Claude Code, make changes, commit:
You: "Commit these changes"
Claude: git add src/intent/classifier.py && git commit -m "refactor: extract pattern table"
The PostToolUse hook fires automatically and injects a report:
## Commit Analysis Report
**Summary:** 1 files, 45 methods, 2 high-CC, 3 TODO/FIXME, 128 affected callers
**CPG status:** fresh
**Impact of changes:**
- `_get_fallback_domain`: CC 29->8 (-21), FanOut 18->5 (-13)
**High complexity methods:**
- `classify` (CC: 17)
- `_score_domain` (CC: 16)
**Blast radius:** 128 callers affected
- `classify` called by: `run_intent_classifier`, `IntentBenchmark._evaluate_single` +126 more
Claude sees this context and can react: “The refactoring reduced _get_fallback_domain complexity from 29 to 8. Two methods still have CC>10: classify and _score_domain.”
What triggers the hook: Any git commit command executed via the Bash tool. The hook detects "git commit" in the command string. Non-commit Bash commands (e.g., git status, ls) are ignored.
What does NOT trigger it: Direct terminal commits outside Claude Code, git commit --amend, or commits via other tools.
Scenario 2: Find and fix quality issues¶
Use CPG queries to find code quality targets, then fix them with the pipeline providing feedback:
You: "Query the CPG for methods with CC > 15 and TODO/FIXME flags in src/workflow/"
Claude: [runs DuckDB query]
Found: _get_fallback_domain (CC=29, TODO), PolicyViolationsHandler.handle (CC=68, TODO)
You: "Refactor _get_fallback_domain to reduce complexity"
Claude: [extracts patterns to data table, replaces if/else chain with loop]
You: "Commit"
Claude: git commit -m "refactor: extract fallback patterns to class-level table"
-> Hook fires, report shows: CC 29->8 (-21)
This is the full Plan-Act-Review loop: 1. Plan: CPG query identifies the problem 2. Act: Refactoring reduces complexity 3. Review: Hook confirms the improvement with concrete metrics
Scenario 3: Validate refactoring impact¶
Before making a large refactoring, check the blast radius:
You: "What's the blast radius if I change HierarchicalIntentClassifier.classify?"
Claude: [queries call_containment]
213 direct callers across production code and tests
You: "Proceed with the refactoring"
Claude: [makes changes, commits]
-> Hook shows: 213 affected callers, CC unchanged, no regressions
The blast radius report helps gauge the risk of changes before they happen.
Scenario 4: On-demand analysis¶
Run analysis without committing:
# Analyze the last commit
python -m src.cli.import_commands dogfood analyze --base-ref HEAD~1
# Analyze changes between branches
python -m src.cli.import_commands dogfood analyze --base-ref origin/main
# Generate a full quality report
python -m src.cli.import_commands dogfood report --format markdown
Pipeline Architecture¶
Data flow¶
git commit (via Bash tool in Claude Code)
|
v
PostToolUse hook fires (.claude/hooks/commit_analysis.py, 60s timeout)
|
+-- Phase 1: CPG freshness check
| Query cpg_git_state.commit_hash, compare to git rev-parse HEAD
|
+-- Phase 1.5: Capture pre-update metrics (for delta report)
| Query nodes_method for changed files BEFORE CPG update
| Store {full_name: {cc, fan_out, ...}} for later comparison
|
+-- Phase 2: CPG update (--force for accurate metrics)
| gocpg update --force --input=<source> --output=<db>
| Full re-parse ensures MethodMetricsPass computes CC/fan_in/fan_out
|
+-- Phase 3: Quality + blast radius analysis
| Query nodes_method for changed files (post-update)
| Deduplicate methods (normalize slash variants, keep highest CC)
| Compute quality summary: high-CC, high-fan_out, TODO, debug, deprecated
| Query call_containment for callers of changed methods
|
+-- Phase 4: Delta computation
| Compare pre-update vs post-update metrics
| Generate before->after report: "CC 29->8 (-21)"
|
+-- Output: {"additionalContext": "## Commit Analysis Report\n..."}
Injected back into Claude Code conversation
Timeout budget¶
The hook has a 60-second total timeout with internal phases:
| Phase | Budget | Action |
|---|---|---|
| Freshness check | 2s | Compare cpg_git_state.commit_hash to git rev-parse HEAD |
| Pre-update metrics | ~1s | Query current metrics for changed files (for delta report) |
| CPG update (–force) | 40s | Run gocpg update --force for full re-parse with metrics |
| Quality analysis | 8s | Query nodes_method for CC, TODO, debug, deprecated flags |
| Blast radius | 8s | Query call_containment for direct callers |
If any phase exceeds its budget, the hook degrades gracefully: it produces whatever data it has or returns empty {}.
CPG freshness and force re-parse¶
The hook uses --force flag when running gocpg update. This triggers a full re-parse instead of incremental update. The reason:
-
Incremental update (
gocpg updatewithout--force): Creates new method entries but skipsMethodMetricsPass. New entries havecyclomatic_complexity=0,fan_in=0,fan_out=0. Fast but produces incomplete metrics. -
Force re-parse (
gocpg update --force): Runs the full parse pipeline includingMethodMetricsPass. All metrics are computed correctly. Slower but accurate.
For the dogfooding use case, accuracy is more important than speed. The 40-second budget accommodates force re-parse for projects up to several hundred source files.
Method deduplication¶
GoCPG may store the same method with different filename formats (forward slash src/file.py vs backslash src\file.py). The analyzer deduplicates by normalizing full_name slashes and keeping the entry with the highest CC value:
# Before dedup: 2 entries for the same method
src\intent\classifier.py:Classifier.classify CC=17
src/intent/classifier.py:Classifier.classify CC=0 (from incremental update)
# After dedup: 1 entry, highest CC wins
src\intent\classifier.py:Classifier.classify CC=17
Delta report¶
When the CPG is stale (needs update), the hook captures pre-update metrics before running gocpg update --force, then compares against post-update metrics. This produces a delta showing the actual impact of changes:
**Impact of changes:**
- `_get_fallback_domain`: CC 29->8 (-21), FanOut 18->5 (-13)
- `_build_quality_summary`: CC 5->3 (-2)
Methods with no metric changes are omitted. This helps developers immediately see whether their refactoring improved or degraded code quality.
Setup¶
One-command setup¶
python -m src.cli.import_commands dogfood setup --repo . --db data/projects/codegraph.duckdb
This installs git hooks (via gocpg) and verifies the Claude Code hook configuration.
Manual setup¶
-
Install git hooks (background CPG update on commit):
bash gocpg/gocpg.exe hooks install --repo=. --db=data/projects/codegraph.duckdb -
Configure Claude Code hooks in
.claude/settings.json:json { "hooks": { "PostToolUse": [{ "matcher": "Bash", "hooks": [{ "type": "command", "command": "python .claude/hooks/commit_analysis.py", "timeout": 60000 }] }] } }
Note: The matcher field must be a string (regex pattern), not an object. "Bash" matches the Bash tool specifically.
Verify setup¶
python -m src.cli.import_commands dogfood status
Expected output shows CPG freshness, hook status, and database path.
CLI Commands¶
# Full setup (git hooks + Claude Code hooks)
python -m src.cli.import_commands dogfood setup [--repo PATH] [--db PATH] [--language LANG]
# Check CPG freshness and hook status
python -m src.cli.import_commands dogfood status [--db PATH]
# Run commit analysis on demand
python -m src.cli.import_commands dogfood analyze [--base-ref HEAD~1] [--db PATH]
# Generate quality report (markdown or JSON)
python -m src.cli.import_commands dogfood report [--format markdown|json] [--db PATH]
Configuration¶
In config.yaml:
dogfooding:
enabled: true
auto_update_cpg: true # Run gocpg update if CPG is stale
cpg_update_timeout: 40 # Seconds for CPG update
analysis_timeout: 16 # Seconds for quality + blast radius
cc_threshold: 10 # Flag methods with CC above this
fan_out_threshold: 30 # Flag methods with fan_out above this
blast_radius_depth: 2 # Max depth for caller traversal
max_files_per_commit: 15 # Max files to analyze per commit
report_format: markdown # markdown or json
Report Format¶
The hook returns a markdown report as additionalContext in JSON:
{"additionalContext": "## Commit Analysis Report\n**Summary:** ..."}
Full report structure:
## Commit Analysis Report
**Summary:** 3 files, 45 methods, 2 high-CC, 1 TODO/FIXME, 128 affected callers
**CPG status:** fresh
**Impact of changes:**
- `_get_fallback_domain`: CC 29->8 (-21), FanOut 18->5 (-13)
**High complexity methods:**
- `classify` (CC: 17)
- `_score_domain` (CC: 16)
**High fan-out methods:**
- `classify` (fan_out: 39)
**Blast radius:** 128 callers affected
- `classify` called by: `run_intent_classifier`, `IntentBenchmark._evaluate_single` +126 more
- `_classify_domain` called by: `classify`, `get_morph` +4 more
*Analysis completed in 95ms*
Sections are omitted when empty (e.g., no high-CC methods = no “High complexity” section).
Scaling to Other Projects¶
The dogfooding pipeline is project-agnostic. To set up for any project:
-
Import the project to create a CPG database:
bash python -m src.cli import /path/to/project --language python -
Register the project in
config.yaml:yaml projects: active: my_project registry: my_project: db_path: data/projects/my_project.duckdb source_path: /path/to/project language: python domain: python_generic -
Run setup:
bash python -m src.cli.import_commands dogfood setup --repo /path/to/project --db data/projects/my_project.duckdb
The hook reads the active project from config.yaml and resolves the correct database path automatically.
Troubleshooting¶
Hook produces empty {} output:
- Check that the database file exists at the configured db_path
- Verify the active project in config.yaml has a valid db_path
- Run python -m src.cli.import_commands dogfood status to check freshness
- Ensure the commit changed code files (.py, .go, .c, etc.), not just docs or configs
CPG always shows stale:
- Ensure gocpg binary exists at gocpg/gocpg.exe (or the configured GOCPG_PATH)
- Check that git hooks are installed: look for .git/hooks/post-commit
- Try manual update: gocpg/gocpg.exe update --force --input=. --output=<db>
CC values are 0 after incremental update:
- This happens when gocpg update runs without --force. The incremental update skips MethodMetricsPass.
- The hook uses --force by default. If you see CC=0, the force re-parse may have timed out. Check the 40s timeout budget.
DuckDB lock error (“file is being used by another process”):
- Another gocpg.exe process is running (e.g., from gocpg watch or a concurrent hook invocation).
- The hook uses read-only connections and handles lock errors gracefully, falling back to subprocess queries.
Delta report not appearing:
- The delta report only appears when the CPG was stale before the update (pre-update metrics were captured).
- If the CPG is already fresh (e.g., gocpg watch updated it), there are no pre-update metrics to compare against.
Timeout exceeded:
- The 60s budget accommodates most commits. For very large projects, gocpg update --force may exceed the 40s phase budget.
- Reduce max_files_per_commit in config or consider using incremental update (remove force=True in commit_analysis.py).
- Ensure GoCPG indexes are up to date: gocpg/gocpg.exe index --db=<db>