Integration guide for connecting CodeGraph’s CPG analysis pipeline with Claude Code hooks and Git workflows.
Table of Contents¶
- Overview
- Architecture
- Hook system
- Data flow
- Shared utilities
- Hooks Reference
- SessionStart: Project context
- UserPromptSubmit: Entity enrichment
- PreToolUse: Complexity gate
- PostToolUse: Commit analysis
- PostToolUse: CLI error monitor
- Stop: Post-analysis
- CLI Review Command
- Scope-Aware Filtering
- Metrics & Monitoring
- Graceful Degradation
- Git Integration
- GoCPG git hooks
- CPG freshness and automatic updates
- Configuration
- Claude Code settings
- Project configuration
- Setup
- Troubleshooting
Overview¶
CodeGraph integrates with Claude Code through five hook scripts that inject CPG analysis data into the conversation at different lifecycle points. Combined with GoCPG git hooks that keep the CPG database synchronized, this creates a continuous feedback loop:
- Session starts – Claude receives project context and CPG statistics
- User asks a question – mentioned code entities are resolved against the CPG
- Claude edits a file – high-complexity methods in that file trigger warnings
- Claude commits code – the commit is analyzed for quality metrics, blast radius, and before/after impact
- Claude finishes a response – files mentioned in the response are checked for quality issues
All hooks communicate via JSON on stdin/stdout and return {"additionalContext": "..."} to inject markdown context into the conversation, or {} to remain silent.
Architecture¶
Hook system¶
Claude Code hooks are configured in .claude/settings.json. Each hook fires at a specific lifecycle event and receives event-specific JSON on stdin.
Claude Code Lifecycle
|
+-- SessionStart ---------> session_context.py -----> Project name, language, CPG stats
|
+-- UserPromptSubmit -----> enrich_prompt.py -------> CPG entity lookups for mentioned code
|
+-- PreToolUse (Edit) ----> pre_tool_use.py --------> Complexity warnings before file edit
|
+-- PostToolUse (Bash) ---> commit_analysis.py -----> Commit quality + blast radius + delta
|
+-- PostToolUse (Bash) ---> cli_error_monitor.py ---> Detect CLI error patterns
|
+-- Stop -----------------> post_analysis.py -------> Quality check on files in response
Data flow¶
All hooks query the CPG database through one of two paths:
-
GoCPG subprocess (
gocpg.exe query --sql "...") – used by most hooks via_utils.run_gocpg_query(). No Python dependency on DuckDB required. -
DuckDB direct (
import duckdb; con.execute(...)) – used bycommit_analysis.pyfor the full analysis pipeline (CommitAnalyzer,CPGFreshnessChecker). Falls back to subprocess if DuckDB import fails.
Hook Script
|
+-- _utils.get_active_project() --> config.yaml --> db_path
|
+-- _utils.run_gocpg_query(sql, db_path) --> gocpg.exe query --> stdout
| OR
+-- duckdb.connect(db_path) --> direct SQL
|
+-- safe_json_output(markdown) --> {"additionalContext": "..."} --> Claude Code
Shared utilities¶
All hooks import from .claude/hooks/_utils.py:
| Function | Purpose |
|---|---|
get_active_project() |
Reads config.yaml, returns active project dict (name, db_path, language, domain) |
run_gocpg_query(sql, db_path) |
Executes SQL via gocpg.exe query subprocess |
run_cli_query(query) |
Executes natural language query via python -m src.cli query |
extract_entities(text) |
Extracts CamelCase names, snake_case identifiers, and file paths from text |
safe_json_output(context) |
Writes {"additionalContext": context} or {} to stdout |
read_stdin_json() |
Parses JSON from stdin |
get_active_project_auto(cwd) |
Auto-detect project by CWD with fallback to get_active_project() |
load_parse_scope(db_path) |
Load parse scope from cpg_parse_scope table in DuckDB |
is_partial_scope(scope) |
Check if scope has exclusions or selective import mode |
tests_in_scope(scope) |
Check if tests are included in the parse scope |
Constants: PROJECT_ROOT (two levels up from hooks dir), GOCPG_BINARY (gocpg/gocpg.exe), CONFIG_PATH (config.yaml).
Additional shared modules in .claude/hooks/:
| Module | Purpose |
|---|---|
_feedback.py |
ReviewFeedback and ReviewFinding dataclasses for structured markdown reports |
_metrics.py |
JSONL performance logging via timed_hook() context manager |
_session_cache.py |
File-based project context cache (avoids re-reading config.yaml on every hook) |
_project_detector.py |
Auto-detect project by CWD/git remote |
Hooks Reference¶
SessionStart: Project context¶
File: .claude/hooks/session_context.py
Timeout: 10s
Fires: Once when a Claude Code session begins
Reads the active project from config.yaml and runs gocpg stats to get CPG size. Outputs:
## Project Context
Active project: codegraph (python)
Domain: python_generic
### CPG Statistics
- Methods: 52341
- Calls: 111234
- Files: 1847
- Total nodes: 312000
- Total edges: 890000
This gives Claude immediate awareness of the project scope and available CPG data.
UserPromptSubmit: Entity enrichment¶
File: .claude/hooks/enrich_prompt.py
Timeout: 15s
Fires: On every user message
Extracts code entities from the user’s message (CamelCase class names, snake_case function names, file paths) and looks them up in the CPG. Returns location and complexity data for up to 3 entities.
## CodeGraph CPG Context
### CommitAnalyzer
- `analyze_commit` at `src/dogfooding/commit_analyzer.py:298` (complexity: 12)
- `_deduplicate_methods` at `src/dogfooding/commit_analyzer.py:115` (complexity: 3)
### cpg_freshness
- `is_fresh` at `src/dogfooding/cpg_freshness.py:64` (complexity: 2)
- `ensure_fresh` at `src/dogfooding/cpg_freshness.py:98` (complexity: 5)
This helps Claude locate code entities without manual file searching.
Interface Exposure Detection (M1)¶
For each resolved entity, the hook additionally checks which interface layers call it:
### CommitAnalyzer
- `analyze_commit` at `src/dogfooding/commit_analyzer.py:298` (complexity: 12)
**Exposed via:** CLI, MCP, REST API
The lookup_interface_exposure() function searches for calls to the entity from the 4 interface layers (CLI, REST API, MCP, ACP) via the nodes_call table in the CPG. This helps Claude assess the potential impact of changes on external interfaces.
PreToolUse: Complexity gate¶
File: .claude/hooks/pre_tool_use.py
Timeout: 8s
Fires: Before any Edit or Write tool call
Checks the target file for methods with CC > 15, high fan-out (> 30), TODO/FIXME markers, or deprecated code. Warns Claude before it modifies complex code:
## File Quality Warning
- `classify` (CC: 17, fan_out: 39) -- high complexity, review carefully
- `_score_domain` (CC: 16, has TODO/FIXME)
Non-source files (.md, .json, .yaml, etc.) are silently skipped.
Registration Completeness Check (CR2)¶
When editing files in scenarios/, services/, analysis/, security/ directories, the hook additionally checks whether all public functions in the file are registered in interface layers:
## Registration Warning
**Registration check:** `analyze_security`, `scan_endpoints` not called from any interface layer (CLI/API/MCP/ACP)
The check_registration_completeness() function extracts public functions (not starting with _ or test_) from the target file via CPG, then checks whether they are called from any interface layer. If some functions are registered and some are not, a warning is produced.
Files in the interface layers themselves (src/cli/, src/api/routers/, src/mcp/tools/, src/acp/) are skipped.
PostToolUse: Commit analysis¶
File: .claude/hooks/commit_analysis.py
Timeout: 60s
Fires: After any Bash tool call containing git commit
The most complex hook. Runs a multi-phase analysis pipeline:
- CPG freshness check (2s) – compares
cpg_git_state.commit_hashtogit rev-parse HEAD - Pre-update metrics capture (~1s) – snapshots current metrics for changed files (enables delta report)
- CPG update (40s) – runs
gocpg update; durable post-commit traces are handled by the detached GoCPG worker - Quality + blast radius analysis (16s) – queries
nodes_methodfor CC, fan-out, TODO/FIXME, deprecated; queriescall_containmentfor callers of changed methods - Delta computation – compares pre-update vs post-update metrics
Output includes before/after impact, quality warnings, and blast radius:
## Commit Analysis Report
**Summary:** 3 files, 45 methods, 2 high-CC, 1 TODO/FIXME, 128 affected callers
**CPG status:** fresh
**Impact of changes:**
- `_get_fallback_domain`: CC 29->8 (-21), FanOut 18->5 (-13)
**High complexity methods:**
- `classify` (CC: 17)
**Blast radius:** 128 callers affected
- `classify` called by: `run_intent_classifier`, `IntentBenchmark._evaluate_single` +126 more
The hook only fires for git commit commands. Non-commit Bash commands (ls, git status, etc.) produce empty {}.
Fallback path: If DuckDB import fails, falls back to subprocess-based analysis (simpler, without blast radius or delta report).
Extended Analysis Phases (Code Review v3)¶
Beyond the base pipeline, the hook performs extended analysis:
Interface Impact Detection (CR1). If changed files are in interface layers (CLI, REST API, MCP, ACP), the hook reports which layers are affected and how many methods changed:
**Interface Impact:**
- CLI: 3 methods affected in `src/cli/audit_commands.py`
- MCP: 1 method affected in `src/mcp/tools/search.py`
Cross-Module Dependency Alerts (CR3). The hook analyzes which adjacent interface layers may need updates when one layer changes.
Go CPG Blast Radius (L1). If a Go CPG database is configured (go_db_path), the hook queries call_containment in both databases for a combined Python + Go blast radius.
Story Coverage Delta (L4). If interface impact is detected, the hook checks for story coverage gaps – for example, a function removed from CLI but still present in API.
Stop: Post-analysis¶
File: .claude/hooks/post_analysis.py
Timeout: 10s
Fires: When Claude finishes a response (excluding error stops)
Scans Claude’s response text for source file paths (src/..., pkg/..., etc.) and checks each file for quality issues via CPG:
- TODO/FIXME markers
- Debug code
- Complexity spikes (methods with CC significantly above file average)
- Fan-out hotspots (fan_out > 30)
- Deprecated methods
## Post-Analysis
- `src/workflow/copilot.py`: complexity spikes: `classify` (CC:17), fan-out hotspots: `classify` (fan_out:39)
- `src/dogfooding/commit_analyzer.py`: 2 methods with TODO/FIXME markers
This surfaces quality issues in files that Claude discussed or modified during the conversation.
Post-Analysis Test & Registration Checks (L2)¶
The Stop hook additionally checks:
- Missing tests: whether changed files contain public functions without corresponding tests
- Unregistered interfaces: whether new functions are not called from any interface layer
PostToolUse: CLI error monitor¶
File: .claude/hooks/cli_error_monitor.py
Timeout: 5s
Fires: After any Bash tool call
Monitors CLI command output for common error patterns and surfaces actionable diagnostics:
- Database not found /
DatabaseNotConfiguredError - DuckDB import errors
- Python
ImportError/ModuleNotFoundError - Unhandled tracebacks
RuntimeWarningmessages- GoCPG binary not found
When an error pattern is detected, the hook returns a context message with the error type and suggested resolution.
CLI Review Command¶
In addition to hooks, CodeGraph provides a standalone CLI command for code review:
# Review changes against a base ref
python -m src.cli review --base-ref HEAD~3
# Review staged changes
python -m src.cli review --staged
# Review specific files
python -m src.cli review --files src/api/main.py src/auth.py
# Output formats
python -m src.cli review --format json --output-file report.json
python -m src.cli review --format sarif --output-file report.sarif
python -m src.cli review --base-ref HEAD~5 --sarif-file out.sarif
# Skip security analysis
python -m src.cli review --no-security
The CLI review command uses ReviewPipeline (src/review/pipeline.py) which orchestrates:
1. Project detection and CPG status check
2. Parse scope loading for scope-aware filtering
3. Quality analysis (dead code, high complexity, large methods)
4. Security analysis via SecurityPRReview (optional)
5. Scope-aware aggregation via ReviewAggregator
6. Output in markdown, JSON, or SARIF 2.1.0 format
Exit codes: 0 = clean or medium/low only, 1 = critical or high findings detected.
Scope-Aware Filtering¶
When the CPG is built with excluded directories (e.g., only backend without tests/frontend), the pipeline applies scope-aware filtering:
| Finding Type | Condition | Action |
|---|---|---|
dead_code |
Scope is partial | Demote to info, mark scope_limited |
missing_test |
include_tests=false |
Suppress entirely |
blast_radius |
Scope is partial | Demote to info, mark scope_limited |
complexity |
– | Not affected (per-method metric) |
security |
– | Not affected (real vulnerability) |
A disclaimer block is added to the report when scope is limited.
Metrics & Monitoring¶
All hooks log execution metrics to data/hook_metrics.jsonl:
{"timestamp": "2026-03-06T12:00:00Z", "hook": "commit_analysis", "duration_ms": 1234.5, "findings": 3, "project": "codegraph", "status": "ok"}
View statistics with the dogfood CLI:
python -m src.cli dogfood hooks-status
python -m src.cli dogfood hooks-status --last 50
python -m src.cli dogfood hooks-status --hook commit_analysis
The dashboard shows per-hook aggregates: runs, ok/error/warning counts, failure rate, average and max duration, total findings.
Graceful Degradation¶
All hooks follow fail-open design: - Missing gocpg binary: skip CPG queries, output empty context - Corrupt/missing DB: skip analysis, warn in metrics - Timeout: partial results returned within budget - Import errors: fallback to subprocess path (commit_analysis) - Any unhandled exception: caught at top level, empty JSON output
Git Integration¶
GoCPG git hooks¶
GoCPG can install git hooks that trigger an incremental CPG update after every commit. With --review-trace, the post-commit hook also persists a full review trace:
gocpg/gocpg.exe hooks install --repo=. --db=data/projects/codegraph.duckdb --review-trace
This creates .git/hooks/post-commit which launches a detached worker. The worker runs gocpg update, verifies the DuckDB file can be reopened immediately after the update, and writes review artifacts to data/reviews/.
Trace artifacts per commit:
<sha>.log— live worker log with heartbeats<sha>.status.json— current phase/progress snapshot<sha>.meta.json— final status includinggocpg_update_okanddb_unlocked_after_update<sha>.json/<sha>.md— review output
Inspect the trace from the terminal:
python scripts/review_trace_status.py <commit-sha>
python scripts/review_trace_status.py <commit-sha> --watch
Check hook status:
gocpg/gocpg.exe hooks status --repo=.
Remove hooks:
gocpg/gocpg.exe hooks uninstall --repo=.
CPG freshness and automatic updates¶
The CPGFreshnessChecker (src/dogfooding/cpg_freshness.py) manages CPG synchronization:
is_fresh()– comparescpg_git_state.commit_hash(stored in DuckDB) againstgit rev-parse HEADcommits_behind()– counts how many commits the CPG is behind HEADensure_fresh()– runsgocpg updateif stale
For durable post-commit feedback, the traced worker does not rely on src.cli review. It runs ReviewPipeline directly on the changed files, which avoids unrelated CLI import failures and keeps the review bounded to the actual commit scope.
Interaction between git hooks and Claude Code hooks:
Developer commits (terminal) Developer commits (Claude Code)
| |
v v
git post-commit hook PostToolUse hook fires
| |
v +-- CPG freshness check
Detached review worker | +-- Best-effort update via API/webhook
| | +-- Conversational diagnostics
+-- gocpg update |
+-- DuckDB unlock check |
+-- ReviewPipeline(files=changed)
+-- data/reviews/<sha>.* |
v v
Durable local trace Report injected into conversation
When both hooks are active, the git post-commit trace provides the durable audit trail while the Claude Code hook remains best-effort and conversational. The traced worker records whether the DB was unlocked immediately after update, so lock issues are visible in data/reviews/<sha>.meta.json.
Configuration¶
Claude Code settings¶
The hook configuration lives in .claude/settings.json:
{
"hooks": {
"SessionStart": [{
"matcher": "",
"hooks": [{
"type": "command",
"command": "python .claude/hooks/session_context.py",
"timeout": 10000
}]
}],
"UserPromptSubmit": [{
"matcher": "",
"hooks": [{
"type": "command",
"command": "python .claude/hooks/enrich_prompt.py",
"timeout": 15000
}]
}],
"PreToolUse": [{
"matcher": "",
"hooks": [{
"type": "command",
"command": "python .claude/hooks/pre_tool_use.py",
"timeout": 8000
}]
}],
"PostToolUse": [{
"matcher": "Bash",
"hooks": [{
"type": "command",
"command": "python .claude/hooks/commit_analysis.py",
"timeout": 60000
}, {
"type": "command",
"command": "python .claude/hooks/cli_error_monitor.py",
"timeout": 5000
}]
}],
"Stop": [{
"matcher": "",
"hooks": [{
"type": "command",
"command": "python .claude/hooks/post_analysis.py",
"timeout": 10000
}]
}]
}
}
Matcher format: The matcher field is a regex string pattern, not an object. Use "" to match all events or "Bash" to match only the Bash tool. Object format ({"tools": ["Bash"]}) is no longer supported.
Timeout: In milliseconds. If a hook exceeds its timeout, Claude Code kills the process and continues without the hook’s output.
Note: The actual
.claude/settings.jsonmay use absolute paths (e.g.python D:/work/codegraph/.claude/hooks/session_context.py). The relative paths above are shown for portability.
Project configuration¶
Hooks read the active project from config.yaml:
projects:
active: codegraph
registry:
codegraph:
db_path: data/projects/codegraph.duckdb
source_path: .
language: python
domain: python_generic
The db_path is resolved relative to the project root. All hooks use _utils.get_active_project() to look up this configuration.
Review pipeline¶
The review pipeline is configured in config.yaml:
review_pipeline:
# Project detection
auto_detect_project: true # Auto-detect project by CWD
detect_by_source_path: true # Match CWD against source_path in registry
freshness_check_on_session: true # Check CPG freshness on SessionStart
auto_reparse: prompt # Auto-reparse: "prompt", "always", "never"
max_reparse_timeout_seconds: 60 # Timeout for CPG re-parse
stale_threshold_commits: 0 # Commits behind threshold (0 = any staleness)
max_findings_in_summary: 10 # Max findings shown in summary
# Scope awareness
scope_aware_filtering: true
suppress_tests_outside_scope: true
demote_dead_code_partial_scope: true
show_scope_disclaimer: true
# Metrics and resilience
metrics_enabled: true
metrics_file: data/hook_metrics.jsonl
fail_open: true
Dogfooding¶
The dogfooding section configures commit analysis thresholds:
dogfooding:
enabled: true
auto_update_cpg: true
cc_threshold: 10 # CC warning threshold for commit analysis
fan_out_threshold: 30 # Fan-out warning threshold
max_files_per_commit: 15 # Max files analyzed per commit
Note: The
cc_thresholdindogfooding(10) differs fromCC_THRESHOLDinpre_tool_use.py(15). The commit analysis hook uses the config value; the pre-tool-use hook uses its hardcoded constant.
Setup¶
One-command setup¶
python -m src.cli.import_commands dogfood setup --repo . --db data/projects/codegraph.duckdb
This installs GoCPG git hooks and verifies the Claude Code hook configuration.
Manual setup¶
-
Ensure GoCPG binary exists at
gocpg/gocpg.exe(or setGOCPG_PATHenvironment variable) -
Ensure CPG database exists – import the project if needed:
bash python -m src.cli import . --language python -
Install GoCPG git hooks:
bash gocpg/gocpg.exe hooks install --repo=. --db=data/projects/codegraph.duckdb -
Copy
.claude/settings.jsonwith hook configuration (see Claude Code settings above) -
Verify with
/doctorin Claude Code – should show no settings errors
Verify¶
# Check dogfooding pipeline status
python -m src.cli.import_commands dogfood status
# Check GoCPG hooks
gocpg/gocpg.exe hooks status --repo=.
# Test commit_analysis.py manually
echo '{"tool":"Bash","tool_input":{"command":"git commit -m \"test\""},"tool_result":"ok"}' | python .claude/hooks/commit_analysis.py
Troubleshooting¶
Hook produces empty {} for all events:
- Check that config.yaml has an active project with valid db_path
- Verify the DuckDB file exists at the configured path
- Ensure gocpg/gocpg.exe exists and is executable
Entity enrichment finds nothing:
- The CPG may not contain the queried entity. Run gocpg query --sql "SELECT COUNT(*) FROM nodes_method" to verify CPG is populated
- Entity extraction only matches CamelCase names (> 3 chars), snake_case names (> 5 chars), and source file paths
Complexity gate never fires:
- Only triggers for Edit and Write tools, not Bash
- Only checks source files (.py, .go, .ts, etc.)
- Threshold is CC > 15 – lower values are not reported
Commit analysis shows stale metrics (CC=0):
- The hook uses --force flag by default. If metrics show 0, the force re-parse may have timed out (40s budget)
- Check if another gocpg.exe process is running and holding a DuckDB lock
- Try manual force update: gocpg/gocpg.exe update --force --input=. --output=<db>
Delta report not appearing:
- Delta report only appears when CPG was stale before the update
- If GoCPG git hooks or gocpg watch already updated the CPG, there are no pre-update metrics to compare
DuckDB lock error:
- Another gocpg.exe process is holding the file. The hook handles this gracefully by falling back to subprocess queries
- In production, this is rare since the hook fires after the Bash command completes
/doctor shows settings errors:
- Ensure all matcher fields are strings ("" or "Bash"), not objects
- Check JSON syntax in .claude/settings.json