Integration guide for connecting CodeGraph with GitHub for automated code review, security scanning, and incremental CPG updates.
Table of Contents¶
- Overview
- Prerequisites
- Webhook Configuration
- CI Pipeline (GitHub Actions)
- REST API Endpoints
- MCP Tools
- Incremental CPG Updates
- Docker Image Setup
- Troubleshooting
- Next Steps
Overview¶
CodeGraph integrates with GitHub for automated code review, security scanning, and technical debt tracking. Three integration paths:
- CI/CD Pipeline – CodeGraph as a step in GitHub Actions workflows
- Webhook-driven – push/PR events trigger incremental CPG updates and reviews automatically
- Standalone – deployed alongside GitHub, accessed via REST API / CLI / MCP
GitHub webhook payloads are normalized by CodeGraph into unified UnifiedPushEvent and UnifiedMREvent models shared across all 4 supported platforms.
Prerequisites¶
- CodeGraph instance (Docker or standalone) accessible from GitHub (or self-hosted runners)
- GitHub repository with admin permissions (for webhook setup)
- Docker (for CI pipeline steps)
- LLM API key:
YANDEX_API_KEY,GIGACHAT_AUTH_KEY, orOPENAI_API_KEY
Webhook Configuration¶
Endpoint: POST /api/v1/webhooks/github (returns 202 Accepted)
GitHub setup: Repository Settings > Webhooks > Add webhook:
1. Payload URL: https://<codegraph-host>/api/v1/webhooks/github
2. Content type: application/json
3. Secret: any string (for HMAC-SHA256 verification)
4. Events: select Pushes and Pull requests (or “Let me select individual events”)
Signature Verification¶
CodeGraph verifies the X-Hub-Signature-256 header (HMAC-SHA256, format: sha256=<hex>).
Error codes:
- 401 Unauthorized – missing X-Hub-Signature-256 header
- 403 Forbidden – invalid signature (HMAC mismatch)
Supported Events¶
Event Header (X-GitHub-Event) |
Action |
|---|---|
push |
Incremental CPG update via CPGUpdateQueue |
pull_request |
Code review workflow (all actions: opened, synchronize, closed, merged, reopened) |
ping |
Connection test (returns {"status": "accepted", "detail": {"message": "pong"}}) |
Events not matching these types are returned with {"status": "skipped"}.
Configuration¶
Section github in config.yaml:
github:
webhook_secret: "your-github-webhook-secret"
auto_update_on_push: true # trigger incremental CPG update on push events
Event Type Detection¶
CodeGraph detects event type from the X-GitHub-Event header. If the header is missing (e.g., testing via curl), it falls back to payload structure detection:
pull_requestkey present →pull_requesteventhead_commitor (ref+commits) present →pusheventzenkey present →pingevent
CI Pipeline (GitHub Actions)¶
Basic Workflow¶
Create .github/workflows/codegraph-review.yaml:
name: CodeGraph Review
on:
pull_request:
types: [opened, synchronize, reopened]
jobs:
cpg-update:
name: Build CPG
runs-on: ubuntu-latest
container:
image: ghcr.io/mkhlsavin/codegraph:latest
steps:
- uses: actions/checkout@v4
with:
fetch-depth: 0
- name: Build CPG
run: |
gocpg parse --input=. --output=/tmp/cpg.duckdb \
--lang=${{ vars.PROJECT_LANGUAGE || 'python' }}
security-scan:
name: Security Scan
needs: cpg-update
runs-on: ubuntu-latest
container:
image: ghcr.io/mkhlsavin/codegraph:latest
steps:
- uses: actions/checkout@v4
with:
fetch-depth: 0
- name: Diff-based security scan
run: |
DIFF=$(git diff ${{ github.event.pull_request.base.sha }}...HEAD)
curl -sf -X POST "${CODEGRAPH_URL}/api/v1/security/scan-diff" \
-H "Authorization: Bearer ${CODEGRAPH_TOKEN}" \
-H "Content-Type: application/json" \
-d "{\"diff_content\": $(echo "$DIFF" | python3 -c 'import sys,json; print(json.dumps(sys.stdin.read()))'), \"output_format\": \"sarif\"}" \
-o security-results.sarif
env:
CODEGRAPH_URL: ${{ vars.CODEGRAPH_URL }}
CODEGRAPH_TOKEN: ${{ secrets.CODEGRAPH_API_TOKEN }}
- name: Upload SARIF
if: always()
uses: github/codeql-action/upload-sarif@v3
with:
sarif_file: security-results.sarif
Required Secrets/Variables¶
| Name | Type | Description |
|---|---|---|
CODEGRAPH_URL |
Variable | CodeGraph API endpoint (e.g. https://codegraph.example.com) |
CODEGRAPH_API_TOKEN |
Secret | CodeGraph API authentication token |
PROJECT_LANGUAGE |
Variable | Source language (default: python) |
SARIF Integration¶
CodeGraph can output security findings in SARIF format, compatible with GitHub Code Scanning:
curl -sf -X POST "${CODEGRAPH_URL}/api/v1/security/scan-diff" \
-H "Authorization: Bearer ${CODEGRAPH_TOKEN}" \
-H "Content-Type: application/json" \
-d '{"diff_content": "...", "output_format": "sarif"}' \
-o security-results.sarif
Upload the SARIF file using github/codeql-action/upload-sarif@v3 to see findings in the GitHub Security tab.
REST API Endpoints¶
POST /api/v1/webhooks/github – Webhook Receiver¶
POST /api/v1/security/scan-diff – Security Scan¶
Scans a raw unified diff for security vulnerabilities.
Request (ScanDiffRequest):
{
"diff_content": "unified diff string",
"output_format": "json"
}
| Field | Type | Default | Description |
|---|---|---|---|
diff_content |
string | – | Raw unified diff (required) |
output_format |
string | "json" |
"json" or "sarif" |
Response (ScanDiffResponse):
{
"findings": [
{
"finding_id": "f-001",
"pattern_id": "CWE-89",
"pattern_name": "SQL Injection",
"category": "security",
"severity": "high",
"method_name": "execute_query",
"filename": "src/db.py",
"line_number": 42,
"description": "User input concatenated into SQL query",
"cwe_ids": ["CWE-89"],
"confidence": 0.92
}
],
"sarif": {},
"new_count": 1,
"fixed_count": 0,
"total_count": 1,
"critical_count": 0,
"high_count": 1,
"medium_count": 0,
"low_count": 0
}
POST /api/v1/review/summary – PR Summary¶
Generates a structured summary from any unified diff (platform-agnostic).
{
"diff_content": "unified diff string",
"title": "Optional PR title",
"description": "Optional PR description"
}
Response: summary, changed_files, additions, deletions, changed_methods, risk_areas.
POST /api/v1/review/commit-message – Commit Message¶
Generates a conventional commit message from a diff.
{
"diff_content": "unified diff string"
}
Response: message, type, scope, files_changed.
POST /api/v1/review/pr – GitHub PR Review¶
Reviews a GitHub PR directly by fetching the diff from GitHub API (no need to pass raw diff).
{
"owner": "org",
"repo": "repo",
"pr_number": 42,
"task_description": "Optional review focus",
"dod_items": ["No security issues", "Tests pass"]
}
Requires X-GitHub-Token header with a GitHub access token that has read access to the repository.
Response: recommendation, score, findings[], dod_validation[], summary, processing_time_ms, request_id.
GET /api/v1/webhooks/status/{project_id} – Update Status¶
Returns the latest CPG update pipeline status for a project.
{
"project_id": "my-project",
"last_update": "2026-03-09T12:00:00+00:00",
"status": "completed",
"commit_sha": "abc12345",
"duration_ms": 1234,
"gocpg_status": "completed",
"chromadb_synced": 3,
"error": null
}
| Field | Type | Description |
|---|---|---|
status |
string | completed, failed, in_progress, or unknown |
last_update |
string | ISO 8601 timestamp of last completed update |
duration_ms |
int | Pipeline execution time in milliseconds |
gocpg_status |
string | GoCPG incremental update status |
chromadb_synced |
int | Number of ChromaDB collections updated |
MCP Tools¶
When CodeGraph runs as an MCP server (python -m src.mcp), AI agents can use CPG-aware tools:
| Tool | Description |
|---|---|
codegraph_project_context |
Project overview: CPG stats, languages, hotspots, security findings |
codegraph_file_context |
Per-file methods, metrics, callers, callees, dead code, security |
codegraph_diff_context |
Blast radius analysis for changed files |
codegraph_query |
Execute raw SQL against CPG DuckDB |
codegraph_find_callers |
Find all callers of a function |
codegraph_find_callees |
Find all functions called by a function |
codegraph_taint_analysis |
Trace data flows from sources to sinks |
OpenCode Integration¶
Add to opencode.json to use CodeGraph tools in OpenCode:
{
"mcp": {
"codegraph": {
"type": "local",
"command": ["python", "-m", "src.mcp", "--transport", "stdio"]
}
}
}
Incremental CPG Updates¶
When a push event is received via webhook, CodeGraph runs the full incremental update pipeline:
- Resolve project – match by
project_id, repository URL, or active project - GoCPG Update – incremental CPG update via gRPC/subprocess (
from_commit→to_commit) - ChromaDB Sync – update vector store collections for changed files
- WebSocket Notification – broadcast
cpg.update.completeevent to connected clients
Pipeline behavior:
- Duplicate push events (same project + commit SHA) are deduplicated
- Rapid pushes to the same project are coalesced (merged changed files, latest commit)
- GoCPG failure triggers automatic retry after 60 seconds
- ChromaDB sync continues even if GoCPG update fails
- Webhook always returns 202 Accepted immediately (processing is asynchronous)
Monitor pipeline status via GET /api/v1/webhooks/status/{project_id}.
Docker Image Setup¶
Images published to GHCR via .github/workflows/publish-ghcr.yml, tagged on version tags (v*) and latest.
docker pull ghcr.io/mkhlsavin/codegraph:latest
docker run --rm ghcr.io/mkhlsavin/codegraph:latest python -m src.cli health
Includes GoCPG binary and supports 11 languages (C, C++, C#, Go, Java, JavaScript, Kotlin, PHP, Python, Ruby, TypeScript).
Troubleshooting¶
Webhook Signature Error (401/403)¶
401 Unauthorized – X-Hub-Signature-256 header is missing:
- Verify the webhook secret is configured in GitHub
- Check Content-Type is application/json
403 Forbidden – signature is invalid:
- Verify the webhook secret in GitHub matches config.yaml > github.webhook_secret
- Ensure proxy servers do not modify the request body
Webhook Returns 400¶
- Check that the payload is valid JSON
- For push events, ensure the
commitsarray is present
CI Pipeline Cannot Reach CodeGraph¶
CODEGRAPH_URLmust be reachable from GitHub Actions runners- Verify:
curl -sf ${CODEGRAPH_URL}/api/v1/health - For self-hosted runners: check firewall rules and VPN connectivity
SARIF Upload Fails¶
- Verify the file is valid SARIF format (use
output_format: "sarif"in scan-diff request) - GitHub Code Scanning requires the repository to have GitHub Advanced Security enabled (for private repos)
Incremental Update Shows “unavailable”¶
- Verify GoCPG is installed and accessible:
gocpg --version - Check that
GOCPG_PATHenvironment variable points to the GoCPG binary - Review logs:
grep "GoCPG" /var/log/codegraph/*.log
Next Steps¶
- GitLab Integration – GitLab webhook and CI integration
- SourceCraft Integration – Yandex SourceCraft integration
- GitVerse Integration – SberTech GitVerse integration
- Configuration – Full configuration reference
- REST API docs – Complete API documentation