GitHub Integration Guide

Integration guide for connecting CodeGraph with GitHub for automated code review, security scanning, and incremental CPG updates.

Table of Contents

Overview

CodeGraph integrates with GitHub for automated code review, security scanning, and technical debt tracking. Three integration paths:

  • CI/CD Pipeline – CodeGraph as a step in GitHub Actions workflows
  • Webhook-driven – push/PR events trigger incremental CPG updates and reviews automatically
  • Standalone – deployed alongside GitHub, accessed via REST API / CLI / MCP

GitHub webhook payloads are normalized by CodeGraph into unified UnifiedPushEvent and UnifiedMREvent models shared across all 4 supported platforms.

Prerequisites

  • CodeGraph instance (Docker or standalone) accessible from GitHub (or self-hosted runners)
  • GitHub repository with admin permissions (for webhook setup)
  • Docker (for CI pipeline steps)
  • LLM API key: YANDEX_API_KEY, GIGACHAT_AUTH_KEY, or OPENAI_API_KEY

Webhook Configuration

Endpoint: POST /api/v1/webhooks/github (returns 202 Accepted)

GitHub setup: Repository Settings > Webhooks > Add webhook: 1. Payload URL: https://<codegraph-host>/api/v1/webhooks/github 2. Content type: application/json 3. Secret: any string (for HMAC-SHA256 verification) 4. Events: select Pushes and Pull requests (or “Let me select individual events”)

Signature Verification

CodeGraph verifies the X-Hub-Signature-256 header (HMAC-SHA256, format: sha256=<hex>).

Error codes: - 401 Unauthorized – missing X-Hub-Signature-256 header - 403 Forbidden – invalid signature (HMAC mismatch)

Supported Events

Event Header (X-GitHub-Event) Action
push Incremental CPG update via CPGUpdateQueue
pull_request Code review workflow (all actions: opened, synchronize, closed, merged, reopened)
ping Connection test (returns {"status": "accepted", "detail": {"message": "pong"}})

Events not matching these types are returned with {"status": "skipped"}.

Configuration

Section github in config.yaml:

github:
  webhook_secret: "your-github-webhook-secret"
  auto_update_on_push: true    # trigger incremental CPG update on push events

Event Type Detection

CodeGraph detects event type from the X-GitHub-Event header. If the header is missing (e.g., testing via curl), it falls back to payload structure detection:

  • pull_request key present → pull_request event
  • head_commit or (ref + commits) present → push event
  • zen key present → ping event

CI Pipeline (GitHub Actions)

Basic Workflow

Create .github/workflows/codegraph-review.yaml:

name: CodeGraph Review
on:
  pull_request:
    types: [opened, synchronize, reopened]

jobs:
  cpg-update:
    name: Build CPG
    runs-on: ubuntu-latest
    container:
      image: ghcr.io/mkhlsavin/codegraph:latest
    steps:
      - uses: actions/checkout@v4
        with:
          fetch-depth: 0
      - name: Build CPG
        run: |
          gocpg parse --input=. --output=/tmp/cpg.duckdb \
            --lang=${{ vars.PROJECT_LANGUAGE || 'python' }}

  security-scan:
    name: Security Scan
    needs: cpg-update
    runs-on: ubuntu-latest
    container:
      image: ghcr.io/mkhlsavin/codegraph:latest
    steps:
      - uses: actions/checkout@v4
        with:
          fetch-depth: 0
      - name: Diff-based security scan
        run: |
          DIFF=$(git diff ${{ github.event.pull_request.base.sha }}...HEAD)
          curl -sf -X POST "${CODEGRAPH_URL}/api/v1/security/scan-diff" \
            -H "Authorization: Bearer ${CODEGRAPH_TOKEN}" \
            -H "Content-Type: application/json" \
            -d "{\"diff_content\": $(echo "$DIFF" | python3 -c 'import sys,json; print(json.dumps(sys.stdin.read()))'), \"output_format\": \"sarif\"}" \
            -o security-results.sarif
        env:
          CODEGRAPH_URL: ${{ vars.CODEGRAPH_URL }}
          CODEGRAPH_TOKEN: ${{ secrets.CODEGRAPH_API_TOKEN }}
      - name: Upload SARIF
        if: always()
        uses: github/codeql-action/upload-sarif@v3
        with:
          sarif_file: security-results.sarif

Required Secrets/Variables

Name Type Description
CODEGRAPH_URL Variable CodeGraph API endpoint (e.g. https://codegraph.example.com)
CODEGRAPH_API_TOKEN Secret CodeGraph API authentication token
PROJECT_LANGUAGE Variable Source language (default: python)

SARIF Integration

CodeGraph can output security findings in SARIF format, compatible with GitHub Code Scanning:

curl -sf -X POST "${CODEGRAPH_URL}/api/v1/security/scan-diff" \
  -H "Authorization: Bearer ${CODEGRAPH_TOKEN}" \
  -H "Content-Type: application/json" \
  -d '{"diff_content": "...", "output_format": "sarif"}' \
  -o security-results.sarif

Upload the SARIF file using github/codeql-action/upload-sarif@v3 to see findings in the GitHub Security tab.

REST API Endpoints

POST /api/v1/webhooks/github – Webhook Receiver

See Webhook Configuration.

POST /api/v1/security/scan-diff – Security Scan

Scans a raw unified diff for security vulnerabilities.

Request (ScanDiffRequest):

{
  "diff_content": "unified diff string",
  "output_format": "json"
}
Field Type Default Description
diff_content string Raw unified diff (required)
output_format string "json" "json" or "sarif"

Response (ScanDiffResponse):

{
  "findings": [
    {
      "finding_id": "f-001",
      "pattern_id": "CWE-89",
      "pattern_name": "SQL Injection",
      "category": "security",
      "severity": "high",
      "method_name": "execute_query",
      "filename": "src/db.py",
      "line_number": 42,
      "description": "User input concatenated into SQL query",
      "cwe_ids": ["CWE-89"],
      "confidence": 0.92
    }
  ],
  "sarif": {},
  "new_count": 1,
  "fixed_count": 0,
  "total_count": 1,
  "critical_count": 0,
  "high_count": 1,
  "medium_count": 0,
  "low_count": 0
}

POST /api/v1/review/summary – PR Summary

Generates a structured summary from any unified diff (platform-agnostic).

{
  "diff_content": "unified diff string",
  "title": "Optional PR title",
  "description": "Optional PR description"
}

Response: summary, changed_files, additions, deletions, changed_methods, risk_areas.

POST /api/v1/review/commit-message – Commit Message

Generates a conventional commit message from a diff.

{
  "diff_content": "unified diff string"
}

Response: message, type, scope, files_changed.

POST /api/v1/review/pr – GitHub PR Review

Reviews a GitHub PR directly by fetching the diff from GitHub API (no need to pass raw diff).

{
  "owner": "org",
  "repo": "repo",
  "pr_number": 42,
  "task_description": "Optional review focus",
  "dod_items": ["No security issues", "Tests pass"]
}

Requires X-GitHub-Token header with a GitHub access token that has read access to the repository.

Response: recommendation, score, findings[], dod_validation[], summary, processing_time_ms, request_id.

GET /api/v1/webhooks/status/{project_id} – Update Status

Returns the latest CPG update pipeline status for a project.

{
  "project_id": "my-project",
  "last_update": "2026-03-09T12:00:00+00:00",
  "status": "completed",
  "commit_sha": "abc12345",
  "duration_ms": 1234,
  "gocpg_status": "completed",
  "chromadb_synced": 3,
  "error": null
}
Field Type Description
status string completed, failed, in_progress, or unknown
last_update string ISO 8601 timestamp of last completed update
duration_ms int Pipeline execution time in milliseconds
gocpg_status string GoCPG incremental update status
chromadb_synced int Number of ChromaDB collections updated

MCP Tools

When CodeGraph runs as an MCP server (python -m src.mcp), AI agents can use CPG-aware tools:

Tool Description
codegraph_project_context Project overview: CPG stats, languages, hotspots, security findings
codegraph_file_context Per-file methods, metrics, callers, callees, dead code, security
codegraph_diff_context Blast radius analysis for changed files
codegraph_query Execute raw SQL against CPG DuckDB
codegraph_find_callers Find all callers of a function
codegraph_find_callees Find all functions called by a function
codegraph_taint_analysis Trace data flows from sources to sinks

OpenCode Integration

Add to opencode.json to use CodeGraph tools in OpenCode:

{
  "mcp": {
    "codegraph": {
      "type": "local",
      "command": ["python", "-m", "src.mcp", "--transport", "stdio"]
    }
  }
}

Incremental CPG Updates

When a push event is received via webhook, CodeGraph runs the full incremental update pipeline:

  1. Resolve project – match by project_id, repository URL, or active project
  2. GoCPG Update – incremental CPG update via gRPC/subprocess (from_committo_commit)
  3. ChromaDB Sync – update vector store collections for changed files
  4. WebSocket Notification – broadcast cpg.update.complete event to connected clients

Pipeline behavior: - Duplicate push events (same project + commit SHA) are deduplicated - Rapid pushes to the same project are coalesced (merged changed files, latest commit) - GoCPG failure triggers automatic retry after 60 seconds - ChromaDB sync continues even if GoCPG update fails - Webhook always returns 202 Accepted immediately (processing is asynchronous)

Monitor pipeline status via GET /api/v1/webhooks/status/{project_id}.

Docker Image Setup

Images published to GHCR via .github/workflows/publish-ghcr.yml, tagged on version tags (v*) and latest.

docker pull ghcr.io/mkhlsavin/codegraph:latest
docker run --rm ghcr.io/mkhlsavin/codegraph:latest python -m src.cli health

Includes GoCPG binary and supports 11 languages (C, C++, C#, Go, Java, JavaScript, Kotlin, PHP, Python, Ruby, TypeScript).

Troubleshooting

Webhook Signature Error (401/403)

401 UnauthorizedX-Hub-Signature-256 header is missing: - Verify the webhook secret is configured in GitHub - Check Content-Type is application/json

403 Forbidden – signature is invalid: - Verify the webhook secret in GitHub matches config.yaml > github.webhook_secret - Ensure proxy servers do not modify the request body

Webhook Returns 400

  • Check that the payload is valid JSON
  • For push events, ensure the commits array is present

CI Pipeline Cannot Reach CodeGraph

  • CODEGRAPH_URL must be reachable from GitHub Actions runners
  • Verify: curl -sf ${CODEGRAPH_URL}/api/v1/health
  • For self-hosted runners: check firewall rules and VPN connectivity

SARIF Upload Fails

  • Verify the file is valid SARIF format (use output_format: "sarif" in scan-diff request)
  • GitHub Code Scanning requires the repository to have GitHub Advanced Security enabled (for private repos)

Incremental Update Shows “unavailable”

  • Verify GoCPG is installed and accessible: gocpg --version
  • Check that GOCPG_PATH environment variable points to the GoCPG binary
  • Review logs: grep "GoCPG" /var/log/codegraph/*.log

Next Steps