GitLab Integration Guide

Integration guide for connecting CodeGraph with GitLab (self-managed or GitLab.com) for automated code review, security scanning, and incremental CPG updates.

Table of Contents

Overview

CodeGraph integrates with GitLab for automated code review, security scanning, and technical debt tracking. Three integration paths:

  • CI/CD Pipeline – CodeGraph as a step in .gitlab-ci.yml
  • Webhook-driven – push/MR events trigger incremental CPG updates and reviews automatically
  • Standalone – deployed alongside GitLab, accessed via REST API / CLI / MCP

GitLab webhook payloads use GitLab-specific format (object_attributes for MR, project for repository info). CodeGraph normalizes these into unified models shared across all 4 supported platforms.

Prerequisites

  • CodeGraph instance (Docker or standalone) accessible from GitLab CI runners
  • GitLab project with Maintainer or Owner permissions (for webhook setup)
  • Docker (for CI pipeline steps)
  • LLM API key: YANDEX_API_KEY, GIGACHAT_AUTH_KEY, or OPENAI_API_KEY

Webhook Configuration

Endpoint: POST /api/v1/webhooks/gitlab (returns 202 Accepted)

GitLab setup: Project Settings > Webhooks > Add new webhook: 1. URL: https://<codegraph-host>/api/v1/webhooks/gitlab 2. Secret token: any string (plain token verification, not HMAC) 3. Trigger: check Push events and Merge request events 4. SSL verification: enable if CodeGraph uses a valid certificate

Token Verification

GitLab uses plain secret token verification (not HMAC-SHA256). CodeGraph compares the X-Gitlab-Token header against the configured secret using constant-time comparison.

Error codes: - 401 Unauthorized – missing X-Gitlab-Token header - 403 Forbidden – token does not match configured secret

Note: Unlike GitHub which uses HMAC-SHA256 over the request body, GitLab sends the secret token directly in the X-Gitlab-Token header. This means the request body is not signed and there is no timestamp header for replay attack protection. For additional security, use IP allowlisting or a reverse proxy with mTLS.

Supported Events

Event Header (X-Gitlab-Event) Action
Push Hook / push Incremental CPG update via CPGUpdateQueue
Merge Request Hook / merge_request Code review workflow

Events not matching these types are returned with {"status": "skipped"}.

Configuration

Section gitlab in config.yaml:

gitlab:
  webhook_secret: "your-gitlab-webhook-token"
  auto_update_on_push: true    # trigger incremental CPG update on push events
  api_url: "https://gitlab.com"   # or your self-managed instance URL

Event Type Detection

CodeGraph detects event type from the X-Gitlab-Event header. If the header is missing, it falls back to payload structure detection:

  • object_attributes or merge_request key present → merge_request event
  • commits or (before + after) present → push event

Payload Structure

Push Hook payload (key fields used by CodeGraph):

{
  "ref": "refs/heads/main",
  "before": "old_commit_sha",
  "after": "new_commit_sha",
  "project": {
    "git_http_url": "https://gitlab.com/org/repo.git",
    "path_with_namespace": "org/repo"
  },
  "user_name": "developer",
  "commits": [
    {
      "id": "new_commit_sha",
      "message": "feat: add feature",
      "author": {"name": "Developer"},
      "added": ["src/new.py"],
      "modified": ["src/existing.py"],
      "removed": []
    }
  ]
}

Merge Request Hook payload (key fields):

{
  "object_attributes": {
    "iid": 15,
    "action": "open",
    "source_branch": "feature-x",
    "target_branch": "main",
    "title": "Add feature X"
  },
  "user": {"username": "developer"},
  "project": {
    "git_http_url": "https://gitlab.com/org/repo.git",
    "path_with_namespace": "org/repo"
  }
}

Action mapping: openopened, updateupdated, closeclosed, mergemerged, reopenreopened.

CI Pipeline (.gitlab-ci.yml)

Basic Pipeline

Create .gitlab-ci.yml:

stages:
  - analyze
  - review
  - report

variables:
  PROJECT_LANGUAGE: python

cpg-update:
  stage: analyze
  image: ghcr.io/mkhlsavin/codegraph:latest
  script:
    - gocpg parse --input=. --output=/tmp/cpg.duckdb --lang=${PROJECT_LANGUAGE}
  artifacts:
    paths:
      - /tmp/cpg.duckdb
    expire_in: 1 hour
  only:
    - merge_requests
    - main

security-scan:
  stage: review
  image: ghcr.io/mkhlsavin/codegraph:latest
  needs: [cpg-update]
  script:
    - |
      DIFF=$(git diff ${CI_MERGE_REQUEST_DIFF_BASE_SHA}...HEAD)
      curl -sf -X POST "${CODEGRAPH_URL}/api/v1/security/scan-diff" \
        -H "Authorization: Bearer ${CODEGRAPH_TOKEN}" \
        -H "Content-Type: application/json" \
        -d "{\"diff_content\": $(echo "$DIFF" | python3 -c 'import sys,json; print(json.dumps(sys.stdin.read()))'), \"output_format\": \"sarif\"}" \
        -o gl-sast-report.json
  artifacts:
    reports:
      sast: gl-sast-report.json
  only:
    - merge_requests

mr-review:
  stage: review
  image: ghcr.io/mkhlsavin/codegraph:latest
  needs: [cpg-update]
  script:
    - |
      DIFF=$(git diff ${CI_MERGE_REQUEST_DIFF_BASE_SHA}...HEAD)
      curl -sf -X POST "${CODEGRAPH_URL}/api/v1/review/summary" \
        -H "Authorization: Bearer ${CODEGRAPH_TOKEN}" \
        -H "Content-Type: application/json" \
        -d "{\"diff_content\": $(echo "$DIFF" | python3 -c 'import sys,json; print(json.dumps(sys.stdin.read()))'), \"title\": \"${CI_MERGE_REQUEST_TITLE}\"}" \
        -o review-results.json
      cat review-results.json
  only:
    - merge_requests

report:
  stage: report
  image: ghcr.io/mkhlsavin/codegraph:latest
  needs: [security-scan, mr-review]
  script:
    - echo "Security scan and review completed"
    - "[ -f gl-sast-report.json ] && cat gl-sast-report.json | python3 -c \"import sys,json; r=json.load(sys.stdin); print(f'Findings: {r.get(\\\"total_count\\\", 0)}')\""
  only:
    - merge_requests

Required CI/CD Variables

Configure in Settings > CI/CD > Variables:

Name Type Description
CODEGRAPH_URL Variable CodeGraph API endpoint (e.g. https://codegraph.example.com)
CODEGRAPH_TOKEN Secret (masked) CodeGraph API authentication token
PROJECT_LANGUAGE Variable Source language (default: python)

SAST Integration

CodeGraph’s SARIF output integrates with GitLab Security Dashboard. Use the artifacts:reports:sast key to upload findings:

artifacts:
  reports:
    sast: gl-sast-report.json

Findings appear in Security & Compliance > Vulnerability Report (GitLab Ultimate) or in the MR widget (all tiers for SAST).

REST API Endpoints

POST /api/v1/webhooks/gitlab – Webhook Receiver

See Webhook Configuration.

POST /api/v1/security/scan-diff – Security Scan

Scans a raw unified diff for security vulnerabilities.

Request (ScanDiffRequest):

{
  "diff_content": "unified diff string",
  "output_format": "sarif"
}
Field Type Default Description
diff_content string Raw unified diff (required)
output_format string "json" "json" or "sarif"

Response (ScanDiffResponse):

{
  "findings": [
    {
      "finding_id": "f-001",
      "pattern_id": "CWE-89",
      "pattern_name": "SQL Injection",
      "category": "security",
      "severity": "high",
      "method_name": "execute_query",
      "filename": "src/db.py",
      "line_number": 42,
      "description": "User input concatenated into SQL query",
      "cwe_ids": ["CWE-89"],
      "confidence": 0.92
    }
  ],
  "sarif": {},
  "new_count": 1,
  "fixed_count": 0,
  "total_count": 1,
  "critical_count": 0,
  "high_count": 1,
  "medium_count": 0,
  "low_count": 0
}

POST /api/v1/review/summary – MR Summary

Generates a structured summary from any unified diff (platform-agnostic).

{
  "diff_content": "unified diff string",
  "title": "Optional MR title",
  "description": "Optional MR description"
}

Response: summary, changed_files, additions, deletions, changed_methods, risk_areas.

POST /api/v1/review/commit-message – Commit Message

Generates a conventional commit message from a diff.

{
  "diff_content": "unified diff string"
}

Response: message, type, scope, files_changed.

POST /api/v1/review/mr – GitLab MR Review

Reviews a GitLab MR directly by fetching the diff from GitLab API (no need to pass raw diff).

{
  "project_id": "org/repo",
  "mr_iid": 15,
  "gitlab_url": "https://gitlab.com",
  "task_description": "Optional review focus",
  "dod_items": ["No security issues", "Tests pass"]
}

Response: recommendation, score, findings[], dod_validation[], summary, processing_time_ms, request_id.

GET /api/v1/webhooks/status/{project_id} – Update Status

Returns the latest CPG update pipeline status for a project.

{
  "project_id": "org/repo",
  "last_update": "2026-03-09T12:00:00+00:00",
  "status": "completed",
  "commit_sha": "ddd444",
  "duration_ms": 2345,
  "gocpg_status": "completed",
  "chromadb_synced": 3,
  "error": null
}
Field Type Description
status string completed, failed, in_progress, or unknown
last_update string ISO 8601 timestamp of last completed update
duration_ms int Pipeline execution time in milliseconds
gocpg_status string GoCPG incremental update status
chromadb_synced int Number of ChromaDB collections updated

MCP Tools

When CodeGraph runs as an MCP server (python -m src.mcp), AI agents can use CPG-aware tools:

Tool Description
codegraph_project_context Project overview: CPG stats, languages, hotspots, security findings
codegraph_file_context Per-file methods, metrics, callers, callees, dead code, security
codegraph_diff_context Blast radius analysis for changed files
codegraph_query Execute raw SQL against CPG DuckDB
codegraph_find_callers Find all callers of a function
codegraph_find_callees Find all functions called by a function
codegraph_taint_analysis Trace data flows from sources to sinks

These tools are available when CodeGraph runs as an MCP server (python -m src.mcp).

Incremental CPG Updates

When a push event is received via webhook, CodeGraph runs the full incremental update pipeline:

  1. Resolve project – match by project_id, repository URL (git_http_url), or active project
  2. GoCPG Update – incremental CPG update via gRPC/subprocess (from_committo_commit)
  3. ChromaDB Sync – update vector store collections for changed files
  4. WebSocket Notification – broadcast cpg.update.complete event to connected clients

Pipeline behavior: - Duplicate push events (same project + commit SHA) are deduplicated - Rapid pushes to the same project are coalesced (merged changed files, latest commit) - GoCPG failure triggers automatic retry after 60 seconds - ChromaDB sync continues even if GoCPG update fails - Webhook always returns 202 Accepted immediately (processing is asynchronous)

Monitor pipeline status via GET /api/v1/webhooks/status/{project_id}.

Docker Image Setup

Images published to GHCR via .github/workflows/publish-ghcr.yml, tagged on version tags (v*) and latest.

docker pull ghcr.io/mkhlsavin/codegraph:latest
docker run --rm ghcr.io/mkhlsavin/codegraph:latest python -m src.cli health

Includes GoCPG binary and supports 11 languages (C, C++, C#, Go, Java, JavaScript, Kotlin, PHP, Python, Ruby, TypeScript).

For self-managed GitLab with private container registry:

docker tag ghcr.io/mkhlsavin/codegraph:latest registry.company.com/codegraph:latest
docker push registry.company.com/codegraph:latest

Then reference registry.company.com/codegraph:latest in .gitlab-ci.yml.

Troubleshooting

Webhook Token Error (401/403)

401 UnauthorizedX-Gitlab-Token header is missing: - Verify the secret token is configured in GitLab webhook settings - Ensure the webhook URL is correct (/api/v1/webhooks/gitlab, not /api/v1/webhooks/github)

403 Forbidden – token does not match: - Verify the token in GitLab matches config.yaml > gitlab.webhook_secret - Tokens are compared using constant-time comparison; check for trailing whitespace or newlines

Webhook Returns 400

  • Check that the payload is valid JSON
  • Ensure Content-Type is application/json (GitLab sets this by default)

CI Pipeline Cannot Reach CodeGraph

  • CODEGRAPH_URL must be reachable from GitLab CI runners
  • Verify: curl -sf ${CODEGRAPH_URL}/api/v1/health
  • For self-managed GitLab: check network rules between runner and CodeGraph
  • For shared runners (GitLab.com): CodeGraph must be publicly accessible or use a tunnel

SAST Report Not Appearing

  • Verify the artifact path matches: artifacts:reports:sast: gl-sast-report.json
  • Use output_format: "sarif" in the scan-diff request
  • SAST reports require the MR to be open (not merged)

MR Review Returns Empty Results

  • Verify CI_MERGE_REQUEST_DIFF_BASE_SHA is available (requires merge_requests trigger)
  • Check that GIT_DEPTH: "0" is set in variables: for full git history (GitLab CI equivalent of fetch-depth: 0)
  • Test API token: curl -sf ${CODEGRAPH_URL}/api/v1/health -H "Authorization: Bearer ${CODEGRAPH_TOKEN}"

Self-Managed GitLab Certificate Issues

If CodeGraph cannot reach a self-managed GitLab instance with a self-signed certificate: - Mount the CA certificate into the CodeGraph container - Set SSL_CERT_FILE environment variable to the CA bundle path

Incremental Update Shows “unavailable”

  • Verify GoCPG is installed and accessible: gocpg --version
  • Check that GOCPG_PATH environment variable points to the GoCPG binary
  • Review logs: grep "GoCPG" /var/log/codegraph/*.log

Next Steps