Enterprise Security Brief¶
Document for CISOs, Security Executives, and Security Architects
Table of Contents¶
- Executive Summary
- Key Security Capabilities
- 1. Data Residency and Sovereignty
- Data Processing Principles
- Data Security Guarantees
- Compliance
- 2. Access Control and Authentication
- RBAC Role Hierarchy
- Permission Matrix (21 Permissions)
- Authentication Methods
- 3. Data Loss Prevention (DLP)
- DLP Scanning Architecture
- Detection Categories (25+ Patterns)
- DLP Actions
- 4. SIEM Integration
- Supported Formats
- Security Event Types
- Delivery Architecture
- 5. Secrets Management
- HashiCorp Vault Integration
- Managed Secrets
- Security
- 6. Advanced Analysis Capabilities
- Competitive Advantages
- Contact
Executive Summary¶
CodeGraph is a static code analysis platform powered by Code Property Graph (CPG), designed with enterprise security requirements in mind. The platform provides complete data control, integration with enterprise security infrastructure, and compliance with regulatory requirements.
Key Security Capabilities¶
| Capability | Description |
|---|---|
| On-Premise Deployment | Code never leaves your infrastructure |
| GigaChat + Yandex AI Studio | Russian LLMs (Qwen3-235B, YandexGPT) — only queries transmitted, not code |
| RBAC | 4 roles, 21 permissions, granular access control |
| DLP | 25+ patterns for sensitive data detection |
| SIEM | Integration with Syslog, ArcSight (CEF), QRadar (LEEF) |
| Vault | HashiCorp Vault for secrets management |
| Structural Pattern Engine | 190 YAML rules with cross-language pattern matching |
| SARIF 2.1.0 Export | Standard interchange format for security findings |
| LLM Autofix | AI-generated remediation suggestions for detected vulnerabilities |
1. Data Residency and Sovereignty¶
Data Processing Principles¶
┌─────────────────────────────────────────────────────────────────┐
│ ORGANIZATION PERIMETER │
│ │
│ [Source Code] ──► [CPG Analysis] ──► [DuckDB Storage] │
│ │ │
│ Code remains │
│ inside perimeter │
│ │ │
│ [User] ──► [RAG Query] ──► [DLP Scanner] ──┐ │
│ │ │
└──────────────────────────────────────────────│──────────────────┘
│
Only NL queries ─────►│
(no source code) │
▼
[GigaChat / Yandex AI Studio API]
Data Security Guarantees¶
| Aspect | Implementation |
|---|---|
| Code Storage | Local only (DuckDB, file system) |
| Data Transmission | Source code is never sent to external systems |
| LLM Integration | GigaChat/Yandex AI Studio receives only user text queries |
| Air-Gapped Mode | Support for fully isolated environments with local LLMs |
Compliance¶
- 152-FZ — Processing of Russian citizens’ personal data within Russia
- GOST R 57580 — Information protection in financial organizations
- FSTEC Requirements — Deployment capability in certified infrastructure
- GDPR — Data minimization principles (code not transmitted)
- SOX — Complete audit trail for compliance
2. Access Control and Authentication¶
RBAC Role Hierarchy¶
┌─────────────┐
│ ADMIN │ ← Full access (admin:all)
│ (Level 4) │
└──────┬──────┘
│
┌──────▼──────┐
│ REVIEWER │ ← Code review + Analyst capabilities
│ (Level 3) │ (GitHub/GitLab integration)
└──────┬──────┘
│
┌──────▼──────┐
│ ANALYST │ ← Query execution + sessions
│ (Level 2) │ (API keys, export)
└──────┬──────┘
│
┌──────▼──────┐
│ VIEWER │ ← Read-only access
│ (Level 1) │ (scenarios, history, stats)
└─────────────┘
Permission Matrix (21 Permissions)¶
| Category | Permissions | VIEWER | ANALYST | REVIEWER | ADMIN |
|---|---|---|---|---|---|
| Scenarios | scenarios:read | ✓ | ✓ | ✓ | ✓ |
| scenarios:execute | ✓ | ✓ | ✓ | ||
| Queries | query:execute | ✓ | ✓ | ✓ | |
| query:validate | ✓ | ✓ | ✓ | ||
| Review | review:execute | ✓ | ✓ | ||
| review:github | ✓ | ✓ | |||
| review:gitlab | ✓ | ✓ | |||
| Sessions | sessions:read | ✓ | ✓ | ✓ | ✓ |
| sessions:write | ✓ | ✓ | ✓ | ||
| sessions:delete | ✓ | ✓ | ✓ | ||
| History | history:read | ✓ | ✓ | ✓ | ✓ |
| history:export | ✓ | ✓ | ✓ | ||
| Users | users:read | ✓ | |||
| users:write | ✓ | ||||
| users:delete | ✓ | ||||
| API Keys | api_keys:read | ✓ | ✓ | ✓ | |
| api_keys:write | ✓ | ✓ | ✓ | ||
| api_keys:delete | ✓ | ||||
| Metrics | stats:read | ✓ | ✓ | ✓ | ✓ |
| metrics:read | ✓ | ||||
| Admin | admin:all | ✓ |
Authentication Methods¶
| Method | Description | Status |
|---|---|---|
| JWT Bearer | Access tokens (30 min) + Refresh tokens (7 days) | ✅ Implemented |
| API Keys | SHA-256 hashing, expiration, revocation | ✅ Implemented |
| OAuth2/OIDC | GitHub, GitLab, Google, Keycloak | ✅ Integration-ready |
| LDAP/AD | Group sync, SSO | ✅ Integration-ready |
3. Data Loss Prevention (DLP)¶
DLP Scanning Architecture¶
┌──────────────────────────────────────────────────────────────┐
│ USER REQUEST │
└──────────────────────────┬───────────────────────────────────┘
│
▼
┌──────────────────────────────────────────────────────────────┐
│ PRE-REQUEST SCANNING │
│ ┌────────────────────────────────────────────────────────┐ │
│ │ Patterns: API keys, passwords, tokens, PII, paths │ │
│ └────────────────────────────────────────────────────────┘ │
│ │ │
│ ┌──────────────┼──────────────┐ │
│ ▼ ▼ ▼ │
│ [BLOCK] [MASK] [WARN/LOG] │
│ Reject Mask Send with │
│ request data logging │
└──────────────────────────────────────────────────────────────┘
│
▼
[GigaChat / Yandex AI Studio API]
│
▼
┌──────────────────────────────────────────────────────────────┐
│ POST-RESPONSE SCANNING │
│ ┌────────────────────────────────────────────────────────┐ │
│ │ Mask sensitive data in LLM response before display │ │
│ └────────────────────────────────────────────────────────┘ │
└──────────────────────────────────────────────────────────────┘
Detection Categories (25+ Patterns)¶
| Category | Severity | Example Patterns |
|---|---|---|
| Credentials | HIGH | AWS keys (AKIA...), GitHub tokens (ghp_...), passwords, JWT, RSA/EC private keys |
| Personal Data | MEDIUM | Email, phone numbers (RU/US), passport numbers, SSN, credit cards |
| Source Code | LOW | Database connection strings, internal Unix/Windows paths |
DLP Actions¶
| Action | Priority | Description |
|---|---|---|
BLOCK |
4 (max) | Complete request blocking |
MASK |
3 | Replace sensitive data with [REDACTED] |
WARN |
2 | Warning + SIEM event dispatch |
LOG_ONLY |
1 | Audit logging only |
4. SIEM Integration¶
Supported Formats¶
| Format | Compatible Systems | Description |
|---|---|---|
| Syslog (RFC 5424) | Splunk, Graylog, rsyslog | Standard Unix format |
| CEF | ArcSight, Splunk | Common Event Format |
| LEEF | IBM QRadar | Log Event Extended Format |
Security Event Types¶
EVENT_TYPE SEVERITY DESCRIPTION
────────────────────────────────────────────────────────────
LLM_REQUEST INFO Request to LLM provider
LLM_RESPONSE INFO Response from LLM
LLM_ERROR ERROR LLM interaction error
DLP_BLOCK CRITICAL Request blocked by DLP
DLP_MASK WARNING Data masked
DLP_WARN WARNING DLP warning
AUTH_SUCCESS INFO Successful authentication
AUTH_FAILURE WARNING Failed login attempt
VAULT_ACCESS INFO Vault secrets access
VAULT_ROTATE INFO Secrets rotation
RATE_LIMIT WARNING Rate limit exceeded
SECURITY_ALERT CRITICAL Critical security event
Delivery Architecture¶
- Buffering: Up to 10,000 events in queue
- Retry: Exponential backoff on failures
- Failover: Continued operation when SIEM is unavailable
5. Secrets Management¶
HashiCorp Vault Integration¶
| Authentication Method | Use Case |
|---|---|
| Token | Development, testing |
| AppRole | CI/CD pipelines, services |
| Kubernetes | K8s clusters with ServiceAccount |
Managed Secrets¶
- LLM Providers: GigaChat, Yandex AI Studio (Qwen3-235B, YandexGPT), OpenAI
- Database: PostgreSQL, DuckDB credentials
- Integrations: GitHub/GitLab tokens, SIEM credentials
Security¶
- KV v2 Secret Engine with versioning
- TTL caching with automatic rotation
- Environment variable fallback when Vault unavailable
6. Advanced Analysis Capabilities¶
Taint Visualization¶
CodeGraph produces Mermaid flowcharts for taint analysis paths, enabling visual inspection of data flow from source to sink. Output integrates directly into CI/CD comments and documentation.
SARIF Integration¶
Security findings export to SARIF 2.1.0 format with full codeFlows support, enabling integration with GitHub Code Scanning, Azure DevOps, and other SARIF-compatible tools.
OWASP Top 10 Mapping¶
All findings are automatically classified against the OWASP Top 10 categories (src/security/owasp_mapping.py), producing auditor-ready compliance reports.
Symbolic Execution¶
The z3-based symbolic execution engine (z3-solver) validates complex vulnerability conditions and path feasibility, reducing false positives for conditional vulnerabilities.
Clone Detection¶
Detects copy-pasted vulnerable code across the codebase (src/analysis/clone_detector.py), ensuring that a fix applied to one location is propagated to all clones.
Competitive Advantages¶
| Feature | CodeGraph | GitHub Copilot | Sourcegraph | CodeScene |
|---|---|---|---|---|
| On-Premise Deployment | ✅ | ❌ | ✅ | ✅ |
| Integrated DLP | ✅ | ❌ | ❌ | ❌ |
| Multi-Format SIEM | ✅ | ❌ | ❌ | ❌ |
| Vault Integration | ✅ | ❌ | ❌ | ❌ |
| Russian LLM (GigaChat + Yandex AI Studio) | ✅ | ❌ | ❌ | ❌ |
| CPG-based Analysis | ✅ | ❌ | ✅ | ✅ |
| Taint-Verified Vulnerabilities | ✅ | ❌ | Partial | ❌ |
| Structural Pattern Engine | ✅ (190 rules) | ❌ | ❌ | ❌ |
Contact¶
For security and integration inquiries:
- Email: security@codegraph.ru
Version: 1.1 | February 2026