Enterprise Security Brief¶
Document for CISOs, Security Executives, and Security Architects
Table of Contents¶
Executive Summary¶
CodeGraph is a static code analysis platform powered by Code Property Graph (CPG), designed with enterprise security requirements in mind. The platform provides complete data control, integration with enterprise security infrastructure, and compliance with regulatory requirements.
Key Security Capabilities¶
| Capability |
Description |
| On-Premise Deployment |
Code never leaves your infrastructure |
| GigaChat + Yandex AI Studio |
Russian LLMs (Qwen3-235B, YandexGPT) — only queries transmitted, not code |
| RBAC |
4 roles, 21 permissions, granular access control |
| DLP |
25+ patterns for sensitive data detection |
| SIEM |
Integration with Syslog, ArcSight (CEF), QRadar (LEEF) |
| Vault |
HashiCorp Vault for secrets management |
1. Data Residency and Sovereignty¶
Data Processing Principles¶
┌─────────────────────────────────────────────────────────────────┐
│ ORGANIZATION PERIMETER │
│ │
│ [Source Code] ──► [CPG Analysis] ──► [DuckDB Storage] │
│ │ │
│ Code remains │
│ inside perimeter │
│ │ │
│ [User] ──► [RAG Query] ──► [DLP Scanner] ──┐ │
│ │ │
└──────────────────────────────────────────────│──────────────────┘
│
Only NL queries ─────►│
(no source code) │
▼
[GigaChat / Yandex AI Studio API]
Data Security Guarantees¶
| Aspect |
Implementation |
| Code Storage |
Local only (DuckDB, file system) |
| Data Transmission |
Source code is never sent to external systems |
| LLM Integration |
GigaChat/Yandex AI Studio receives only user text queries |
| Air-Gapped Mode |
Support for fully isolated environments with local LLMs |
Compliance¶
- 152-FZ — Processing of Russian citizens’ personal data within Russia
- GOST R 57580 — Information protection in financial organizations
- FSTEC Requirements — Deployment capability in certified infrastructure
- GDPR — Data minimization principles (code not transmitted)
- SOX — Complete audit trail for compliance
2. Access Control and Authentication¶
RBAC Role Hierarchy¶
┌─────────────┐
│ ADMIN │ ← Full access (admin:all)
│ (Level 4) │
└──────┬──────┘
│
┌──────▼──────┐
│ REVIEWER │ ← Code review + Analyst capabilities
│ (Level 3) │ (GitHub/GitLab integration)
└──────┬──────┘
│
┌──────▼──────┐
│ ANALYST │ ← Query execution + sessions
│ (Level 2) │ (API keys, export)
└──────┬──────┘
│
┌──────▼──────┐
│ VIEWER │ ← Read-only access
│ (Level 1) │ (scenarios, history, stats)
└─────────────┘
Permission Matrix (21 Permissions)¶
| Category |
Permissions |
VIEWER |
ANALYST |
REVIEWER |
ADMIN |
| Scenarios |
scenarios:read |
✓ |
✓ |
✓ |
✓ |
|
scenarios:execute |
|
✓ |
✓ |
✓ |
| Queries |
query:execute |
|
✓ |
✓ |
✓ |
|
query:validate |
|
✓ |
✓ |
✓ |
| Review |
review:execute |
|
|
✓ |
✓ |
|
review:github |
|
|
✓ |
✓ |
|
review:gitlab |
|
|
✓ |
✓ |
| Sessions |
sessions:read |
✓ |
✓ |
✓ |
✓ |
|
sessions:write |
|
✓ |
✓ |
✓ |
|
sessions:delete |
|
✓ |
✓ |
✓ |
| History |
history:read |
✓ |
✓ |
✓ |
✓ |
|
history:export |
|
✓ |
✓ |
✓ |
| Users |
users:read |
|
|
|
✓ |
|
users:write |
|
|
|
✓ |
|
users:delete |
|
|
|
✓ |
| API Keys |
api_keys:read |
|
✓ |
✓ |
✓ |
|
api_keys:write |
|
✓ |
✓ |
✓ |
|
api_keys:delete |
|
|
|
✓ |
| Metrics |
stats:read |
✓ |
✓ |
✓ |
✓ |
|
metrics:read |
|
|
|
✓ |
| Admin |
admin:all |
|
|
|
✓ |
Authentication Methods¶
| Method |
Description |
Status |
| JWT Bearer |
Access tokens (30 min) + Refresh tokens (7 days) |
✅ Implemented |
| API Keys |
SHA-256 hashing, expiration, revocation |
✅ Implemented |
| OAuth2/OIDC |
GitHub, GitLab, Google, Keycloak |
✅ Integration-ready |
| LDAP/AD |
Group sync, SSO |
✅ Integration-ready |
3. Data Loss Prevention (DLP)¶
DLP Scanning Architecture¶
┌──────────────────────────────────────────────────────────────┐
│ USER REQUEST │
└──────────────────────────┬───────────────────────────────────┘
│
▼
┌──────────────────────────────────────────────────────────────┐
│ PRE-REQUEST SCANNING │
│ ┌────────────────────────────────────────────────────────┐ │
│ │ Patterns: API keys, passwords, tokens, PII, paths │ │
│ └────────────────────────────────────────────────────────┘ │
│ │ │
│ ┌──────────────┼──────────────┐ │
│ ▼ ▼ ▼ │
│ [BLOCK] [MASK] [WARN/LOG] │
│ Reject Mask Send with │
│ request data logging │
└──────────────────────────────────────────────────────────────┘
│
▼
[GigaChat / Yandex AI Studio API]
│
▼
┌──────────────────────────────────────────────────────────────┐
│ POST-RESPONSE SCANNING │
│ ┌────────────────────────────────────────────────────────┐ │
│ │ Mask sensitive data in LLM response before display │ │
│ └────────────────────────────────────────────────────────┘ │
└──────────────────────────────────────────────────────────────┘
Detection Categories (25+ Patterns)¶
| Category |
Severity |
Example Patterns |
| Credentials |
HIGH |
AWS keys (AKIA...), GitHub tokens (ghp_...), passwords, JWT, RSA/EC private keys |
| Personal Data |
MEDIUM |
Email, phone numbers (RU/US), passport numbers, SSN, credit cards |
| Source Code |
LOW |
Database connection strings, internal Unix/Windows paths |
DLP Actions¶
| Action |
Priority |
Description |
BLOCK |
4 (max) |
Complete request blocking |
MASK |
3 |
Replace sensitive data with [REDACTED] |
WARN |
2 |
Warning + SIEM event dispatch |
LOG_ONLY |
1 |
Audit logging only |
4. SIEM Integration¶
| Format |
Compatible Systems |
Description |
| Syslog (RFC 5424) |
Splunk, Graylog, rsyslog |
Standard Unix format |
| CEF |
ArcSight, Splunk |
Common Event Format |
| LEEF |
IBM QRadar |
Log Event Extended Format |
Security Event Types¶
EVENT_TYPE SEVERITY DESCRIPTION
────────────────────────────────────────────────────────────
LLM_REQUEST INFO Request to LLM provider
LLM_RESPONSE INFO Response from LLM
LLM_ERROR ERROR LLM interaction error
DLP_BLOCK CRITICAL Request blocked by DLP
DLP_MASK WARNING Data masked
DLP_WARN WARNING DLP warning
AUTH_SUCCESS INFO Successful authentication
AUTH_FAILURE WARNING Failed login attempt
VAULT_ACCESS INFO Vault secrets access
VAULT_ROTATE INFO Secrets rotation
RATE_LIMIT WARNING Rate limit exceeded
SECURITY_ALERT CRITICAL Critical security event
Delivery Architecture¶
- Buffering: Up to 10,000 events in queue
- Retry: Exponential backoff on failures
- Failover: Continued operation when SIEM is unavailable
5. Secrets Management¶
HashiCorp Vault Integration¶
| Authentication Method |
Use Case |
| Token |
Development, testing |
| AppRole |
CI/CD pipelines, services |
| Kubernetes |
K8s clusters with ServiceAccount |
Managed Secrets¶
- LLM Providers: GigaChat, Yandex AI Studio (Qwen3-235B, YandexGPT), OpenAI
- Database: PostgreSQL, DuckDB credentials
- Integrations: GitHub/GitLab tokens, SIEM credentials
Security¶
- KV v2 Secret Engine with versioning
- TTL caching with automatic rotation
- Environment variable fallback when Vault unavailable
Competitive Advantages¶
| Feature |
CodeGraph |
GitHub Copilot |
Sourcegraph |
CodeScene |
| On-Premise Deployment |
✅ |
❌ |
✅ |
✅ |
| Integrated DLP |
✅ |
❌ |
❌ |
❌ |
| Multi-Format SIEM |
✅ |
❌ |
❌ |
❌ |
| Vault Integration |
✅ |
❌ |
❌ |
❌ |
| Russian LLM (GigaChat + Yandex AI Studio) |
✅ |
❌ |
❌ |
❌ |
| CPG-based Analysis |
✅ |
❌ |
✅ |
✅ |
| Taint-Verified Vulnerabilities |
✅ |
❌ |
Partial |
❌ |
For security and integration inquiries:
- Email: security@codegraph.ru
Version: 1.0 | December 2025