LLM Integration Security¶
Data Protection Guide for Working with LLM Providers
Table of Contents¶
- Overview
- Security Guarantees
- Architecture
- Request Processing Pipeline
- Data Protection Principles
- 1. Source Code Is Never Transmitted
- API Reference
- SecureLLMProvider
- Methods
- Context Parameters
- Configuration
- Full Configuration (config.yaml)
- Error Handling
- DLPBlockedException
- Error Structure
- Secrets Management
- HashiCorp Vault Integration
- Environment Variable Fallback
- Auditing and Logging
- Request Log Structure
- SIEM Events
- Streaming Generation
- Streaming Mode Specifics
- Metrics and Monitoring
- Prometheus Metrics
- Grafana Dashboard
- Best Practices
- For Developers
- For Operators
- For Compliance
- Related Documents
Overview¶
The SecureLLMProvider module provides comprehensive protection when working with external LLM providers (GigaChat, Yandex AI Studio, OpenAI, and others). The module implements the “defence in depth” principle — multi-layered data protection.
Security Guarantees¶
| Guarantee | Implementation |
|---|---|
| No code transmission | Only NL queries sent to LLM |
| Pre-request DLP | Scanning before sending |
| Post-response DLP | Masking in responses |
| Full audit | Logging every request |
| SIEM integration | Real-time events |
| Secrets in Vault | API keys in HashiCorp Vault |
Architecture¶
Request Processing Pipeline¶
┌─────────────────────────────────────────────────────────────────────────┐
│ SecureLLMProvider │
│ │
│ ┌─────────────────────────────────────────────────────────────────┐ │
│ │ 1. PRE-REQUEST PHASE │ │
│ │ │ │
│ │ User Query ──► [DLP Scanner] ──┬──► BLOCK ──► DLPBlockedException │
│ │ │ │ │
│ │ ├──► MASK ──► Masked Query │ │
│ │ │ │ │
│ │ └──► PASS ──► Original Query │ │
│ │ │ │ │
│ │ ┌───────────────┘ │ │
│ │ ▼ │ │
│ │ [SIEM Event] │ │
│ └─────────────────────────────────────────────────────────────────┘ │
│ │ │
│ ▼ │
│ ┌─────────────────────────────────────────────────────────────────┐ │
│ │ 2. LLM PROVIDER CALL │ │
│ │ │ │
│ │ [VaultClient] ──► API Key ──► [GigaChat/Yandex AI/OpenAI] ──► Response │
│ │ │ │
│ │ • API key from Vault │ │
│ │ • TLS connection │ │
│ │ • Request ID tracking │ │
│ └─────────────────────────────────────────────────────────────────┘ │
│ │ │
│ ▼ │
│ ┌─────────────────────────────────────────────────────────────────┐ │
│ │ 3. POST-RESPONSE PHASE │ │
│ │ │ │
│ │ Response ──► [DLP Scanner] ──► Mask Sensitive Data ──► User │ │
│ │ │ │ │
│ │ ▼ │ │
│ │ [SIEM Event] │ │
│ └─────────────────────────────────────────────────────────────────┘ │
│ │ │
│ ▼ │
│ ┌─────────────────────────────────────────────────────────────────┐ │
│ │ 4. AUDIT LOGGING │ │
│ │ │ │
│ │ • Request ID • Token usage │ │
│ │ • User ID • Latency │ │
│ │ • Session ID • DLP matches │ │
│ │ • IP address • Error details │ │
│ │ │ │
│ │ ──► [PostgreSQL] + [SIEM] │ │
│ └─────────────────────────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────────────────────┘
Data Protection Principles¶
1. Source Code Is Never Transmitted¶
┌──────────────────────────────────────────────────────────────────┐
│ ORGANIZATION PERIMETER │
│ │
│ [Source Code] ──► [CPG Analysis] ──► [DuckDB] │
│ │ │
│ ▼ │
│ [User] ──► "Find buffer overflow in function X" │
│ │ │
│ ▼ │
│ [RAG Query Engine] │
│ │ │
│ ▼ │
│ NL Query (no code snippets) │
│ │ │
└───────────────────────────┼──────────────────────────────────────┘
│
▼
[GigaChat/Yandex API]
│
▼
NL Response (explanation)
What IS sent to LLM: - User text queries - Metadata (function names, file names) - Structured data from CPG
What is NOT sent: - Source code - File contents - Connection strings - Secrets and credentials
API Reference¶
SecureLLMProvider¶
from src.security.llm import SecureLLMProvider
from src.security.config import get_security_config
from src.llm.gigachat import GigaChatProvider
# Create secure provider
base_provider = GigaChatProvider(config)
secure_provider = SecureLLMProvider(
wrapped_provider=base_provider,
config=get_security_config()
)
# Usage (with security context)
response = secure_provider.generate(
system_prompt="You are a security analyst.",
user_prompt="Analyze this function for vulnerabilities.",
_user_id="analyst@company.com",
_session_id="sess-12345",
_ip_address="10.0.0.50"
)
Methods¶
| Method | Description |
|---|---|
generate(system_prompt, user_prompt, **kwargs) |
Generation with full protection |
generate_simple(prompt, **kwargs) |
Simplified call |
generate_stream(system_prompt, user_prompt, **kwargs) |
Streaming generation |
is_available() |
Check provider availability |
Context Parameters¶
| Parameter | Description |
|---|---|
_user_id |
User ID for audit |
_session_id |
Session ID |
_ip_address |
Client IP address |
Configuration¶
Full Configuration (config.yaml)¶
security:
enabled: true
# LLM interaction logging
llm_logging:
enabled: true
log_prompts: true # Log prompts
redact_prompts: true # Redact sensitive data
max_prompt_length: 2000 # Max length in log
log_responses: true # Log responses
max_response_length: 5000 # Max response length in log
log_token_usage: true # Log token usage
log_latency: true # Log latency
log_to_database: true # Save to PostgreSQL
log_to_siem: true # Send to SIEM
# DLP configuration
dlp:
enabled: true
pre_request:
enabled: true
default_action: WARN
post_response:
enabled: true
default_action: MASK
# SIEM integration
siem:
enabled: true
syslog:
enabled: true
host: "siem.company.com"
port: 514
# HashiCorp Vault
vault:
enabled: true
url: "https://vault.company.com:8200"
auth_method: "approle"
llm_secrets_path: "codegraph/llm"
Error Handling¶
DLPBlockedException¶
from src.security.dlp import DLPBlockedException
try:
response = secure_provider.generate(
system_prompt="...",
user_prompt=user_input
)
except DLPBlockedException as e:
# Request blocked by DLP
logger.warning(f"DLP blocked: {e.message}")
# Violation information
for match in e.matches:
print(f"Category: {match.category}")
print(f"Pattern: {match.pattern_name}")
print(f"Severity: {match.severity}")
# Return error to client
return {
"error": "dlp_blocked",
"message": "Request contains sensitive data",
"categories": e.to_dict()["categories"]
}
Error Structure¶
{
"error": "dlp_blocked",
"message": "Request blocked by DLP policy. Detected 2 violation(s) in categories: credentials, pii. Please remove sensitive data before retrying.",
"categories": ["credentials", "pii"],
"violation_count": 2
}
Secrets Management¶
HashiCorp Vault Integration¶
from src.security.vault import VaultClient
from src.security.config import get_security_config
config = get_security_config()
vault = VaultClient(config.vault)
# Get LLM credentials
credentials = vault.get_llm_credentials()
# Result:
# {
# "gigachat_credentials": "...",
# "yandex_ai_api_key": "...",
# "yandex_ai_folder_id": "...",
# "openai_api_key": "sk-...",
# "anthropic_api_key": "sk-ant-..."
# }
Environment Variable Fallback¶
When Vault is unavailable, environment variables are used:
| Variable | Provider |
|---|---|
GIGACHAT_CREDENTIALS |
GigaChat |
YANDEX_AI_API_KEY |
Yandex AI Studio |
YANDEX_AI_FOLDER_ID |
Yandex Cloud Folder |
OPENAI_API_KEY |
OpenAI |
ANTHROPIC_API_KEY |
Anthropic |
AZURE_OPENAI_API_KEY |
Azure OpenAI |
Auditing and Logging¶
Request Log Structure¶
{
"timestamp": "2025-12-14T10:30:00.000Z",
"request_id": "req-12345",
"event": "llm_request",
"provider": "GigaChatProvider",
"model": "GigaChat-2-Pro",
"user_id": "analyst@company.com",
"session_id": "sess-67890",
"ip_address": "10.0.0.50",
"latency_ms": 1250.5,
"tokens": {
"prompt_tokens": 150,
"completion_tokens": 450,
"total_tokens": 600
},
"dlp": {
"pre_request_matches": 0,
"post_response_matches": 1,
"action": "MASK"
}
}
SIEM Events¶
| Event | When Sent |
|---|---|
LLM_REQUEST |
When sending request |
LLM_RESPONSE |
When receiving response |
LLM_ERROR |
On error |
DLP_BLOCK |
When DLP blocks |
DLP_MASK |
When masking |
Streaming Generation¶
Streaming Mode Specifics¶
# Streaming generation with protection
for chunk in secure_provider.generate_stream(
system_prompt="You are a security analyst.",
user_prompt="Explain this vulnerability.",
_user_id="analyst@company.com"
):
# Chunk has already passed pre-request DLP
print(chunk, end="", flush=True)
# Post-response DLP runs after stream completion
# If sensitive data detected - event to SIEM
Streaming Limitations: - Pre-request DLP works fully - Post-response DLP logs matches but cannot modify already-sent chunks - Recommended for interactive scenarios with low risk
Metrics and Monitoring¶
Prometheus Metrics¶
# LLM request count
rate(llm_requests_total[5m])
# DLP block count
rate(dlp_blocks_total{phase="llm_request"}[5m])
# Average LLM latency
histogram_quantile(0.95, rate(llm_latency_seconds_bucket[5m]))
# Token usage
sum(rate(llm_tokens_total[1h])) by (provider, model)
# LLM errors
rate(llm_errors_total[5m])
Grafana Dashboard¶
{
"panels": [
{
"title": "LLM Requests per Minute",
"query": "rate(llm_requests_total[1m])"
},
{
"title": "DLP Block Rate",
"query": "rate(dlp_blocks_total{phase='llm_request'}[5m]) / rate(llm_requests_total[5m])"
},
{
"title": "P95 Latency",
"query": "histogram_quantile(0.95, rate(llm_latency_seconds_bucket[5m]))"
},
{
"title": "Token Usage by Model",
"query": "sum(rate(llm_tokens_total[1h])) by (model)"
}
]
}
Best Practices¶
For Developers¶
- Always pass context —
_user_id,_session_id,_ip_address - Handle DLPBlockedException — inform the user
- Use streaming carefully — understand post-DLP limitations
- Don’t log full responses — use
max_response_length
For Operators¶
- Monitor DLP block rate — high rate may indicate a problem
- Set up LLM_ERROR alerts — provider errors require attention
- Rotate API keys — use Vault with auto-rotation
- Audit tokens — track anomalous consumption
For Compliance¶
- Document data flow — what data goes where
- Retain logs — minimum 1 year for compliance
- Regular audit — verify DLP effectiveness
- Incident response — plan for DLP_BLOCK response
Related Documents¶
- Enterprise Security Brief — Security overview
- DLP Security — DLP patterns and configuration
- SIEM Integration — SIEM integration
Version: 1.0 | December 2025