Data Protection Guide for Working with LLM Providers
Table of Contents¶
- Overview
- Security Guarantees
- Architecture
- Request Processing Pipeline
- Data Protection Principles
- 1. Source Code Is Never Transmitted
- API Reference
- SecureLLMProvider
- Methods
- Context Parameters
- ContentScanner
- Configuration
- Full Configuration (config.yaml)
- Error Handling
- DLPBlockedException
- Error Structure
- Secrets Management
- HashiCorp Vault Integration
- Environment Variable Fallback
- Auditing and Logging
- Request Log Structure
- SIEM Events
- Streaming Generation
- Streaming Mode Specifics
- Metrics and Monitoring
- Prometheus Metrics
- Grafana Dashboard
- Best Practices
- For Developers
- For Operators
- For Compliance
- Related Documents
Overview¶
The SecureLLMProvider module provides comprehensive protection when working with external LLM providers (GigaChat, Yandex AI Studio, OpenAI, and others). The module implements the “defence in depth” principle — multi-layered data protection.
Security Guarantees¶
| Guarantee | Implementation |
|---|---|
| No code transmission | Only NL queries sent to LLM |
| Pre-request DLP | ContentScanner scanning before sending |
| Post-response DLP | Masking in responses |
| Full audit | Logging every request |
| SIEM integration | Real-time events |
| Secrets in Vault | API keys in HashiCorp Vault |
Architecture¶
Request Processing Pipeline¶
+---------------------------------------------------------------------------+
| SecureLLMProvider |
| |
| +-------------------------------------------------------------------+ |
| | 1. PRE-REQUEST PHASE | |
| | | |
| | User Query --> [ContentScanner] --+--> BLOCK --> DLPBlockedException |
| | | | |
| | +--> MASK --> Masked Query | |
| | | | |
| | +--> WARN --> Original + Event | |
| | | | |
| | +--> PASS --> Original Query | |
| | | | |
| | +---------------+ | |
| | v | |
| | [SIEM Event] | |
| +-------------------------------------------------------------------+ |
| | |
| v |
| +-------------------------------------------------------------------+ |
| | 2. LLM PROVIDER CALL | |
| | | |
| | [VaultClient] --> API Key --> [GigaChat/Yandex AI/OpenAI] --> Response |
| | | |
| | * API key from Vault | |
| | * TLS connection | |
| | * Request ID tracking | |
| +-------------------------------------------------------------------+ |
| | |
| v |
| +-------------------------------------------------------------------+ |
| | 3. POST-RESPONSE PHASE | |
| | | |
| | Response --> [ContentScanner] --> Mask Sensitive Data --> User | |
| | | | |
| | v | |
| | [SIEM Event] | |
| +-------------------------------------------------------------------+ |
| | |
| v |
| +-------------------------------------------------------------------+ |
| | 4. AUDIT LOGGING | |
| | | |
| | * Request ID * Token usage | |
| | * User ID * Latency | |
| | * Session ID * DLP matches | |
| | * IP address * Error details | |
| | | |
| | --> [PostgreSQL] + [SIEM] | |
| +-------------------------------------------------------------------+ |
+---------------------------------------------------------------------------+
Data Protection Principles¶
1. Source Code Is Never Transmitted¶
+------------------------------------------------------------------+
| ORGANIZATION PERIMETER |
| |
| [Source Code] --> [CPG Analysis] --> [DuckDB] |
| | |
| v |
| [User] --> "Find buffer overflow in function X" |
| | |
| v |
| [RAG Query Engine] |
| | |
| v |
| NL Query (no code snippets) |
| | |
+---------------------------+----------------------------------------+
|
v
[GigaChat/Yandex API]
|
v
NL Response (explanation)
What IS sent to LLM: - User text queries - Metadata (function names, file names) - Structured data from CPG
What is NOT sent: - Source code - File contents - Connection strings - Secrets and credentials
API Reference¶
SecureLLMProvider¶
from src.security.llm import SecureLLMProvider
from src.security.config import get_security_config
from src.llm.gigachat_provider import GigaChatProvider
# Create secure provider
base_provider = GigaChatProvider(config)
secure_provider = SecureLLMProvider(
wrapped_provider=base_provider,
config=get_security_config()
)
# Usage (with security context)
response = secure_provider.generate(
system_prompt="You are a security analyst.",
user_prompt="Analyze this function for vulnerabilities.",
_user_id="analyst@company.com",
_session_id="sess-12345",
_ip_address="10.0.0.50"
)
Methods¶
| Method | Description |
|---|---|
generate(system_prompt, user_prompt, **kwargs) |
Generation with full protection |
generate_simple(prompt, **kwargs) |
Simplified call |
generate_stream(system_prompt, user_prompt, **kwargs) |
Streaming generation |
is_available() |
Check provider availability |
Context Parameters¶
| Parameter | Description |
|---|---|
_user_id |
User ID for audit |
_session_id |
Session ID |
_ip_address |
Client IP address |
ContentScanner¶
The ContentScanner class (src/security/dlp/scanner.py) performs DLP scanning in both pre-request and post-response phases. It is used internally by SecureLLMProvider and can also be used directly:
from src.security.dlp import ContentScanner
scanner = ContentScanner(config.dlp)
result = scanner.scan_request(text) # Pre-request scan
result = scanner.scan_response(text) # Post-response scan
Configuration¶
Full Configuration (config.yaml)¶
security:
enabled: true
# LLM interaction logging
llm_logging:
enabled: true
log_prompts: true # Log prompts
redact_prompts: true # Redact sensitive data
max_prompt_length: 2000 # Max length in log
log_responses: true # Log responses
max_response_length: 5000 # Max response length in log
log_token_usage: true # Log token usage
log_latency: true # Log latency
log_to_database: true # Save to PostgreSQL
log_to_siem: true # Send to SIEM
# DLP configuration
dlp:
enabled: true
pre_request:
enabled: true
default_action: WARN
post_response:
enabled: true
default_action: MASK
# SIEM integration
siem:
enabled: true
syslog:
enabled: true
host: "siem.company.com"
port: 514
# HashiCorp Vault
vault:
enabled: true
url: "https://vault.company.com:8200"
auth_method: "approle"
llm_secrets_path: "codegraph/llm"
Error Handling¶
DLPBlockedException¶
from src.security.dlp import DLPBlockedException
try:
response = secure_provider.generate(
system_prompt="...",
user_prompt=user_input
)
except DLPBlockedException as e:
# Request blocked by DLP
logger.warning(f"DLP blocked: {e.message}")
# Violation information
for match in e.matches:
print(f"Category: {match.category}")
print(f"Pattern: {match.pattern_name}")
print(f"Severity: {match.severity}")
# Return error to client
return {
"error": "dlp_blocked",
"message": "Request contains sensitive data",
"categories": e.to_dict()["categories"]
}
Error Structure¶
{
"error": "dlp_blocked",
"message": "Request blocked by DLP policy. Detected 2 violation(s) in categories: credentials, pii. Please remove sensitive data before retrying.",
"categories": ["credentials", "pii"],
"violation_count": 2
}
Secrets Management¶
HashiCorp Vault Integration¶
from src.security.vault import VaultClient
from src.security.config import get_security_config
config = get_security_config()
vault = VaultClient(config.vault)
# Get LLM credentials
credentials = vault.get_llm_credentials()
# Result (from env fallback):
# {
# "gigachat_credentials": "...",
# "gigachat_api_key": "...",
# "openai_api_key": "sk-...",
# "anthropic_api_key": "sk-ant-...",
# "azure_openai_api_key": "...",
# "azure_openai_endpoint": "https://..."
# }
Note:
get_llm_credentials()retrieves credentials for GigaChat, OpenAI, Anthropic, and Azure OpenAI. Yandex credentials (YANDEX_API_KEY,YANDEX_FOLDER_ID) are managed separately throughconfig.yamland are not stored in Vault.
Environment Variable Fallback¶
When Vault is unavailable, environment variables are used:
| Variable | Provider |
|---|---|
GIGACHAT_CREDENTIALS |
GigaChat |
GIGACHAT_API_KEY |
GigaChat (alternative) |
YANDEX_API_KEY |
Yandex AI Studio |
YANDEX_FOLDER_ID |
Yandex Cloud Folder |
OPENAI_API_KEY |
OpenAI |
ANTHROPIC_API_KEY |
Anthropic |
AZURE_OPENAI_API_KEY |
Azure OpenAI |
AZURE_OPENAI_ENDPOINT |
Azure OpenAI Endpoint |
Auditing and Logging¶
Request Log Structure¶
{
"timestamp": "2026-02-26T10:30:00.000Z",
"request_id": "req-12345",
"event": "llm_request",
"provider": "GigaChatProvider",
"model": "GigaChat-2-Pro",
"user_id": "analyst@company.com",
"session_id": "sess-67890",
"ip_address": "10.0.0.50",
"latency_ms": 1250.5,
"tokens": {
"prompt_tokens": 150,
"completion_tokens": 450,
"total_tokens": 600
},
"dlp": {
"pre_request_matches": 0,
"post_response_matches": 1,
"action": "MASK"
}
}
SIEM Events¶
| Event | When Sent |
|---|---|
LLM_REQUEST |
When sending request to LLM provider |
LLM_RESPONSE |
When receiving response from LLM provider |
LLM_ERROR |
On LLM provider error |
DLP_BLOCK |
When DLP blocks the request (action=BLOCK) |
DLP_MASK |
When DLP masks sensitive data (action=MASK) |
DLP_WARN |
When DLP detects a match but only logs a warning (action=WARN) |
DLP_LOG |
When DLP logs a match for audit without taking action (action=LOG) |
Streaming Generation¶
Streaming Mode Specifics¶
# Streaming generation with protection
for chunk in secure_provider.generate_stream(
system_prompt="You are a security analyst.",
user_prompt="Explain this vulnerability.",
_user_id="analyst@company.com"
):
# Chunk has already passed pre-request DLP
print(chunk, end="", flush=True)
# Post-response DLP runs after stream completion
# If sensitive data detected - event to SIEM
Streaming Limitations: - Pre-request DLP works fully - Post-response DLP logs matches but cannot modify already-sent chunks - Recommended for interactive scenarios with low risk
Metrics and Monitoring¶
Prometheus Metrics¶
All LLM-related metrics use the rag_ prefix:
# LLM request count
rate(rag_total_requests[5m])
# Average LLM latency (P95)
histogram_quantile(0.95, rate(rag_llm_latency_seconds_bucket[5m]))
# Token usage by model
sum(rate(rag_llm_tokens_total[1h])) by (model, token_type)
# LLM errors by model and type
rate(rag_llm_errors_total[5m])
| Metric | Type | Labels | Description |
|---|---|---|---|
rag_total_requests |
Counter | — | Total requests processed |
rag_active_requests |
Gauge | — | In-flight requests |
rag_llm_latency_seconds |
Histogram | model, operation | LLM API call latency |
rag_llm_tokens_total |
Counter | model, token_type | Token usage |
rag_llm_errors_total |
Counter | model, error_type | LLM API errors |
Note: There is no dedicated DLP block metric. DLP events are tracked through SIEM events (
DLP_BLOCK,DLP_WARN, etc.).
Grafana Dashboard¶
{
"panels": [
{
"title": "LLM Requests per Minute",
"query": "rate(rag_total_requests[1m])"
},
{
"title": "P95 Latency",
"query": "histogram_quantile(0.95, rate(rag_llm_latency_seconds_bucket[5m]))"
},
{
"title": "Token Usage by Model",
"query": "sum(rate(rag_llm_tokens_total[1h])) by (model)"
},
{
"title": "LLM Errors",
"query": "rate(rag_llm_errors_total[5m])"
}
]
}
Best Practices¶
For Developers¶
- Always pass context —
_user_id,_session_id,_ip_address - Handle DLPBlockedException — inform the user
- Use streaming carefully — understand post-DLP limitations
- Don’t log full responses — use
max_response_length
For Operators¶
- Monitor DLP events — track DLP_BLOCK/DLP_WARN events in SIEM
- Set up LLM_ERROR alerts — provider errors require attention
- Rotate API keys — use Vault with auto-rotation
- Audit tokens — track anomalous consumption
For Compliance¶
- Document data flow — what data goes where
- Retain logs — minimum 1 year for compliance
- Regular audit — verify DLP effectiveness
- Incident response — plan for DLP_BLOCK response
Note: Structured prompt templates are managed via the global prompt registry (
config/prompts/), enabling centralized auditing and versioning of all LLM prompts used across the platform.
Related Documents¶
- Enterprise Security Brief — Security overview
- DLP Security — DLP patterns and configuration
- SIEM Integration — SIEM integration
Version: 1.2 | March 2026