Demo API

Public, unauthenticated endpoint for the landing page “Try it yourself” feature. Rate-limited per IP, restricted to the onboarding scenario only.

Endpoints

POST /api/v1/demo/chat

Send a natural-language query about the demo codebase (PostgreSQL 17.6).

Authentication: None (public endpoint) Rate limit: 30 requests/minute per IP

Request

{
  "query": "Where is the main function?",
  "language": "en"
}
Field Type Required Default Description
query string Yes Natural-language question (1–500 chars)
language string No "ru" Response language ("en" or "ru")

Response (200)

Successful query:

{
  "answer": "The main function is defined in src/backend/main/main.c at line 53...",
  "scenario_id": "onboarding",
  "processing_time_ms": 234.5
}

Rejected query (off-topic, wrong scenario, or blocked content):

{
  "answer": "### Request Blocked\n\nThis type of request is not supported in the demo version...",
  "scenario_id": "demo_rejection",
  "processing_time_ms": 12.3
}

Internal error (LLM unavailable, etc.):

{
  "answer": "Sorry, the analysis system is temporarily unavailable. Please try again later.",
  "scenario_id": "error",
  "processing_time_ms": 1502.7
}
Field Type Description
answer string LLM-generated response, rejection message, or error text
scenario_id string "onboarding" — success, "demo_rejection" — query rejected, "error" — internal error
processing_time_ms float Server-side processing time

Error Responses

Status Cause Body
400 Query too long (>500 chars) {"detail": "Query too long. Maximum length is 500 characters."}
422 Pydantic validation failure (empty query, wrong types) {"detail": [...]}
429 Rate limit exceeded {"detail": "Too many requests"}
503 Demo mode disabled {"detail": "Demo endpoint is currently disabled"}

Note: Off-topic and wrong-scenario queries return HTTP 200 with scenario_id: "demo_rejection" and a friendly message in the answer field. They are NOT HTTP errors — this allows the landing page to display helpful guidance without triggering error handlers.

GET /api/v1/demo/status

Check demo endpoint availability and configuration.

Authentication: None

Response (200)

{
  "enabled": true,
  "rate_limit": "30/minute",
  "max_query_length": 500,
  "allowed_scenarios": ["onboarding"]
}

Query Validation Pipeline

Incoming queries pass through a 3-stage validation pipeline before processing:

1. HARD REJECT  — explicit malicious content (regex patterns from domain plugin)
   └─ Returns 200 with scenario_id="demo_rejection", rejection_reason="blocked_content"

2. WRONG SCENARIO — legitimate but outside onboarding scope
   └─ Returns 200 with scenario_id="demo_rejection", rejection_reason="wrong_scenario"

3. DOMAIN RELEVANCE — keyword/pattern scoring against domain plugin
   └─ Score ≥ 0.5 (LOW threshold) → accepted, forwarded to onboarding handler
   └─ Score < 0.5 → rejected with scenario_id="demo_rejection", rejection_reason="off_topic"

ValidationResult

The internal ValidationResult dataclass (demo.py:122):

@dataclass
class ValidationResult:
    is_valid: bool
    confidence: float
    rejection_reason: Optional[str] = None  # "off_topic" | "wrong_scenario" | "blocked_content"
    detected_scenario: Optional[str] = None
    method: str = "keyword"

Relevance Thresholds

Configured in src/config/demo.yamlrelevance.thresholds:

Threshold Value Trigger
high 0.9 3+ keyword matches
medium 0.75 2 keyword matches
low 0.5 1 keyword match — rejection boundary
minimal 0.1 0 keyword matches

Queries with score below low (0.5) are rejected as off-topic.

Domain Plugin Methods

Validation patterns are loaded from the active domain plugin (DomainPluginV3):

Method Returns Purpose
get_demo_keywords() List[str] Domain-specific keywords for relevance scoring
get_hard_reject_patterns() List[str] Regex patterns for hard rejection
get_wrong_scenario_patterns() List[Tuple[str, str]] (pattern, scenario_name) pairs for wrong-scenario detection

Caching

The demo endpoint caches responses to reduce LLM calls:

Cache Size TTL Key
Response cache 100 entries (LRU) 30 minutes query.lower().strip()

Note: The response cache is checked before validation. Repeated queries (even off-topic ones that were previously processed) return cached results without re-validation.

Configuration

config.yaml

api:
  demo:
    enabled: true                    # Enable/disable demo endpoint
    rate_limit: 30/minute            # Rate limit per IP
    max_query_length: 500            # Max query length in characters
    allowed_scenarios:               # Allowed scenario IDs
      - onboarding

src/config/demo.yaml

Separate configuration for caching and relevance scoring:

cache:
  dynamic_response:
    maxsize: 100           # LRU cache size for responses
    ttl_seconds: 1800      # 30 minutes
  judge:
    maxsize: 1000          # LRU cache size (reserved for future LLM judge)
    ttl_seconds: 3600      # 1 hour

llm_judge:
  temperature: 0.1         # Reserved for future LLM judge implementation
  max_tokens: 50
  model: yandexgpt-lite

relevance:
  thresholds:
    high: 0.9
    medium: 0.75
    low: 0.5               # Rejection boundary
    minimal: 0.1

Environment Variables

Variable Default Description
DEMO_ENABLED true Enable/disable demo endpoint
DEMO_RATE_LIMIT 30/minute Rate limit per IP

Pydantic Models

class DemoRequest(BaseModel):
    query: str = Field(..., min_length=1, max_length=500, description="User query")
    language: str = Field(default="ru", description="Response language")

class DemoResponse(BaseModel):
    answer: str = Field(..., description="Response from the system")
    scenario_id: str = Field(default="onboarding", description="Scenario used")
    processing_time_ms: float = Field(..., description="Processing time in milliseconds")

Usage Examples

curl

# Valid query
curl -X POST http://localhost:8000/api/v1/demo/chat \
  -H "Content-Type: application/json" \
  -d '{"query": "How does MVCC work?", "language": "ru"}'

# Check status
curl http://localhost:8000/api/v1/demo/status

JavaScript (landing page)

const response = await fetch('/api/v1/demo/chat', {
  method: 'POST',
  headers: { 'Content-Type': 'application/json' },
  body: JSON.stringify({
    query: document.getElementById('demo-input').value,
    language: 'ru'
  })
});

const data = await response.json();

if (response.ok) {
  if (data.scenario_id === 'demo_rejection') {
    // Query rejected — show friendly guidance
    showRejection(data.answer);
  } else if (data.scenario_id === 'error') {
    showError(data.answer);
  } else {
    showAnswer(data.answer);
  }
} else if (response.status === 429) {
  showRateLimit();
}

Security Considerations

  • No authentication — endpoint is publicly accessible
  • Rate limiting — 30 requests/minute per IP prevents abuse
  • Scenario restriction — only onboarding scenario allowed (no security analysis, file editing, etc.)
  • Query validation — 3-stage pipeline blocks malicious and off-topic queries
  • Read-only — no write operations possible through demo endpoint
  • Domain keywords — loaded from domain plugin, not hardcoded

Module: src/api/routers/demo.py Config: src/config/demo.yaml, config.yamlapi.demo Last updated: March 2026