Yandex AI Studio Integration Guide¶

Integration guide for Yandex Cloud AI Studio with CodeGraph using OpenAI-compatible API.

Table of Contents¶

Overview
Quick Setup (3 Steps)
Step 1: Create API Key
Step 2: Set Environment Variables
Step 3: Configure config.yaml
Available Models
Text Generation Models
Embedding Models
Model URI Format
Usage in Code
Basic Usage
Streaming
Embeddings
With CodeGraph Workflow
Configuration Reference
Full config.yaml Example
Environment Variables
Provider Parameters
Privacy and Compliance
Error Handling
Authentication Error (401)
Rate Limit Exceeded (429)
Connection Timeout
Folder ID Error
Retry Logic
Best Practices
Comparison with Other Providers
Resources
Next Steps

Overview¶

Yandex Cloud AI Studio is a comprehensive AI platform providing access to: - YandexGPT family of models (YandexGPT, YandexGPT-Lite, YandexGPT-32k) - Open-source models (Qwen3, DeepSeek, OpenAI OSS) - Embedding models for semantic search

CodeGraph integrates with Yandex AI Studio using the OpenAI-compatible API, which allows using the standard openai Python library.

Key Features: - OpenAI API compatibility (use familiar openai library) - Privacy-focused: logging disabled by default via x-data-logging-enabled: false - Multiple model options from 20B to 235B parameters - Built-in retry with exponential backoff

Quick Setup (3 Steps)¶

Step 1: Create API Key¶

Go to Yandex Cloud Console
Navigate to AI Studio in the left menu
Click Create new key in the top panel
Select Create API key
In the Scope field, select yc.ai.languageModels.execute
Click Create and save both the ID and secret key
Note your Folder ID from the folder selector in the top panel

Step 2: Set Environment Variables¶

# Windows PowerShell
$env:YANDEX_API_KEY = "AQVNxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"
$env:YANDEX_FOLDER_ID = "b1gxxxxxxxxxxxxxxxxx"

# Permanent (survives restart)
[System.Environment]::SetEnvironmentVariable('YANDEX_API_KEY', 'AQVNxxx...', 'User')
[System.Environment]::SetEnvironmentVariable('YANDEX_FOLDER_ID', 'b1gxxx...', 'User')

# Linux/Mac
export YANDEX_API_KEY="AQVNxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"
export YANDEX_FOLDER_ID="b1gxxxxxxxxxxxxxxxxx"

# Add to ~/.bashrc for permanent
echo 'export YANDEX_API_KEY="your_key"' >> ~/.bashrc
echo 'export YANDEX_FOLDER_ID="your_folder"' >> ~/.bashrc

Step 3: Configure config.yaml¶

llm:
  provider: yandex

  yandex:
    api_key: ${YANDEX_API_KEY}
    folder_id: ${YANDEX_FOLDER_ID}
    model: "qwen3-235b-a22b-fp8/latest"
    temperature: 0.7
    max_tokens: 2000
    timeout: 180

Verify configuration:

python -c "
from src.llm.yandex_provider import YandexProvider
from src.llm.base_provider import LLMConfig

config = LLMConfig(provider_type='yandex')
provider = YandexProvider(config)
print(f'Provider: {provider}')
print('Connection successful!')
"

Available Models¶

Text Generation Models¶

Model	URI	Context	Best For
Qwen3 235B	`qwen3-235b-a22b-fp8/latest`	262K	Complex code analysis, security review (default)
gpt-oss-120b	`gpt-oss-120b/latest`	131K	OpenAI OSS reasoning model, complex tasks
gpt-oss-20b	`gpt-oss-20b/latest`	131K	OpenAI OSS model, balanced performance
Gemma 3 27B	`gemma-3-27b-it/latest`	131K	Google’s open model, instruction-tuned
YandexGPT Pro 5	`yandexgpt/latest`	32K	General purpose, excellent Russian support
YandexGPT Pro 5.1	`yandexgpt/rc`	32K	Latest features, improved reasoning
YandexGPT Lite 5	`yandexgpt-lite`	32K	Fast responses, simple queries
Alice AI LLM	`aliceai-llm`	32K	Conversational AI, dialogue systems
Fine-tuned YandexGPT Lite	`yandexgpt-lite/latest@<suffix>`	32K	Custom fine-tuned models

Recommendations: - Code Analysis: Use qwen3-235b-a22b-fp8/latest (262K context, best quality) - Security Review: Use gpt-oss-120b/latest (reasoning capabilities) - Fast Queries: Use yandexgpt-lite for speed - Russian Text: Use yandexgpt/latest for native Russian support - Large Files: Use models with 131K+ context (Qwen3, gpt-oss, Gemma)

Embedding Models¶

Model	Dimensions	Use Case
`text-search-doc/latest`	256	Document embeddings (default)
`text-search-query/latest`	256	Query embeddings

Model URI Format¶

Yandex requires a specific model URI format:

gpt://<folder_id>/<model_name>
emb://<folder_id>/<embedding_model>

Examples:

gpt://b1g123456789/qwen3-235b-a22b-fp8/latest
gpt://b1g123456789/yandexgpt/latest
emb://b1g123456789/text-search-doc/latest

The CodeGraph provider constructs this automatically from your folder_id and model settings.

Usage in Code¶

Basic Usage¶

import os
from openai import OpenAI

# Initialize client with Yandex endpoint
client = OpenAI(
    api_key=os.getenv("YANDEX_API_KEY"),
    base_url="https://llm.api.cloud.yandex.net/v1",
    default_headers={
        "x-data-logging-enabled": "false",  # Privacy
        "x-folder-id": os.getenv("YANDEX_FOLDER_ID"),
    },
)

# Construct model URI
folder_id = os.getenv("YANDEX_FOLDER_ID")
model_uri = f"gpt://{folder_id}/qwen3-235b-a22b-fp8/latest"

# Make request
response = client.chat.completions.create(
    model=model_uri,
    messages=[
        {"role": "system", "content": "You are a code analysis expert."},
        {"role": "user", "content": "Explain MVCC in PostgreSQL."},
    ],
    temperature=0.7,
    max_tokens=2000,
)

print(response.choices[0].message.content)

Streaming¶

# Stream response
stream = client.chat.completions.create(
    model=model_uri,
    messages=[
        {"role": "user", "content": "List security best practices for C code."},
    ],
    stream=True,
)

for chunk in stream:
    if chunk.choices and chunk.choices[0].delta.content:
        print(chunk.choices[0].delta.content, end="", flush=True)

Embeddings¶

# Get embeddings
embedding_uri = f"emb://{folder_id}/text-search-doc/latest"

response = client.embeddings.create(
    model=embedding_uri,
    input="Find buffer overflow vulnerabilities",
    encoding_format="float",
)

embedding = response.data[0].embedding
print(f"Embedding dimension: {len(embedding)}")  # 256

With CodeGraph Workflow¶

from src.llm.yandex_provider import YandexProvider
from src.llm.base_provider import LLMConfig

# Initialize provider (reads from config.yaml)
config = LLMConfig(
    provider_type='yandex',
    temperature=0.7,
    max_tokens=2000,
    extra_params={
        'api_key': os.getenv('YANDEX_API_KEY'),
        'folder_id': os.getenv('YANDEX_FOLDER_ID'),
        'model': 'qwen3-235b-a22b-fp8/latest',
    }
)
provider = YandexProvider(config)

# Generate response
response = provider.generate(
    system_prompt="You are a PostgreSQL security expert.",
    user_prompt="Find SQL injection vulnerabilities in the executor module."
)

print(response.content)
print(f"Tokens used: {response.metadata['tokens_used']}")
print(f"Latency: {response.metadata['latency_ms']:.0f}ms")

Using with workflow:

from src.workflow.orchestration.copilot import Copilot

# Copilot automatically uses configured provider
copilot = Copilot()
result = copilot.answer("Find memory allocation functions in PostgreSQL")
print(result['answer'])

Configuration Reference¶

Full config.yaml Example¶

llm:
  # Set Yandex as the active provider
  provider: yandex

  yandex:
    # Required: Authentication
    api_key: ${YANDEX_API_KEY}      # Service account API key
    folder_id: ${YANDEX_FOLDER_ID}  # Yandex Cloud folder ID

    # Model selection
    model: "qwen3-235b-a22b-fp8/latest"  # Best for code analysis

    # API endpoint (default, usually not changed)
    base_url: "https://llm.api.cloud.yandex.net/v1"

    # Timeouts
    timeout: 180  # seconds (increased for large prompts)

    # Generation parameters
    temperature: 0.7       # Creativity (0.0-1.0)
    max_tokens: 2000       # Max response length
    top_p: null            # Use Yandex default

    # Embedding model
    embedding_model: "text-search-doc/latest"

Environment Variables¶

Variable	Required	Description
`YANDEX_API_KEY`	Yes	API key from Yandex Cloud Console
`YANDEX_FOLDER_ID`	Yes	Folder ID where AI Studio is enabled

Provider Parameters¶

Parameter	Type	Default	Description
`api_key`	string	-	Yandex Cloud API key (required)
`folder_id`	string	-	Yandex Cloud folder ID (required)
`model`	string	`qwen3-235b-a22b-fp8/latest`	Text generation model
`base_url`	string	`https://llm.api.cloud.yandex.net/v1`	API endpoint
`timeout`	int	60	Request timeout in seconds
`temperature`	float	0.7	Response randomness (0.0-1.0)
`max_tokens`	int	2000	Maximum response tokens
`top_p`	float	null	Nucleus sampling parameter
`embedding_model`	string	`text-search-doc/latest`	Embedding model

Privacy and Compliance¶

CodeGraph automatically disables Yandex-side logging for privacy/GDPR compliance:

default_headers={
    "x-data-logging-enabled": "false",  # Disables logging on Yandex side
    "x-folder-id": folder_id,
}

This ensures: - Your prompts are not logged by Yandex - Response data is not stored - Compliant with data protection regulations

Error Handling¶

Authentication Error (401)¶

Error: 401 Unauthorized

Solutions:

# Check if variables are set
echo $YANDEX_API_KEY
echo $YANDEX_FOLDER_ID

# Verify API key format (should start with AQVNw, AQVN_, or similar)
python -c "import os; key = os.getenv('YANDEX_API_KEY', ''); print(f'Key prefix: {key[:5]}...' if key else 'NOT SET')"

# Verify API key scope includes yc.ai.languageModels.execute

Rate Limit Exceeded (429)¶

Error: 429 Too Many Requests

Solutions:

# Built-in retry handles this automatically
# But you can increase timeout for complex requests
yandex:
  timeout: 300  # 5 minutes

Or implement custom retry:

import time

for attempt in range(3):
    try:
        response = provider.generate(system_prompt, user_prompt)
        break
    except YandexRateLimitError:
        time.sleep(2 ** attempt)  # Exponential backoff

Connection Timeout¶

Error: Request timeout

Solutions:

# Increase timeout for large prompts/responses
yandex:
  timeout: 300  # seconds

Folder ID Error¶

Error: Folder not found or access denied

Solutions: 1. Verify folder ID in Yandex Cloud Console (top panel dropdown) 2. Ensure AI Studio is enabled in the folder 3. Check that API key has access to the folder

Retry Logic¶

The Yandex provider includes built-in retry with exponential backoff:

# Automatic retry for:
# - APITimeoutError (timeout)
# - APIConnectionError (connection issues)

# Retry parameters:
# - max_retries: 3
# - initial_delay: 2.0 seconds
# - max_delay: 30.0 seconds
# - backoff_factor: 2.0

Example retry sequence: 1. First attempt fails -> wait 2s 2. Second attempt fails -> wait 4s 3. Third attempt fails -> raise exception

Best Practices¶

Use Qwen3-235B for code analysis - Best quality for complex queries
Use YandexGPT-Lite for simple queries - Faster response times
Set appropriate timeout - 180s for security analysis, 60s for simple queries
Monitor token usage - Response metadata includes token counts
Use streaming for long responses - Better UX for interactive applications
Store credentials in environment - Never commit API keys to git

Comparison with Other Providers¶

Feature	Yandex AI Studio	GigaChat	OpenAI
API Compatibility	OpenAI-compatible	Custom SDK	Native
Best Model	Qwen3 235B	GigaChat-2-Pro	GPT-4
Privacy Header	Yes (x-data-logging-enabled)	No	No
Russian Support	Excellent	Excellent	Good
Pricing	Pay-per-token	Pay-per-token	Pay-per-token
Max Context	262K (Qwen3)	32K	128K (GPT-4)
Open Models	Qwen3, Gemma, gpt-oss	No	No

Resources¶

Next Steps¶

Installation - Full setup guide
Configuration - All settings
TUI User Guide - Using the system
GigaChat Integration - Alternative LLM provider