Scenario 04: Feature Development

Developer understanding existing code to add new features.

Table of Contents

Quick Start

# Select Feature Development Scenario
/select 04

How It Works

Intent Classification

The FeatureDevIntentDetector classifies queries into one of 4 intents (+ fallback). Each intent has EN and RU keywords with morphological matching.

Intent Priority Keywords (EN) Keywords (RU)
integration_points 10 where to add, hook point, implement feature куда добавить, точка интеграции
similar_features 20 similar to, pattern, existing feature похожая функция, паттерн, подобный
extension_points 30 extension point, plugin, hook, extend точка расширения, плагин, хук
dependency_analysis 40 dependency, prerequisite, required зависимость, требование, необходимый

If no intent matches, the fallback general_feature_dev is used (confidence=0.5).

Two-Phase Architecture

S04 uses a two-phase approach for optimal performance:

Phase 1: Handler-based (no LLM). The integrate_handlers() function tries template-based handlers that produce structured reports directly from CPG data. If a handler matches the detected intent and finds results, the response is returned without calling the LLM.

Phase 2: LLM fallback. If no handler matched or the handler found 0 results, the full pipeline runs: keyword extraction, CPG queries (4-level fallback chain), CallGraphAnalyzer for integration points, DependencyAnalyzer for betweenness centrality, then LLM generation.

Query -> FeatureDevIntentDetector -> integrate_handlers()
  |                                       |
  |  Phase 1: Handler matched?   Yes -> Structured report (no LLM)
  |                                No  -> Phase 2: Full pipeline
  |
  Phase 2: Domain plugin -> CPG queries -> CallGraphAnalyzer
           -> DependencyAnalyzer -> LLM -> Response

Three Handlers

Three handlers are registered in HandlerRegistry("feature_dev"), sorted by priority:

Handler Priority Intent Function
ExtensionPointHandler 5 extension_points Finds extension points/hooks in CPG by category, with fallback CPG search and relevance ranking
IntegrationPointsHandler 10 integration_points Finds hook points, entry points from domain plugin, calculates integration complexity
SimilarFeaturesHandler 20 similar_features Finds similar patterns, analyzes dependencies, recommends placement, classifies pattern types

All handlers inherit FeatureDevHandler -> BaseHandler and produce HandlerResult with data, retrieved_functions, evidence, and metadata.

Understanding Existing Code

> Find functions related to transaction handling

╭─────────────── Related Code ────────────────────────────────╮
│                                                              │
│  Transaction Functions:                                      │
│                                                              │
│  Entry Points:                                               │
│    StartTransaction()      src/backend/access/transam/xact.c │
│    CommitTransaction()     src/backend/access/transam/xact.c │
│    AbortTransaction()      src/backend/access/transam/xact.c │
│                                                              │
│  Support Functions:                                          │
│    AssignTransactionId()   src/backend/access/transam/xact.c │
│    GetCurrentTransactionId() src/backend/access/transam/xact.c │
│                                                              │
│  Related Modules:                                            │
│    src/backend/access/heap/  - Heap operations               │
│    src/backend/storage/lmgr/ - Lock manager                  │
│                                                              │
╰──────────────────────────────────────────────────────────────╯

Understanding Call Chains

> Show the call path from parser to executor

╭─────────────── Call Chain ──────────────────────────────────╮
│                                                              │
│  Parser -> Executor Flow:                                    │
│                                                              │
│  1. pg_parse_query()                                         │
│       |                                                      │
│  2. pg_analyze_and_rewrite()                                 │
│       |                                                      │
│  3. pg_plan_queries()                                        │
│       |                                                      │
│  4. PortalRun()                                              │
│       |                                                      │
│  5. ExecutorStart() -> ExecutorRun() -> ExecutorFinish()      │
│       |                                                      │
│  6. ExecProcNode() [dispatches to specific nodes]            │
│                                                              │
╰──────────────────────────────────────────────────────────────╯

CallGraphAnalyzer Integration

The main workflow uses CallGraphAnalyzer from src/analysis/ for deep integration point analysis:

  • find_all_callers(method, max_depth) — traverses the call graph upward to find all transitive callers
  • find_all_callees(method, max_depth) — traverses the call graph downward to find all transitive callees
  • analyze_impact(method) — computes impact_score, transitive_callers, transitive_callees, direct_callers

Methods with a high caller count (> min_callees threshold) are identified as popular integration points — good candidates for hooking new features. Impact analysis determines safety: methods with impact_score < 0.5 are marked as safe to extend.

Integration Points

Finding Extension Points

> Where can I add a new node type?

╭─────────────── Extension Points ────────────────────────────╮
│                                                              │
│  Adding New Executor Node Type:                              │
│                                                              │
│  1. Define node structure in:                                │
│     src/include/nodes/execnodes.h                            │
│                                                              │
│  2. Implement node operations:                               │
│     src/backend/executor/nodeXXX.c                           │
│     - ExecInitXXX()                                          │
│     - ExecXXX()                                              │
│     - ExecEndXXX()                                           │
│                                                              │
│  3. Register in dispatcher:                                  │
│     src/backend/executor/execProcnode.c                      │
│     - ExecInitNode()                                         │
│     - ExecProcNode()                                         │
│                                                              │
╰──────────────────────────────────────────────────────────────╯

The ExtensionPointHandler uses longest-match-first category detection across EN and RU keywords (e.g., “table access method” -> table_am, “custom scan” -> custom_scan). Categories include: aggregate, custom_scan, planner_hooks, executor_hooks, utility_hooks, table_am, index_am, fdw, join_algorithm, authentication.

Hook Points and Entry Points

The IntegrationPointsHandler combines three data sources:

  1. Hook points — subsystem functions from domain plugin, filtered by target module
  2. Entry pointsdomain.get_entry_points() for the active domain
  3. Similar patterns — methods matching the feature name via CPG ILIKE queries

Integration complexity is calculated from the number of hook points: low / medium / high.

Betweenness Centrality Analysis

Phase 4A enhancement uses DependencyAnalyzer.identify_architectural_chokepoints() to find strategically important integration points:

  • Methods with betweenness_percentile > betweenness_percentile_high threshold are flagged as strategic integration points
  • These are central in the architecture — good for features that affect multiple subsystems
  • Metadata includes high_centrality_points, top_centrality_method, max_centrality_percentile

Similar Features and Patterns

Pattern Classifications

The SimilarFeaturesHandler classifies methods into 5 naming patterns:

Pattern Prefixes
Initialization init_, setup_, create_, new_, alloc_
Handler/callback handle_, process_, on_, exec_, run_
Query/lookup get_, find_, lookup_, fetch_, search_
Validation validate_, check_, verify_, is_, has_
Cleanup free_, destroy_, cleanup_, close_, end_

Placement Recommendation

SimilarFeaturesHandler._recommend_placement() recommends where to place new code:

  1. Extracts keywords from the feature name
  2. Scores each subsystem by keyword overlap (with target module boost)
  3. Finds the most common file among the best subsystem’s functions
  4. Identifies the nearest method (by name similarity) as an anchor
  5. Calculates confidence: overlap / (keywords + 2.0)
> Where should I place new cache invalidation feature?

╭─────────────── Recommended Placement ─────────────────────╮
│                                                            │
│  Subsystem: cache manager                                  │
│  File: src/backend/utils/cache.c                           │
│  Near method: cache_lookup() (line 145)                    │
│  Confidence: 75%                                           │
│                                                            │
│  Related methods in this area:                             │
│    - cache_insert()                                        │
│    - cache_remove()                                        │
│    - cache_invalidate_all()                                │
│                                                            │
╰────────────────────────────────────────────────────────────╯

Pattern Examples

> Show pattern examples for executor subsystem

╭─────────────── Pattern Examples ──────────────────────────╮
│                                                            │
│  Examples from the target subsystem:                       │
│                                                            │
│  Method          | File          | Pattern        | CC    │
│  init_executor() | execMain.c:42 | Initialization | 3     │
│  exec_scan()     | execScan.c:88 | Handler        | 5     │
│  get_plan_node() | execUtils.c:15| Query/lookup   | 2     │
│  check_perms()   | execPerms.c:7 | Validation     | 4     │
│  end_executor()  | execMain.c:200| Cleanup        | 1     │
│                                                            │
╰────────────────────────────────────────────────────────────╯

Examples are sorted by cyclomatic complexity (ascending) — simpler patterns serve as better templates.

Domain-Agnostic Architecture

S04 is fully domain-agnostic — it uses get_active_domain() to load the domain plugin, which provides:

  • get_extension_points(category) — extension point functions for a category
  • get_extension_categories() — available categories with keywords
  • get_subsystem_functions() — functions grouped by subsystem
  • get_entry_points() — top-level entry point function names

When querying the CPG for matching functions, the workflow uses a 4-level fallback chain:

  1. Domain plugin — get functions from plugin, query CPG by exact name
  2. Category-based CPG search — derive English terms from category name, ILIKE search
  3. Keyword-based CPG search — extract keywords from query, ILIKE search
  4. Generic hooks fallback — search for *_hook, *Hook* patterns; then regex extraction of CamelCase/snake_case names from query

CLI Usage

# Ask about integration points
python -m src.cli query "Where to add a new join algorithm?"

# Find similar features
python -m src.cli query "Show similar features to heap_insert"

# Find extension points
python -m src.cli query "Find planner hook injection points"

# Analyze dependencies
python -m src.cli query "What modules does the buffer manager depend on?"

Example Questions

  • “How does [feature_name] work?”
  • “What functions are involved in [subsystem]?”
  • “Where should I place new [feature] in [module]?”
  • “Show similar implementations to [existing_feature]”
  • “What’s the data flow for [operation]?”
  • “Find integration points for [component]”
  • “Show pattern examples for [subsystem]”
  • “Recommend placement for new [feature_type]”
  • “Find extension points for [category]”
  • “What are the hook points in [module]?”

S04 vs S05 vs S13: S04 focuses on finding integration and extension points for adding new features to existing code. S05 focuses on refactoring existing code (detecting smells, suggesting improvements). S13 focuses on mass refactoring — coordinating large-scale changes across many files.