Developer understanding existing code to add new features.
Table of Contents¶
- Quick Start
- How It Works
- Intent Classification
- Two-Phase Architecture
- Three Handlers
- Understanding Existing Code
- Finding Related Functions
- Understanding Call Chains
- CallGraphAnalyzer Integration
- Integration Points
- Finding Extension Points
- Hook Points and Entry Points
- Betweenness Centrality Analysis
- Similar Features and Patterns
- Pattern Classifications
- Placement Recommendation
- Pattern Examples
- Domain-Agnostic Architecture
- CLI Usage
- Example Questions
- Related Scenarios
Quick Start¶
# Select Feature Development Scenario
/select 04
How It Works¶
Intent Classification¶
The FeatureDevIntentDetector classifies queries into one of 4 intents (+ fallback). Each intent has EN and RU keywords with morphological matching.
| Intent | Priority | Keywords (EN) | Keywords (RU) |
|---|---|---|---|
integration_points |
10 | where to add, hook point, implement feature | куда добавить, точка интеграции |
similar_features |
20 | similar to, pattern, existing feature | похожая функция, паттерн, подобный |
extension_points |
30 | extension point, plugin, hook, extend | точка расширения, плагин, хук |
dependency_analysis |
40 | dependency, prerequisite, required | зависимость, требование, необходимый |
If no intent matches, the fallback general_feature_dev is used (confidence=0.5).
Two-Phase Architecture¶
S04 uses a two-phase approach for optimal performance:
Phase 1: Handler-based (no LLM). The integrate_handlers() function tries template-based handlers that produce structured reports directly from CPG data. If a handler matches the detected intent and finds results, the response is returned without calling the LLM.
Phase 2: LLM fallback. If no handler matched or the handler found 0 results, the full pipeline runs: keyword extraction, CPG queries (4-level fallback chain), CallGraphAnalyzer for integration points, DependencyAnalyzer for betweenness centrality, then LLM generation.
Query -> FeatureDevIntentDetector -> integrate_handlers()
| |
| Phase 1: Handler matched? Yes -> Structured report (no LLM)
| No -> Phase 2: Full pipeline
|
Phase 2: Domain plugin -> CPG queries -> CallGraphAnalyzer
-> DependencyAnalyzer -> LLM -> Response
Three Handlers¶
Three handlers are registered in HandlerRegistry("feature_dev"), sorted by priority:
| Handler | Priority | Intent | Function |
|---|---|---|---|
ExtensionPointHandler |
5 | extension_points |
Finds extension points/hooks in CPG by category, with fallback CPG search and relevance ranking |
IntegrationPointsHandler |
10 | integration_points |
Finds hook points, entry points from domain plugin, calculates integration complexity |
SimilarFeaturesHandler |
20 | similar_features |
Finds similar patterns, analyzes dependencies, recommends placement, classifies pattern types |
All handlers inherit FeatureDevHandler -> BaseHandler and produce HandlerResult with data, retrieved_functions, evidence, and metadata.
Understanding Existing Code¶
Finding Related Functions¶
> Find functions related to transaction handling
╭─────────────── Related Code ────────────────────────────────╮
│ │
│ Transaction Functions: │
│ │
│ Entry Points: │
│ StartTransaction() src/backend/access/transam/xact.c │
│ CommitTransaction() src/backend/access/transam/xact.c │
│ AbortTransaction() src/backend/access/transam/xact.c │
│ │
│ Support Functions: │
│ AssignTransactionId() src/backend/access/transam/xact.c │
│ GetCurrentTransactionId() src/backend/access/transam/xact.c │
│ │
│ Related Modules: │
│ src/backend/access/heap/ - Heap operations │
│ src/backend/storage/lmgr/ - Lock manager │
│ │
╰──────────────────────────────────────────────────────────────╯
Understanding Call Chains¶
> Show the call path from parser to executor
╭─────────────── Call Chain ──────────────────────────────────╮
│ │
│ Parser -> Executor Flow: │
│ │
│ 1. pg_parse_query() │
│ | │
│ 2. pg_analyze_and_rewrite() │
│ | │
│ 3. pg_plan_queries() │
│ | │
│ 4. PortalRun() │
│ | │
│ 5. ExecutorStart() -> ExecutorRun() -> ExecutorFinish() │
│ | │
│ 6. ExecProcNode() [dispatches to specific nodes] │
│ │
╰──────────────────────────────────────────────────────────────╯
CallGraphAnalyzer Integration¶
The main workflow uses CallGraphAnalyzer from src/analysis/ for deep integration point analysis:
find_all_callers(method, max_depth)— traverses the call graph upward to find all transitive callersfind_all_callees(method, max_depth)— traverses the call graph downward to find all transitive calleesanalyze_impact(method)— computesimpact_score,transitive_callers,transitive_callees,direct_callers
Methods with a high caller count (> min_callees threshold) are identified as popular integration points — good candidates for hooking new features. Impact analysis determines safety: methods with impact_score < 0.5 are marked as safe to extend.
Integration Points¶
Finding Extension Points¶
> Where can I add a new node type?
╭─────────────── Extension Points ────────────────────────────╮
│ │
│ Adding New Executor Node Type: │
│ │
│ 1. Define node structure in: │
│ src/include/nodes/execnodes.h │
│ │
│ 2. Implement node operations: │
│ src/backend/executor/nodeXXX.c │
│ - ExecInitXXX() │
│ - ExecXXX() │
│ - ExecEndXXX() │
│ │
│ 3. Register in dispatcher: │
│ src/backend/executor/execProcnode.c │
│ - ExecInitNode() │
│ - ExecProcNode() │
│ │
╰──────────────────────────────────────────────────────────────╯
The ExtensionPointHandler uses longest-match-first category detection across EN and RU keywords (e.g., “table access method” -> table_am, “custom scan” -> custom_scan). Categories include: aggregate, custom_scan, planner_hooks, executor_hooks, utility_hooks, table_am, index_am, fdw, join_algorithm, authentication.
Hook Points and Entry Points¶
The IntegrationPointsHandler combines three data sources:
- Hook points — subsystem functions from domain plugin, filtered by target module
- Entry points —
domain.get_entry_points()for the active domain - Similar patterns — methods matching the feature name via CPG
ILIKEqueries
Integration complexity is calculated from the number of hook points: low / medium / high.
Betweenness Centrality Analysis¶
Phase 4A enhancement uses DependencyAnalyzer.identify_architectural_chokepoints() to find strategically important integration points:
- Methods with
betweenness_percentile > betweenness_percentile_highthreshold are flagged as strategic integration points - These are central in the architecture — good for features that affect multiple subsystems
- Metadata includes
high_centrality_points,top_centrality_method,max_centrality_percentile
Similar Features and Patterns¶
Pattern Classifications¶
The SimilarFeaturesHandler classifies methods into 5 naming patterns:
| Pattern | Prefixes |
|---|---|
| Initialization | init_, setup_, create_, new_, alloc_ |
| Handler/callback | handle_, process_, on_, exec_, run_ |
| Query/lookup | get_, find_, lookup_, fetch_, search_ |
| Validation | validate_, check_, verify_, is_, has_ |
| Cleanup | free_, destroy_, cleanup_, close_, end_ |
Placement Recommendation¶
SimilarFeaturesHandler._recommend_placement() recommends where to place new code:
- Extracts keywords from the feature name
- Scores each subsystem by keyword overlap (with target module boost)
- Finds the most common file among the best subsystem’s functions
- Identifies the nearest method (by name similarity) as an anchor
- Calculates confidence:
overlap / (keywords + 2.0)
> Where should I place new cache invalidation feature?
╭─────────────── Recommended Placement ─────────────────────╮
│ │
│ Subsystem: cache manager │
│ File: src/backend/utils/cache.c │
│ Near method: cache_lookup() (line 145) │
│ Confidence: 75% │
│ │
│ Related methods in this area: │
│ - cache_insert() │
│ - cache_remove() │
│ - cache_invalidate_all() │
│ │
╰────────────────────────────────────────────────────────────╯
Pattern Examples¶
> Show pattern examples for executor subsystem
╭─────────────── Pattern Examples ──────────────────────────╮
│ │
│ Examples from the target subsystem: │
│ │
│ Method | File | Pattern | CC │
│ init_executor() | execMain.c:42 | Initialization | 3 │
│ exec_scan() | execScan.c:88 | Handler | 5 │
│ get_plan_node() | execUtils.c:15| Query/lookup | 2 │
│ check_perms() | execPerms.c:7 | Validation | 4 │
│ end_executor() | execMain.c:200| Cleanup | 1 │
│ │
╰────────────────────────────────────────────────────────────╯
Examples are sorted by cyclomatic complexity (ascending) — simpler patterns serve as better templates.
Domain-Agnostic Architecture¶
S04 is fully domain-agnostic — it uses get_active_domain() to load the domain plugin, which provides:
get_extension_points(category)— extension point functions for a categoryget_extension_categories()— available categories with keywordsget_subsystem_functions()— functions grouped by subsystemget_entry_points()— top-level entry point function names
When querying the CPG for matching functions, the workflow uses a 4-level fallback chain:
- Domain plugin — get functions from plugin, query CPG by exact name
- Category-based CPG search — derive English terms from category name,
ILIKEsearch - Keyword-based CPG search — extract keywords from query,
ILIKEsearch - Generic hooks fallback — search for
*_hook,*Hook*patterns; then regex extraction of CamelCase/snake_case names from query
CLI Usage¶
# Ask about integration points
python -m src.cli query "Where to add a new join algorithm?"
# Find similar features
python -m src.cli query "Show similar features to heap_insert"
# Find extension points
python -m src.cli query "Find planner hook injection points"
# Analyze dependencies
python -m src.cli query "What modules does the buffer manager depend on?"
Example Questions¶
- “How does [feature_name] work?”
- “What functions are involved in [subsystem]?”
- “Where should I place new [feature] in [module]?”
- “Show similar implementations to [existing_feature]”
- “What’s the data flow for [operation]?”
- “Find integration points for [component]”
- “Show pattern examples for [subsystem]”
- “Recommend placement for new [feature_type]”
- “Find extension points for [category]”
- “What are the hook points in [module]?”
Related Scenarios¶
- Onboarding - Codebase exploration basics
- Architecture - Architectural understanding
- Refactoring - Code improvement
S04 vs S05 vs S13: S04 focuses on finding integration and extension points for adding new features to existing code. S05 focuses on refactoring existing code (detecting smells, suggesting improvements). S13 focuses on mass refactoring — coordinating large-scale changes across many files.