CPG-based untested code detection, test prioritization, test generation recommendations, runtime coverage import, and hybrid analysis.
Table of Contents¶
- Quick Start
- How It Works
- Architecture
- Intent Detection
- Handler Registry
- Detection Modes
- Heuristic Detection
- Hybrid Detection
- Handlers
- UntestedCodeHandler
- TestGenerationHandler
- TestPriorityHandler
- Impact Analysis
- CPG-Based Recommendations
- Runtime Coverage Import
- Configuration
- CLI Usage
- REST API
- Use Cases
- Example Questions
- Related Scenarios
Quick Start¶
/select 07
How It Works¶
Architecture¶
The test coverage module (src/workflow/scenarios/coverage_handlers/, 13 files) consists of 6 components:
User Query
|
v
CoverageIntentDetector (5 intent types, bilingual)
|
v
HandlerRegistry (priority-ordered dispatch)
|
+---> TestGenerationHandler (priority 5)
| generate test recommendations for specific functions
|
+---> UntestedCodeHandler (priority 10)
| find untested code via heuristic or hybrid detection
|
+---> TestPriorityHandler (priority 20)
| rank untested functions by criticality
|
v
CoverageReportFormatter (bilingual markdown output)
|
v
CallGraphAnalyzer ──> ImpactAnalysis (impact_score, callers, callees)
| Component | Module | Purpose |
|---|---|---|
CoverageIntentDetector |
intent_detector.py |
Detect query intent among 5 types with morphological matching |
TestGenerationHandler |
handlers/test_generator.py |
Generate test recommendations for named functions |
UntestedCodeHandler |
handlers/untested.py |
Find untested code (heuristic + hybrid runtime detection) |
TestPriorityHandler |
handlers/priority.py |
Rank untested functions by 4 scoring factors |
CoverageReportFormatter |
formatters/coverage_report.py |
Format reports with FormatterLocalization (EN/RU) |
TestCoverageScanner |
handlers/coverage_scanner.py |
Interface coverage scanning (dual scan: edges_call + file path heuristic) |
Intent Detection¶
CoverageIntentDetector classifies queries into 5 intent types using morphological keyword matching with Cyrillic word boundaries:
| Intent Type | Priority | EN Keywords | RU Keywords |
|---|---|---|---|
test_generation |
5 | generate test, create test, write test, test suite, mutation test, stress test, property-based, chaos engineering | сгенерировать тест, создать тест, написать тест, модульный тест, стресс-тесты, мутационное тестирование |
untested_code_scan |
10 | untested, no test, missing test, not covered, uncovered | непротестированный, без теста, нет тестов, непокрытый |
test_priority |
20 | test priority, should test, critical test, high priority test | приоритет теста, протестировать, критический тест, важный тест |
coverage_gap |
30 | coverage gap, low coverage, coverage report, test coverage | пробел покрытия, низкое покрытие, отчет покрытия, покрытие тестами |
coverage_improvement |
40 | improve coverage, increase coverage, better coverage | улучшить покрытие, увеличить покрытие, повысить покрытие |
Additional extraction:
- _extract_criticality(query) — returns "critical", "high", "medium", or "all"
- _extract_scope(query) — returns "method", "class", "module", or "all"
Handler Registry¶
Handlers are registered via HandlerRegistry("coverage") with priority-ordered dispatch. Lower priority = higher precedence:
@coverage_registry.register(priority=5)
class TestGenerationHandlerRegistered(TestGenerationHandler): ...
@coverage_registry.register(priority=10)
class UntestedCodeHandlerRegistered(UntestedCodeHandler): ...
@coverage_registry.register(priority=20)
class TestPriorityHandlerRegistered(TestPriorityHandler): ...
Each handler implements can_handle(query_info) -> bool and handle(query_info) -> HandlerResult. The registry tries handlers in priority order and uses the first one that matches.
Detection Modes¶
Heuristic Detection¶
Default mode when no runtime coverage data is available. Methods without test_* callers in edges_call are flagged as untested:
-- Methods with no test callers
SELECT m.id, m.name, m.full_name
FROM nodes_method m
WHERE NOT EXISTS (
SELECT 1 FROM edges_call ec
JOIN nodes_method caller ON ec.source_id = caller.id
WHERE ec.target_id = m.id
AND caller.name LIKE 'test_%'
)
Hybrid Detection¶
Automatically activated when coverage_percent column exists in nodes_method (populated via coverage import):
- Methods with
coverage_percent < 1.0→ flagged via runtime data - Methods with
NULL coverage_percent→ fallback to heuristic test-caller analysis - Each candidate tagged with
detection_method:"runtime"or"heuristic" - Coverage estimate uses
AVG(coverage_percent)from runtime data
The _has_coverage_data() guard ensures zero behavior change when no data is imported.
Handlers¶
UntestedCodeHandler¶
Handles untested_code_scan intent. Key methods:
| Method | Description |
|---|---|
can_handle(query_info) |
Returns True when type == "untested_code_scan" |
handle(query_info) |
Finds untested code, classifies by criticality, generates recommendations |
_find_untested_functions() |
Heuristic + optional runtime coverage detection |
_classify_by_criticality(candidates) |
Groups by risk level (critical/high/medium/low) |
_estimate_coverage(candidates) |
Calculates coverage percentage |
_has_coverage_data() |
Checks for coverage_percent column |
Enriches top 20 candidates with CPG-based test recommendations (branch coverage, parameter boundaries, error paths).
TestGenerationHandler¶
Handles test_generation intent. Key methods:
| Method | Description |
|---|---|
can_handle(query_info) |
Returns True when type == "test_generation" and function names extractable |
handle(query_info) |
Generates test recommendations for specific functions |
_extract_function_names(query) |
Extracts function names via priority patterns |
_search_functions_by_keywords(query) |
Concept-based function search (e.g., “tests for query execution”) |
_get_function_info(func_name) |
Case-insensitive function lookup |
_get_function_callees(func_name) |
Dependencies to mock |
_get_function_callers(func_name) |
Test scenarios from callers |
_suggest_test_approach(...) |
Generates unit/integration/edge-case strategy |
TestPriorityHandler¶
Handles test_priority intent. Ranks untested functions using 4 scoring factors:
| Factor | Score | Condition |
|---|---|---|
| Module criticality | +3.0 | api, interface, core modules |
| Module criticality | +2.0 | main, engine, system modules |
| Complexity | +1.5 | Signature length > 200 chars |
| Complexity | +1.0 | Signature length > 100 chars |
| Public API | +1.0 | No underscore prefix in name |
| Caller count | +2.0 | Above high contention threshold |
| Caller count | +1.0 | Above medium caller threshold |
Score is converted to priority level via _score_to_rating(): high (≥ threshold), medium, low.
Impact Analysis¶
The workflow uses CallGraphAnalyzer (Graph Method #2) from src/analysis/callgraph/analyzer.py for impact analysis on untested methods.
ImpactAnalysis dataclass:
| Field | Type | Description |
|---|---|---|
method_name |
str | Analyzed method |
direct_callers |
list[str] | Methods calling this directly |
transitive_callers |
list[str] | All transitive callers |
direct_callees |
list[str] | Methods called directly |
transitive_callees |
list[str] | All transitive callees |
impact_score |
float | 0.0–1.0 impact score |
3 graph insight categories tracked in state["metadata"]:
| Insight | Condition |
|---|---|
high_impact_untested |
impact_score > thresholds.high_impact |
untested_entry_points |
Many callers + few callees |
critical_untested |
callers > min_callers && impact_score > impact_score_medium |
CPG-Based Recommendations¶
For each untested method (top 20 by criticality), the handler generates specific recommendations:
Branch Coverage Analysis¶
Counts control structures (IF, FOR, WHILE, SWITCH) from nodes_control_structure and estimates test cases needed.
Parameter Boundary Analysis¶
Maps parameter types from nodes_param to boundary test suggestions:
| Type | Boundary Tests |
|---|---|
int, long, size_t, float |
zero, negative, max, min |
char*, string, str |
empty, null, very long, special chars |
Pointer types (*, ptr) |
null pointer |
bool |
true, false |
| Variadic parameters | zero args, one arg, many args |
Error Path Analysis¶
Counts TRY blocks in nodes_control_structure and RETURN statements in nodes_return. Multiple return statements suggest error handling paths.
Runtime Coverage Import¶
Import coverage data from external tools to enable hybrid detection.
Supported Formats¶
| Format | Tool | File Type |
|---|---|---|
pytest-cov |
pytest-cov (--cov-report=json) |
JSON |
lcov |
gcov / lcov / geninfo | Text (.info, .lcov) |
cobertura |
Cobertura, JaCoCo, coverage.py XML | XML |
How It Works¶
- The parser reads the coverage report and extracts per-file line-level hit data
- The importer adds a
coverage_percentcolumn tonodes_method(if absent) - Each method is matched to coverage data by suffix-matching the filename and intersecting the method’s line range with covered lines
coverage_percentis computed ascovered_lines_in_range / total_lines_in_range * 100
Path normalization: Coverage reports often contain absolute paths while the CPG stores relative paths. The importer normalizes paths (strips ./, converts backslashes) and falls back to suffix matching. Use --source-root to strip a common prefix.
Configuration¶
Coverage-related parameters from get_unified_config():
Thresholds¶
| Parameter | Default | Description |
|---|---|---|
coverage_high |
0.75 | High coverage threshold |
coverage_low |
0.25 | Low coverage threshold |
test_coverage_minimum |
50 | Minimum required coverage % |
test_coverage_good |
80 | Good coverage % |
high_impact |
— | Min score for high-impact methods |
min_callers |
— | Min callers for critical classification |
impact_score_medium |
— | Medium impact threshold |
Scoring Weights¶
| Parameter | Default | Description |
|---|---|---|
coverage_base_score |
5.0 | Base priority score |
coverage_critical_module |
3.0 | Bonus for critical modules |
coverage_important_module |
2.0 | Bonus for important modules |
coverage_long_signature |
1.5 | Bonus for complex signatures (>200 chars) |
coverage_medium_signature |
1.0 | Bonus for medium signatures (>100 chars) |
Handler Limits¶
| Parameter | Default | Description |
|---|---|---|
display_items |
15 | Items in report output |
summary_items |
5 | Summary items |
query_medium |
30 | Medium query limit (impact analysis) |
cpg_results |
50 | CPG result limit |
retrieved_functions |
25 | Functions for benchmark evaluation |
priority_functions |
10 | Prioritized list size |
CLI Usage¶
# Import pytest-cov JSON report
python -m src.cli.import_commands coverage import --file coverage.json --format pytest-cov --db data/projects/postgres.duckdb
# Import lcov trace file
python -m src.cli.import_commands coverage import --file lcov.info --format lcov
# Import Cobertura XML (e.g., from Java/C# tooling)
python -m src.cli.import_commands coverage import --file coverage.xml --format cobertura --source-root /project
# View imported coverage data
python -m src.cli.import_commands coverage show
python -m src.cli.import_commands coverage show --uncovered-only
REST API¶
Test coverage queries are handled via the scenario router:
| Method | Endpoint | Description |
|---|---|---|
POST |
/api/v1/scenarios/test_coverage/query |
Execute test coverage analysis |
Request model (ScenarioQueryRequest):
| Field | Type | Description |
|---|---|---|
query |
str | Analysis query (1–10000 chars) |
session_id |
str? | Session identifier |
language |
str | "en" or "ru" (default: "en") |
Response model (ScenarioQueryResponse):
| Field | Type | Description |
|---|---|---|
answer |
str | Formatted analysis result |
scenario_id |
str | "test_coverage" |
confidence |
float | Intent confidence |
evidence |
list[dict] | Supporting evidence |
processing_time_ms |
float | Processing time |
Example:
curl -X POST http://localhost:8000/api/v1/scenarios/test_coverage/query \
-H "Content-Type: application/json" \
-d '{"query": "Find untested functions", "language": "en"}'
Use Cases¶
Finding Untested Code¶
> What functions lack test coverage?
## Coverage Gaps
**Detection mode:** Heuristic
**Total untested:** 234 functions
**Coverage estimate:** 78%
### Critical (executor)
- ExecParallelHashJoinNewBatch()
- ExecReScanGather()
### High priority (storage)
- heap_lock_updated_tuple()
- heap_abort_speculative()
### Recommendations for `heap_lock_updated_tuple`
1. Add 5 test cases for branch coverage (IF: 3, SWITCH: 1)
2. Test boundary values for parameter `flags` (zero, negative, max)
3. Test error handling: 2 try/catch blocks
Test Prioritization¶
> Which critical functions need tests first?
## Test Priority Ranking
| Function | Score | Priority | Reason |
|----------|-------|----------|--------|
| heap_lock_updated_tuple | 8.5 | high | core module, 23 callers, public API |
| ExecParallelHashJoinNewBatch | 7.0 | high | engine module, complex signature |
| AtEOXact_RelationCache | 5.0 | medium | system module, 4 callers |
Generating Test Cases¶
> Generate test cases for heap_insert
## Test Recommendations: heap_insert()
**File:** src/backend/access/heap/heapam.c:2156
**Strategy:**
- Unit Tests: Test heap_insert in isolation by mocking 5 dependencies
- Integration Tests: Test through 23 callers to verify real-world usage
- Edge Cases: Test boundary conditions, null inputs, error handling
**Dependencies to Mock:**
- RelationGetBufferForTuple()
- heap_prepare_insert()
- XLogInsert()
**Test Scenarios from Callers:**
- simple_heap_insert() uses heap_insert for...
- toast_save_datum() uses heap_insert for...
Hybrid Detection (with runtime data)¶
> Find untested code
## Coverage Gaps (Hybrid)
**Detection mode:** Runtime + Heuristic
**Coverage estimate:** 62.3% (from runtime data)
| Method | Detection | Coverage | Reason |
|--------|-----------|----------|--------|
| parse_query() | Runtime | 0.0% | No lines covered |
| exec_plan() | Runtime | 12.5% | Partial coverage |
| helper_func() | Heuristic | --- | No test callers |
Example Questions¶
Untested code detection: - “What functions lack test coverage?” - “Find untested code” - “Show functions without tests” - “Which code is not covered by tests?”
Test prioritization: - “Which critical functions need tests first?” - “What should I test first?” - “Test priority ranking”
Test generation: - “Generate test cases for heap_insert” - “Create tests for palloc function” - “What edge cases should I test in ExecInitNode?” - “Write mutation tests for query parser”
Coverage overview: - “Show coverage gaps” - “Coverage report” - “How to improve test coverage?”
Related Scenarios¶
- Security Audit (S02) — Security testing
- Refactoring (S05) — Find unused code and refactor
- Debugging (S15) — Debug code issues