Scenario 21: Structural Pattern Search¶
Developer or security engineer finding code matching structural patterns with CPG-aware constraints.
Quick Start¶
/select pattern_search
Or via CLI:
# Ad-hoc pattern search
python -m src.cli patterns search "malloc($x)" --lang c
# Scan with all rules
python -m src.cli patterns scan
# Scan specific rule
python -m src.cli patterns scan --rule unchecked-return
Overview¶
Structural Pattern Search uses tree-sitter CST parsing combined with CPG constraints (data flow, call graph, types, domain annotations) to find code matching complex patterns. Unlike regex-based grep, it understands code structure and can match across AST boundaries.
Pattern Types¶
Syntactic Patterns¶
Tree-sitter CST patterns with metavariables:
| Metavariable | Matches |
|---|---|
$VAR |
Any single expression or identifier |
$$ARGS |
Zero or more arguments |
$_ |
Any node (wildcard) |
# Find malloc calls
python -m src.cli patterns search "malloc($x)" --lang c
# Find if-return without else
python -m src.cli patterns search "if ($cond) { return $val; }" --lang c
CPG-Constrained Patterns¶
Patterns with data flow, call graph, type, and domain constraints:
id: unchecked-return
pattern: "$ret = $func($$args)"
language: c
constraints:
- type: data_flow
from: "$ret"
not_reaches: "if ($ret"
- type: call_graph
callee: "$func"
returns: "int"
message: "Return value of $func is not checked"
severity: warning
YAML Rules¶
Pre-defined rules in configs/rules/ — 190 rules across 14 languages.
# List all available rules
python -m src.cli patterns list
# Show rule statistics
python -m src.cli patterns stats
Example Queries¶
Find unchecked return values
Find malloc without free
Show functions matching error-handling anti-patterns
Find SQL query construction without parameterization
Find all functions with cyclomatic complexity > 20
Usage¶
CLI¶
# Search with pattern
python -m src.cli patterns search "malloc($x)" --lang c --max-results 50
# Scan all rules
python -m src.cli patterns scan --db data/projects/postgres.duckdb
# Scan specific severity
python -m src.cli patterns scan --severity error
# Generate rule from natural language
python -m src.cli patterns generate "find unchecked return values" --lang c --output rule.yaml
# Validate rule
gocpg validate-rule --file rule.yaml
# Autofix (dry run)
python -m src.cli patterns fix --dry-run
# Autofix (apply)
python -m src.cli patterns fix --rule unchecked-return
Programmatic¶
from src.workflow import MultiScenarioCopilot
copilot = MultiScenarioCopilot()
result = copilot.run("Find unchecked return values", scenario="pattern_search")
for finding in result.get('findings', []):
print(f"{finding['rule_id']}: {finding['file']}:{finding['line']}")
print(f" {finding['message']}")
API¶
# Search patterns
POST /api/v1/patterns/search
{
"pattern": "malloc($x)",
"language": "c",
"max_results": 50
}
# Get findings for a rule
POST /api/v1/patterns/findings
{
"rule_id": "unchecked-return"
}
# Generate rule from description
POST /api/v1/patterns/generate
{
"description": "find unchecked return values",
"language": "c"
}
MCP¶
Available as MCP tools: codegraph_pattern_search, codegraph_pattern_findings, codegraph_pattern_stats, codegraph_pattern_fix, codegraph_pattern_generate, codegraph_pattern_test.
Related¶
- Scenarios Overview — All 21 scenarios
- Security Audit — Vulnerability detection (uses patterns internally)
- Composite Workflows — Orchestration guide