Symbolic Execution (Z3 SMT Solver)

Symbolic Execution (Z3 SMT Solver)

Lightweight symbolic execution engine for path feasibility checking in taint analysis. Uses the Z3 SMT solver to filter out infeasible taint paths, reducing false positive rate by 5-7%.

Overview

When taint analysis finds a path from source to sink, the path may traverse branches with contradictory conditions (e.g., if (x > 10) followed by if (x < 5) on the same variable). Symbolic execution checks whether such paths are actually feasible by encoding branch conditions as Z3 constraints and calling the solver.

Key characteristics: - Optional component — gracefully degrades when Z3 is not installed - Adds ~20% time overhead, ~25% memory overhead - Maximum 20 constraints per path (configurable) - Supports integer comparisons, NULL checks, simple arithmetic

Installation

pip install z3-solver

Z3 is already included in requirements.txt. If Z3 is unavailable at runtime, the system logs a warning and returns conservative estimates (all paths assumed feasible).

Architecture

TaintPropagator (enable_symbolic_execution=True)
    
    ├─ find_taint_paths()    raw taint paths
    
    └─ PathConstraintTracker.filter_infeasible_paths()
        
        ├─ _collect_constraints(start, end)
           └─ Walk CFG edges, extract branch conditions from CDG
        
        ├─ _parse_constraint_to_z3(constraint, z3_vars)
           └─ Convert "x > 100"  z3.Int('x') > 100
        
        └─ z3.Solver().check()
            ├─ sat      path feasible (confidence 0.9)
            ├─ unsat    path infeasible, FILTER OUT (confidence 0.95)
            └─ unknown  conservative, keep path (confidence 0.4)

API Reference

PathConstraint

@dataclass
class PathConstraint:
    node_id: int       # CPG node ID of the branch
    condition: str     # Condition expression (e.g., "x > 100")
    location: str      # File:line location
    negated: bool      # True if this is the else-branch

PathConstraintTracker

class PathConstraintTracker:
    def __init__(self, cpg_service, max_constraints: int = 20):
        """
        Args:
            cpg_service: CPGQueryService instance for graph queries
            max_constraints: Maximum constraints per path (default: 20).
                Paths with more constraints are assumed feasible to avoid solver timeout.
        """

    def is_path_feasible(self, start_node: int, end_node: int) -> Tuple[bool, float]:
        """
        Check if a path between two nodes is feasible.

        Returns:
            (is_feasible, confidence) — confidence ranges from 0.0 to 1.0
        """

    def filter_infeasible_paths(
        self, paths: List[Any], min_confidence: float = 0.8
    ) -> List[Any]:
        """
        Filter a list of taint paths, removing infeasible ones.

        Args:
            paths: List of taint path objects (must have .source_id and .sink_id)
            min_confidence: Minimum confidence to consider a result reliable (default: 0.8)

        Returns:
            Filtered list with only feasible paths
        """

Usage

Direct API

from src.analysis.dataflow.symbolic_execution import PathConstraintTracker

tracker = PathConstraintTracker(cpg_service, max_constraints=20)

# Check a single path
is_feasible, confidence = tracker.is_path_feasible(node_123, node_456)
if not is_feasible and confidence > 0.9:
    print("Path is infeasible — likely false positive")

# Filter a batch of taint paths
feasible_paths = tracker.filter_infeasible_paths(all_paths, min_confidence=0.8)

Integration with Taint Analysis

from src.analysis.dataflow.taint import TaintPropagator

propagator = TaintPropagator(
    cpg_service,
    enable_symbolic_execution=True  # Enables Z3 path filtering
)

# Symbolic execution is applied automatically after taint propagation
paths = propagator.find_taint_paths(
    sources=["recv", "getenv"],
    sinks=["strcpy", "sprintf"]
)
# paths already filtered — infeasible paths removed

Supported Constraint Types

Constraint Example Z3 Encoding
Integer comparison x > 100 z3.Int('x') > 100
Equality x == 0 z3.Int('x') == 0
Inequality x != 5 z3.Int('x') != 5
NULL check ptr == NULL z3.Int('ptr') == 0
nullptr check ptr != nullptr z3.Int('ptr') != 0
Simple arithmetic x + y > z z3.Int('x') + z3.Int('y') > z3.Int('z')

Not supported: function calls in conditions, complex expressions, floating-point, string operations.

Confidence Levels

Solver Result Feasible? Confidence Action
sat Yes 0.9 Keep path
unsat No 0.95 Remove path (false positive)
unknown Assumed yes 0.4 Keep path (conservative)
Z3 unavailable Assumed yes 0.5 Keep path (degraded mode)
Too many constraints Assumed yes 0.5 Keep path (skip analysis)

Performance Impact

Metric Without Z3 With Z3 Delta
Analysis time Baseline +20% Solver overhead
Memory Baseline +25% Z3 internal structures
False positive rate ~12% ~5-8% Infeasible paths removed

Limitations

  1. Lightweight only — not a full symbolic executor; only tracks branch conditions
  2. Integer-only — no floating-point or string constraint support
  3. Max 20 constraints — longer paths assumed feasible to avoid timeout
  4. No interprocedural — constraints collected only within a single function’s CFG
  5. Pattern-based parsing — condition expressions parsed via regex, not full AST

Module: src/analysis/dataflow/symbolic_execution.py Last updated: February 2026