Scenario 17: File Editing

Surgical, AST-based code modifications with diff preview, backup/undo, and multi-interface access (CLI, REST API, MCP).

Table of Contents

Quick Start

/select 17

How It Works

Architecture

The editing module (src/editing/) consists of 5 components connected in a pipeline:

User Request
    |
    v
ASTParser (tree-sitter + regex fallback)
    |
    v
TargetFinder (CPG + AST merge, 7 search methods)
    |
    v
CodeModifier (6 edit modes, backup/undo stack)
    |
    v
DiffGenerator (4 output formats)
    |
    v
Preview / Apply

SSRAdapter ──> CodeModifier
  (GoCPG RewriteResult → EditOperation bridge)
Component Module Purpose
ASTParser ast_parser.py tree-sitter parsing with regex fallback for 14 file extensions
TargetFinder target_finder.py Find functions/classes by name, signature, location, or CPG query
CodeModifier code_modifier.py Apply edits preserving formatting, manage backup/undo (stack up to 50)
DiffGenerator diff_generator.py Generate diffs in unified, side-by-side, inline, and HTML formats
SSRAdapter ssr_adapter.py Convert GoCPG RewriteResult diffs into EditOperation objects

Intent Detection

The workflow (src/workflow/scenarios/file_editing.py) detects editing intent via is_file_editing_query() using 27 bilingual keywords:

Language Keywords
English (16) edit, modify, change, update, replace, refactor, rename, insert, delete, remove, add code, change code, update function, modify class, edit method
Russian (11) редактировать, изменить, обновить, заменить, рефакторинг, переименовать, вставить, удалить, добавить код, изменить код, обновить функцию

The file_editing_workflow() function is registered in LangGraph via graph_builder.py and routed through intent_classifier.pyrouter.py. Optional enrichment via EnrichmentAdapter adds vector context from docstrings and code comments.

AST Parsing

ASTParser supports two parsing modes:

  1. tree-sitter (primary) — precise AST via tree-sitter-languages package
  2. Regex fallback — pattern-based parsing when tree-sitter is unavailable

Supported file extensions (14):

Language Extensions
Python .py
C .c, .h
C++ .cpp, .hpp, .cc, .cxx
JavaScript .js, .jsx
TypeScript .ts, .tsx
Go .go
Java .java
C# .cs
PHP .php
Rust .rs
Kotlin .kt

Regex fallback patterns are available for Python, C, C++, JavaScript, and Go — covering function, class, method, struct, interface, and arrow function definitions.

TargetFinder provides 7 search methods combining CPG database queries with AST parsing:

Method Source Purpose
find_by_name() AST + CPG Find targets by name or pattern, with exact match option
find_by_signature() AST Match function signatures against regex pattern
find_by_location() AST Find the smallest target containing a specific line number
find_by_cpg_query() CPG Execute custom SQL against the CPG database
find_callers() CPG Find all functions calling the specified function
find_callees() CPG Find all functions called by the specified function
find_references() CPG Find all references to a name (variables, functions, types)

When both CPG and AST results are available, _merge_targets() combines them — preferring AST for precise location info and merging CPG metadata.

Edit Modes & Target Types

Edit Modes (6)

Mode Enum Description Use Case
Replace REPLACE Replace entire target Rewrite function logic
Insert Before INSERT_BEFORE Insert code before target Add imports, declarations
Insert After INSERT_AFTER Insert code after target Add new methods
Delete DELETE Remove target Remove deprecated code
Wrap WRAP Wrap target with prefix/suffix (format: prefix\|\|\|suffix) Add error handling, logging
Rename RENAME Rename identifier with word-boundary regex Refactor naming across file

Target Types (7)

Type Enum Description
Function FUNCTION Standalone functions
Class CLASS Class, struct, interface definitions
Method METHOD Class methods
Variable VARIABLE Variable declarations and references
Import IMPORT Import statements
Block BLOCK Arbitrary code blocks
Line Range LINE_RANGE Explicit line range (used by SSR adapter)

Data Models

CodeTarget (13 fields)

Field Type Description
file_path str Path to the source file
target_type TargetType One of 7 target types
name str Target name
start_line int Start line number
end_line int End line number
start_column int Start column (default 0)
end_column int End column (default 0)
signature str? Function/class signature
docstring str? Associated docstring
parent_name str? For methods: enclosing class name
ast_node_type str? tree-sitter node type
source_code str? Full source code text
metadata dict Additional metadata (e.g., {"source": "cpg"})

EditOperation (8 fields)

Field Type Description
target CodeTarget Target to edit
mode EditMode One of 6 edit modes
new_code str Replacement code
description str? Human-readable description
preserve_indentation bool Maintain original indentation (default: True)
preserve_comments bool Keep leading/trailing comments (default: True)
auto_format bool Auto-format result (default: True)
metadata dict Additional metadata

EditResult (10 fields)

Field Type Description
success bool Whether edit succeeded
file_path str Edited file path
operation EditOperation The applied operation
original_content str Content before edit
new_content str Content after edit
diff str Generated diff
backup_path str? Path to backup file
applied_at datetime? Timestamp (None for dry-run)
error_message str? Error details if failed
warnings list[str] Non-fatal warnings

Diff Formats

DiffGenerator supports 4 output formats via DiffFormat enum:

Format Enum Description
Unified UNIFIED Standard unified diff with +/- markers (default)
Side-by-side SIDE_BY_SIDE Two-column comparison
Inline INLINE Dual line numbers with change markers
HTML HTML Full HTML page via difflib.HtmlDiff

Additional methods: - summarize_changes() — human-readable summary (e.g., “+4 lines added, -2 lines removed (+2 net)”) - get_colored_diff() — ANSI-colored terminal output (red/green/cyan) - get_changed_lines() — returns tuple of (added, removed, changed) line numbers

SSR Integration

The SSRAdapter (ssr_adapter.py) bridges GoCPG structural search-and-replace results with the editing module:

GoCPG scan --fix --dry-run
    |
    v
RewriteResult diffs (unified diff + metadata)
    |
    v
rewrite_results_to_edit_operations()  ──> List[EditOperation]
    |
    v
CodeModifier.apply_multiple()  ──> List[EditResult]

Two adapter functions: - rewrite_results_to_edit_operations(diffs) — parse unified diff hunks or explicit line_start/fix_preview into EditOperation objects with EditMode.REPLACE - edit_operations_from_findings(findings) — convenience wrapper that filters findings with fixes from GoCPGScanResult

Use Cases

Finding Code Targets

> Find function heap_insert in src/backend/access/heap/heapam.c

## Found Code Target

**Name:** heap_insert
**Type:** function
**File:** src/backend/access/heap/heapam.c
**Lines:** 2156-2298

  void heap_insert(Relation relation,
                   HeapTuple tup,
                   CommandId cid,
                   int options,
                   BulkInsertState bistate)
  { ... }

Preview Changes with Diff

> Edit function validate_input to add bounds checking

## Edit Preview

**Target:** validate_input (function)
**File:** src/utils/validation.c:45-67

### Changes:
  @@ -45,6 +45,10 @@
   bool validate_input(const char *input, size_t len)
   {
  +    /* Bounds checking */
  +    if (len > MAX_INPUT_SIZE) {
  +        return false;
  +    }
       if (input == NULL) {
           return false;
       }

Use `/edit apply` to apply the changes.

Apply Edits with Backup

CodeModifier creates backups automatically (format: name.YYYYMMDD_HHMMSS.hash.bak) and maintains an undo stack (up to 50 entries). Operations on the same file are sorted by line number descending to avoid line shifts.

Rename Operation

The RENAME mode uses word-boundary regex (\bOldName\b) to replace all occurrences in a file without affecting partial matches.

Configuration

EditingConfig dataclass (9 parameters):

editing:
  ast_parser: "tree-sitter"         # Primary parser
  fallback_parser: "regex"           # Fallback when tree-sitter unavailable
  preserve_formatting: true          # Keep original formatting
  preserve_comments: true            # Keep comments around targets
  diff_context_lines: 5              # Context lines in diff output
  backup_before_edit: true           # Create backups before edits
  backup_dir: "./backups/edits"      # Backup directory
  max_file_size_mb: 10               # Maximum file size to edit
  supported_extensions:              # File types to process
    - .py
    - .c
    - .h
    - .cpp
    - .js
    - .ts
    - .go
    - .rb

CLI Usage

# Find targets by name
python -m src.cli.import_commands edit find "heap_insert" --file src/heap.c --exact

# Find all functions matching pattern in directory
python -m src.cli.import_commands edit find "validate_*" --file src/ --type function --limit 20

# Preview an edit (target format: file.py::name or file.py:line)
python -m src.cli.import_commands edit preview src/utils/validation.c::validate_input \
    --new-code "bool validate_input(...) { ... }" \
    --format unified --context 5

# Preview using new code from file
python -m src.cli.import_commands edit preview src/file.c::func_name \
    --new-code-file patch.txt

# Apply an edit
python -m src.cli.import_commands edit apply --backup

# Undo last edit (restores from backup)
python -m src.cli.import_commands edit undo --steps 1

# Show edit history
python -m src.cli.import_commands edit history --limit 10 --file src/utils/

REST API

5 endpoints in src/api/routers/editing.py:

Method Endpoint Description
POST /api/v1/edit/find-target Find code targets by name/pattern
POST /api/v1/edit/preview Generate diff preview (dry-run)
POST /api/v1/edit/apply Apply edit with optional backup
POST /api/v1/edit/undo Undo the most recent edit
GET /api/v1/edit/history Get edit history (limit, default 10)

Example:

# Find targets
curl -X POST http://localhost:8000/api/v1/edit/find-target \
  -H "Content-Type: application/json" \
  -d '{"file_path": "src/utils/validation.c", "name_pattern": "validate_input", "target_type": "function"}'

# Preview edit
curl -X POST http://localhost:8000/api/v1/edit/preview \
  -H "Content-Type: application/json" \
  -d '{"file_path": "src/file.c", "target_name": "func", "new_code": "...", "diff_format": "unified"}'

MCP Tool

codegraph_edit_preview(file_path, target_name, target_type="function")

Locates the target in the CPG and returns its type, location, and code preview — useful for safe refactoring from an AI assistant. Uses TargetFinder with CPG service for semantic understanding.

Example Questions

Finding targets: - “Find function heap_insert in src/heap.c” - “Find all classes in src/backend/executor/” - “What functions call heap_insert?”

Editing: - “Edit function validate_input to add null check” - “Insert logging before function process_request” - “Delete unused function old_helper” - “Wrap function with error handling” - “Rename class TransactionHandler to TxHandler”

Preview & History: - “Show me the current edit preview” - “Show edit history” - “Undo the last edit”