Code Extraction Reference

Complete reference documentation for Probe's code extraction capabilities, including AST parsing, language-specific extraction, and advanced usage techniques.

EXTRACT COMMAND

bash

probe extract <FILES> [OPTIONS]

CORE PARAMETERS

Parameter	Description
`<FILES>`	Required: Files to extract from (e.g., `main.rs:42` or `main.rs#function_name`)

KEY OPTIONS

Option	Description	Default
`-c, --context <N>`	Add N context lines	0
`--diff`	Process input as git diff format	Off
`-f, --format <TYPE>`	Output as: `markdown`, `plain`, `json`, `xml`, `color`	`color`
`-k, --keep-input`	Preserve and display original input content	Off
`--prompt <TEMPLATE>`	System prompt template for LLM models (`engineer`, `architect`, or path to file)	None
`--instructions <TEXT>`	User instructions for LLM models	None
`--to-clipboard`	Copy results to clipboard	Off
`--from-clipboard`	Read file paths from clipboard	Off

For complete option details, see probe extract --help.

FILE PATH SYNTAX

Probe supports several ways to specify what code to extract:

Entire file: file.rs
Specific line: file.rs:42
Line range: file.rs:10-20
Symbol name: file.rs#handle_extract
Multiple files: file1.rs:10 file2.go:15
Glob patterns: src/*.rs:42

AST PARSING

Probe uses Abstract Syntax Tree (AST) parsing to understand code structure:

EXTRACTION PROCESS

When you specify a line number, Probe:

Parses the file using tree-sitter to generate an AST
Locates the node in the AST containing the specified line
Navigates to find the smallest complete code unit (function, class, etc.)
Extracts the entire code block with proper formatting

This approach ensures you get complete, syntactically valid code blocks rather than arbitrary line ranges.

File Content → Tree-sitter Parser → AST Generation → Node Location → Parent Node Identification → Code Block Extraction

BENEFITS

Structural Understanding: Recognizes code structures, not just text
Complete Units: Extracts entire functions, classes, or blocks
Language Awareness: Applies language-specific extraction rules
Context Preservation: Maintains the full context of code elements
Documentation Inclusion: Captures associated comments and documentation

FALLBACK MECHANISMS

If AST parsing fails or isn't available for a particular language:

Context-Based Fallback: Extracts the specified line with configurable context
Line Range Extraction: For explicit line ranges, extracts exactly those lines
Symbol Text Search: For symbol references, falls back to text-based search

LANGUAGE-SPECIFIC EXTRACTION

Probe provides specialized extraction for each supported language:

RUST

Function definitions with attributes and documentation
Struct and enum definitions
Implementation blocks
Macro definitions

bash

# Extract a Rust function
probe extract src/main.rs#handle_request

# Extract an impl block
probe extract src/models.rs:42

JAVASCRIPT / TYPESCRIPT

Function and arrow function definitions
Class definitions with methods
JSX/TSX components
TypeScript interfaces and type definitions

bash

# Extract a JavaScript class
probe extract src/components/Button.js#Button

# Extract a React component
probe extract src/App.tsx:15

PYTHON

Function definitions with docstrings
Class definitions with methods
Decorated functions and classes
Indentation-aware extraction

bash

# Extract a Python class
probe extract src/models.py#UserModel

# Extract a decorated function
probe extract src/views.py:42

GO

Function definitions with documentation
Struct and interface definitions
Methods associated with types
Comment association

bash

# Extract a Go struct
probe extract pkg/models/user.go#User

# Extract a method
probe extract pkg/handlers/auth.go:42

OTHER LANGUAGES

Probe supports extraction for many other languages including:

C/C++: Functions, classes, structs, templates, namespaces
Java: Methods, classes, interfaces, annotations
Ruby: Methods, classes, modules, blocks
PHP: Functions, classes, namespaces, attributes
Swift: Functions, classes, structs, protocols, extensions
C#: Methods, classes, interfaces, namespaces, attributes
Markdown: Sections, code blocks, lists, tables, frontmatter

Each language implementation understands the unique syntax and structures of that language.

ADVANCED USAGE TECHNIQUES

PRESERVING ORIGINAL INPUT WITH --KEEP-INPUT

The --keep-input (or -k) flag preserves and displays the original, unstructured input content alongside the extracted code blocks:

bash

# Extract code while preserving original input
probe extract src/main.rs:42 --keep-input

# Using the short form
probe extract src/main.rs:42 -k

When this flag is enabled, the output will include:

The original input text exactly as provided
The structured, extracted code blocks

This is particularly useful when:

Working with error messages or logs where the context is important
Debugging extraction issues by comparing input to output
Creating documentation that references both the original text and the extracted code
Preserving file paths and line numbers from compiler output

Output Format with --keep-input

In terminal output formats (color, terminal, plain, markdown), the original input is displayed first, followed by a separator, then the extracted code blocks:

--- Original Input ---
src/main.rs:42: error: invalid syntax

--- Extracted Code ---
fn process_data(data: &[u8]) -> Result<Vec<u8>, Error> {
    // Processing logic
    ...
}

In structured output formats (json, xml), the original input is included as an additional field:

json

{
  "original_input": "src/main.rs:42: error: invalid syntax",
  "results": [
    {
      "file": "src/main.rs",
      "lines": [40, 45],
      "node_type": "function",
      "code": "fn process_data(data: &[u8]) -> Result<Vec<u8>, Error> {\n    // Processing logic\n    ...\n}"
    }
  ],
  "summary": {
    "count": 1,
    "total_bytes": 85,
    "total_tokens": 25
  }
}

EXTRACTING FROM ERROR MESSAGES

Feed compiler errors directly to extract relevant code:

bash

# Extract code from compiler errors
rustc main.rs 2>&1 | probe extract

# Pull code from test failures
go test ./... | probe extract

# Extract code from errors while preserving the original error message
rustc main.rs 2>&1 | probe extract --keep-input


### GIT DIFF EXTRACTION

Extract code from git diff output with automatic format detection:

```bash
# Extract code from git diff output (auto-detection)
git diff | probe extract

# Extract code from a diff file (auto-detection)
probe extract diff_file.patch

# Extract code from clipboard containing git diff (auto-detection)
probe extract --from-clipboard

Probe automatically detects git diff format when content starts with diff --git, making the --diff flag optional. This works with:

Piped git diff output
Diff files provided as arguments
Clipboard content with --from-clipboard

The --diff flag is still supported for backward compatibility and explicit format specification.

PIPELINE INTEGRATION

Chain with other tools for powerful workflows:

bash

# Find and extract error handlers
probe search "error handling" --files-only | xargs -I{} probe extract {} --format markdown

# Extract specific functions with context
grep -n "handleRequest" ./src/*.js | cut -d':' -f1,2 | probe extract --context 3

# Extract all functions matching a pattern
find . -name "*.py" | xargs grep -l "def test_" | xargs -I{} probe extract {}#test_

SYMBOL EXTRACTION EXAMPLES

Extract code by symbol name across different languages:

bash

# Extract a Rust function
probe extract src/main.rs#handle_extract

# Extract a JavaScript class method
probe extract src/components/User.js#User.authenticate

# Extract a Python class method
probe extract src/models.py#UserModel.save

# Extract a Go interface
probe extract pkg/service/interface.go#UserService

MULTI-FILE EXTRACTION

Extract code from multiple files in a single command:

bash

# Extract from multiple specific files
probe extract src/auth.js:15 src/api.js:27 src/models.rs:42

# Extract using glob patterns
probe extract src/*.rs:42

# Extract multiple symbols
probe extract src/main.rs#handle_request src/models.rs#User

PRACTICAL APPLICATIONS

CODE REVIEW WORKFLOWS

bash

# Extract changes for review (using auto-detection)
git diff | probe extract

# Extract changes from a specific commit
git show commit_hash | probe extract

# Extract changes between branches
git diff main..feature-branch | probe extract

# Extract functions modified in a PR
git diff --name-only origin/main | xargs grep -l "fn " | xargs -I{} probe extract {}

DOCUMENTATION GENERATION

bash

# Extract public API functions for documentation
find . -name "*.rs" | xargs grep -l "pub fn" | xargs -I{} probe extract {} --format markdown > api_docs.md

# Extract class definitions for API reference
find . -name "*.py" | xargs grep -l "class " | xargs -I{} probe extract {} --format markdown > classes.md

AI INTEGRATION

bash

# Extract code for AI context
probe extract src/main.rs:42 --format json | jq '.results[0].code' | ai-assistant "Explain this code"

# Extract multiple related functions for AI analysis
probe extract src/auth.rs#authenticate src/auth.rs#validate --format json --max-tokens 4000

# Code review with git diff and AST extraction
git diff | tee /tmp/changes.diff | ai-assistant "Here are the changes:" && git diff | probe extract | ai-assistant "Here are the complete functions that were modified:"

# Comprehensive code review with both diff and AST context
git diff > /tmp/changes.diff && git diff | probe extract > /tmp/ast_blocks.txt && cat /tmp/changes.diff /tmp/ast_blocks.txt | ai-assistant "Review these changes. The first part shows the diff, and the second part shows the complete AST blocks of modified functions."

# Extract code with LLM prompt and instructions
probe extract src/auth.rs#authenticate --format json --prompt engineer --instructions "Explain this authentication function"

The git diff auto-detection feature is particularly valuable for AI code review workflows. When you pipe a git diff to probe extract, it automatically extracts the complete AST nodes (functions, classes, methods) that contain the changes, providing AI tools with both the specific changes (from the diff) and the full context (from the AST extraction).

LLM INTEGRATION WITH PROMPTS

Probe's extract command supports direct integration with Large Language Models (LLMs) through the --prompt and --instructions flags:

bash

# Extract code with engineer prompt template
probe extract src/main.rs#handle_request --prompt engineer --instructions "Explain this function"

# Extract code with architect prompt template
probe extract src/auth.rs --prompt architect --instructions "Analyze this authentication module"

# Extract code with custom prompt template
probe extract src/api.js:42 --prompt /path/to/custom/prompt.txt --instructions "Refactor this code"

PROMPT TEMPLATES

The --prompt flag accepts three types of values:

Built-in templates:
- engineer: A prompt template for software engineering tasks, focused on code implementation
- architect: A prompt template for architectural analysis and planning
Custom templates:
- Path to a file containing a custom prompt template
Output integration:
- In structured formats (JSON, XML), the prompt and instructions are included as fields
- In text formats, they appear as sections at the end of the output

This feature is particularly useful for:

Creating consistent AI prompting patterns
Providing context for code analysis
Standardizing code review workflows
Automating documentation generation

DEBUGGING ASSISTANCE

bash

# Extract code from error stack trace
cat error.log | probe extract

# Extract function with additional context
probe extract src/api.js:27 --context 10

BENEFITS OF CODE EXTRACTION

COMPLETE CONTEXT: Get entire functions or classes, not just fragments
LANGUAGE AWARENESS: Extracts code according to language-specific rules
PRECISE TARGETING: Extract exactly what you need by line or symbol name
FORMAT FLEXIBILITY: Output in various formats for different workflows
TOOL INTEGRATION: Works seamlessly with other command-line tools
INTELLIGENT FALLBACKS: Gracefully handles cases where AST parsing isn't possible

For more information on how Probe works internally, see How Probe Works. For details on search capabilities, see Search Functionality.

Code Extraction Reference ​

EXTRACT COMMAND ​

CORE PARAMETERS ​

KEY OPTIONS ​

FILE PATH SYNTAX ​

AST PARSING ​

EXTRACTION PROCESS ​

BENEFITS ​

FALLBACK MECHANISMS ​

LANGUAGE-SPECIFIC EXTRACTION ​

RUST ​

JAVASCRIPT / TYPESCRIPT ​

PYTHON ​

GO ​

OTHER LANGUAGES ​

ADVANCED USAGE TECHNIQUES ​

PRESERVING ORIGINAL INPUT WITH --KEEP-INPUT ​

Output Format with --keep-input ​

EXTRACTING FROM ERROR MESSAGES ​

PIPELINE INTEGRATION ​

SYMBOL EXTRACTION EXAMPLES ​

MULTI-FILE EXTRACTION ​

PRACTICAL APPLICATIONS ​

CODE REVIEW WORKFLOWS ​

DOCUMENTATION GENERATION ​

AI INTEGRATION ​

LLM INTEGRATION WITH PROMPTS ​

PROMPT TEMPLATES ​

DEBUGGING ASSISTANCE ​

BENEFITS OF CODE EXTRACTION ​

Code Extraction Reference

EXTRACT COMMAND

CORE PARAMETERS

KEY OPTIONS

FILE PATH SYNTAX

AST PARSING

EXTRACTION PROCESS

BENEFITS

FALLBACK MECHANISMS

LANGUAGE-SPECIFIC EXTRACTION

RUST

JAVASCRIPT / TYPESCRIPT

PYTHON

GO

OTHER LANGUAGES

ADVANCED USAGE TECHNIQUES

PRESERVING ORIGINAL INPUT WITH --KEEP-INPUT

Output Format with --keep-input

EXTRACTING FROM ERROR MESSAGES

PIPELINE INTEGRATION

SYMBOL EXTRACTION EXAMPLES

MULTI-FILE EXTRACTION

PRACTICAL APPLICATIONS

CODE REVIEW WORKFLOWS

DOCUMENTATION GENERATION

AI INTEGRATION

LLM INTEGRATION WITH PROMPTS

PROMPT TEMPLATES

DEBUGGING ASSISTANCE

BENEFITS OF CODE EXTRACTION