Skip to content

Node.js SDK

Probe provides a powerful Node.js SDK that allows you to integrate its code search capabilities directly into your JavaScript and TypeScript applications. This document covers the installation, usage, and advanced features of the Node.js SDK.

Installation

Local Installation

bash
npm install @buger/probe

Global Installation

bash
npm install -g @buger/probe

During installation, the package will automatically download the appropriate Probe binary for your platform (Windows, macOS, or Linux).

Features

  • Search Code: Search for patterns in your codebase using Elasticsearch-like query syntax
  • Query Code: Find specific code structures using tree-sitter patterns
  • Extract Code: Extract code blocks from files based on file paths and line numbers
  • AI Tools Integration: Ready-to-use tools for Vercel AI SDK, LangChain, and other AI frameworks
  • System Message: Default system message for AI assistants with instructions on using Probe tools
  • Cross-Platform: Works on Windows, macOS, and Linux
  • Automatic Binary Management: Automatically downloads and manages the Probe binary
  • Direct CLI Access: Use the Probe binary directly from the command line when installed globally

Basic Usage

Using as a Node.js Library

javascript
import { search, query, extract } from '@buger/probe';

// Search for code
const searchResults = await search({
  path: '/path/to/your/project',
  query: 'function',
  maxResults: 10
});

// Query for specific code structures
const queryResults = await query({
  path: '/path/to/your/project',
  pattern: 'function $NAME($$$PARAMS) $$$BODY',
  language: 'javascript'
});

// Extract code blocks
const extractResults = await extract({
  files: ['/path/to/your/project/src/main.js:42']
});

Using as a Command-Line Tool

When installed globally, the probe command will be available directly from the command line:

bash
# Search for code
probe search "function" /path/to/your/project

# Query for specific code structures
probe query "function $NAME($$$PARAMS) $$$BODY" /path/to/your/project

# Extract code blocks
probe extract /path/to/your/project/src/main.js:42

The package installs the actual Probe binary, not a JavaScript wrapper, so you get the full native performance and all features of the original Probe CLI.

Core Functions

The search function allows you to search for patterns in your codebase using Elasticsearch-like query syntax.

javascript
import { search } from '@buger/probe';

const results = await search({
  path: '/path/to/your/project',
  query: 'function',
  // Optional parameters
  filesOnly: false,
  ignore: ['node_modules', 'dist'],
  excludeFilenames: false,
  reranker: 'hybrid',
  frequencySearch: true,
  exact: false,
  maxResults: 10,
  maxBytes: 1000000,
  maxTokens: 40000,
  allowTests: false,
  anyTerm: false,
  noMerge: false,
  mergeThreshold: 5,
  json: false,
  binaryOptions: {
    forceDownload: false,
    version: '1.0.0'
  }
});

Parameters

ParameterTypeDescriptionDefault
pathstringPath to search in(required)
querystring | string[]Search query or queries(required)
filesOnlybooleanOnly output file pathsfalse
ignorestring[]Patterns to ignore[]
excludeFilenamesbooleanExclude filenames from searchfalse
rerankerstringReranking method ('hybrid', 'hybrid2', 'bm25', 'tfidf')'hybrid'
frequencySearchbooleanUse frequency-based searchtrue
exactbooleanUse exact matchingfalse
maxResultsnumberMaximum number of results10
maxBytesnumberMaximum bytes to return1000000
maxTokensnumberMaximum tokens to return40000
allowTestsbooleanInclude test filesfalse
anyTermbooleanMatch any termfalse
noMergebooleanDon't merge adjacent blocksfalse
mergeThresholdnumberMerge threshold5
sessionstringSession ID for caching results''
jsonbooleanReturn results as parsed JSON instead of stringfalse
binaryOptionsobjectOptions for getting the binary{}

Query

The query function allows you to find specific code structures using tree-sitter patterns.

javascript
import { query } from '@buger/probe';

const results = await query({
  path: '/path/to/your/project',
  pattern: 'function $NAME($$$PARAMS) $$$BODY',
  // Optional parameters
  language: 'javascript',
  ignore: ['node_modules', 'dist'],
  allowTests: false,
  maxResults: 10,
  format: 'markdown',
  json: false,
  binaryOptions: {
    forceDownload: false,
    version: '1.0.0'
  }
});

Parameters

ParameterTypeDescriptionDefault
pathstringPath to search in(required)
patternstringThe ast-grep pattern to search for(required)
languagestringProgramming language to search in(inferred from files)
ignorestring[]Patterns to ignore[]
allowTestsbooleanInclude test filesfalse
maxResultsnumberMaximum number of results10
formatstringOutput format ('markdown', 'plain', 'json', 'color')'markdown'
jsonbooleanReturn results as parsed JSON instead of stringfalse
binaryOptionsobjectOptions for getting the binary{}

Extract

The extract function allows you to extract code blocks from files based on file paths and line numbers.

javascript
import { extract } from '@buger/probe';

const results = await extract({
  files: [
    '/path/to/your/project/src/main.js',
    '/path/to/your/project/src/utils.js:42'  // Extract from line 42
  ],
  // Optional parameters
  allowTests: false,
  contextLines: 2,
  format: 'markdown',
  json: false,
  binaryOptions: {
    forceDownload: false,
    version: '1.0.0'
  }
});

Parameters

ParameterTypeDescriptionDefault
filesstring[]Files to extract from (can include line numbers with colon)(required)
allowTestsbooleanInclude test filesfalse
contextLinesnumberNumber of context lines to include0
formatstringOutput format ('markdown', 'plain', 'json')'markdown'
jsonbooleanReturn results as parsed JSON instead of stringfalse
binaryOptionsobjectOptions for getting the binary{}

Binary Management

The SDK provides functions for managing the Probe binary:

javascript
import { getBinaryPath, setBinaryPath } from '@buger/probe';

// Get the path to the probe binary
const binaryPath = await getBinaryPath({
  forceDownload: false,
  version: '1.0.0'
});

// Manually set the path to the probe binary
setBinaryPath('/path/to/probe/binary');

AI Tools Integration

The SDK provides built-in tools for integrating with AI frameworks. The latest version introduces tool generators that allow for better configuration and session isolation:

Vercel AI SDK Integration

javascript
import { generateText } from 'ai';
import { searchTool, queryTool, extractTool } from '@buger/probe';
import { randomUUID } from 'crypto';

// Generate a session ID
const sessionId = randomUUID();

// Create configured tools with the session ID
const configOptions = {
  sessionId,
  debug: process.env.DEBUG === 'true'
};

// Create configured tool instances
const configuredTools = {
  search: searchTool(configOptions),
  query: queryTool(configOptions),
  extract: extractTool(configOptions)
};

// Use the configured tools with Vercel AI SDK
async function chatWithAI(userMessage) {
  const result = await generateText({
    model: provider(modelName),
    messages: [{ role: 'user', content: userMessage }],
    system: "You are a code intelligence assistant. Use the provided tools to search and analyze code.",
    tools: configuredTools,
    maxSteps: 15,
    temperature: 0.7
  });
  
  return result.text;
}

Tool Generators and Configuration

The SDK provides tool generators that allow you to create configured instances of tools with specific options:

javascript
import { searchTool, queryTool, extractTool } from '@buger/probe';
import { randomUUID } from 'crypto';

// Generate a session ID
const sessionId = randomUUID();

// Configure tools with options
const configOptions = {
  sessionId,                        // Session ID for caching search results
  debug: process.env.DEBUG === '1', // Enable debug logging
  maxTokens: 30000                  // Override default max tokens (search tool only)
};

// Create configured tool instances
const search = searchTool(configOptions);
const query = queryTool({ debug: true });  // Each tool can have different options
const extract = extractTool({ debug: true });

Configuration Options

OptionTypeDescriptionApplicable Tools
sessionIdstringSession ID for caching search resultssearchTool
debugbooleanEnable debug loggingAll tools
maxTokensnumberOverride default max tokenssearchTool

Backward Compatibility

For backward compatibility, the package still exports pre-configured tools:

javascript
import { tools } from '@buger/probe';

// Use pre-configured tools
const { searchTool, queryTool, extractTool } = tools;

LangChain Integration

javascript
import { ChatOpenAI } from '@langchain/openai';
import { tools, createSearchTool, createQueryTool, createExtractTool } from '@buger/probe';
import { randomUUID } from 'crypto';

// Generate a session ID
const sessionId = randomUUID();

// Create the LangChain tools with configuration
const searchTool = tools.createSearchTool({
  sessionId,
  debug: process.env.DEBUG === 'true'
});
const queryTool = tools.createQueryTool({ debug: true });
const extractTool = tools.createExtractTool({ debug: true });

// Create a ChatOpenAI instance with tools
const model = new ChatOpenAI({
  modelName: "gpt-4o",
  temperature: 0.7
}).withTools([searchTool, queryTool, extractTool]);

// Use the model with tools
async function chatWithAI(userMessage) {
  const result = await model.invoke([
    { role: "system", content: "You are a code intelligence assistant. Use the provided tools to search and analyze code." },
    { role: "user", content: userMessage }
  ]);
  
  return result.content;
}

Default System Message

The package provides a default system message that you can use with your AI assistants:

javascript
import { DEFAULT_SYSTEM_MESSAGE, searchTool, queryTool, extractTool } from '@buger/probe';

// Use the default system message in your AI application
const systemMessage = DEFAULT_SYSTEM_MESSAGE;

// Generate a session ID
const sessionId = randomUUID();

// Create configured tools with the session ID
const configOptions = { sessionId };

// Example with Vercel AI SDK
const result = await generateText({
  model: provider(modelName),
  messages: [{ role: 'user', content: userMessage }],
  system: DEFAULT_SYSTEM_MESSAGE,
  tools: {
    search: searchTool(configOptions),
    query: queryTool(configOptions),
    extract: extractTool(configOptions)
  }
});

Advanced Examples

Building a Code Search API

javascript
import express from 'express';
import { search, query, extract } from '@buger/probe';

const app = express();
app.use(express.json());

// Search endpoint
app.post('/api/search', async (req, res) => {
  try {
    const { path, query, options } = req.body;
    const results = await search({
      path,
      query,
      ...options
    });
    res.json({ results });
  } catch (error) {
    res.status(500).json({ error: error.message });
  }
});

// Query endpoint
app.post('/api/query', async (req, res) => {
  try {
    const { path, pattern, language, options } = req.body;
    const results = await query({
      path,
      pattern,
      language,
      ...options
    });
    res.json({ results });
  } catch (error) {
    res.status(500).json({ error: error.message });
  }
});

// Extract endpoint
app.post('/api/extract', async (req, res) => {
  try {
    const { files, options } = req.body;
    const results = await extract({
      files,
      ...options
    });
    res.json({ results });
  } catch (error) {
    res.status(500).json({ error: error.message });
  }
});

app.listen(3000, () => {
  console.log('Code search API running on port 3000');
});

Creating a Custom AI Assistant

javascript
import { search, query, extract } from '@buger/probe';
import { ChatOpenAI } from '@langchain/openai';
import { PromptTemplate } from '@langchain/core/prompts';
import { StringOutputParser } from '@langchain/core/output_parsers';

// Create a custom AI assistant that can search code
async function createCodeAssistant() {
  // Create a chat model
  const model = new ChatOpenAI({
    modelName: "gpt-4o",
    temperature: 0.7
  });
  
  // Create a prompt template
  const promptTemplate = PromptTemplate.fromTemplate(`
    You are a code assistant. I'll provide you with a question and some code search results.
    Please analyze the code and answer the question.
    
    Question: {question}
    
    Code search results:
    {searchResults}
    
    Your analysis:
  `);
  
  // Create a chain
  const chain = promptTemplate
    .pipe(model)
    .pipe(new StringOutputParser());
  
  // Function to answer questions about code
  async function answerCodeQuestion(question, codebasePath) {
    // Search for relevant code
    const searchResults = await search({
      path: codebasePath,
      query: question,
      maxResults: 5,
      maxTokens: 10000
    });
    
    // Get the answer from the AI
    const answer = await chain.invoke({
      question,
      searchResults
    });
    
    return answer;
  }
  
  return { answerCodeQuestion };
}

// Usage
const assistant = await createCodeAssistant();
const answer = await assistant.answerCodeQuestion(
  "How is authentication implemented?",
  "/path/to/your/project"
);
console.log(answer);

Batch Processing Multiple Repositories

javascript
import { search } from '@buger/probe';
import fs from 'fs/promises';
import path from 'path';

async function batchSearch(repositories, searchQuery) {
  const results = {};
  
  for (const repo of repositories) {
    console.log(`Searching in ${repo}...`);
    try {
      const searchResults = await search({
        path: repo,
        query: searchQuery,
        maxResults: 20,
        json: true // Get structured results
      });
      
      results[repo] = searchResults;
    } catch (error) {
      console.error(`Error searching in ${repo}:`, error);
      results[repo] = { error: error.message };
    }
  }
  
  return results;
}

// Example usage
const repositories = [
  '/path/to/repo1',
  '/path/to/repo2',
  '/path/to/repo3'
];

const results = await batchSearch(repositories, 'security AND (vulnerability OR exploit)');

// Save results to a file
await fs.writeFile(
  path.join(process.cwd(), 'search-results.json'),
  JSON.stringify(results, null, 2)
);

console.log('Search completed and results saved to search-results.json');

Code Analysis Pipeline

javascript
import { search, query } from '@buger/probe';
import fs from 'fs/promises';

async function analyzeCodebase(codebasePath) {
  const analysis = {
    timestamp: new Date().toISOString(),
    codebasePath,
    metrics: {},
    patterns: {},
    potentialIssues: []
  };
  
  // Count functions by language
  const languages = ['javascript', 'typescript', 'python', 'rust', 'go'];
  const functionCounts = {};
  
  for (const lang of languages) {
    try {
      const pattern = lang === 'javascript' || lang === 'typescript'
        ? 'function $NAME($$$PARAMS) $$$BODY'
        : lang === 'python'
          ? 'def $NAME($$$PARAMS): $$$BODY'
          : lang === 'rust'
            ? 'fn $NAME($$$PARAMS) $$$BODY'
            : 'func $NAME($$$PARAMS) $$$BODY';
      
      const results = await query({
        path: codebasePath,
        pattern,
        language: lang,
        maxResults: 1000,
        json: true
      });
      
      functionCounts[lang] = results.matches ? results.matches.length : 0;
    } catch (error) {
      console.error(`Error counting functions in ${lang}:`, error);
      functionCounts[lang] = -1; // Error indicator
    }
  }
  
  analysis.metrics.functionCounts = functionCounts;
  
  // Find potential security issues
  const securityPatterns = [
    'password',
    'token',
    'api_key',
    'apikey',
    'secret',
    'credential',
    'eval(',
    'exec(',
    'shell_exec'
  ];
  
  for (const pattern of securityPatterns) {
    try {
      const results = await search({
        path: codebasePath,
        query: pattern,
        maxResults: 50,
        json: true
      });
      
      if (results.matches && results.matches.length > 0) {
        analysis.potentialIssues.push({
          pattern,
          matches: results.matches.map(match => ({
            file: match.file,
            line: match.line,
            content: match.content.substring(0, 100) + '...' // Truncate long content
          }))
        });
      }
    } catch (error) {
      console.error(`Error searching for pattern ${pattern}:`, error);
    }
  }
  
  // Save analysis to file
  await fs.writeFile(
    'codebase-analysis.json',
    JSON.stringify(analysis, null, 2)
  );
  
  return analysis;
}

// Usage
const analysis = await analyzeCodebase('/path/to/your/project');
console.log('Analysis complete. Results saved to codebase-analysis.json');
console.log(`Found ${Object.values(analysis.metrics.functionCounts).reduce((a, b) => a + (b > 0 ? b : 0), 0)} functions across all languages`);
console.log(`Found ${analysis.potentialIssues.length} potential security issues`);

How It Works

When you install the @buger/probe package:

  1. A placeholder binary is included in the package
  2. During installation, the postinstall script downloads the actual Probe binary for your platform
  3. The placeholder is replaced with the actual binary
  4. When installed globally, npm creates a symlink to this binary in your system path

This approach ensures that you get the actual native binary, not a JavaScript wrapper, providing full performance and all features of the original Probe CLI.

Troubleshooting

Common Issues

Binary Not Found

If you encounter a "Binary not found" error:

javascript
import { setBinaryPath } from '@buger/probe';

// Manually set the path to the probe binary
setBinaryPath('/path/to/probe/binary');

Permission Denied

If you encounter a "Permission denied" error:

bash
# Make the binary executable
chmod +x /path/to/probe/binary

Network Error During Binary Download

If you encounter a network error during binary download:

javascript
import { getBinaryPath } from '@buger/probe';

// Force download with a specific version
const binaryPath = await getBinaryPath({
  forceDownload: true,
  version: '1.0.0'
});

Timeout Error

If you encounter a timeout error:

javascript
import { search } from '@buger/probe';

// Increase the timeout by using the execAsync function directly
import { promisify } from 'util';
import { exec } from 'child_process';
const execAsync = promisify(exec);

// Get the binary path
import { getBinaryPath } from '@buger/probe';
const binaryPath = await getBinaryPath();

// Execute the command with a longer timeout
const { stdout } = await execAsync(`${binaryPath} search "query" /path/to/project`, {
  timeout: 60000 // 60 seconds
});

Best Practices

  1. Use Specific Queries: More specific queries yield better results
  2. Limit Result Size: Use maxResults and maxTokens to limit the size of results
  3. Handle Errors: Always wrap API calls in try/catch blocks
  4. Cache Results: Consider caching results for frequently used queries
  5. Use JSON Format: Use json: true for programmatic processing of results
  6. Combine with Other Tools: Use Probe alongside other tools for a more comprehensive understanding of your codebase
  7. Optimize for Performance: Use filesOnly for initial broad searches, then refine with more specific queries
  8. Use Session IDs: For related searches, use the same session ID to avoid seeing duplicate code blocks
  9. Use Tool Generators: Create configured tool instances with session IDs for better isolation in concurrent environments
  10. Enable Debug Logging: Set the debug option to true when troubleshooting tool execution

Session-Based Caching Example

There are two ways to use session-based caching:

1. Using the search function directly

javascript
import { search } from '@buger/probe';

// First search with empty session string (generates a session ID)
const results1 = await search({
  path: '/path/to/your/project',
  query: 'authentication',
  session: ''
});

// Get the session ID from the results
const sessionId = results1.session;
console.log(`Session ID: ${sessionId}`);

// Use the same session ID for related searches
const results2 = await search({
  path: '/path/to/your/project',
  query: 'login',
  session: sessionId
});

// This will skip code blocks already shown in the previous search
console.log(`Found ${results2.matches.length} new matches`);

2. Using the tool generators with Vercel AI SDK

javascript
import { searchTool } from '@buger/probe';
import { randomUUID } from 'crypto';

// Generate a session ID
const sessionId = randomUUID();
console.log(`Session ID: ${sessionId}`);

// Create a configured search tool with the session ID
const configuredSearchTool = searchTool({
  sessionId,
  debug: true // Enable debug logging
});

// Use the configured tool with Vercel AI SDK
// All searches performed with this tool will use the same session ID
const result = await generateText({
  model: provider(modelName),
  messages: [{ role: 'user', content: userMessage }],
  system: "You are a code intelligence assistant. Use the provided tools to search and analyze code.",
  tools: {
    search: configuredSearchTool
  },
  maxSteps: 15
});

Session-based caching is particularly useful when:

  • Building interactive search interfaces
  • Conducting progressive searches that refine or expand on previous queries
  • Creating AI assistants that need to avoid repeating the same code blocks
  • Implementing search workflows that build on previous results
  • Handling concurrent searches in multi-user environments

Released under the Apache 2.0 License.