Skip to content

Srimanth-Reddy/file-finder

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

File Finder - AI-Powered CLI Search Tool

A powerful CLI tool for finding files using both simple name matching and AI-powered contextual search.

Features

  • Simple Search: Fast filename-based search using pattern matching
  • Contextual Search: AI-powered semantic search using Google Gemini that understands file content
  • Concurrent Processing: 10-16x faster with parallel worker pool
  • Efficient Traversal: Uses filepath.WalkDir for recursive directory traversal
  • Smart Filtering: Automatically skips binary files and large files during contextual search

Prerequisites

Before installing, make sure you have:

  • Go 1.19 or higher installed (Download here)
  • Internet connection (for AI contextual search)
  • Gemini API key (optional, only for contextual search) - Get one here

Note: For simplicity in this assignment demo, the Gemini API key is currently hardcoded. In production it should be stored using environment variables.

Installation

Option 1: Local Build (Run from Project Directory)

Build the binary locally:

# 1. Navigate to the project directory
cd /Users/nallamillisrimanthreddy/Desktop/file-finder

# 2. Build the binary
go build -o finder

# 3. Run from current directory
./finder -h

Option 2: Install to System PATH (Recommended for Global Use)

Install to a directory in your PATH so you can use finder from anywhere:

# 1. Navigate to the project directory
cd /Users/nallamillisrimanthreddy/Desktop/file-finder

# 2. Build the binary
go build -o finder

# 3. Copy to a directory in your PATH
sudo cp finder /usr/local/bin/

# 4. Verify installation
finder -h

Now you can use finder from any directory without ./

Quick Start

After installation, try these commands:

# 1. Simple filename search in current directory
finder report

# 2. Search in a specific directory
finder -path ~/Documents report

# 3. AI contextual search (requires API key)
export GEMINI_API_KEY="your-api-key-here"
finder -q "find my tax documents"

# 4. Search with custom worker count
finder -q "meeting notes" -workers 15

Usage Guide

1. Simple Filename Search (Fast & Free)

Search for files by name using pattern matching. Works offline and very fast:

# Search in current directory
finder report

# Search from your home directory
finder -path ~ config

# Search in Documents folder
finder -path ~/Documents budget

# Output example:
# /Users/sri/docs/annual_report.pdf
# /Users/sri/projects/report_2024.txt
# /Users/sri/work/quarterly_report.xlsx

When to use: Finding files by their filename when you know part of the name.

2. AI Contextual Search (Semantic Understanding)

Use the -q flag for semantic search that understands file content and meaning:

# Find files about groceries (searches file CONTENT)
finder -q "grocery list"

# Find files containing passwords or credentials
finder -q "find my password"

# Find meeting notes from specific topics
finder -q "meeting notes about project planning"

# Search only in Documents folder
finder -path ~/Documents -q "tax documents from 2024"

When to use: When you know what the file is about, but not its exact name.

3. Performance Tuning

Control how many files are processed concurrently:

# Default: 10 concurrent workers (balanced)
finder -q "your query"

# Fast: 20 workers (for large searches, may hit rate limits)
finder -q "your query" -workers 20

# Conservative: 5 workers (slower but safer for API limits)
finder -q "your query" -workers 5

# Debug: 1 worker (sequential, for troubleshooting)
finder -q "your query" -workers 1

Real-World Examples

Example 1: Find Shopping Lists

Given a file things_to_buy.txt with content:

rice
eggs
mango

Command:

finder -q "Grocery list"

Result: Finds things_to_buy.txt even though "grocery" doesn't appear anywhere in the file, because AI understands the semantic meaning.

Example 2: Find Credentials

Given a file app_config.json with content:

{
  "username": "admin@example.com",
  "password": "secret123"
}

Command:

finder -q "find my password"

Result: Finds app_config.json and other files containing credentials.

Example 3: Find Work Documents

Search your entire Documents folder for meeting notes:

cd ~/Documents
finder -q "meeting notes from this month"

Example 4: Find Code Files

Search for API-related code in your projects:

finder -path ~/projects -q "REST API endpoints or authentication code"

Command Reference

Complete Syntax

finder [OPTIONS] [SEARCH_TERM]

Flags

Flag Type Default Description
-q string - Semantic search query (enables AI mode)
-path string . (current dir) Root directory to search from
-workers int 10 Number of concurrent workers (AI mode only)

Usage Modes

Mode 1: Simple Search

finder <search-term>
finder -path <directory> <search-term>

Mode 2: AI Contextual Search

finder -q "semantic query"
finder -path <directory> -q "semantic query"
finder -q "semantic query" -workers <number>

Configuration

Setting Up Gemini API Key

Note: For simplicity in this assignment demo, the Gemini API key is currently hardcoded. In production it should be stored using environment variables.

Contextual search requires a Google Gemini API key.

Step 1: Get API Key

  1. Visit Google AI Studio
  2. Sign in with your Google account
  3. Click "Create API Key"
  4. Copy the key

Step 2: Set Environment Variable

Temporary (current session only):

export GEMINI_API_KEY="your-api-key-here"

Permanent (recommended):

For macOS/Linux with bash:

echo 'export GEMINI_API_KEY="your-api-key-here"' >> ~/.bashrc
source ~/.bashrc

For macOS with zsh (default on modern macOS):

echo 'export GEMINI_API_KEY="your-api-key-here"' >> ~/.zshrc
source ~/.zshrc

Step 3: Verify

echo $GEMINI_API_KEY
# Should print your API key

Common Use Cases

Use Case 1: Find Files in Current Directory

cd ~/Documents
finder budget

Use Case 2: Search Entire Home Directory

finder -path ~ "vacation"

Use Case 3: Semantic Search in Specific Folder

finder -path ~/work -q "project proposals from Q1"

Use Case 4: Find Configuration Files

finder -path ~ -q "files containing database credentials"

Use Case 5: Fast Search with More Workers

finder -path ~/Documents -q "research papers" -workers 20

Troubleshooting

Issue: "command not found: finder"

Solution 1: Use relative path from project directory

cd /Users/nallamillisrimanthreddy/Desktop/file-finder
./finder -q "query"

Solution 2: Install to system PATH

cd /Users/nallamillisrimanthreddy/Desktop/file-finder
go build -o finder
sudo cp finder /usr/local/bin/

Solution 3: Verify /usr/local/bin is in your PATH

echo $PATH | grep "/usr/local/bin"
# If not found, add it:
echo 'export PATH="/usr/local/bin:$PATH"' >> ~/.bashrc
source ~/.bashrc

Issue: "GEMINI_API_KEY environment variable not set"

Note: For simplicity in this assignment demo, the Gemini API key is currently hardcoded. In production it should be stored using environment variables.

Solution:

# Set the API key
export GEMINI_API_KEY="your-actual-api-key"

# Or add to your shell config
echo 'export GEMINI_API_KEY="your-key"' >> ~/.bashrc

Issue: API rate limit errors

Solution: Reduce concurrent workers

finder -q "query" -workers 5

Issue: No results found

Checklist:

  1. Verify the search path is correct

    finder -path /correct/path -q "query"
  2. Check file extensions are supported (text files only)

    • Supported: .txt, .md, .json, .py, .js, etc.
    • Not supported: .pdf, .docx, .jpg, etc.
  3. Verify file size (must be under 1MB)

    find /path -type f -size +1M
  4. Try simple search first to verify files exist

    finder filename_part

Issue: Slow performance

Solutions:

  1. Increase workers (if no rate limits)

    finder -q "query" -workers 20
  2. Narrow search path

    finder -path ~/Documents/specific-folder -q "query"
  3. Use simple search when possible

    finder filename  # Much faster than -q

How It Works

Simple Search Mode

  1. Recursively walks through directories using filepath.WalkDir
  2. Matches filenames using case-insensitive substring matching
  3. Returns all matching file paths

Contextual Search Mode

  1. Recursively walks through directories
  2. Filters out:
    • Binary files (based on extension)
    • Large files (>1MB)
    • Directories
  3. Reads text file contents
  4. Sends file content + query to Gemini AI
  5. Returns files that Gemini determines are semantically relevant

Architecture

file-finder/
├── main.go                    # CLI entry point and argument parsing
├── internal/
│   ├── search/
│   │   └── search.go         # Search logic (simple & contextual)
│   └── ai/
│       └── gemini.go         # Gemini API client
└── README.md

Key Components

  • main.go: Handles CLI argument parsing using the flag package
  • search.go: Implements both Find() (simple) and ContextualFind() (AI-powered)
  • gemini.go: Manages API communication with Google Gemini

Performance Optimizations

Concurrent Processing (New!)

The tool now uses parallel processing for contextual search:

  • Worker Pool Pattern: Processes multiple files simultaneously
  • Configurable Concurrency: Control worker count with -workers flag
  • Default: 10 concurrent API calls (balances speed vs rate limits)

Performance Comparison:

  • Sequential (old): 1 file at a time → ~3-5 seconds per file
  • Concurrent (new): 10 files at once → 10x faster for large datasets

Performance Examples

# Use default 10 workers
finder -q "your query"

# Increase to 20 workers for faster searches (watch rate limits!)
finder -q "your query" -workers 20

# Reduce to 5 workers for conservative API usage
finder -q "your query" -workers 5

Benchmarks

For 100 text files:

  • Sequential: ~5 minutes
  • 10 workers: ~30 seconds
  • 20 workers: ~15 seconds

Considerations

  • Simple search: Very fast, suitable for large directory trees
  • Contextual search: Now much faster with concurrency
  • API Rate Limits: Adjust -workers if you hit rate limits
  • File size limit: 1MB max for contextual search
  • Text file detection: Uses extension-based heuristics to skip binary files

Limitations

  • Contextual search requires internet connection (API calls)
  • Contextual search has API rate limits and costs
  • Binary files are skipped in contextual search
  • Maximum file size for content analysis is 1MB

Future Enhancements

  • Parallel/concurrent traversal for faster searchesImplemented!
  • Local indexing/caching for instant repeated searches
  • Support for glob patterns (e.g., *.py)
  • Respect .gitignore patterns automatically
  • Custom file size limits via flags
  • Real-time progress bars for long searches
  • Result ranking by relevance score
  • Batch multiple files in single API call

License

MIT

About

File Finder

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors