Quick Start Β· Key Features Β· Web UI Β· How it Works Β· FAQ
π Agentic Search Β β’Β π§ Knowledge Clustering Β β’Β π Monte Carlo Evidence Sampling
β‘ Indexless Retrieval Β β’Β π Self-Evolving Knowledge Base Β β’Β π¬ Real-time Chat
Intelligence pipelines built upon vector-based retrieval can be rigid and brittle. They rely on static vector embeddings that are expensive to compute, blind to real-time changes, and detached from the raw context. We introduce Sirchmunk to usher in a more agile paradigm, where data is no longer treated as a snapshot, and insights can evolve together with the data.
Sirchmunk works directly with raw data -- bypassing the heavy overhead of squeezing your rich files into fixed-dimensional vectors.
- Instant Search: Eliminating complex pre-processing pipelines in hours long indexing; just drop your files and search immediately.
- Full Fidelity: Zero information loss β- stay true to your data without vector approximation.
Data is a stream, not a snapshot. Sirchmunk is dynamic by design, while vector DB can become obsolete the moment your data changes.
- Context-Aware: Evolves in real-time with your data context.
- LLM-Powered Autonomy: Designed for Agents that perceive data as it lives, utilizing token-efficient reasoning that triggers LLM inference only when necessary to maximize intelligence while minimizing cost.
Sirchmunk bridges massive local repositories and the web with high-scale throughput and real-time awareness.
It serves as a unified intelligent hub for AI agents, delivering deep insights across vast datasets at the speed of thought.
| Dimension | Traditional RAG | β¨Sirchmunk |
|---|---|---|
| π° Setup Cost | High Overhead (VectorDB, GraphDB, Complex Document Parser...) |
β
Zero Infrastructure Direct-to-data retrieval without vector silos |
| π Data Freshness | Stale (Batch Re-indexing) |
β
Instant & Dynamic Self-evolving index that reflects live changes |
| π Scalability | Linear Cost Growth |
β
Extremely low RAM/CPU consumption Native Elastic Support, efficiently handles large-scale datasets |
| π― Accuracy | Approximate Vector Matches |
β
Deterministic & Contextual Hybrid logic ensuring semantic precision |
| βοΈ Workflow | Complex ETL Pipelines |
β
Drop-and-Search Zero-config integration for rapid deployment |
- ππ Jan 22, 2026: Introducing Sirchmunk: Initial Release v0.0.1 Now Available!
- Python 3.10+
- LLM API Key (OpenAI-compatible endpoint, local or remote)
- Node.js 18+ (Optional, for web interface)
# Create virtual environment (recommended)
conda create -n sirchmunk python=3.13 -y && conda activate sirchmunk
pip install sirchmunk
# Or via UV:
uv pip install sirchmunk
# Alternatively, install from source:
git clone https://github.com/modelscope/sirchmunk.git && cd sirchmunk
pip install -e .import asyncio
from sirchmunk import AgenticSearch
from sirchmunk.llm import OpenAIChat
llm = OpenAIChat(
api_key="your-api-key",
base_url="your-base-url", # e.g., https://api.openai.com/v1
model="your-model-name" # e.g., gpt-4o
)
async def main():
agent_search = AgenticSearch(llm=llm)
result: str = await agent_search.search(
query="How does transformer attention work?",
search_paths=["/path/to/documents"],
)
print(result)
asyncio.run(main())- Upon initialization, AgenticSearch automatically checks if ripgrep-all and ripgrep are installed. If they are missing, it will attempt to install them automatically. If the automatic installation fails, please install them manually.
- Replace
"your-api-key","your-base-url","your-model-name"and/path/to/documentswith your actual values.
The web UI is built for fast, transparent workflows: chat, knowledge analytics, and system monitoring in one place.
git clone https://github.com/modelscope/sirchmunk.git && cd sirchmunk
pip install ".[web]"
npm install --prefix web- Note: Node.js 18+ is required for the web interface.
# Start frontend and backend
python scripts/start_web.py
# Stop frontend and backend
python scripts/stop_web.pyAccess the web UI at (By default):
- Backend APIs: http://localhost:8584/docs
- Frontend: http://localhost:8585
Configuration:
- Access
SettingsβEnvrionment Variablesto configure LLM API, and other parameters.
| Component | Description |
|---|---|
| AgenticSearch | Search orchestrator with LLM-enhanced retrieval capabilities |
| KnowledgeBase | Transforms raw results into structured knowledge clusters with evidences |
| EvidenceProcessor | Evidence processing based on the MonteCarlo Importance Sampling |
| GrepRetriever | High-performance indexless file search with parallel processing |
| OpenAIChat | Unified LLM interface supporting streaming and usage tracking |
| MonitorTracker | Real-time system and application metrics collection |
All persistent data is stored in the configured WORK_PATH (default: ~/.sirchmunk/):
{WORK_PATH}/
βββ .cache/
βββ history/ # Chat session history (DuckDB)
β βββ chat_history.db
βββ knowledge/ # Knowledge clusters (Parquet)
β βββ knowledge_clusters.parquet
βββ settings/ # User settings (DuckDB)
βββ settings.db
How is this different from traditional RAG systems?
Sirchmunk takes an indexless approach:
- No pre-indexing: Direct file search without vector database setup
- Self-evolving: Knowledge clusters evolve based on search patterns
- Multi-level retrieval: Adaptive keyword granularity for better recall
- Evidence-based: Monte Carlo sampling for precise content extraction
What LLM providers are supported?
Any OpenAI-compatible API endpoint, including (but not limited too):
- OpenAI (GPT-4, GPT-4o, GPT-3.5)
- Local models served via Ollama, llama.cpp, vLLM, SGLang etc.
- Claude via API proxy
How do I add documents to search?
Simply specify the path in your search query:
result = await search.search(
query="Your question",
search_paths=["/path/to/folder", "/path/to/file.pdf"]
)No pre-processing or indexing required!
Where are knowledge clusters stored?
Knowledge clusters are persisted in Parquet format at:
{WORK_PATH}/.cache/knowledge/knowledge_clusters.parquet
You can query them using DuckDB or the KnowledgeManager API.
How do I monitor LLM token usage?
- Web Dashboard: Visit the Monitor page for real-time statistics
- API:
GET /api/v1/monitor/llmreturns usage metrics - Code: Access
search.llm_usagesafter search completion
- Text-retrieval from raw files
- Knowledge structuring & persistence
- Real-time chat with RAG
- Web UI support
- Web search integration
- Multi-modal support (images, videos)
- Distributed search across nodes
- Knowledge visualization and deep analytics
- More file type support
We welcome contributions !
This project is licensed under the Apache License 2.0.
ModelScope Β· β Star us Β· π Report a bug Β· π¬ Discussions
β¨ Sirchmunk: Raw data to self-evolving intelligence, real-time.



