sist2

sist2

File system indexer and web-based search for local files

1.2kstars
72forks
Last commit: 6mo ago
Repo age: 7y old

sist2 (Simple incremental search tool) is a lightning-fast file system indexer that scans directories and builds a searchable index of file contents and metadata. It provides a mobile-friendly web interface and supports either Elasticsearch or a lightweight SQLite (FTS5) search backend.

Key Features

  • Incremental, multi-threaded scanning optimized for speed and low memory usage
  • Web UI for searching and browsing results, including thumbnails and metadata
  • Supports Elasticsearch indexing or a simpler SQLite-based search backend
  • Content extraction and metadata parsing for many common formats (documents, media, ebooks)
  • Recursive scanning inside archive files (including archives within archives)
  • Optional OCR via Tesseract for images and supported ebook/document formats
  • Manual tagging in the UI and automatic tagging via user scripts
  • Basic statistics and disk utilization visualizations

Use Cases

  • Personal or team “desktop search” for large document and media collections
  • Building a searchable archive of mixed file types (PDFs, photos, videos, ebooks)
  • Indexing NAS or server directories to quickly locate files by content or metadata

Limitations and Considerations

  • Elasticsearch provides more features but has a significantly higher resource footprint than SQLite
  • Archive scanning is single-threaded and some seek-heavy media formats in archives may be limited

sist2 is well-suited for users who want fast local file indexing with a modern web search experience and flexible backend options depending on resources and feature needs.

Categories:

Tags:

Tech Stack:

Share:

Similar Services

Meilisearch

Meilisearch

Fast search engine API with full-text, vector, and hybrid search

55.4k
2.3k
Last commit: 2d ago

Meilisearch is a lightning-fast search engine API for apps and websites, offering typo-tolerant full-text search plus vector and AI-ready hybrid retrieval.

Alternative to:
Algolia
Algolia
+16
ArchiveBox

ArchiveBox

Open-source self-hosted web archiving and snapshotting tool

26.4k
1.4k
Last commit: 11d ago

Self-hosted tool to collect and preserve webpages, media, and bookmarks in durable formats (HTML, PDF, WARC, MP4) with a CLI, web UI, and search.

Alternative to:
Internet Archive Wayback Machine
Internet Archive Wayback Machine
+3
Typesense

Typesense

Fast, typo-tolerant search engine with keyword and vector search

25k
850
Last commit: 2d ago

Typesense is a developer-friendly search engine for instant, typo-tolerant search-as-you-type with faceting, filtering, geo search, and vector/semantic search APIs.

Alternative to:
Algolia
Algolia
+19
SearXNG

SearXNG

Privacy-focused metasearch engine for aggregating web results

24.2k
2.4k
Last commit: 22h ago

SearXNG is a privacy-respecting metasearch engine that aggregates results from many search services without tracking or profiling users.

Alternative to:
Google Search
Google Search
+6
ZincSearch

ZincSearch

A lightweight open-source search engine for full-text indexing.

17.7k
762
Last commit: 1mo ago

ZincSearch is a Go-based, lightweight search engine for full-text indexing with Elasticsearch API-compatible ingestion, a Vue UI, and a schema-less document model.

Alternative to:
Elastic Cloud (Elasticsearch Service)
Elastic Cloud (Elasticsearch Service)
+7
Onyx Community Edition

Onyx Community Edition

Self-hosted AI chat and enterprise search for any LLM

17.1k
2.3k
Last commit: 16h ago

Open-source platform for AI chat, RAG, agents, and enterprise search across your team’s connected knowledge sources, compatible with hosted and local LLMs.

Alternative to:
Onyx
Onyx
+19