
Diskover
File system indexing, search, and storage analytics platform
Diskover is a data management and analytics platform for unstructured file data that crawls storage, enriches file metadata, and indexes it for fast search and reporting. It is designed to help teams understand what they have, where it lives, and how storage is being used.
Key Features
- Crawls and indexes heterogeneous storage (local file systems, NFS/SMB shares, and other supported sources)
- Elasticsearch-backed indexing for fast file search and filtering
- Storage usage analytics to identify cold data, growth trends, and large consumers
- Duplicate file discovery and wasted-space analysis
- Extensible metadata enrichment via plugins
- Web UI for search, reporting, and operational visibility
Use Cases
- Storage capacity planning and cost optimization by finding cold/unused or duplicate data
- Rapid file discovery and investigation across large shares and mixed storage
- Data hygiene initiatives such as organizing, tagging, and preparing curated datasets for analytics
Limitations and Considerations
- Requires running and maintaining an Elasticsearch cluster for indexing and search
- Crawling very large environments may require tuning and scheduling to manage resource usage
Diskover fits organizations and advanced homelabs that need centralized visibility into file data sprawl and want searchable metadata at scale. It pairs a crawler/indexer with a web interface to turn unstructured storage into actionable insights for cleanup, governance, and operations.
Categories:
Tags:
Tech Stack:
Similar Services

Meilisearch
Fast search engine API with full-text, vector, and hybrid search
Meilisearch is a lightning-fast search engine API for apps and websites, offering typo-tolerant full-text search plus vector and AI-ready hybrid retrieval.

ArchiveBox
Open-source self-hosted web archiving and snapshotting tool
Self-hosted tool to collect and preserve webpages, media, and bookmarks in durable formats (HTML, PDF, WARC, MP4) with a CLI, web UI, and search.

Typesense
Fast, typo-tolerant search engine with keyword and vector search
Typesense is a developer-friendly search engine for instant, typo-tolerant search-as-you-type with faceting, filtering, geo search, and vector/semantic search APIs.

SearXNG
Privacy-focused metasearch engine for aggregating web results
SearXNG is a privacy-respecting metasearch engine that aggregates results from many search services without tracking or profiling users.
ZincSearch
A lightweight open-source search engine for full-text indexing.
ZincSearch is a Go-based, lightweight search engine for full-text indexing with Elasticsearch API-compatible ingestion, a Vue UI, and a schema-less document model.
Onyx Community Edition
Self-hosted AI chat and enterprise search for any LLM
Open-source platform for AI chat, RAG, agents, and enterprise search across your team’s connected knowledge sources, compatible with hosted and local LLMs.
JavaScript
HTML
Docker
Python
CSS
Elasticsearch
PHP