Distributed P2P web search engine and intranet search appliance

3.8kstars
472forks
Last commit: 8d ago
Repo age: 11y old

YaCy is a self-hosted search engine stack combining a web crawler, an index, and a web UI for searching and managing content. It can run as a standalone search portal, an intranet search appliance, or as part of a decentralized peer-to-peer network that exchanges index data for web search.

Key Features

  • Built-in web crawler with scheduling to keep indexes fresh
  • Search UI plus administration interface for configuring crawls, indexes, and peers
  • Peer-to-peer mode for sharing index data without relying on a central operator
  • Standalone mode for private, local-only search results from your own index
  • Intranet search use case with network scanning to discover HTTP, FTP, and SMB servers
  • HTTP-based interfaces with XML/JSON outputs for many pages and functions

Use Cases

  • Run a private search portal for a curated set of websites you crawl
  • Provide intranet search across internal web services and shared resources
  • Participate in a community-operated decentralized web search network

Limitations and Considerations

  • Precompiled packages may be less frequent; building from source is commonly recommended
  • Requires Java (11+) and can be resource-intensive depending on crawl and index size

YaCy is suited to organizations and individuals who want control over crawling and indexing, and who prefer privacy-aware search without dependence on a centralized search provider. Its flexible modes make it useful both for private indexing and for distributed web search participation.

Categories:

Tags:

Tech Stack:

Share:

Similar Services

Meilisearch

Meilisearch

Fast search engine API with full-text, vector, and hybrid search

55.4k
2.3k
Last commit: 2d ago

Meilisearch is a lightning-fast search engine API for apps and websites, offering typo-tolerant full-text search plus vector and AI-ready hybrid retrieval.

Alternative to:
Algolia
Algolia
+16
ArchiveBox

ArchiveBox

Open-source self-hosted web archiving and snapshotting tool

26.4k
1.4k
Last commit: 11d ago

Self-hosted tool to collect and preserve webpages, media, and bookmarks in durable formats (HTML, PDF, WARC, MP4) with a CLI, web UI, and search.

Alternative to:
Internet Archive Wayback Machine
Internet Archive Wayback Machine
+3
Typesense

Typesense

Fast, typo-tolerant search engine with keyword and vector search

25k
850
Last commit: 2d ago

Typesense is a developer-friendly search engine for instant, typo-tolerant search-as-you-type with faceting, filtering, geo search, and vector/semantic search APIs.

Alternative to:
Algolia
Algolia
+19
SearXNG

SearXNG

Privacy-focused metasearch engine for aggregating web results

24.2k
2.4k
Last commit: 22h ago

SearXNG is a privacy-respecting metasearch engine that aggregates results from many search services without tracking or profiling users.

Alternative to:
Google Search
Google Search
+6
ZincSearch

ZincSearch

A lightweight open-source search engine for full-text indexing.

17.7k
762
Last commit: 1mo ago

ZincSearch is a Go-based, lightweight search engine for full-text indexing with Elasticsearch API-compatible ingestion, a Vue UI, and a schema-less document model.

Alternative to:
Elastic Cloud (Elasticsearch Service)
Elastic Cloud (Elasticsearch Service)
+7
Onyx Community Edition

Onyx Community Edition

Self-hosted AI chat and enterprise search for any LLM

17.1k
2.3k
Last commit: 16h ago

Open-source platform for AI chat, RAG, agents, and enterprise search across your team’s connected knowledge sources, compatible with hosted and local LLMs.

Alternative to:
Onyx
Onyx
+19