
Khoj
Open-source personal AI for chat, semantic search and agents

Khoj is an open-source personal AI platform that combines chat, semantic document search, custom agents and scheduled automations. It can run locally or as a cloud-hosted service and integrates with local or remote LLMs to answer questions, generate content and automate research.
Key Features
- Multi-client access: web, desktop, Obsidian, Emacs, mobile (PWA) and chat integrations (e.g., WhatsApp).
- Model-agnostic LLM support: connect local GGUF models or remote OpenAI-compatible, Anthropic and Google-compatible endpoints; supports on-device and cloud models.
- Semantic search and embeddings: document ingestion (PDF, Markdown, Word, org-mode, Notion, images) with vector storage and retrieval for fast, contextual search.
- Custom agents and automations: build agents with distinct personas, tools and knowledge bases; schedule research tasks and email newsletters.
- Document processing and code tools: built-in extractors, simple code execution sandbox support (local Terrarium or remote sandboxes) and image generation features.
- Enterprise & self-hosting options: deploy via Docker or pip, use Postgres with pgvector for embeddings, and configure authentication and domains.
Use Cases
- Personal knowledge management: query a private document corpus and get grounded answers across notes, PDFs and files.
- Research automation: schedule recurring research queries and receive summarized results by email.
- Team/private deployments: host a private assistant for a team with custom agents, model selection and on-premise data control.
Limitations and Considerations
- Some optional integrations require extra setup or external services (e.g., code sandboxes, email providers); self-hosting needs correct environment configuration.
- A few plugins/integrations may be unmaintained or platform-specific; users should check the chosen connectors and follow the docs for compatibility and maintenance status.
Khoj is designed to be extensible and model-agnostic, emphasizing private data control and flexible deployment. It is suited for individuals and teams who need a searchable, automatable assistant that can run with local or cloud language models.
Categories:
Tags:
Tech Stack:
Similar Services

Ollama
Run and manage large language models locally with an API
Ollama is a local LLM runtime that lets you pull, run, and customize models, offering a CLI and REST API for chat, generation, and embeddings.

Open WebUI
Extensible, offline-capable web interface for LLM interactions
Feature-rich, self-hosted AI interface that integrates Ollama and OpenAI-compatible APIs, offers RAG, vector DB support, image tools, RBAC and observability.

LocalAI
OpenAI-compatible local AI inference server and API
Run LLMs, image, and audio models locally with an OpenAI-compatible API, optional GPU acceleration, and a built-in web UI for managing and testing models.

Jina
Cloud-native Python framework for serving multimodal AI services
Open-source Python framework to build, scale, and deploy multimodal AI services and pipelines with gRPC/HTTP/WebSocket support and Kubernetes/Docker integration.
Paperless-AI
AI extension for Paperless‑ngx providing automated analysis and RAG
Extension for Paperless‑ngx that uses OpenAI-compatible backends and Ollama to auto-classify, tag, index, and enable RAG-powered document chat and semantic search.

Django
FastAPI
Docker
PyTorch
LangChain