Willow
Open-source, privacy-focused voice assistant platform
Willow is an open-source, privacy-focused voice assistant platform designed for low-cost ESP32-S3 hardware. It provides fast on-device wake-word and command recognition and can optionally integrate with a self-hosted inference server for high-quality speech-to-text, TTS, and LLM tasks. (heywillow.io)
Key Features
- On-device wake-word engine and voice-activity detection with configurable wake words and up to hundreds of on-device commands. (heywillow.io)
- Integration with Home Assistant, openHAB and generic REST endpoints for home automation and custom workflows. (heywillow.io)
- Willow Inference Server (WIS) option: a performance-optimized server that supports ASR/STT (Whisper models), TTS, and optional LLM inference with REST, WebRTC and WebSocket transports. WIS targets CUDA GPUs for low-latency workloads and includes deployment scripts and Docker compose support. (github.com)
- Device management and OTA flashing via the Willow Application Server (WAS) with a provided Docker image to simplify onboarding. (heywillow.io)
Use Cases
- Privacy-first smart-home voice control: local wake-word and command recognition that triggers Home Assistant automations without cloud transcription.
- On-premises speech processing: self-hosted WIS for low-latency ASR/STT and TTS for accessibility, transcription, or edge assistant applications.
- Developer integrations: embed Willow devices into custom REST/WebRTC workflows or use WIS to add LLM-powered assistants to local networks. (github.com)
Limitations and Considerations
- Advanced WIS features (LLM, high-quality TTS) expect CUDA-capable GPUs and NVIDIA drivers; CPU-only setups are supported but significantly slower and may disable some features. (github.com)
- Primary device target is the ESP32-S3-BOX family; other hardware may require additional porting or tuning. (heywillow.io)
Willow combines a small-footprint device runtime with an optional, high-performance inference server to enable private, low-latency voice assistants and on-premises speech workflows. It is actively developed with documentation, Docker deployment options, and community discussion channels for support. (heywillow.io)
Categories:
Tags:
Tech Stack:
Similar Services

Open WebUI
Extensible, offline-capable web interface for LLM interactions
Feature-rich, self-hosted AI interface that integrates Ollama and OpenAI-compatible APIs, offers RAG, vector DB support, image tools, RBAC and observability.


AnythingLLM
All-in-one AI chat app with RAG, agents, and multi-model support
AnythingLLM is an all-in-one desktop and Docker app for chatting with documents using RAG, running AI agents, and connecting to local or hosted LLMs and vector databases.

LibreChat
Self-hosted multi-provider AI chat UI with agents and tools
LibreChat is a self-hosted AI chat platform that supports multiple LLM providers, custom endpoints, agents/tools, file and image chat, conversation search, and presets.


Netron
Visualizer for neural network and machine learning models
Netron is a model graph viewer for inspecting neural network and ML formats such as ONNX, TensorFlow Lite, PyTorch, Keras, Core ML, and more.

Khoj
Open-source personal AI for chat, semantic search and agents
Self-hostable personal AI 'second brain' for chat, semantic search, custom agents, automations and integration with local or cloud LLMs.
Perplexica
Privacy-focused AI answering engine with web search and citations
Self-hosted AI answering engine that combines web search with local or hosted LLMs to generate cited answers, with search history and file uploads.
Docker
Python
WebRTC
C