Speaches

Speaches

OpenAI API-compatible server for speech-to-text and text-to-speech

2.8kstars
356forks
Last commit: 20d ago
Repo age: 2y old
Speaches screenshot

Speaches is an OpenAI API-compatible server for speech-to-text, translation, and text-to-speech, designed to be a local “model server” for voice workflows. It supports streaming and realtime interactions so applications can transcribe or generate audio with minimal integration changes.

Key Features

  • OpenAI API compatibility for integrating with existing OpenAI SDKs and tools
  • Streaming transcription via Server-Sent Events (SSE) for incremental results
  • Speech-to-text powered by faster-whisper, with support for transcription and translation
  • Text-to-speech using Piper and Kokoro models
  • Realtime API support for low-latency voice interactions
  • Dynamic model loading and offloading based on request parameters and inactivity
  • CPU and GPU execution support
  • Deployable with Docker and Docker Compose and designed to be highly configurable

Use Cases

  • Replace hosted speech APIs with a self-managed, OpenAI-compatible voice backend
  • Build realtime voice assistants that need streaming STT and fast TTS responses
  • Batch transcription/translation pipelines for recordings with optional sentiment analysis

Speaches is a practical choice when you want OpenAI-style endpoints for voice features while retaining control over models and infrastructure. It fits well into existing OpenAI-oriented application stacks while focusing specifically on TTS/STT workloads.

Categories:

Tags:

Tech Stack:

Share:

Similar Services

Ollama

Ollama

Run and manage large language models locally with an API

159.6k
14.2k
Last commit: 16h ago

Ollama is a local LLM runtime that lets you pull, run, and customize models, offering a CLI and REST API for chat, generation, and embeddings.

Alternative to:
OpenAI API
OpenAI API
+15
LocalAI

LocalAI

OpenAI-compatible local AI inference server and API

42.1k
3.4k
Last commit: 19h ago

Run LLMs, image, and audio models locally with an OpenAI-compatible API, optional GPU acceleration, and a built-in web UI for managing and testing models.

Alternative to:
OpenAI API
OpenAI API
+19
Jina

Jina

Cloud-native Python framework for serving multimodal AI services

21.8k
2.2k
Last commit: 9mo ago

Open-source Python framework to build, scale, and deploy multimodal AI services and pipelines with gRPC/HTTP/WebSocket support and Kubernetes/Docker integration.

Alternative to:
Baseten
Baseten
+12
Willow

Willow

Open-source, privacy-focused voice assistant platform

3k
113
Last commit: 6mo ago

Self-hosted voice assistant platform for ESP32 devices with on-device wake-word and command recognition, Home Assistant integration, and an optional inference server for...

Alternative to:
Amazon Alexa
Amazon Alexa
+9
Unblink

Unblink

AI camera monitoring with federated vision workers

1.3k
152
Last commit: 1d ago

Open-source AI camera monitoring that routes camera streams through a relay/node proxy and broadcasts frames to federated AI workers for detections, summaries, and alerts...

Alternative to:
Blue Iris
Blue Iris
+10
withoutBG

withoutBG

Open-source image background removal with local models and hosted API

755
33
Last commit: 1mo ago

Open-source background-removal toolkit offering Focus/Snap local models, a Docker web app and Python SDK, plus a Pro API (Inferentia‑accelerated) for production use.

Alternative to:
remove.bg
remove.bg