What is the best free alternative to Transkriptor?

We have 5 open source alternatives to Transkriptor that you can self-host for free.

Can I self-host an alternative to Transkriptor?

Yes! All 5 alternatives listed here can be self-hosted on your own servers, giving you full control over your data and privacy.

Are these Transkriptor alternatives really free?

Yes, all alternatives are open source and free to use. Some may offer paid hosting or premium features, but the core software is always free.

Best Self-hosted Alternatives to Transkriptor

A curated collection of the 5 best self hosted alternatives to Transkriptor.

Transkriptor is an AI transcription service that converts audio and video into editable text, offering speaker recognition, timestamps, in-browser editing and export options for meetings, interviews, podcasts and content creation workflows.

ebook2audiobook

Self-hostable tool to convert non-DRM eBooks into audiobooks with chapter support, metadata, multilingual TTS engines, and optional voice cloning via a web UI or CLI.

ebook2audiobook is a tool for generating audiobooks from non-DRM, legally acquired eBooks using multiple text-to-speech (TTS) engines. It can run with a Gradio web interface or in headless/CLI mode, and supports multilingual narration with optional voice cloning.

Key Features

Converts many input formats including EPUB, MOBI/AZW3, FB2, PDF, DOC/DOCX, HTML, RTF, TXT, and image-based documents
OCR support for scanned pages and image-based eBooks
Multiple TTS engine options (including XTTSv2 and others) with broad language coverage
Optional voice cloning using a provided reference voice file
Supports custom XTTSv2 model uploads (e.g., zipped model artifacts)
Outputs common audiobook/audio formats including MP3, M4B, M4A, AAC, FLAC, OGG, WAV, and WebM
Runs on CPU or accelerators (CUDA and other backends depending on environment)

Use Cases

Converting personal eBook libraries into listenable audiobooks with chapters and metadata
Producing multilingual narration for accessibility, language learning, or travel
Creating custom-voice narration for personal use using voice cloning

Limitations and Considerations

Intended for non-DRM, legally acquired eBooks; DRM-protected sources require separate lawful handling
OCR quality and document structure (especially EPUB chapter boundaries) can affect chapter splitting and narration results

It is well-suited for users who want a local web UI and batch-capable CLI for audiobook generation, while keeping flexibility in TTS engines, languages, and output formats. With GPU acceleration and suitable TTS models, it can significantly improve throughput and audio quality for larger books.

18.3kstars

1.5kforks

View Details

Speakr

Speakr is a self-hosted web app for recording or uploading audio, transcribing with AI (including diarization), and turning conversations into searchable, shareable notes.

Speakr is a personal, self-hosted web application that turns audio recordings into organized, searchable notes using AI transcription and post-processing. It supports both cloud and self-hosted ASR/LLM backends and is designed for privacy-conscious individuals and teams.

Key Features

In-browser recording and audio file upload
AI transcription with optional speaker diarization and audio-transcript sync
Voice profiles via speaker embeddings when using a compatible WhisperX ASR service
Interactive chat and semantic “inquire” mode to query recordings using natural language
Tag-based organization with custom prompts, ASR settings, and prompt stacking
Sharing and collaboration with granular permissions, groups, and group-scoped tags
Retention policies and automatic deletion with tag-based protection
REST API v1 with OpenAPI/Swagger UI
Single Sign-On via OIDC providers

Use Cases

Meeting and standup transcription with searchable summaries and action items
Research, interviews, and personal voice notes exported into a knowledge base
Team knowledge capture for architecture decisions and client calls with controlled sharing

Limitations and Considerations

Some advanced features (voice profiles/embeddings) require a separate WhisperX ASR service and typically a GPU
LLM-powered summaries/chat depend on configuring a compatible text model provider

Speakr combines transcription, organization, and collaboration in a single web UI, while keeping data under your control. Its tagging, sharing, and retention features make it suitable for both personal note-taking and team workflows around recorded conversations.

2.8kstars

220forks

View Details

Scriberr

Scriberr is a self-hosted, privacy-focused AI transcription app for audio and video, with speaker diarization, word-level timestamps, summaries, and transcript chat.

Scriberr is an open-source application for transcribing audio and video locally, designed to keep recordings private by avoiding third-party cloud processing. It provides a web-based interface to upload, record, review, and work with transcripts, with optional integration for LLM-powered transcript chat and summaries.

Key Features

Local/offline transcription using modern speech-to-text models (including Whisper and newer model options)
Speaker diarization to separate and label different speakers
Word-level timestamps and transcript playback follow-along with seeking from text
Built-in audio recorder plus note-taking/annotation while listening
Transcript summarization and “chat with your audio” (supports local LLMs via Ollama and OpenAI-compatible providers)
Automation-friendly features such as an API and folder watcher for auto-processing new files
PWA support for a more native app-like experience on desktop and mobile

Use Cases

Transcribe meetings, interviews, and lectures without uploading sensitive audio to external services
Process large batches of recordings automatically via folder watching and API-driven workflows
Create searchable, annotated transcripts and generate summaries for personal knowledge capture

Limitations and Considerations

High-accuracy transcription and diarization can be resource-intensive; GPU acceleration is recommended for best performance
Some advanced features (like transcript chat) may require configuring external or local LLM providers

Scriberr is a strong fit for privacy-conscious users who want reliable local transcription with a polished review experience and workflow automation options. It combines transcription, organization, and AI-assisted analysis into a single self-hostable service.

2.1kstars

152forks

View Details

File Wizard

Self-hosted web UI for file conversion, OCR for PDFs/images, and local Whisper-based audio transcription, wrapping common CLI tools with background jobs and history.

File Wizard is a browser-based utility for converting files, running OCR on PDFs/images, and transcribing audio. It provides a simple web UI that orchestrates common command-line tools and local ML models, with job tracking and a persistent history.

Key Features

Convert between many document, image, audio, and video formats by wrapping external tools (configurable via a YAML settings file)
OCR for PDFs and images using Tesseract and OCRmyPDF, including generating searchable PDFs
Audio transcription using local Whisper models (faster-whisper), with subtitle-style outputs supported by Whisper tooling
Drag-and-drop web interface with responsive dark UI
Background job processing with real-time status updates and stored job history
Optional OAuth/OIDC-based access control configuration (can run without auth in local-only mode)
Optional CUDA-enabled container image for GPU-accelerated transcription

Use Cases

Convert office documents and ebooks into consistent archival formats (PDF, EPUB, DOCX)
Turn scanned PDFs into searchable documents with OCR
Create transcripts/subtitles from meeting recordings and other audio files

Limitations and Considerations

Not safe to expose publicly without strong authentication and isolation; wrapping converters can introduce arbitrary command execution risk if misconfigured
Conversion fidelity and supported formats depend on the installed external tools and their build options
Transcription performance varies significantly by model size and whether GPU acceleration is available

File Wizard fits well for homelabs and internal teams that want a single, lightweight web interface to run conversions, OCR workflows, and local speech-to-text processing. Its tool-based architecture makes it extensible, but it should be deployed with careful security controls when used beyond local environments.

818stars

50forks

View Details

ZipCaptions

Open-source PWA that generates live captions and transcripts in the browser; supports broadcasts, OBS/vMix integration, and optional Azure AI captions.

ZipCaptions is a browser-native, open-source application that produces live closed-captions and transcripts from audio sources. It runs as a Progressive Web App and focuses on client-side captioning with optional cloud-backed AI captioning for higher accuracy.

Key Features

In-browser real-time speech-to-text captioning (browser engine) without mandatory server processing.
Optional cloud AI captions using Azure Cognitive Services for improved accuracy (paid feature).
PWA installable experience; supports persistent overlay and browser integrations for live streams and broadcasts.
Streaming/broadcast support with joinable caption streams and direct integration guidance for OBS, vMix, and other production tools.
Local transcript storage with export options (SRT, VTT, TXT) for use with video or documentation workflows.
Multiple languages and dialect selection in settings to improve recognition quality.

Use Cases

Live event accessibility: provide open or closed captions for conferences, worship services, classrooms, and streamed events.
Broadcast/production workflows: feed live captions into OBS, vMix, or browser-source panels for real-time on-screen titles.
Post-session captioning: record and export session transcripts in subtitle formats for video publishing and archiving.

Limitations and Considerations

Cloud AI captions require Azure Cognitive Services and are restricted to paying supporters; browser engine remains the free/default option.
Browser and OS differences can affect microphone access and caption reliability (known issues documented for specific Chrome versions and some mobile builds).
Transcripts are stored locally per device by design; syncing across devices requires manual export/import.

ZipCaptions prioritizes accessibility-first, client-side captioning with optional cloud AI for higher accuracy. It is intended for event captioning and production integration where low-cost, privacy-conscious captioning is required.

57stars

8forks

View Details

Why choose an open source alternative?

•Data ownership: Keep your data on your own servers
•No vendor lock-in: Freedom to switch or modify at any time
•Cost savings: Reduce or eliminate subscription fees
•Transparency: Audit the code and know exactly what's running

Alternatives List

ebook2audiobook

Key Features

Use Cases

Limitations and Considerations

Speakr

Key Features

Use Cases

Limitations and Considerations

Scriberr

Key Features

Use Cases

Limitations and Considerations

File Wizard

Key Features

Use Cases

Limitations and Considerations

ZipCaptions

Key Features

Use Cases

Limitations and Considerations

Why choose an open source alternative?