ebook2audiobook
Self-hostable tool to convert non-DRM eBooks into audiobooks with chapter support, metadata, multilingual TTS engines, and optional voice cloning via a web UI or CLI.
ebook2audiobook is a tool for generating audiobooks from non-DRM, legally acquired eBooks using multiple text-to-speech (TTS) engines. It can run with a Gradio web interface or in headless/CLI mode, and supports multilingual narration with optional voice cloning.
Key Features
- Converts many input formats including EPUB, MOBI/AZW3, FB2, PDF, DOC/DOCX, HTML, RTF, TXT, and image-based documents
- OCR support for scanned pages and image-based eBooks
- Multiple TTS engine options (including XTTSv2 and others) with broad language coverage
- Optional voice cloning using a provided reference voice file
- Supports custom XTTSv2 model uploads (e.g., zipped model artifacts)
- Outputs common audiobook/audio formats including MP3, M4B, M4A, AAC, FLAC, OGG, WAV, and WebM
- Runs on CPU or accelerators (CUDA and other backends depending on environment)
Use Cases
- Converting personal eBook libraries into listenable audiobooks with chapters and metadata
- Producing multilingual narration for accessibility, language learning, or travel
- Creating custom-voice narration for personal use using voice cloning
Limitations and Considerations
- Intended for non-DRM, legally acquired eBooks; DRM-protected sources require separate lawful handling
- OCR quality and document structure (especially EPUB chapter boundaries) can affect chapter splitting and narration results
It is well-suited for users who want a local web UI and batch-capable CLI for audiobook generation, while keeping flexibility in TTS engines, languages, and output formats. With GPU acceleration and suitable TTS models, it can significantly improve throughput and audio quality for larger books.


