File Wizard
Browser-based file conversion, OCR, and audio transcription UI
File Wizard is a browser-based utility for converting files, running OCR on PDFs/images, and transcribing audio. It provides a simple web UI that orchestrates common command-line tools and local ML models, with job tracking and a persistent history.
Key Features
- Convert between many document, image, audio, and video formats by wrapping external tools (configurable via a YAML settings file)
- OCR for PDFs and images using Tesseract and OCRmyPDF, including generating searchable PDFs
- Audio transcription using local Whisper models (faster-whisper), with subtitle-style outputs supported by Whisper tooling
- Drag-and-drop web interface with responsive dark UI
- Background job processing with real-time status updates and stored job history
- Optional OAuth/OIDC-based access control configuration (can run without auth in local-only mode)
- Optional CUDA-enabled container image for GPU-accelerated transcription
Use Cases
- Convert office documents and ebooks into consistent archival formats (PDF, EPUB, DOCX)
- Turn scanned PDFs into searchable documents with OCR
- Create transcripts/subtitles from meeting recordings and other audio files
Limitations and Considerations
- Not safe to expose publicly without strong authentication and isolation; wrapping converters can introduce arbitrary command execution risk if misconfigured
- Conversion fidelity and supported formats depend on the installed external tools and their build options
- Transcription performance varies significantly by model size and whether GPU acceleration is available
File Wizard fits well for homelabs and internal teams that want a single, lightweight web interface to run conversions, OCR workflows, and local speech-to-text processing. Its tool-based architecture makes it extensible, but it should be deployed with careful security controls when used beyond local environments.
Categories:
Tags:
Tech Stack:
Similar Services

Stirling PDF
Self-hosted PDF editing, conversion, OCR, and automation platform
Open-source PDF platform to edit, convert, OCR, sign, redact, and automate PDF workflows via a web UI and REST API.

Paperless-ngx
Document management system with OCR, search, and automated filing
Paperless-ngx is an open-source document management system that ingests scans and files, runs OCR, and turns them into a searchable, taggable document archive.

Reactive Resume
Privacy-focused, open-source resume builder
Open-source resume builder for creating, customizing, exporting and publishing resumes with templates, PDF export, public sharing and optional OpenAI assistance.

CyberChef
Browser-based toolkit for data decoding, encoding and analysis
CyberChef is a web-based “cyber” toolkit for encoding/decoding, encryption/decryption, compression, hashing, parsing, and data transformation using drag-and-drop recipes.

ArchiveBox
Open-source self-hosted web archiving and snapshotting tool
Self-hosted tool to collect and preserve webpages, media, and bookmarks in durable formats (HTML, PDF, WARC, MP4) with a CLI, web UI, and search.
ebook2audiobook
Convert eBooks into audiobooks with TTS and optional voice cloning
Self-hostable tool to convert non-DRM eBooks into audiobooks with chapter support, metadata, multilingual TTS engines, and optional voice cloning via a web UI or CLI.

JavaScript
FastAPI
Uvicorn
HTML
Docker
Python
CSS