File Wizard

File Wizard

Browser-based file conversion, OCR, and audio transcription UI

777stars
42forks
Last commit: 2mo ago
Repo age: 1y old

File Wizard is a browser-based utility for converting files, running OCR on PDFs/images, and transcribing audio. It provides a simple web UI that orchestrates common command-line tools and local ML models, with job tracking and a persistent history.

Key Features

  • Convert between many document, image, audio, and video formats by wrapping external tools (configurable via a YAML settings file)
  • OCR for PDFs and images using Tesseract and OCRmyPDF, including generating searchable PDFs
  • Audio transcription using local Whisper models (faster-whisper), with subtitle-style outputs supported by Whisper tooling
  • Drag-and-drop web interface with responsive dark UI
  • Background job processing with real-time status updates and stored job history
  • Optional OAuth/OIDC-based access control configuration (can run without auth in local-only mode)
  • Optional CUDA-enabled container image for GPU-accelerated transcription

Use Cases

  • Convert office documents and ebooks into consistent archival formats (PDF, EPUB, DOCX)
  • Turn scanned PDFs into searchable documents with OCR
  • Create transcripts/subtitles from meeting recordings and other audio files

Limitations and Considerations

  • Not safe to expose publicly without strong authentication and isolation; wrapping converters can introduce arbitrary command execution risk if misconfigured
  • Conversion fidelity and supported formats depend on the installed external tools and their build options
  • Transcription performance varies significantly by model size and whether GPU acceleration is available

File Wizard fits well for homelabs and internal teams that want a single, lightweight web interface to run conversions, OCR workflows, and local speech-to-text processing. Its tool-based architecture makes it extensible, but it should be deployed with careful security controls when used beyond local environments.

Categories:

Tags:

Tech Stack:

Share:

Similar Services

Stirling PDF

Stirling PDF

Self-hosted PDF editing, conversion, OCR, and automation platform

73.1k
6.2k
Last commit: 16h ago

Open-source PDF platform to edit, convert, OCR, sign, redact, and automate PDF workflows via a web UI and REST API.

Alternative to:
Adobe Acrobat
Adobe Acrobat
+19
Paperless-ngx

Paperless-ngx

Document management system with OCR, search, and automated filing

35.7k
2.3k
Last commit: 17h ago

Paperless-ngx is an open-source document management system that ingests scans and files, runs OCR, and turns them into a searchable, taggable document archive.

Alternative to:
DocuWare
DocuWare
+6
Reactive Resume

Reactive Resume

Privacy-focused, open-source resume builder

34.5k
3.8k
Last commit: 9d ago

Open-source resume builder for creating, customizing, exporting and publishing resumes with templates, PDF export, public sharing and optional OpenAI assistance.

Alternative to:
Resume.io
Resume.io
+5
CyberChef

CyberChef

Browser-based toolkit for data decoding, encoding and analysis

33.8k
3.8k
Last commit: 5mo ago

CyberChef is a web-based “cyber” toolkit for encoding/decoding, encryption/decryption, compression, hashing, parsing, and data transformation using drag-and-drop recipes.

ArchiveBox

ArchiveBox

Open-source self-hosted web archiving and snapshotting tool

26.4k
1.4k
Last commit: 11d ago

Self-hosted tool to collect and preserve webpages, media, and bookmarks in durable formats (HTML, PDF, WARC, MP4) with a CLI, web UI, and search.

Alternative to:
Internet Archive Wayback Machine
Internet Archive Wayback Machine
+3
ebook2audiobook

ebook2audiobook

Convert eBooks into audiobooks with TTS and optional voice cloning

17k
1.4k
Last commit: 1d ago

Self-hostable tool to convert non-DRM eBooks into audiobooks with chapter support, metadata, multilingual TTS engines, and optional voice cloning via a web UI or CLI.

Alternative to:
Speechify
Speechify
+7