
Paperless-ngx
Document management system with OCR, search, and automated filing

Paperless-ngx is a community-supported document management system that turns scanned paperwork and digital files into a searchable online archive. It ingests documents, performs OCR, and helps you organize and retrieve files using metadata and full-text search.
Key Features
- OCR processing to make scanned documents searchable and selectable, leveraging the Tesseract OCR engine
- Full-text search with relevance sorting, highlighting, auto-complete, and “similar documents” discovery
- Organization with tags, correspondents, document types, and configurable storage paths/filenames
- Modern web UI with dashboards, saved views, filtering, bulk edits, drag-and-drop uploads, and dark mode
- Workflow automation to apply rules and actions throughout the document pipeline
- Email ingestion with multiple accounts and rules, plus post-processing actions (mark read, delete, etc.)
- Multi-user permissions with global and per-object/document access control
- Document archival options including PDF/A storage for long-term preservation alongside originals
Use Cases
- Digitize and archive household or small-office paperwork (invoices, contracts, tax documents)
- Centralize document intake from scanners, folders, and email for consistent filing and retrieval
- Build a searchable compliance or record-keeping archive with controlled user access
Limitations and Considerations
- Documents are stored unencrypted by default (including extracted text), so it should be deployed only on trusted infrastructure with appropriate access controls and backups
Paperless-ngx is well-suited for replacing paper filing with a searchable digital archive while adding automation for tagging and routing. Its OCR and search capabilities make it practical for long-term document retention and fast retrieval.
Categories:
Tags:
Tech Stack:
Similar Services

Stirling PDF
Self-hosted PDF editing, conversion, OCR, and automation platform
Open-source PDF platform to edit, convert, OCR, sign, redact, and automate PDF workflows via a web UI and REST API.

Reactive Resume
Privacy-focused, open-source resume builder
Open-source resume builder for creating, customizing, exporting and publishing resumes with templates, PDF export, public sharing and optional OpenAI assistance.

CyberChef
Browser-based toolkit for data decoding, encoding and analysis
CyberChef is a web-based “cyber” toolkit for encoding/decoding, encryption/decryption, compression, hashing, parsing, and data transformation using drag-and-drop recipes.

ArchiveBox
Open-source self-hosted web archiving and snapshotting tool
Self-hosted tool to collect and preserve webpages, media, and bookmarks in durable formats (HTML, PDF, WARC, MP4) with a CLI, web UI, and search.
ebook2audiobook
Convert eBooks into audiobooks with TTS and optional voice cloning
Self-hostable tool to convert non-DRM eBooks into audiobooks with chapter support, metadata, multilingual TTS engines, and optional voice cloning via a web UI or CLI.


ConvertX
Self-hosted web-based file converter for 1000+ formats
ConvertX is a self-hosted web file converter supporting 1000+ formats across documents, images, audio/video, ebooks, and 3D assets, with multi-file processing and account...




