Paperless-ngx

Paperless-ngx

Document management system with OCR, search, and automated filing

36.9kstars
2.3kforks
Last commit: 1d ago
Repo age: 4y old
Paperless-ngx screenshot

Paperless-ngx is a community-supported document management system that turns scanned paperwork and digital files into a searchable online archive. It ingests documents, performs OCR, and helps you organize and retrieve files using metadata and full-text search.

Key Features

  • OCR processing to make scanned documents searchable and selectable, leveraging the Tesseract OCR engine
  • Full-text search with relevance sorting, highlighting, auto-complete, and “similar documents” discovery
  • Organization with tags, correspondents, document types, and configurable storage paths/filenames
  • Modern web UI with dashboards, saved views, filtering, bulk edits, drag-and-drop uploads, and dark mode
  • Workflow automation to apply rules and actions throughout the document pipeline
  • Email ingestion with multiple accounts and rules, plus post-processing actions (mark read, delete, etc.)
  • Multi-user permissions with global and per-object/document access control
  • Document archival options including PDF/A storage for long-term preservation alongside originals

Use Cases

  • Digitize and archive household or small-office paperwork (invoices, contracts, tax documents)
  • Centralize document intake from scanners, folders, and email for consistent filing and retrieval
  • Build a searchable compliance or record-keeping archive with controlled user access

Limitations and Considerations

  • Documents are stored unencrypted by default (including extracted text), so it should be deployed only on trusted infrastructure with appropriate access controls and backups

Paperless-ngx is well-suited for replacing paper filing with a searchable digital archive while adding automation for tagging and routing. Its OCR and search capabilities make it practical for long-term document retention and fast retrieval.

Categories:

Tags:

Tech Stack:

Share:

Similar Services

Stirling PDF

Stirling PDF

Self-hosted PDF editing, conversion, OCR, and automation platform

74.6k
6.3k
Last commit: 7h ago

Open-source PDF platform to edit, convert, OCR, sign, redact, and automate PDF workflows via a web UI and REST API.

Alternative to:
Adobe Acrobat
Adobe Acrobat
+19
Reactive Resume

Reactive Resume

Privacy-focused, open-source resume builder

35.4k
3.9k
Last commit: 1d ago

Open-source resume builder for creating, customizing, exporting and publishing resumes with templates, PDF export, public sharing and optional OpenAI assistance.

Alternative to:
Resume.io
Resume.io
+5
CyberChef

CyberChef

Browser-based toolkit for data decoding, encoding and analysis

34.1k
3.9k
Last commit: 1d ago

CyberChef is a web-based “cyber” toolkit for encoding/decoding, encryption/decryption, compression, hashing, parsing, and data transformation using drag-and-drop recipes.

ArchiveBox

ArchiveBox

Open-source self-hosted web archiving and snapshotting tool

26.9k
1.5k
Last commit: 1d ago

Self-hosted tool to collect and preserve webpages, media, and bookmarks in durable formats (HTML, PDF, WARC, MP4) with a CLI, web UI, and search.

Alternative to:
Internet Archive Wayback Machine
Internet Archive Wayback Machine
+3
ebook2audiobook

ebook2audiobook

Convert eBooks into audiobooks with TTS and optional voice cloning

18.3k
1.5k
Last commit: 5d ago

Self-hostable tool to convert non-DRM eBooks into audiobooks with chapter support, metadata, multilingual TTS engines, and optional voice cloning via a web UI or CLI.

Alternative to:
Speechify
Speechify
+7
ConvertX

ConvertX

Self-hosted web-based file converter for 1000+ formats

16k
874
Last commit: 2d ago

ConvertX is a self-hosted web file converter supporting 1000+ formats across documents, images, audio/video, ebooks, and 3D assets, with multi-file processing and account...

Alternative to:
CloudConvert
CloudConvert
+7