Papermerge

Website

Open-source document management system for scanned documents

Repository Website Demo

2.9kstars

303forks

Last commit: 3mo ago

Repo age: 6y old

Papermerge is a web-based document management system focused on scanned documents and digital archives. It extracts text via OCR, indexes documents for full-text search, and provides a desktop-like web UI for organizing and managing document collections.

Key Features

OCR processing of scanned PDFs and images (uses open-source OCR tooling to extract searchable text).
Full-text search with support for multiple search backends and indexing options.
OpenAPI-compliant REST API for automation and integrations.
Document versioning so original and processed versions (for example OCRed versions) are retained.
Categories, tags and user-defined custom fields (metadata) per document type for structured organization.
Page management: reorder, rotate, cut, move or extract individual pages within documents.
Multi-user access, group ownership and share controls for documents and folders.
Modern, responsive frontend with dual-panel browsing, drag-and-drop and internationalization.

Use Cases

Long-term archival of scanned documents for small-to-medium organizations and personal archives.
Processing receipts, invoices and administrative paperwork with metadata and searchable OCR text.
Managing contract and record versioning with searchable history and page-level edits.

Limitations and Considerations

Robust full-text search typically requires deploying an external search backend (e.g., Elasticsearch, Solr, Xapian) for large archives; bundled minimal setups may omit advanced search.
OCR and indexing are resource-intensive at scale and commonly run in background workers; production deployments should provision worker processes and sufficient CPU/RAM.
The public demo instance is intentionally limited (for example, OCR and full-text search may be disabled) and is reset periodically, so it is useful only for exploring the UI and basic flows.

Papermerge is a focused solution for turning scanned documents into searchable, organized archives with metadata and version control. It exposes a programmable API and can be integrated into automated ingestion pipelines for document-centric workflows.

Stirling PDF

Self-hosted PDF editing, conversion, OCR, and automation platform

74.6k

6.3k

Last commit: 7h ago

Open-source PDF platform to edit, convert, OCR, sign, redact, and automate PDF workflows via a web UI and REST API.

Alternative to:

Adobe Acrobat+19

Paperless-ngx

Document management system with OCR, search, and automated filing

36.9k

2.3k

Last commit: 1d ago

Paperless-ngx is an open-source document management system that ingests scans and files, runs OCR, and turns them into a searchable, taggable document archive.

Alternative to:

DocuWare+6

Reactive Resume

Privacy-focused, open-source resume builder

35.4k

3.9k

Last commit: 1d ago

Open-source resume builder for creating, customizing, exporting and publishing resumes with templates, PDF export, public sharing and optional OpenAI assistance.

Alternative to:

Resume.io+5

CyberChef

Browser-based toolkit for data decoding, encoding and analysis

34.1k

3.9k

Last commit: 1d ago

CyberChef is a web-based “cyber” toolkit for encoding/decoding, encryption/decryption, compression, hashing, parsing, and data transformation using drag-and-drop recipes.

ArchiveBox

Open-source self-hosted web archiving and snapshotting tool

26.9k

1.5k

Last commit: 1d ago

Self-hosted tool to collect and preserve webpages, media, and bookmarks in durable formats (HTML, PDF, WARC, MP4) with a CLI, web UI, and search.

Alternative to:

Internet Archive Wayback Machine+3

ebook2audiobook

Convert eBooks into audiobooks with TTS and optional voice cloning

18.3k

1.5k

Last commit: 5d ago

Self-hostable tool to convert non-DRM eBooks into audiobooks with chapter support, metadata, multilingual TTS engines, and optional voice cloning via a web UI or CLI.

Alternative to:

Speechify+7

Papermerge

Key Features

Use Cases

Limitations and Considerations

Categories:

Tags:

Tech Stack:

Similar Services

Stirling PDF

Paperless-ngx

Reactive Resume

CyberChef

ArchiveBox

ebook2audiobook