Scraperr

Scraperr

Self-hosted no-code web scraping platform

4.8kstars
237forks
Last commit: 3mo ago
Repo age: 2y old
Scraperr screenshot

Scraperr is a self-hosted web scraping solution that lets you scrape websites from a web interface without writing code. It focuses on repeatable scraping jobs with structured results, exports, and optional crawling within a domain.

Key Features

  • No-code web UI for creating and managing scraping jobs
  • XPath-based extraction for precise element targeting
  • Queue management to submit and run multiple scraping jobs
  • Optional domain spidering to crawl and scrape pages within a site
  • Custom request headers provided as JSON
  • Media downloads for images, videos, and other assets
  • Results visualization in a structured table view
  • Export scraped data to CSV and Markdown
  • Completion notifications via supported channels

Use Cases

  • Collect product, directory, or listing data for internal analysis
  • Crawl and extract structured content from documentation or knowledge sites
  • Download and catalog media assets from permitted web sources

Limitations and Considerations

  • Uses browser automation; large crawls can be resource-intensive and may require careful rate limiting
  • Scraping capability and reliability depend on target site complexity and anti-bot measures

Scraperr fits teams and individuals who want a practical, UI-driven scraper they can run on their own infrastructure. It is well-suited for scheduled or repeated data collection workflows where exports and job management matter.

Categories:

Tags:

Tech Stack:

Share:

Similar Services

n8n

n8n

Workflow automation platform with visual builder and code support

169.5k
53.7k
Last commit: 23h ago

Self-hostable workflow automation platform combining a visual builder with JavaScript/Python code steps, 400+ integrations, and AI-assisted automation.

Alternative to:
Zapier
Zapier
+17
Ansible

Ansible

Agentless IT automation and configuration management engine

67.7k
24.2k
Last commit: 22h ago

Open source, agentless automation engine for configuration management, app deployment, orchestration, and infrastructure provisioning using YAML playbooks over SSH.

Alternative to:
Red Hat Ansible Automation Platform
Red Hat Ansible Automation Platform
+4
NocoDB

NocoDB

No-code spreadsheet interface for SQL databases with APIs

61.5k
4.6k
Last commit: 1d ago

Open-source Airtable alternative that turns Postgres/MySQL/SQLite into a no-code spreadsheet UI with views, permissions, integrations, and REST APIs.

Alternative to:
Airtable
Airtable
+10
Huginn

Huginn

Open-source platform for self-hosted automation agents

48.5k
4.2k
Last commit: 24d ago

Huginn is an open-source automation platform that runs agents to monitor web data, process events, and trigger actions — self-hosted and extensible.

Alternative to:
IFTTT
IFTTT
+17
Apache Airflow

Apache Airflow

Platform to author, schedule, and monitor workflows as code

43.9k
16.3k
Last commit: 19h ago

Apache Airflow is a workflow orchestration platform to define, schedule, and monitor data pipelines and other batch jobs using Python-defined DAGs.

Alternative to:
Astronomer
Astronomer
+5
Appsmith

Appsmith

Open-source low-code platform for internal tools and dashboards

38.9k
4.4k
Last commit: 2d ago

Build and deploy internal tools, admin panels, and dashboards with a low-code UI builder that connects to databases and APIs and supports JavaScript logic and Git workflo...

Alternative to:
Retool
Retool
+14