Apache Airflow

Apache Airflow

Platform to author, schedule, and monitor workflows as code

43.9kstars
16.3kforks
Last commit: 19h ago
Repo age: 11y old
Apache Airflow screenshot

Apache Airflow is an open source platform for programmatically authoring, scheduling, and monitoring workflows. Workflows are defined as code (DAGs), making them maintainable, versionable, and easier to test and operate at scale.

Key Features

  • Define workflows in Python with dynamic DAG generation and parametrization
  • Scheduling and dependency management for complex task graphs
  • Scalable execution using a scheduler and distributed workers, typically backed by a message queue
  • Web UI to visualize DAGs, monitor runs, inspect logs, and troubleshoot failures
  • Extensible architecture with a large ecosystem of operators, hooks, and provider integrations
  • Templating support (Jinja) for runtime parameters and task configuration

Use Cases

  • Orchestrating ETL/ELT data pipelines and batch data processing
  • Running scheduled machine learning and analytics workflows
  • Coordinating infrastructure or application automation that requires dependency-aware execution

Limitations and Considerations

  • Best suited for mostly static, slowly changing workflow structures rather than highly dynamic per-run graphs
  • Not a streaming engine; common patterns process near-real-time data in batches
  • Tasks should be idempotent and should avoid passing large datasets between tasks (use external storage/services and pass metadata instead)

Apache Airflow is a strong fit when you need reliable, observable orchestration for batch workflows with clear dependencies and operational controls. Its extensibility and broad integration ecosystem make it adaptable across many data and automation environments.

Categories:

Tags:

Tech Stack:

Share:

Similar Services

Portainer

Portainer

Web UI and API for managing Docker and Kubernetes environments

36.2k
2.8k
Last commit: 2d ago

Lightweight web-based platform to manage Docker, Swarm and Kubernetes resources with a GUI and API, including access control and multi-environment operations.

Alternative to:
Portainer Business Edition (Portainer Cloud)
Portainer Business Edition (Portainer Cloud)
+6
Dokploy

Dokploy

Self-hosted PaaS to deploy and manage containerized apps and databases.

29.1k
1.9k
Last commit: 1d ago

Open-source self-hostable PaaS for deploying containerized applications and managing databases with Docker Compose, Traefik, monitoring, and backups.

Alternative to:
Vercel
Vercel
+9
Kestra

Kestra

Open-source, event-driven workflow orchestration and scheduling platform

26.2k
2.5k
Last commit: 3d ago

Declarative, API-first orchestration platform for scheduled and event-driven workflows with a plugin ecosystem, UI editor, CI/CD and Terraform integration.

Alternative to:
Dagster Cloud
Dagster Cloud
+16
XPipe

XPipe

Connection hub and remote file manager for managing server infrastructure

13.5k
517
Last commit: 22h ago

Desktop application that centralizes SSH, containers, VMs, Kubernetes and remote file management; integrates local CLI tools and syncs connection data via git.

Alternative to:
MobaXterm
MobaXterm
+6
Coder

Coder

Self-hosted cloud development environments for teams and agents

12k
1.1k
Last commit: 21h ago

Open-source platform to provision secure, self-hosted developer workspaces (VMs, containers, Kubernetes) defined in Terraform, with IDE integrations and AI agent support.

Alternative to:
Coder (Coder Cloud)
Coder (Coder Cloud)
+6
Komodo

Komodo

Build and deployment system for managing software across servers

9.6k
258
Last commit: 3mo ago

Komodo is a self-hosted build and deployment platform to automate builds and deploy Docker containers and Compose stacks across many servers with a web UI and API.

Alternative to:
Coolify Cloud
Coolify Cloud
+19