git-annex
Manage large files with Git by storing content outside repositories

git-annex lets Git track and manage large files without storing their contents inside the Git repository. It records file identity, location, and metadata while keeping actual file content in annexed object stores, enabling sync, backup and archival workflows across local drives and remote backends.
Key Features
- Keeps file contents outside commits while storing pointers and metadata in Git, enabling versioning of metadata without bloating repositories
- Supports many "special remotes" and backends (S3, WebDAV, directory, rsync, bup, Tahoe, Glacier, bittorrent and more) for flexible storage and transfer
- Optional encryption and signing of content (GnuPG/gcrypt) and multiple checksum/key-value backends for integrity and detectability
- git-annex assistant and webapp provide an easier UI for folder synchronisation and managing queued transfers; core functionality remains CLI-first
- FUSE-backed presentation of annexed files as work-tree symlinks to enable transparent access without full local copies
- Location tracking and preferred-content policies let you locate which drive or remote holds specific files and automate movement (cron-friendly)
- Scales to very large collections and offline/archival workflows, with features for deduplication, clustering, and selective retrieval
Use Cases
- Long-term archiving: track datasets and media across multiple offline archival drives while preserving Git history of organization
- Nomadic/offline workflows: selectively fetch or drop large files on laptops, USB drives, and servers to save space while keeping metadata in Git
- Distributed sharing of large assets: exchange big files between collaborators or servers using multiple special-remote backends and sync policies
Limitations and Considerations
- Many convenience features (webapp, assistant, some backend support) require optional build-time libraries; packaged builds from distributions may omit features
- The webapp listens on localhost by default and must be configured for remote access; some platform ports (Windows/Android) are marked beta
- There is a learning curve for matching workflows and configuring special remotes; operations are primarily command-line oriented
git-annex is a mature, integrity-focused tool for bringing Git-style tracking to large files while avoiding repository bloat. It is best suited to users who need flexible backend support, offline/archival workflows, and programmable control over file placement and retrieval.
Categories:
Tags:
Tech Stack:
Similar Services

rclone
Command-line sync and management tool for cloud and remote storage
rclone is a CLI tool to sync, copy, mount, and serve files across cloud storage providers and standard protocols like S3, WebDAV, FTP, and SFTP.

restic
Fast, secure, deduplicating backup tool for files and directories
Restic is a fast, efficient backup program with encryption, deduplication, snapshots, and multiple storage backends including local, SFTP, REST, and S3-compatible stores.


Duplicati
Encrypted, incremental backups to cloud and remote storage
Open-source backup client for encrypted, compressed, incremental backups to cloud storage and remote servers via SFTP, WebDAV, and more.

BorgBackup
Deduplicating backup tool with encryption and compression.
BorgBackup is a deduplicating backup program with authenticated encryption and compression for Unix-like systems.


Kopia
Cross-platform snapshot-based backup tool with encryption and deduplication
Cross-platform backup and restore tool with snapshot-based incremental backups, client-side end-to-end encryption, compression, and deduplication via CLI and GUI.


Duplicacy
Lock-free deduplicating backup tool with cloud and SFTP storage support
Cross-platform backup tool with lock-free deduplication, encryption, snapshots, and pruning, supporting local disk, SFTP, and many cloud storage backends.

Haskell
rclone