git-annex

git-annex

Manage large files with Git by storing content outside repositories

git-annex screenshot

git-annex lets Git track and manage large files without storing their contents inside the Git repository. It records file identity, location, and metadata while keeping actual file content in annexed object stores, enabling sync, backup and archival workflows across local drives and remote backends.

Key Features

  • Keeps file contents outside commits while storing pointers and metadata in Git, enabling versioning of metadata without bloating repositories
  • Supports many "special remotes" and backends (S3, WebDAV, directory, rsync, bup, Tahoe, Glacier, bittorrent and more) for flexible storage and transfer
  • Optional encryption and signing of content (GnuPG/gcrypt) and multiple checksum/key-value backends for integrity and detectability
  • git-annex assistant and webapp provide an easier UI for folder synchronisation and managing queued transfers; core functionality remains CLI-first
  • FUSE-backed presentation of annexed files as work-tree symlinks to enable transparent access without full local copies
  • Location tracking and preferred-content policies let you locate which drive or remote holds specific files and automate movement (cron-friendly)
  • Scales to very large collections and offline/archival workflows, with features for deduplication, clustering, and selective retrieval

Use Cases

  • Long-term archiving: track datasets and media across multiple offline archival drives while preserving Git history of organization
  • Nomadic/offline workflows: selectively fetch or drop large files on laptops, USB drives, and servers to save space while keeping metadata in Git
  • Distributed sharing of large assets: exchange big files between collaborators or servers using multiple special-remote backends and sync policies

Limitations and Considerations

  • Many convenience features (webapp, assistant, some backend support) require optional build-time libraries; packaged builds from distributions may omit features
  • The webapp listens on localhost by default and must be configured for remote access; some platform ports (Windows/Android) are marked beta
  • There is a learning curve for matching workflows and configuring special remotes; operations are primarily command-line oriented

git-annex is a mature, integrity-focused tool for bringing Git-style tracking to large files while avoiding repository bloat. It is best suited to users who need flexible backend support, offline/archival workflows, and programmable control over file placement and retrieval.

Categories:

Tags:

Tech Stack:

Share:

Similar Services

rclone

rclone

Command-line sync and management tool for cloud and remote storage

55k
4.8k
Last commit: 2d ago

rclone is a CLI tool to sync, copy, mount, and serve files across cloud storage providers and standard protocols like S3, WebDAV, FTP, and SFTP.

Alternative to:
GoodSync
GoodSync
+8
restic

restic

Fast, secure, deduplicating backup tool for files and directories

31.8k
1.7k
Last commit: 1mo ago

Restic is a fast, efficient backup program with encryption, deduplication, snapshots, and multiple storage backends including local, SFTP, REST, and S3-compatible stores.

Alternative to:
Arq Backup
Arq Backup
+15
Duplicati

Duplicati

Encrypted, incremental backups to cloud and remote storage

14.1k
1k
Last commit: 1d ago

Open-source backup client for encrypted, compressed, incremental backups to cloud storage and remote servers via SFTP, WebDAV, and more.

Alternative to:
Duplicacy (Commercial)
Duplicacy (Commercial)
+15
BorgBackup

BorgBackup

Deduplicating backup tool with encryption and compression.

12.9k
811
Last commit: 3d ago

BorgBackup is a deduplicating backup program with authenticated encryption and compression for Unix-like systems.

Alternative to:
BorgBase
BorgBase
+14
Kopia

Kopia

Cross-platform snapshot-based backup tool with encryption and deduplication

12.3k
594
Last commit: 11d ago

Cross-platform backup and restore tool with snapshot-based incremental backups, client-side end-to-end encryption, compression, and deduplication via CLI and GUI.

Alternative to:
Duplicacy (Commercial)
Duplicacy (Commercial)
+15
Duplicacy

Duplicacy

Lock-free deduplicating backup tool with cloud and SFTP storage support

5.6k
349
Last commit: 8mo ago

Cross-platform backup tool with lock-free deduplication, encryption, snapshots, and pruning, supporting local disk, SFTP, and many cloud storage backends.

Alternative to:
Duplicacy (Commercial)
Duplicacy (Commercial)
+15