[Exherbo-dev] git-annex for distfiles

Calvin Walton calvin.walton at kepstin.ca
Tue Mar 5 15:35:19 UTC 2013

On Wed, 2013-02-20 at 19:17 +0200, Ali Polatel wrote:
> git-annex is a way to manage files with git without checking their
> contents into git¹. There have been some chit-chat about using git-annex
> for distfiles management on IRC but we never really discussed it
> thoroughly.
> git-annex may provide several advantages to our distfiles management.
> One of the main advantages is integrity checking of files managed by
> git-annex². The users can use several remotes which make it a practical
> way to handle mirrors for them³.
> The deployment may not look simple at first sight but I do not think
> this is the case. git-annex is written in Haskell but it is fairly easy
> to build it as a static linked binary which may be distributed in
> ::arbor and stages. I have been using this approach with our radio
> station for a while now and haven't had any problems⁴. We can also use
> git-annex' special "web" remote to distribute files through http protocol⁵.
> The special remotes and especially the "hook" remote can even make it
> possible to distribute files via p2p or other protocols⁶.
> CC'ing infra monkeys for comments.
> Please discuss!

I was curious about this, so I've installed git-annex to take a look at
the features and see how such an integration would work.

It looks like the key issue that I'm hitting right now is that
performing any git-annex operation (such as 'git-annex get' to download
a distfile into the local repository) requires write access to the
repository git tree. (It will actually add a commit on the 'git-annex'
management branch.)

This of course conflicts with running the fetch operation with reduced
privileges (and sandboxing, were I to use src_fetch_extra).The two
solutions I see for this would be to either run syncers with reduced
privileges as well (so the repositories are owned by 'paludisbuild';
probably a good idea anyways), or run the fetch operation as root (less

Other than that, the fetch operation could be handled using either a
custom fetcher script (probably the best idea, since these aren't
sandboxed?) or an exlib (and src_fetch_extra), along with a git-annex
special 'hook' remote that supports fetching files from the user's
configured mirrors (from mirrors.conf).

Cleaning up old distfiles won't be that hard; git-annex has built-in
support for finding and removing 'unused' (no longer referenced by any
branch head) files.

Calvin Walton <calvin.walton at kepstin.ca>

More information about the Exherbo-dev mailing list