Managing and storing large files in git

It is not uncommon for projects to have include high quality images and videos that are large in size. If you have large files in your repository, such as images and videos, Git will keep a full copy of the file in the repo every time you commit a change to the file. Git is ultimately versioning the file, if many versions of these files exist in your repo, they will dramatically increase the time to check out, branch, fetch, and clone the code. 

Luckily git has solved this problem using Git Large File System (LFS). LFS is an extension to Git; it replaces large files, such as audio samples, videos, datasets, and graphics, with text pointers inside Git, while storing the file's contents on a remote server which commits data that describes the large files in a commit to your repo, and stores the binary file contents into separate remote storage. 

When you clone and switch branches in your repo, Git LFS automatically downloads the correct version from that remote storage. Your local development tools will transparently work with the files as if they were committed directly to your repo.

Git LFS provides your teams with a seamless experience, as they can use the familiar end-to-end Git workflow no matter whether they work on small or large files. LFS files can be as big as you need them to be. As of version 2.0, Git LFS now also supports file-locking (https://github.com/git-lfs/git-lfs/wiki/File-Locking) to help your team work on large, undefiable assets, such as videos, sounds, and game maps. 

You should be aware of a few things before using Git LFS:

  • Every Git client used by your team must install the Git LFS client and understand its tracking configuration (https://github.com/github/git-lfs/tree/master/docs).
  • If the Git LFS client is not installed and configured correctly, you will not see the binary files committed through Git LFS when you clone your repo. Git will download the data that describes the large file (which is what Git LFS commits to the repo) and not the actual binary file. Committing large binaries without the Git LFS client installed will push the binary to your repo.
  • Git cannot merge the changes from two different versions of a binary file even if both versions have a common parent. If two people are working on the same file at the same time, they must work together to reconcile their changes to avoid overwriting the other's work. Git LFS provides file-locking to help. Users must still take care to always pull the latest copy of a binary asset before beginning work.
  • Azure DevOps server currently does not support using SSH in repos with Git LFS tracked files.