Managing Large Repositories
Because of its distributed nature, Git includes the full change history in each copy of the repository. Every clone gets not only all the files but every revision of every file ever committed. This allows for efficient development (local operations not involving a network are usually fast enough so that they are not a bottleneck) and efficient collaboration with others (their distributed nature allows for many collaborative workflows).
But what happens when the repository you want to work on is huge? Can we avoid taking a large amount of disk space for version control storage? Is it possible to reduce the amount of data that end users need to retrieve while cloning the repository? Do we need to have all files present to be able to work on a project?
If you think about it, there are broadly three main reasons for repositories to become massive: they can accumulate a very long history (every revision direction), they can include huge binary assets that...