Search icon CANCEL
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Conferences
Free Learning
Arrow right icon
Arrow up icon
GO TO TOP
Git Version Control Cookbook

You're reading from   Git Version Control Cookbook Leverage version control to transform your development workflow and boost productivity

Arrow left icon
Product type Paperback
Published in Jul 2018
Publisher
ISBN-13 9781789137545
Length 354 pages
Edition 2nd Edition
Tools
Arrow right icon
Authors (4):
Arrow left icon
Aske Olsson Aske Olsson
Author Profile Icon Aske Olsson
Aske Olsson
Emanuele Zattin(EUR) Emanuele Zattin(EUR)
Author Profile Icon Emanuele Zattin(EUR)
Emanuele Zattin(EUR)
Kenneth Geisshirt Kenneth Geisshirt
Author Profile Icon Kenneth Geisshirt
Kenneth Geisshirt
Rasmus Voss Rasmus Voss
Author Profile Icon Rasmus Voss
Rasmus Voss
Arrow right icon
View More author details
Toc

Table of Contents (14) Chapters Close

Preface 1. Navigating Git FREE CHAPTER 2. Configuration 3. Branching, Merging, and Options 4. Rebasing Regularly and Interactively, and Other Use Cases 5. Storing Additional Information in Your Repository 6. Extracting Data from the Repository 7. Enhancing Your Daily Work with Git Hooks, Aliases, and Scripts 8. Recovering from Mistakes 9. Repository Maintenance 10. Patching and Offline Sharing 11. Tips and Tricks 12. Git Providers, Integrations, and Clients 13. Other Books You May Enjoy

Git's objects

Now, since you know that Git stores every commit as a full tree state or snapshot, let's take a closer look at the object's Git store in the repository.

Git's object storage is a key-value storage, the key being the ID of the object and the value being the object itself. The key is an SHA-1 hash of the object, with some additional information, such as size. There are four types of objects in Git, as well as branches (which are not objects, but which are important) and the special HEAD pointer that refers to the branch/commit currently being checked out. The four object types are as follows:

  • Files, or blobs as they are also called in the Git context
  • Directories, or trees in the Git context
  • Commits
  • Tags

We will start by looking at the most recent commit object in the repository we just cloned, keeping in mind that the special HEAD pointer points to the branch that is currently being checked out.

Getting ready

To view the objects in the Git database, we first need a repository to be examined. For this recipe, we will clone an example repository in the following location:

$ git clone https://github.com/PacktPublishing/Git-Version-Control-Cookbook-Second-Edition.git
$ cd Git-Version-Control-Cookbook-Second-Edition

Now you are ready to look at the objects in the database. We will start by looking first at the commit object, followed by the trees, the files, and finally, the branches and tags.

How to do it...

Let's take a closer look at the object's Git stores in the repository.

The commit object

The Git's special HEAD object always points to the current snapshot/commit, so we can use that as the target for our request of the commit that we want to have a look at:

$ git cat-file -p HEAD
tree 34fa038544bcd9aed660c08320214bafff94150b
parent 5c662c018efced42ca5e9cce709787c40a849f34
author John Doe <john.doe@example.com> 1386933960 +0100
committer John Doe <john.doe@example.com> 1386941455 +0100

This is the subject line of the commit message. It should be followed by a blank line and then the body, which is this text. Here, you can use multiple paragraphs to explain your commit. It's like an email with a subject and a body to try to attract people's attention to the subject.

The cat-file command with the -p option prints the object given on the command line; in this case, HEAD, points to master, which, in turn, points to the most recent commit on the branch.

We can now see the commit object, consisting of the root tree (tree), the parent commit object's ID (parent), the author and timestamp information (author), the committer and timestamp information (committer), and the commit message.

The tree object

To see the tree object, we can run the same command on the tree, but with the tree ID (34fa038544bcd9aed660c08320214bafff94150b) as the target:

$ git cat-file -p 34fa038544bcd9aed660c08320214bafff94150b 
100644 blob f21dc2804e888fee6014d7e5b1ceee533b222c15    README.md
040000 tree abc267d04fb803760b75be7e665d3d69eeed32f8    a_sub_directory
100644 blob b50f80ac4d0a36780f9c0636f43472962154a11a    another-file.txt
100644 blob 92f046f17079aa82c924a9acf28d623fcb6ca727    cat-me.txt
100644 blob bb2fe940924c65b4a1cefcbdbe88c74d39eb23cd    hello_world.c

We can also specify that we want the tree object from the commit pointed to by HEAD by specifying git cat-file -p HEAD^{tree}, which would give the same results as the previous command. The special notation HEAD^{tree} means that from the reference given, HEAD recursively dereferences the object at the reference until a tree object is found.

The first tree object is the root tree object found from the commit pointed to by the master branch, which is pointed to by HEAD. A generic form of the notation is <rev>^<type>, and will return the first object of <type>, searching recursively from <rev>.

From the tree object, we can see what it contains: the file type/permissions, type (tree/blob), ID, and pathname:

Type/

Permissions

Type

ID/SHA-1

Pathname

100644

blob

f21dc2804e888fee6014
d7e5b1ceee533b222c15

README.md

040000

tree

abc267d04fb803760b75
be7e665d3d69eeed32f8

a_sub_directory

100644

blob

b50f80ac4d0a36780f9c
0636f43472962154a11a

another-file.txt

100644

blob

92f046f17079aa82c924
a9acf28d623fcb6ca727

cat-me.txt

100644

blob

bb2fe940924c65b4a1ce
fcbdbe88c74d39eb23cd

hello-world.c

The blob object

Now, we can investigate the blob (file) object. We can do this using the same command, giving the blob ID as the target for the cat-me.txt file:

$ git cat-file -p 92f046f17079aa82c924a9acf28d623fcb6ca727

The content of the file is cat-me.txt.

Not really that exciting, huh?

This is simply the content of the file, which we can also get by running a normal cat cat-me.txt command. So, the objects are tied together, blobs to trees, trees to other trees, and the root tree to the commit object, all connected by the SHA-1 identifier of the object.

The branch object

The branch object is not really like any other Git objects; you can't print it using the cat-file command as we can with the others (if you specify the -p pretty print, you'll just get the commit object it points to), as shown in the following code:

$ git cat-file master
usage: git cat-file (-t|-s|-e|-p|<type>|--textconv) <object>
or: git cat-file (--batch|--batch-check) < <list_of_objects>
    
<type> can be one of: blob, tree, commit, tag.
...
$ git cat-file -p master
tree 34fa038544bcd9aed660c08320214bafff94150b
parent a90d1906337a6d75f1dc32da647931f932500d83
...

Instead, we can take a look at the branch inside the .git folder where the whole Git repository is stored. If we open the text file .git/refs/heads/master, we can actually see the commit ID that the master branch points to. We can do this using cat, as follows:

$ cat .git/refs/heads/master
13dcada077e446d3a05ea9cdbc8ecc261a94e42d 

We can verify that this is the latest commit by running git log -1:

$ git log -1
commit 34acc370b4d6ae53f051255680feaefaf7f7850d (HEAD -> master, origin/master, origin/HEAD)
Author: John Doe <john.doe@example.com>
Date:   Fri Dec 13 12:26:00 2013 +0100
    
This is the subject line of the commit message
...

We can also see that HEAD is pointing to the active branch by using cat with the .git/HEAD file:

$ cat .git/HEAD
ref: refs/heads/master

The branch object is simply a pointer to a commit, identified by its SHA-1 hash.

The tag object

The last object to be analyzed is the tag object. There are three different kinds of tag: a lightweight (just a label) tag, an annotated tag, and a signed tag. In the example repository, there are two annotated tags:

$ git tag
v0.1
v1.0

Let's take a closer look at the v1.0 tag:

$ git cat-file -p v1.0
object f55f7383b57ad7c11cf56a7c55a8d738af4741ce
type commit
tag v1.0
tagger John Doe <john.doe@example.com> 1526017989 +0200

We got the hello world C program merged, let's call that a release 1.0

As you can see, the tag consists of an object—which, in this case, is the latest commit on the master branch—the object's type (commits, blobs, and trees can be tagged), the tag name, the tagger and timestamp, and finally the tag message.

How it works...

The Git command git cat-file -p will print the object given as an input. Normally, it is not used in everyday Git commands, but it is quite useful to investigate how it ties the objects together.

We can also verify the output of git cat-file by rehashing it with the Git command git hash-object; for example, if we want to verify the commit object at HEAD (34acc370b4d6ae53f051255680feaefaf7f7850d), we can run the following command:

$ git cat-file -p HEAD | git hash-object -t commit --stdin
13dcada077e446d3a05ea9cdbc8ecc261a94e42d

If you see the same commit hash as HEAD pointing towards you, you can verify whether it is correct using git log -1.

There's more...

There are many ways to see the objects in the Git database. The git ls-tree command can easily show the content of trees and subtrees, and git show can show the Git objects, but in a different way.

You have been reading a chapter from
Git Version Control Cookbook - Second Edition
Published in: Jul 2018
Publisher:
ISBN-13: 9781789137545
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at €18.99/month. Cancel anytime