You're reading from AWS Automation Cookbook Continuous Integration and Continuous Deployment using AWS services

Product type Paperback

Published in Nov 2017

Publisher Packt

ISBN-13 9781788394925

Length 388 pages

Edition 1st Edition

Tools

AWS

Concepts

Cloud Computing

Author (1):

Nikit Swaraj

View More author details

Introducing VCS and Git

VCS comes under the category of software development, which helps a software team manage changes to source code over time. A VCS keeps track of each and every modification to the code in a database. If a mistake is made, the developer can compare earlier versions of the code and fix the mistake while minimizing disturbance to the rest of the team members.

The most widely used VCS in the world is Git. It's a mature and actively maintained open source project developed by Linus Torvalds in 2005.

What is VCS?

A version control system (VCS) is the system where the changes to a file (or a set of files) usually get recorded so that we can recall it whenever we want. In this book, we mostly play around with the source code of software or applications, but that does not mean that we can track the version changes to only the source code. If you are a graphic designer or infrastructure automation worker and want to keep every version of image layout or configuration file change, then VCS is the best thing to use.

Why VCS ?

There are lots of benefits to using VCS for a project. A few of them are mentioned here:

Collaboration: Anyone or everyone in the team can work on any file of the project at any time. There would be no question where the latest version of a file or the whole project is. It's in a common, central place, your version control system.
Storing versions properly: Saving a version of a file or an entire project after making changes is an essential habit, but without using a VCS, it will become very tough, tedious, and error-prone. With a VCS, we can save the entire project and mention the name of the versions as well. We can also mention the details of the projects, and what all changes have been done in the current version as compared to the previous version in a README file.
Restoring previous versions: If you mess up with your present code, you can simply undo the changes in a few minutes.

There are many more features of using VCS while implementing or developing a project.

Types of VCS

The types of VCS are mentioned as follows:

Local version control system: In a local VCS, all the changes to a file are kept in the local machine, which has a database that has all the changes to a file under revision control, for example, Revision control system (RCS).
Centralized version control system: In a centralized VCS, we can collaborate with other developers on different machines. So in these VCS, we need a single server that contains all the versioned files and the number of clients can check out files from that single server, for example, Subversion (SVN).
Distributed version control system: In a distributed VCS, the client not only checks out the latest version of the file but also mirrors the whole repository. Thus if any server dies, and these systems were collaborating via it, any of the client repositories can be copied back to the server to restore it. An example of this is Git.

What is Git?

Git is a distributed VCS, and it came into the picture when there was some maintenance needed in the Linux Kernel. The Linux Kernel development community was using a proprietary Distributed version control system (DVCS) called BitKeeper. But after some time, the relationship between the Linux community developers and the proprietary software BitKeeper broke down, which led to Linux community developers (in particular Linux creator Linus Torvalds) developing their own DVCS tool called Git. They took a radical approach that makes it different from other VCSs such as CVS and SVN.

Why Git over other VCSs?

It wouldn't be appropriate to say Git is better than SVN or any other VCS. It depends on the scenario and the requirements of the project. But nowadays, most enterprises have chosen Git as their VCS for the following reasons:

Distributed nature: Git has been designed as a distributed VCS, which means every user can have a complete copy of the repository data stored locally, so they can access the file history extremely fast. It also allows full functionality when the user is not connected to the network, whereas in a centralized VCS, such as SVN, only the central repository has the complete history. This means the user needs to connect with the network to access the history from the central repository.
Branch handling: This is one of the major differences. Git has built-in support for branches and strongly encourages developers to use them, whereas SVN can also have branches, but its practice and workflow does not have the inside command. In Git, we can have multiple branches of a repository, and in each repository, you can carry out development, test it, and then merge, and it's in a tree fashion. In SVN, everything is linear; whenever you add, delete, or modify any file, the revision will just increment by one. Even if you roll back some changes in SVN, it will be considered a new revision:
Smaller space requirements: Git repositories and working directory sizes are very small in comparison with SVN.

Features of Git

The following are some of the features of Git:

Captures snapshots, not entire files: Git and other VCSs had this major difference; VCS keeps the record of revisions in the form of a file. This means it keeps a set of files for every revision. Git, however, has another way of accounting for changes. Every time you commit or save the state of your project in Git, it basically takes a snapshot of what your files look like at that very moment and stores a reference to that snapshot. If files have not been changed, Git does not store the file again; it stores a link to the previous identical file it has already stored.
Data integrity: Before storing any data in a Git repository, it is first checksummed, and is then referred to by that checksum. That means, if you carry out any other modification in the file, then Git will have every record of every modification. The mechanism used by Git for checksumming is known as SHA-1 hash.
SHA-1 hash looks something like this:
```
b52af1db10a8c915cfbb9c1a6c9679dc47052e34
```

States and areas: Git has three main states and views all files in three different states:
- Modified: This is the modification that has been done in the file, but not yet written or committed in the database.
- Committed: This ensures that the source code and related data are safely stored in your local database or machine
- Staged: This ensures that the modified file is added in its current version and is ready for the next commitment.

How to do it...

Here are the steps and commands that will guide you through installing and setting up Git and creating a repository in a very famous self-hosted Git, GitHub.

Installation of Git and its implementation using GitHub

If you want to use Git, we have to install the Git package on our system:
- For Fedora distributions (RHEL/CentOS):

# yum install git

For Debian distributions (Debian/Ubuntu):

# apt-get install git

Configure your identity with Git because every Git commit uses this information, for example, the following commit has been done by User awsstar and email is awsstar@foo.com:

# git config --global user.name “awsstar”
# git config --global user.email “awsstar@foo.com”

Check your settings. You will find the above username and email-id:

# git config --list

Now, let's try to create a repository on GitHub:
- Hit www.github.com in your web browser and log in with your credentials
- Click on create New Repository

Then, we will get something like the following screenshot. We have to mention the Repository name and a Description of the repository. After that, we need to select Public or Private based on our requirements. When we opt for Public, then anyone can see your repository, but you pick who can commit; when you opt for Private, then you pick who can see and who can commit, meaning by default it won't be visible to anyone. After that, we have to initialize the README, where we can give a detailed description of the project and click on Create Repository:

Once we have a repository, HelloWorld, then let's try to clone it to our local machine and some program files. Cloning a repository means creating a local copy of the repository and it can be done as follows:
- Fetch the Git URL by clicking on Clone or Download (https://github.com/awsstar/HelloWorld.git):

- Now, clone the URL:

    root@awsstar:~# git clone https://github.com/awsstar/HelloWorld.git
     Cloning into 'HelloWorld'...
     remote: Counting objects: 4, done.
     remote: Compressing objects: 100% (3/3), done.
     remote: Total 4 (delta 0), reused 0 (delta 0), pack-reused 0
     Unpacking objects: 100% (4/4), done.
     Checking connectivity... done.
     root@abae81a80866:~# ls
     HelloWorld
     root@awsstar:~# cd HelloWorld
     root@awsstar:~/HelloWorld# ls
     LICENSE README.md
     root@awsstar:~/HelloWorld#

We have the HelloWorld repository on our local machine. So, let's add index.html and push it back to the repository. Create a file, index.html, and write HelloWorld inside it:

    root@awsstar:~/HelloWorld# echo '<h1> HelloWorld </h1>' > index.html

The git status command checks the current status and reports whether there is anything left to commit or not:

    root@awsstar:~/HelloWorld# git status
    On branch masterYour branch is up-to-date with     'origin/master'.Untracked files: (use "git add <file>..." to include     in what will be committed)
     index.html
     nothing added to commit but untracked files present (use "git add"     to track)

Now to add the changes to the repository, we have to enter this command:

    root@awsstar:~/HelloWorld# git add .

To store the current contents of the index in a new commit, along with a log message from the user describing the changes, we need to enter this command:

    root@awsstar:~/HelloWorld# git commit -m "index.html added"
    [master 7be5f57] index.html added 1 file changed, 1 insertion(+)
    create mode 100644 index.html

Push your local changes to the remote repository:

    root@awsstar:~/HelloWorld# git push origin master
     Username for 'https://github.com': awsstar
     Password for 'https://awsstar@github.com':
     Counting objects: 3, done.
     Delta compression using up to 4 threads.
     Compressing objects: 100% (2/2), done.
     Writing objects: 100% (3/3), 327 bytes | 0 bytes/s, done.
     Total 3 (delta 0), reused 0 (delta 0)
     To https://github.com/awsstar/HelloWorld.git
     a0a82b2..7be5f57 master -> master

Here, we can see that index.html is now in our GitHub repository: