Using Python or any other language requires you to use a version control system. A version control system is a tool that records changes in files over time. This allows a programmer to revert to an earlier version of the file and identify bugs more easily. You can test new ideas without fear of breaking your current code, and your team can work using a predefined workflow without stepping on each others' toes. Git was developed by Linus Torvalds, the father of Linux. It's decentralized, light, and has great features that get the job done the right way.
Version control with Git
Installing Git
Installing Git is very simple. Simply go to http://www.git-scm.com/downloads and click on the operating system (OS) that is being run. A program will begin to download will walk you through the basic installation process.
Git on Windows
Git was originally solely developed for Unix OSes (for example, Linux and macOS X). Consequently, using Git on Windows is not seamless. During the installation, the installer will ask whether you want to install Git alongside the normal Windows Command Prompt. Do not pick this option. Choose the default option that will install a new type of command processor on your system named Bash (Bourne-again shell), which is the same command processor that the Unix systems use. Bash is much more powerful than the default Windows command line, and this is what we will be using for all the examples in this book.
Git basics
Git is a very complex tool; only the basics that are needed for this book will be covered in this section.
Git does not track your changes automatically. In order for Git to run properly, we have to give it the following information:
- Which folders to track
- When to save the state of the code
- What to track and what not to track
Before we can do anything, we have to tell Git to initialize a new git repository in our directory. Run the following code on your Terminal:
$ git init
Git will now start to track changes in our project. As git tracks our files, we can see the status of our tracked files and any files that are not tracked by typing the following command:
$ git status
Now we can save our first commit, which is a snapshot of our code at the time that we run the commit command:
# In Bash, comments are marked with a #, just like Python # Add any files that have changes and you wish to save in this
# commit $ git add main.py # Commit the changes, add in your commit message with -m $ git commit -m "Our first commit"
Now, at any point in the future, we can return to this point in our project. Adding files that are to be committed is called staging files in Git. Remember that you should only add stage files if you are ready to commit them. Once the files are staged, any further changes will not be staged. For an example of more advanced Git usage, add any text to your main.py file with your text editor and then run the following:
# To see the changes from the last commit $ git diff # To see the history of your changes $ git log # As an example, we will stage main.py # and then remove any added files from the stage $ git add main.py $ git status $ git reset HEAD main.py # After any complicated changes, be sure to run status # to make sure everything went well $ git status # lets delete the changes to main.py, reverting to its state at the
# last commit # This can only be run on files that aren't staged $ git checkout -- main.py
Your terminal should look something like the following:
Note that in the preceding example I have modified the main.py file by adding the comment # Changed to show the git diff command.
One important step to include in every Git repository is a .gitignore file. This file tells Git what files to ignore. This way you can safely commit and add all your files. The following are some common files that you can ignore:
- Python's byte code files (*.pyc)
- Databases (specially for our examples using SQLLite database files) (*.db)
- Secrets (never push secrets (password, keys, and so on) to your repositories)
- IDE metadata files (.idea)
- The Virtualenv directory (env or venv)
Here's a simple example of a gitignore file:
*.pyc
*.pem
*.pub
*.tar.gz
*.zip
*.sql
*.db
secrets.txt
./tmp
./build/*
.idea/*
.idea
env
venv
Now we can safely add all the files to git and commit them:
$ git add --all
$ git status
$ git commit -a -m "Added gitignore and all the projects missing
files"
The Git system's checkout command is rather advanced for this simple introduction, but it is used to change the current status of the Git system's HEAD pointer, which refers to the current location of our code in the history of our project. This will be shown in the next example.
Now, if we wish to see the code in a previous commit, we should first run the following command:
$ git log
commit cd88be37f12fb596be743ccba7e8283dd567ac05 (HEAD -> master)
Author: Daniel Gaspar
Date: Sun May 6 16:59:46 2018 +0100
Added gitignore and all the projects missing files
commit beb471198369e64a8ee8f6e602acc97250dce3cd
Author: Daniel Gaspar
Date: Fri May 4 19:06:57 2018 +0100
Our first commit
The string of characters next to our commit message, beb4711, is called the hash of our commit. It is the unique identifier of the commit that we can use to return to the saved state. Now, to take the project back to the previous state, run the following command:
$ git checkout beb4711
Your Git project is now in a special state where any changes or commits will neither be saved nor affect any commits that were made after the one you checked out. This state is just for viewing old code. To return to the normal mode of Git, run the following command:
$ git checkout master
Git branches and flow
Source control branches are an important feature that works great in team projects. A developer can create a new line of code from a specific point in time, revision, or tag. In this way, developing new features, creating releases, and making bugfixes or hotfixes can be done safely and subjected to team revision, and/or automatic integration tools (such as tests, code coverage, lint tools). A branch can be merged with other branches until it finally reaches the main line of code, called the master branch.
But let's get our hands on a practical exercise. Let's say that we want to develop a new feature. Our first chapter example displays the traditional "Hello World" message, but we want it to say "good morning" to the users. First, we create a branch from a special branch called the feature/good-morning that for now is a copy of the master branch, as shown in the following code:
# Display our branches
$ git branch
* master
# Create a branch called feature/good-morning from master
$ git branch feature/good-morning
# Display our branches again
$ git branch
feature/good-morning
* master
# Check out the new feature/good-morning branch
$ git checkout feature/good-morning
This could be resumed to the following:
$ git checkout -b feature/good-morning master
Now let's change our code to display good morning to the visitors of a certain URL, along with their names. To do this, we change main.py, which looks like the following code:
@app.route('/')
def home():
return '<h1>Hello world</h1>'
We change main.py to the following:
@app.route('/username')
def home():
return '<h1>Good Morning %s</h1>' % username
Let's look at what we have done:
$ git diff
diff --git a/main.py b/main.py
index 3e0aacc..1a930d9 100755
--- a/main.py
+++ b/main.py
@@ -5,9 +5,9 @@ app = Flask(__name__)
app.config.from_object(DevConfig)
# Changed to show the git diff command
-@app.route('/')
-def home():
- return '<h1>Hello World!</h1>'
+@app.route('/<username>')
+def home(username):
+ return '<h1>Good Morning %s</h1>' % username
if __name__ == '__main__':
app.run()
Looks good. Let's commit, as shown in the following code:
$ git commit -m "Display good morning because its nice"
[feature/good-morning d4f7fb8] Display good morning because its nice
1 file changed, 3 insertions(+), 3 deletions(-)
Now, if we were working as part of a team, or if our work was open source (or if we just wanted to back up our work), we should upload (push) our code to a centralized remote origin. One way of doing this is to push our code to a version control system, such as Bitbucket or GitHub, and then open a pull request to the master branch. This pull request will show our changes. As such, it may need approval from other team members, and many other features that these systems can provide.
For our example, let's just merge to the master, as shown in the following code:
# Get back to the master branch
$ git checkout master
Switched to branch 'master'
bash-3.2$ git log
commit 139d121d6ecc7508e1017f364e6eb2e4c5f57d83 (HEAD -> master)
Author: Daniel Gaspar
Date: Fri May 4 23:32:42 2018 +0100
Our first commit
# Merge our feature into the master branch
$ git merge feature/good-morning
Updating 139d121..5d44a43
Fast-forward
main.py | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
bash-3.2$ git log
commit 5d44a4380200f374c879ec1f7bda055f31243263 (HEAD -> master, feature/good-morning)
Author: Daniel Gaspar
Date: Fri May 4 23:34:06 2018 +0100
Display good morning because its nice
commit 139d121d6ecc7508e1017f364e6eb2e4c5f57d83
Author: Daniel Gaspar <daniel.gaspar@miniclip.com>
Date: Fri May 4 23:32:42 2018 +0100
Our first commit
As you can see from the output, Git uses the fast-forward strategy by default. If we wanted to keep an extra commit log message that mentions the merge itself, then we could have used the --no-ff flag on the git merge command. This flag will disable the fast-forward merging strategy.
Now imagine that we regret our change and want to revert the feature that we have just created back to an earlier version. To do this, we can use the following code:
$ git revert
With Git, you can actually delete your commits, but this is considered a really bad practice. Note that the revert command did not delete our merge, but created a new commit with the reverted changes. It's considered a good practice not to rewrite the past.
What was shown is a feature branch simple workflow. With big teams or projects, the use of more complex workflows is normally adopted to better isolate features, fixes, and releases, and to keep a stable line of code. This is what is proposed when using the git-flow process.
Now that we have a version control system, we are ready to cover Python's package management system.