Configuring GitLab CI/CD pipelines
We’ve mentioned that you can configure your project’s CI/CD pipeline to define its stages, jobs, and commands. But how do you do that? All CI/CD pipeline configuration happens within a file called .gitlab-ci.yml
, which lives in the root of your project’s repository. Look through any public GitLab project, and you’re sure to see a file with that name that determines what happens in that project’s pipeline.
Every .gitlab-ci.yml
file uses a domain-specific language that consists of keywords, values, and some syntactical glue. Some keywords define stages and jobs within those stages. Other keywords configure jobs to do different things within the pipeline. Still, other keywords set variables, specify Docker images for jobs, or affect the overall pipeline in various ways. This domain-specific language is rich enough to let you do just about anything you’d like in your CI/CD pipelines, but not so rich as to be overwhelming (at least, once you’ve had some experience writing and reading these CI/CD configuration files).
There are about 30 keywords available to use in a .gitlab-ci.yml
file. Rather than trying to memorize the details and configuration options available for each, we recommend that you concentrate on the big picture of what’s possible with CI/CD pipelines, and then learn the nuances of the relevant keywords as needed. The official GitLab documentation is the best source of information on these keywords, especially since they change from time to time.
We’ll spend much of the rest of this book demonstrating some of the key CI/CD pipeline tasks you can accomplish with these keywords, so this is a good time to dip your toe into the CI/CD pipeline configuration water by looking at a bare-bones .gitlab-ci.yml
file. The contents of this file will drive an actual pipeline, albeit a simple one. Let’s walk through it, explaining each line as we go.
Since .gitlab-ci.yml
files use the YAML format for structured data, this would be a good time to learn or review the extremely simple YAML syntax. The Wikipedia article on YAML is a good place to find that information. We’ll wait for you here until you feel confident using YAML.
Now that that’s out of the way, let’s get started. Most CI/CD configuration files begin by defining the pipeline’s stages. If you don’t define any stages, your pipeline will have build
, test
, and deploy
stages by default. If you do define stages, these will replace – not augment – the three default stages. For this simple pipeline, we only need the build
and test
stages, so let’s define those explicitly in a new file called .gitlab-ci.yml
at the root level of the hats-for-cats
project repository:
stages: - build - test
We’re going to have two jobs in this pipeline, with one job in each of the two stages we just defined. Let’s say that this project is Python-based, so both jobs will use Python-related tools. In the next chapter, we’ll explain more about how GitLab Runners can run jobs within Docker containers. For now, all you need to know is that we can specify a Docker image within our CI/CD configuration file for jobs to run within. In this case, both of our jobs will need access to Python tools, so we’ll tell the pipeline to use a Python Docker image for all jobs:
image: python:3.10
Our first job will run mypy
, which is a tool that makes sure Python source code uses the right data types in its functions and variables. This task could reasonably be put in either the build
or test
stage, but let’s put it in the build
stage just so we can have at least one job in that stage. Here’s how we define the job:
data-type-check: stage: build script: - pip install mypy - mypy src/hats-for-cats.py
Since the first word on the first line is not a keyword that GitLab recognizes, GitLab assumes it’s the name of a new job to be defined. This name can contain spaces instead of hyphens if you prefer, but sometimes, that can be harder to parse visually.
The next line assigns this job to the build
stage.
The third line starts with the script
keyword, which tells GitLab that we’re about to list the commands for this job. And the following two lines do exactly that: the first runs a command to use the pip
package manager to install the mypy
package into the Python Docker container that the job is running in. The second command runs the mypy
command that was just installed on any files that are in the src/
directory. If mypy
finds any problems with how our code uses data types, it will fail this job, which will fail the build
stage that the job lives in, which, in turn, will fail the entire pipeline instance.
Now, let’s define a job for running automated unit tests:
unit-tests: stage: test script: - pip install pytest - pytest test/ --junitxml=unit_test_results.xml artifacts: reports: junit: unit_test_results.xml when: always
Since the first line is not a recognized keyword, GitLab knows that this is the name of a new job that we’re defining.
The second line assigns the job to the test
stage.
Following the script
keyword, we define two commands for the job. The first installs the pytest
package, while the second runs the newly installed pytest
tool on any unit tests that live in the test/
directory. Furthermore, it specifies that pytest
should output the results of the unit tests to a file called unit_test_results.xml
, which will be in JUnit XML format.
The section that begins with the artifacts
keyword allows GitLab to preserve the unit test results file when the job finishes, instead of throwing it away. In GitLab terminology, any files that are generated by a job and then preserved are called artifacts. It’s important to understand that any files that were generated by a job but not declared to be artifacts are deleted as soon as the job finishes.
The exact syntax that’s used in this example artifacts
section isn’t too important because it can easily be looked up in the GitLab documentation when needed, but here, we are telling GitLab that this artifact contains unit test results in the JUnit XML format, which is an industry-standard format that GitLab requires to ingest and display the test results in the test tab on the pipeline details page.
The last line in the artifacts
section tells GitLab to preserve the results file as an artifact, even if the unit-tests
job fails. The job will have a failed status if there are any test failures, but we want to display the test results every time this job runs, even if (or especially if!) there are any test failures.
Combining all of the configuration code listed previously, the complete.gitlab-ci.yml
file looks like this:
stages: - build - test image: python:3.10 data-type-check: stage: build script: - pip install mypy - mypy src/hats-for-cats.py unit-tests: stage: test script: - pip install pytest - pytest test/ --junitxml=unit_test_results.xml artifacts: reports: junit: unit_test_results.xml when: always
The following screenshot shows the pipeline details page after this pipeline has finished. Don’t worry about the unit-tests job’s failed status. That’s expected whenever any of the tests that it runs fail:
Figure 4.9 – Details page for the completed pipeline that validates Python data types and runs unit tests