In this article by H M Irfan Sadiq, the author of the book Using Yocto Project with BeagleBone Black, we will move one step ahead by detailing different aspects of the basic engine behind Yocto Project, and other similar projects. This engine is BitBake. Covering all the various aspects of BitBake in one article is not possible; it will require a complete book. We will familiarize you as much as possible with this tool.
We will cover the following topics in this article:
(For more resources related to this topic, see here.)
This discussion does not intend to invoke any religious row between other alternatives and BitBake. Every step in the evolution has its own importance, which cannot be denied, and so do other available tools. BitBake was developed keeping in mind the Embedded Linux Development domain. So, it tried to solve the problems faced in this core area, and in my opinion, it addresses these in the best way till date. You might get the same output using other tools, such as Buildroot, but the flexibility and ease provided by BitBake in this domain is second to none. The major difference is in addressing the problem. Legacy tools are developed considering packages in mind, but BitBake evolved to solve the problems faced during the creation of BSPs, or embedded distributions. Let's go through the challenges faced in this specific domain and understand how BitBake helps us face them.
BitBake takes care of cross compilation. You do not have to worry about it for each package you are building. You can use the same set of packages and build for different platforms seamlessly.
This is the real pain of resolving dependencies of packages on each other and fulfilling them. In this case, we need to specify the different dependency types available, and BitBake takes care of them for us. We can handle both build and runtime dependencies.
BitBake supports a variety of target distribution creations. We can define a full new distribution of our own, by choosing package management, image types, and other artifacts to fulfill our requirements.
BitBake is not very dependent on the build system we use to build our target images. We don't use libraries and tools installed on the system; we build their native versions and use them instead. This way, we are not dependent on the build system's root filesystem.
Since BitBake is very loosely coupled to the build system's distribution type, it's very easy to use on various distributions.
We have to support different architectures. We don't have to modify our recipes for each package. We can write our recipes so that features, parameters, and flags are picked up conditionally.
For the simplest projects, we have to build images and do more than a thousand tasks. These tasks require us to use the full power available to us, whether they are computational or related to memory. BitBake's architecture supports us in this regard, using its scheduler to run as many tasks in parallel as it can, or as we configure. Also, when we say task, it should not be confused with package, but it is a part of package. A package can contain many tasks, (fetch, compile, configure, package, populate_sysroot, and so on), and all these can run in parallel.
Keeping and relying on metadata keeps things simple and configurable. Almost nothing is hard coded. Thus, we can configure things according to our requirements. Also, BitBake provides us with a mechanism to reuse things that are already developed. We can keep our metadata structured, so that it gets applied/extended conditionally. You will learn these tricks when we will explore layers.
To get us to a successful package or image, BitBake performs some steps that we need to go through to get an understanding of the workflow. In certain cases, some of these steps can be avoided; but we are not discussing such cases, considering them as corner cases. For details on these, we should refer to the BitBake user manual.
When we invoke the BitBake command to build our image, the first thing it does is parse our base configuration metadata. This metadata consists of build_bb/conf/bblayers.conf, multiple layer/conf/layer.conf, and poky/meta/conf/bitbake.conf. This data can be of the following types:
Key variables BBFILES and BBPATH, which are constructed from the layer.conf file. Thus, the constructed BBPATH variable is used to locate configuration files under conf/ and class files under class/ directories. The BBFILES variable is used to find recipe files (.bb and .bbappend). bblayers.conf is used to set these variables.
Next, the bitbake.conf file is parsed.
If there is no bblayers.conf file, it is assumed that the user has set BBFILES and BBPATH directly in the environment.
After having dealt with configuration files, class files inclusion and parsing are taken care of. These class files are specified using the INHERIT variable. Next, BitBake will use the BBFILES variable to construct a list of recipes to parse, along with any append files. Thus, after parsing, recipe values for various variables are stored into datastore. After the completion of a recipe parsing BitBake has:
BitBake starts looking through the PROVIDES set in recipe files. The PROVIDES set defaults to the recipe name, and we can define multiple values to it. We can have multiple recipes providing a similar package. This task is accomplished by setting PROVIDES in the recipes. While actually making such recipes part of the build, we have to define PRREFERED_PROVIDER_foo so that our specific recipe foo can be used. We can do this in multiple locations. In the case of kernels, we use it in the manchin.conf file. BitBake iterates through the list of targets it has to build and resolves them, along with their dependencies.
If PRREFERED_PROVIDER is not set and multiple versions of a package exist, BitBake will choose the highest version.
Each target/recipe has multiple tasks, such as fetch, unpack, configure, and compile. BitBake considers each of these tasks as independent units to exploit parallelism in a multicore environment. Although these tasks are executed sequentially for a single package/recipe, for multiple packages, they are run in parallel. We may be compiling one recipe, configuring the second, and unpacking the third in parallel. Or, may be at the start, eight packages are all fetching their sources. For now, we should know the dependencies between tasks that are defined using DEPENDS and RDEPENDS. In DEPENDS, we provide the dependencies that our package needs to build successfully. So, BitBake takes care of building these dependencies before our package is built. RDEPENDS are the dependencies that are required for our package to execute/run successfully on the target system. So, BitBake takes care of providing these dependencies on the target's root filesystem.
Tasks can be defined using the shell syntax or Python. In the case of shell tasks, a shell script is created under a temporary directory as run.do_taskname.pid and then, it is executed. The generated shell script contains all the exported variables and the shell functions, with all the variables expanded. Output from the task is saved in the same directory with log.do_taskname.pid. In the case of errors, BitBake shows the full path to this logfile. This is helpful for debugging.
In this article, you learned the goals and problem areas that BitBake has considered, thus making itself a unique option for Embedded Linux Development. You also learned how BitBake actually works.
Further resources on this subject: