Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Save more on your purchases! discount-offer-chevron-icon
Savings automatically calculated. No voucher code required.
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletter Hub
Free Learning
Arrow right icon
timer SALE ENDS IN
0 Days
:
00 Hours
:
00 Minutes
:
00 Seconds
Arrow up icon
GO TO TOP
The Software Developer's Guide to Linux

You're reading from   The Software Developer's Guide to Linux A practical, no-nonsense guide to using the Linux command line and utilities as a software developer

Arrow left icon
Product type Paperback
Published in Jan 2024
Publisher Packt
ISBN-13 9781804616925
Length 300 pages
Edition 1st Edition
Tools
Arrow right icon
Authors (2):
Arrow left icon
Christian Sturm Christian Sturm
Author Profile Icon Christian Sturm
Christian Sturm
David Cohen David Cohen
Author Profile Icon David Cohen
David Cohen
Arrow right icon
View More author details
Toc

Table of Contents (20) Chapters Close

Preface 1. How the Command Line Works 2. Working with Processes FREE CHAPTER 3. Service Management with systemd 4. Using Shell History 5. Introducing Files 6. Editing Files on the Command Line 7. Users and Groups 8. Ownership and Permissions 9. Managing Installed Software 10. Configuring Software 11. Pipes and Redirection 12. Automating Tasks with Shell Scripts 13. Secure Remote Access with SSH 14. Version Control with Git 15. Containerizing Applications with Docker 16. Monitoring Application Logs 17. Load Balancing and HTTP 18. Other Books You May Enjoy
19. Index

Process basics

When we refer to a “process” in Linux, we’re referring to the operating system’s internal model of what exactly a running program is. Linux needs a general abstraction that works for all programs, which can encapsulate the things the operating system cares about. A process is that abstraction, and it enables the OS to track some of the important context around programs that are executing; namely:

  • Memory usage
  • Processor time used
  • Other system resource usage (disk access, network usage)
  • Communication between processes
  • Related processes that a program starts, for example, firing off a shell command

You can get a listing of all system processes (at least the ones your user is allowed to see) by running the ps program with the aux flags:

Figure 2.1: List of system processes

We’ll cover the attributes most relevant to your work as a developer in this chapter.

What is a Linux process made of?

From the perspective of the operating system, a “process” is simply a data structure that makes it easy to access information like:

  • Process ID (PID in the ps output above). PID 1 is the init system – the original parent of all other processes, which bootstraps the system. The kernel starts this as one of the first things it does after starting to execute. When a process is created, it gets the next available process ID, in sequential order. Because it is so important to the normal functioning of the operating system, init cannot be killed, even by the root user. Different Unix operating systems use different init systems – for example, most Linux distributions use systemd, while macOS uses launchd, and many other Unixes use SysV. Regardless of the specific implementation, we’ll refer to this process by the name of the role it fills: “init.”

    Note

    In containers, processes are namespaced – in the “real” environment, all container processes might be PID 3210, while that single PID maps to lots of processes (1..n, where n is the number of running processes in the container). You can see this from outside but not inside the container.

  • Parent Process PID (PPID). Each process is spawned by a parent. If the parent process dies while the child is alive, the child becomes an “orphan.” Orphaned processes are re-parented to init (PID 1).
  • Status (STAT in the ps output above). man ps will show you an overview:
    • D – uninterruptible sleep (usually IO)
    • I – idle kernel thread
    • R – running or runnable (on run queue)
    • S – interruptible sleep (waiting for an event to complete)
    • T – stopped by job control signal
    • t – stopped by debugger during tracing
    • X – dead (should never be seen)
    • Z – defunct (“zombie”) process, terminated but not reaped by its parent
  • Priority status (“niceness” – does this process allow other processes to take priority over it?).
  • A process Owner (USER in the ps output above); the effective user ID.
  • Effective Group ID (EGID), which is used.
  • An address map of the process’s memory space.
  • Resource usage – open files, network ports, and other resources the process is using (VSZ and RSS for memory usage in the ps output above).

(Citation: from the Unix and Linux System Administration Handbook, 5th edition, p.91.)

Let’s take a closer look at a few of the process attributes that are most important for developers and occasional troubleshooters to understand.

Process ID (PID)

Each process is uniquely identifiable by its process ID, which is just a unique integer that is assigned to a process when it starts. Much like a relational database with IDs that uniquely identify each row of data, the Linux operating system keeps track of each process by its PID.

A PID is by far the most useful label for you to use when interacting with processes.

Effective User ID (EUID) and Effective Group ID (EGID)

These determine which system user and group your process is running as. Together, user and group permissions determine what a process is allowed to do on the system.

As you’ll see in Chapter 5, Introducing Files, files have user and group ownership set on them, which determines who their permissions apply to. If a file’s ownership and permissions are essentially a lock, then a process with the right user/group permissions is like a key that opens the lock and allows access to the file. We’ll dive deeper into this later, when we talk about permissions.

Environment variables

You’ve probably used environment variables in your applications – they’re a way for the operating system environment that launches your process to pass in data that the process needs. This commonly includes things like configuration directives (LOG_DEBUG=1) and secret keys (AWS_SECRET_KEY), and every programming language has some way to read them out from the context of the program.

For example, this Python script gets the user’s home directory from the HOME environment variable, and then prints it:

import os
home_dir = os.environ['HOME']
print("The home directory for this user is", home_dir)

In my case, running this program in the python3 REPL on a Linux machine results in the following output:

The home directory for this user is /home/dcohen 

Working directory

A process has a “current working directory,” just like your shell (which is just a process, anyway). Typing pwd in your shell prints its current working directory, and every process has a working directory. The working directory for a process can change, so don’t rely on it too much.

This concludes our overview of the process attributes that you should know about. In the next section, we’ll step away from theory and look at some commands you can use to start working with processes right away.

You have been reading a chapter from
The Software Developer's Guide to Linux
Published in: Jan 2024
Publisher: Packt
ISBN-13: 9781804616925
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $19.99/month. Cancel anytime
Banner background image