Defining scope – visibility, extent, and linkage
Often, when the scope of a variable or function is mentioned, it is referring only to the visibility of the variable or function. Visibility essentially determines which functions or statements can see the variable to either access it or modify it. If the variable is visible, it can be accessed and modified, except—as you may recall from Chapter 4, Using Variables and Assignments—when it is declared as a const
variable, it can only be accessed but cannot be changed. As we will see, visibility is but one component of a variable's scope. The other components of scope are extent (or the lifetime of the variable) and linkage (or in which file the variable exists).
The visibility, extent, and linkage of variables and functions depend upon where they are declared and how they are defined. However, regardless of how or where they are defined, they must be defined before they can be accessed. This is true for both functions and variables.
Scope applies to both variables as well as functions. However, the considerations for each of them are slightly different. We will address the scope of variables first, and then expand those concepts to the scope of functions.
Exploring visibility
The visibility of a variable is largely determined by its location within a source file. There are several places where a variable can appear, which determines its visibility. Some of these we have already explored. The following is a comprehensive listing of types of visibility:
- Block/local scope: This occurs in function blocks, conditional statement blocks, loop statement-body blocks, and unnamed blocks. These are also called internal variables. The visibility of variables declared in this scope is limited to the boundaries of the block where they are declared.
- Function parameter scope: Even though this scope occurs in function parameters, the function parameters are actually within the block scope of the function body.
- File scope: These are also called external variables. A variable declared outside any function parameter or block is visible to all other functions and blocks in that file. External scope enables access to functions and variables within a single file. Here, external refers to scope outside of block scope.
- Global scope: Global scope is when an external variable in one file is specially referenced in other files to make it visible to them. This is also called program scope. Global scope enables access to functions and variables across multiple files.
- Static scope: This is when a variable has block scope with a function but whose extent, or lifetime, differs from automatic variables. When a function or variable has static file scope, also called static external scope, it is visible only within that file. We will explore static function scope later in this chapter.
We have primarily been relying upon block scope for all of our programs. In some cases, we have had brief encounters with both external scope variables and static variables.
Note that internal variables exist within a block, whereas external variables exist in a source file outside of any function blocks. The block of the internal variable may be a function body, or, within a given function, it may be a loop body, a conditional expression block, or an unnamed block. We will explore examples of these later in this chapter.
However, the scope of a variable involves more than just visibility. While visibility is a major component of scope, we must also understand extent and linkage.
Exploring extent
The scope is also determined by the lifetime, or extent, of the variable. We explored the lifetime of variables and memory in Chapter 17, Understanding Memory Allocation and Lifetime. We revisit this topic here since it relates to the other components of scope: visibility and linkage.
The extent of a variable begins when a variable is created (memory is allocated for it) and ends when the variable is deallocated or destroyed. Within that extent, a variable is accessible and modifiable. Attempting to access or modify a variable outside of its extent will either raise a compiler error or may lead to unpredictable program behavior.
Internal variables have a somewhat limited extent, which begins within a block when the variable is declared and ends when the block ends. External variables are allocated when the program loads and exist until the program ends.
A variable's extent is also specified by a storage class, or how it is allocated, used, and subsequently deallocated. There are five classes of storage, as follows:
- auto: This is the default storage class when no other storage class is specified. When an
auto
variable is declared within a block, it has an internal variable extent. When anauto
variable is declared outside of a block, it has an external variable extent. - register: This is equivalent to
auto
but it provides a suggestion to the compiler to put the variable in one of the registers of the central processing unit (CPU). This is often ignored by modern compilers. - extern: Specifies that the variable has been defined (its memory has been allocated) in another file; in that other file, the variable must be an external variable. Therefore, its extent is the life of the program.
- static: A variable declared with this class has the visibility of the block scope but the extent of an external variable—that is, the life of the program; whenever that block is re-entered, the static variable retains the value it was last assigned.
- typedef: Formally, this is a storage class, but when used, a new data type is declared and no storage is actually allocated. A
typedef
scope is similar to afunction
scope, described later in this chapter.
Perhaps you can now see why memory allocation and deallocation are closely related to the extent component of the scope.
We can now turn to the last component of scope—linkage.
Exploring linkage
In a single source file program, the concept of linkage doesn't really apply since everything is contained within the single source file (even if it has its own header file). However, when we employ multiple source files in a program, a variable's scope is also determined by its linkage. Linkage involves declarations within a single source file—or compilation unit.
Understanding compilation units
A compilation unit is essentially a single source file and its header file. That source file may be a complete program or it may be just one among several or many source files that make up a final executable. Each source file is preprocessed and compiled individually in the compilation phase. The result of this is an intermediate object file. An object file knows about external functions and variables via header declarations but defers the resolution of their actual addresses until later.
When all source files have been successfully compiled into object files, the link phase is entered. In the link phase, the addresses of functions in other files or libraries are resolved and the addresses of external global variables are resolved. When all unresolved addresses have been successfully resolved (linked together), the object files are then combined into a single executable.
In the dealer.c
program, there were four source files. Each of those four files was an individual compilation unit. At compile time, each of those source files was compiled into four separate object files. Those four object files were then linked together and combined to form a single executable.
Everything within a compilation unit is visible and accessible to everything else within that compilation unit. The linkage of functions and variables is typically limited to just that compilation unit. To cross linkage boundaries (source files), we must employ header files with the proper storage classes for variables (extern
) as well as typedef
declarations and function prototypes.
So, the linkage component of scope involves making function and variable declarations available in another or many compilation unit(s).
Putting visibility, extent, and linkage all together
We now have an idea of the components involved in a scope. Within a single file, the visibility and extent components are somewhat intertwined and take primary consideration. With multiple files, the linkage component of scope requires more consideration.
We can think of a scope as starting from a very narrow range and expanding to the entire program. Block and function scope has the narrowest range. External variables and function prototypes have a wider scope, encompassing an entire file. The broadest scope occurs with the declarations from within a single file and expanded across multiple files.
Note
Some clarification is needed regarding global scope. Global scope means that a function or variable is accessible in two or more source files. It is very often confused with file scope, where a function or variable is only accessible in the single file where it is declared. So, when a programmer refers to a global variable, they often mean an external variable with file scope.
The preferred way to give a function or variable global scope is to define and initialize them in the originating source file with file scope, and make them accessible in any other file via linkage through the use of the extern
declaration for that variable (extern
is optional for functions).
Older compilers would allow any external variables with file scope to be accessible across all source files in a program, making them truly global variables. Linkage scope was therefore assumed across all source files in the program. This led to much misuse and name clashes of global variables. Most modern compilers no longer make such an assumption; linkage scope across file/compilation unit boundaries must now be explicit with the use of extern
. Such extern
variable declarations are easily done through the use of header files.
We can now focus on the specifics of the scope of variables.