Working with variables
All applications process data. Data comes in, data is processed, and then data goes out.
Data usually comes into our program from files, databases, or user input, and it can be put temporarily into variables that will be stored in the memory of the running program. When the program ends, the data in memory is lost. Data is usually output to files and databases, or to the screen or a printer. When using variables, you should think about, firstly, how much space the variable takes in the memory, and, secondly, how fast it can be processed.
We control this by picking an appropriate type. You can think of simple common types such as int
and double
as being different-sized storage boxes, where a smaller box would take less memory but may not be as fast at being processed; for example, adding 16-bit numbers might not be processed as fast as adding 64-bit numbers on a 64-bit operating system. Some of these boxes may be stacked close by, and some may be thrown into a big heap further away.
Naming things and assigning values
There are naming conventions for things, and it is good practice to follow them, as shown in the following table:
Naming convention |
Examples |
Used for |
Camel case |
|
Local variables, private fields |
Title case aka Pascal case |
|
Types, non-private fields, and other members like methods |
Good Practice: Following a consistent set of naming conventions will enable your code to be easily understood by other developers (and yourself in the future!).
The following code block shows an example of declaring a named local variable and assigning a value to it with the =
symbol. You should note that you can output the name of a variable using a keyword introduced in C# 6.0, nameof
:
// let the heightInMetres variable become equal to the value 1.88
double heightInMetres = 1.88;
Console.WriteLine($"The variable {nameof(heightInMetres)} has the value
{heightInMetres}.");
The message in double quotes in the preceding code wraps onto a second line because the width of a printed page is too narrow. When entering a statement like this in your code editor, type it all in a single line.
Literal values
When you assign to a variable, you often, but not always, assign a literal value. But what is a literal value? A literal is a notation that represents a fixed value. Data types have different notations for their literal values, and over the next few sections, you will see examples of using literal notation to assign values to variables.
Storing text
For text, a single letter, such as an A
, is stored as a char
type.
Good Practice: Actually, it can be more complicated than that. Egyptian Hieroglyph A002 (U+13001) needs two System.Char
values (known as surrogate pairs) to represent it: \uD80C
and \uDC01
. Do not always assume one char
equals one letter or you could introduce weird bugs into your code.
A char
is assigned using single quotes around the literal value, or assigning the return value of a fictitious function call, as shown in the following code:
char letter = 'A'; // assigning literal characters
char digit = '1';
char symbol = '$';
char userChoice = GetSomeKeystroke(); // assigning from a fictitious function
For text, multiple letters, such as Bob
, are stored as a string
type and are assigned using double quotes around the literal value, or assigning the return value of a function call, as shown in the following code:
string firstName = "Bob"; // assigning literal strings
string lastName = "Smith";
string phoneNumber = "(215) 555-4256";
// assigning a string returned from a fictitious function
string address = GetAddressFromDatabase(id: 563);
Understanding verbatim strings
When storing text in a string
variable, you can include escape sequences, which represent special characters like tabs and new lines using a backslash, as shown in the following code:
string fullNameWithTabSeparator = "Bob\tSmith";
But what if you are storing the path to a file on Windows, and one of the folder names starts with a T
, as shown in the following code?
string filePath = "C:\televisions\sony\bravia.txt";
The compiler will convert the \t
into a tab character and you will get errors!
You must prefix with the @
symbol to use a verbatim literal string
, as shown in the following code:
string filePath = @"C:\televisions\sony\bravia.txt";
To summarize:
- Literal string: Characters enclosed in double-quote characters. They can use escape characters like
\t
for tab. To represent a backslash, use two:\\
. - Verbatim string: A literal string prefixed with
@
to disable escape characters so that a backslash is a backslash. It also allows thestring
value to span multiple lines because the white space characters are treated as themselves instead of instructions to the compiler. - Interpolated string: A literal string prefixed with
$
to enable embedded formatted variables. You will learn more about this later in this chapter.
Storing numbers
Numbers are data that we want to perform an arithmetic calculation on, for example, multiplying. A telephone number is not a number. To decide whether a variable should be stored as a number or not, ask yourself whether you need to perform arithmetic operations on the number or whether the number includes non-digit characters such as parentheses or hyphens to format the number, such as (414) 555-1234. In this case, the number is a sequence of characters, so it should be stored as a string
.
Numbers can be natural numbers, such as 42, used for counting (also called whole numbers); they can also be negative numbers, such as -42 (called integers); or, they can be real numbers, such as 3.9 (with a fractional part), which are called single- or double-precision floating-point numbers in computing.
Let's explore numbers:
- Use your preferred code editor to add a new Console Application to the
Chapter02
workspace/solution namedNumbers
:- In Visual Studio Code, select
Numbers
as the active OmniSharp project. When you see the pop-up warning message saying that required assets are missing, click Yes to add them. - In Visual Studio, set the startup project to the current selection.
- In Visual Studio Code, select
- In
Program.cs
, delete the existing code and then type statements to declare some number variables using various data types, as shown in the following code:// unsigned integer means positive whole number or 0 uint naturalNumber = 23; // integer means negative or positive whole number or 0 int integerNumber = -23; // float means single-precision floating point // F suffix makes it a float literal float realNumber = 2.3F; // double means double-precision floating point double anotherRealNumber = 2.3; // double literal
Storing whole numbers
You might know that computers store everything as bits. The value of a bit is either 0 or 1. This is called a binary number system. Humans use a decimal number system.
The decimal number system, also known as Base 10, has 10 as its base, meaning there are ten digits, from 0 to 9. Although it is the number base most commonly used by human civilizations, other number base systems are popular in science, engineering, and computing. The binary number system, also known as Base 2, has two as its base, meaning there are two digits, 0 and 1.
The following table shows how computers store the decimal number 10. Take note of the bits with the value 1 in the 8 and 2 columns; 8 + 2 = 10:
128 |
64 |
32 |
16 |
8 |
4 |
2 |
1 |
0 |
0 |
0 |
0 |
1 |
0 |
1 |
0 |
So, 10
in decimal is 00001010
in binary.
Improving legibility by using digit separators
Two of the improvements seen in C# 7.0 and later are the use of the underscore character _
as a digit separator, and support for binary literals.
You can insert underscores anywhere into the digits of a number literal, including decimal, binary, or hexadecimal notation, to improve legibility.
For example, you could write the value for 1 million in decimal notation, that is, Base 10, as 1_000_000
.
You can even use the 2/3 grouping common in India: 10_00_000
.
Using binary notation
To use binary notation, that is, Base 2, using only 1s and 0s, start the number literal with 0b
. To use hexadecimal notation, that is, Base 16, using 0 to 9 and A to F, start the number literal with 0x
.
Exploring whole numbers
Let's enter some code to see some examples:
- In
Program.cs
, type statements to declare some number variables using underscore separators, as shown in the following code:// three variables that store the number 2 million int decimalNotation = 2_000_000; int binaryNotation = 0b_0001_1110_1000_0100_1000_0000; int hexadecimalNotation = 0x_001E_8480; // check the three variables have the same value // both statements output true Console.WriteLine($"{decimalNotation == binaryNotation}"); Console.WriteLine( $"{decimalNotation == hexadecimalNotation}");
- Run the code and note the result is that all three numbers are the same, as shown in the following output:
True True
Computers can always exactly represent integers using the int
type or one of its sibling types, such as long
and short
.
Storing real numbers
Computers cannot always represent real, aka decimal or non-integer, numbers precisely. The float
and double
types store real numbers using single- and double-precision floating points.
Most programming languages implement the IEEE Standard for Floating-Point Arithmetic. IEEE 754 is a technical standard for floating-point arithmetic established in 1985 by the Institute of Electrical and Electronics Engineers (IEEE).
The following table shows a simplification of how a computer represents the number 12.75
in binary notation. Note the bits with the value 1
in the 8, 4, ½, and ¼ columns.
8 + 4 + ½ + ¼ = 12¾ = 12.75.
128 |
64 |
32 |
16 |
8 |
4 |
2 |
1 |
. |
½ |
¼ |
1/8 |
1/16 |
0 |
0 |
0 |
0 |
1 |
1 |
0 |
0 |
. |
1 |
1 |
0 |
0 |
So, 12.75
in decimal is 00001100.1100
in binary. As you can see, the number 12.75
can be exactly represented using bits. However, some numbers can't, something that we'll be exploring shortly.
Writing code to explore number sizes
C# has an operator named sizeof()
that returns the number of bytes that a type uses in memory. Some types have members named MinValue
and MaxValue
, which return the minimum and maximum values that can be stored in a variable of that type. We are now going to use these features to create a console application to explore number types:
- In
Program.cs
, type statements to show the size of three number data types, as shown in the following code:Console.WriteLine($"int uses {sizeof(int)} bytes and can store numbers in the range {int.MinValue:N0} to {int.MaxValue:N0}."); Console.WriteLine($"double uses {sizeof(double)} bytes and can store numbers in the range {double.MinValue:N0} to {double.MaxValue:N0}."); Console.WriteLine($"decimal uses {sizeof(decimal)} bytes and can store numbers in the range {decimal.MinValue:N0} to {decimal.MaxValue:N0}.");
The width of the printed pages in this book makes the
string
values (in double quotes) wrap over multiple lines. You must type them on a single line, or you will get compile errors. - Run the code and view the output, as shown in Figure 2.3:
Figure 2.3: Size and range information for common number data types
An int
variable uses four bytes of memory and can store positive or negative numbers up to about 2 billion. A double
variable uses eight bytes of memory and can store much bigger values! A decimal
variable uses 16 bytes of memory and can store big numbers, but not as big as a double
type.
But you may be asking yourself, why might a double
variable be able to store bigger numbers than a decimal
variable, yet it's only using half the space in memory? Well, let's now find out!
Comparing double and decimal types
You will now write some code to compare double
and decimal
values. Although it isn't hard to follow, don't worry about understanding the syntax right now:
- Type statements to declare two
double
variables, add them together and compare them to the expected result, and write the result to the console, as shown in the following code:Console.WriteLine("Using doubles:"); double a = 0.1; double b = 0.2; if (a + b == 0.3) { Console.WriteLine($"{a} + {b} equals {0.3}"); } else { Console.WriteLine($"{a} + {b} does NOT equal {0.3}"); }
- Run the code and view the result, as shown in the following output:
Using doubles: 0.1 + 0.2 does NOT equal 0.3
In locales that use a comma for the decimal separator the result will look slightly different, as shown in the following output:
0,1 + 0,2 does NOT equal 0,3
The double
type is not guaranteed to be accurate because some numbers like 0.1
literally cannot be represented as floating-point values.
As a rule of thumb, you should only use double
when accuracy, especially when comparing the equality of two numbers, is not important. An example of this may be when you're measuring a person's height and you will only compare values using greater than or less than, but never equals.
The problem with the preceding code is illustrated by how the computer stores the number 0.1
, or multiples of it. To represent 0.1
in binary, the computer stores 1 in the 1/16 column, 1 in the 1/32 column, 1 in the 1/256 column, 1 in the 1/512 column, and so on.
The number 0.1
in decimal is 0.00011001100110011
… in binary, repeating forever:
4 |
2 |
1 |
. |
½ |
¼ |
1/8 |
1/16 |
1/32 |
1/64 |
1/128 |
1/256 |
1/512 |
1/1024 |
1/2048 |
0 |
0 |
0 |
. |
0 |
0 |
0 |
1 |
1 |
0 |
0 |
1 |
1 |
0 |
0 |
Good Practice: Never compare double
values using ==
. During the First Gulf War, an American Patriot missile battery used double
values in its calculations. The inaccuracy caused it to fail to track and intercept an incoming Iraqi Scud missile, and 28 soldiers were killed; you can read about this at https://www.ima.umn.edu/~arnold/disasters/patriot.html.
- Copy and paste the statements that you wrote before (that used the
double
variables). - Modify the statements to use
decimal
and rename the variables toc
andd
, as shown in the following code:Console.WriteLine("Using decimals:"); decimal c = 0.1M; // M suffix means a decimal literal value decimal d = 0.2M; if (c + d == 0.3M) { Console.WriteLine($"{c} + {d} equals {0.3M}"); } else { Console.WriteLine($"{c} + {d} does NOT equal {0.3M}"); }
- Run the code and view the result, as shown in the following output:
Using decimals: 0.1 + 0.2 equals 0.3
The decimal
type is accurate because it stores the number as a large integer and shifts the decimal point. For example, 0.1
is stored as 1
, with a note to shift the decimal point one place to the left. 12.75
is stored as 1275
, with a note to shift the decimal point two places to the left.
Good Practice: Use int
for whole numbers. Use double
for real numbers that will not be compared for equality to other values; it is okay to compare double
values being less than or greater than, and so on. Use decimal
for money, CAD drawings, general engineering, and wherever the accuracy of a real number is important.
The double
type has some useful special values: double.NaN
represents not-a-number (for example, the result of dividing by zero), double.Epsilon
represents the smallest positive number that can be stored in a double
, and double.PositiveInfinity
and double.NegativeInfinity
represent infinitely large positive and negative values.
Storing Booleans
Booleans can only contain one of the two literal values true
or false
, as shown in the following code:
bool happy = true;
bool sad = false;
They are most commonly used to branch and loop. You don't need to fully understand them yet, as they are covered more in Chapter 3, Controlling Flow, Converting Types, and Handling Exceptions.
Storing any type of object
There is a special type named object
that can store any type of data, but its flexibility comes at the cost of messier code and possibly poor performance. Because of those two reasons, you should avoid it whenever possible. The following steps show how to use object types if you need to use them:
- Use your preferred code editor to add a new Console Application to the
Chapter02
workspace/solution namedVariables
. - In Visual Studio Code, select
Variables
as the active OmniSharp project. When you see the pop-up warning message saying that required assets are missing, click Yes to add them. - In
Program.cs
, type statements to declare and use some variables using theobject
type, as shown in the following code:object height = 1.88; // storing a double in an object object name = "Amir"; // storing a string in an object Console.WriteLine($"{name} is {height} metres tall."); int length1 = name.Length; // gives compile error! int length2 = ((string)name).Length; // tell compiler it is a string Console.WriteLine($"{name} has {length2} characters.");
- Run the code and note that the fourth statement cannot compile because the data type of the
name
variable is not known by the compiler, as shown in Figure 2.4:Figure 2.4: The object type does not have a Length property
- Add comment double slashes to the beginning of the statement that cannot compile to "comment out" the statement to make it inactive.
- Run the code again and note that the compiler can access the length of a
string
if the programmer explicitly tells the compiler that theobject
variable contains astring
by prefixing with a cast expression like(string)
, as shown in the following output:Amir is 1.88 metres tall. Amir has 4 characters.
The object
type has been available since the first version of C#, but C# 2.0 and later have a better alternative called generics, which we will cover in Chapter 6, Implementing Interfaces and Inheriting Classes, which will provide us with the flexibility we want, but without the performance overhead.
Storing dynamic types
There is another special type named dynamic
that can also store any type of data, but even more than object
, its flexibility comes at the cost of performance. The dynamic
keyword was introduced in C# 4.0. However, unlike object
, the value stored in the variable can have its members invoked without an explicit cast. Let's make use of a dynamic
type:
- Add statements to declare a
dynamic
variable and then assign astring
literal value, and then an integer value, and then an array of integer values, as shown in the following code:// storing a string in a dynamic object // string has a Length property dynamic something = "Ahmed"; // int does not have a Length property // something = 12; // an array of any type has a Length property // something = new[] { 3, 5, 7 };
- Add a statement to output the length of the
dynamic
variable, as shown in the following code:// this compiles but would throw an exception at run-time // if you later store a data type that does not have a // property named Length Console.WriteLine($"Length is {something.Length}");
- Run the code and note it works because a
string
value does have aLength
property, as shown in the following output:Length is 5
- Uncomment the statement that assigns an
int
value. - Run the code and note the runtime error because
int
does not have aLength
property, as shown in the following output:Unhandled exception. Microsoft.CSharp.RuntimeBinder.RuntimeBinderException: 'int' does not contain a definition for 'Length'
- Uncomment the statement that assigns the array.
- Run the code and note the output because an array of three
int
values does have aLength
property, as shown in the following output:Length is 3
One limitation of dynamic
is that code editors cannot show IntelliSense to help you write the code. This is because the compiler cannot check what the type is during build time. Instead, the CLR checks for the member at runtime and throws an exception if it is missing.
Exceptions are a way to indicate that something has gone wrong at runtime. You will learn more about them and how to handle them in Chapter 3, Controlling Flow, Converting Types, and Handling Exceptions.
Declaring local variables
Local variables are declared inside methods, and they only exist during the execution of that method, and once the method returns, the memory allocated to any local variables is released.
Strictly speaking, value types are released while reference types must wait for a garbage collection. You will learn about the difference between value types and reference types in Chapter 6, Implementing Interfaces and Inheriting Classes.
Specifying the type of a local variable
Let's explore local variables declared with specific types and using type inference:
- Type statements to declare and assign values to some local variables using specific types, as shown in the following code:
int population = 66_000_000; // 66 million in UK double weight = 1.88; // in kilograms decimal price = 4.99M; // in pounds sterling string fruit = "Apples"; // strings use double-quotes char letter = 'Z'; // chars use single-quotes bool happy = true; // Booleans have value of true or false
Depending on your code editor and color scheme, it will show green squiggles under each of the variable names and lighten their text color to warn you that the variable is assigned but its value is never used.
Inferring the type of a local variable
You can use the var
keyword to declare local variables. The compiler will infer the type from the value that you assign after the assignment operator, =
.
A literal number without a decimal point is inferred as an int
variable, that is, unless you add a suffix, as described in the following list:
L
: inferslong
UL
: infersulong
M
: infersdecimal
D
: infersdouble
F
: infersfloat
A literal number with a decimal point is inferred as double
unless you add the M
suffix, in which case, it infers a decimal
variable, or the F
suffix, in which case, it infers a float
variable.
Double quotes indicate a string
variable, single quotes indicate a char
variable, and the true
and false
values infer a bool
type:
- Modify the previous statements to use
var
, as shown in the following code:var population = 66_000_000; // 66 million in UK var weight = 1.88; // in kilograms var price = 4.99M; // in pounds sterling var fruit = "Apples"; // strings use double-quotes var letter = 'Z'; // chars use single-quotes var happy = true; // Booleans have value of true or false
- Hover your mouse over each of the
var
keywords and note that your code editor shows a tooltip with information about the type that has been inferred. - At the top of the class file, import the namespace for working with XML to enable us to declare some variables using types in that namespace, as shown in the following code:
using System.Xml;
Good Practice: If you are using .NET Interactive Notebooks, then add
using
statements in a separate code cell above the code cell where you write the main code. Then click Execute Cell to ensure the namespaces are imported. They will then be available in subsequent code cells. - Under the previous statements, add statements to create some new objects, as shown in the following code:
// good use of var because it avoids the repeated type // as shown in the more verbose second statement var xml1 = new XmlDocument(); XmlDocument xml2 = new XmlDocument(); // bad use of var because we cannot tell the type, so we // should use a specific type declaration as shown in // the second statement var file1 = File.CreateText("something1.txt"); StreamWriter file2 = File.CreateText("something2.txt");
Good Practice: Although using
var
is convenient, some developers avoid using it, to make it easier for a code reader to understand the types in use. Personally, I use it only when the type is obvious. For example, in the preceding code statements, the first statement is just as clear as the second in stating what the type of thexml
variables are, but it is shorter. However, the third statement isn't clear in showing the type of thefile
variable, so the fourth is better because it shows that the type isStreamWriter
. If in doubt, spell it out!
Using target-typed new to instantiate objects
With C# 9, Microsoft introduced another syntax for instantiating objects known as target-typed new. When instantiating an object, you can specify the type first and then use new
without repeating the type, as shown in the following code:
XmlDocument xml3 = new(); // target-typed new in C# 9 or later
If you have a type with a field or property that needs to be set, then the type can be inferred, as shown in the following code:
class Person
{
public DateTime BirthDate;
}
Person kim = new();
kim.BirthDate = new(1967, 12, 26); // instead of: new DateTime(1967, 12, 26)
Good Practice: Use target-typed new to instantiate objects unless you must use a pre-version 9 C# compiler. I have used target-typed new throughout the rest of this book. Please let me know if you spot any cases that I missed!
Getting and setting the default values for types
Most of the primitive types except string
are value types, which means that they must have a value. You can determine the default value of a type by using the default()
operator and passing the type as a parameter. You can assign the default value of a type by using the default
keyword.
The string
type is a reference type. This means that string
variables contain the memory address of a value, not the value itself. A reference type variable can have a null
value, which is a literal that indicates that the variable does not reference anything (yet). null
is the default for all reference types.
You'll learn more about value types and reference types in Chapter 6, Implementing Interfaces and Inheriting Classes.
Let's explore default values:
- Add statements to show the default values of an
int
,bool
,DateTime
, andstring
, as shown in the following code:Console.WriteLine($"default(int) = {default(int)}"); Console.WriteLine($"default(bool) = {default(bool)}"); Console.WriteLine($"default(DateTime) = {default(DateTime)}"); Console.WriteLine($"default(string) = {default(string)}");
- Run the code and view the result, noting that your output for the date and time might be formatted differently if you are not running it in the UK, and that
null
values output as an emptystring
, as shown in the following output:default(int) = 0 default(bool) = False default(DateTime) = 01/01/0001 00:00:00 default(string) =
- Add statements to declare a number, assign a value, and then reset it to its default value, as shown in the following code:
int number = 13; Console.WriteLine($"number has been set to: {number}"); number = default; Console.WriteLine($"number has been reset to its default: {number}");
- Run the code and view the result, as shown in the following output:
number has been set to: 13 number has been reset to its default: 0
Storing multiple values in an array
When you need to store multiple values of the same type, you can declare an array. For example, you may do this when you need to store four names in a string
array.
The code that you will write next will allocate memory for an array for storing four string
values. It will then store string
values at index positions 0 to 3 (arrays usually have a lower bound of zero, so the index of the last item is one less than the length of the array).
Good Practice: Do not assume that all arrays count from zero. The most common type of array in .NET is an szArray, a single-dimension zero-indexed array, and these use the normal []
syntax. But .NET also has mdArray, a multi-dimensional array, and they do not have to have a lower bound of zero. These are rarely used but you should know they exist.
Finally, it will loop through each item in the array using a for
statement, something that we will cover in more detail in Chapter 3, Controlling Flow, Converting Types, and Handling Exceptions.
Let's look at how to use an array:
- Type statements to declare and use an array of
string
values, as shown in the following code:string[] names; // can reference any size array of strings // allocating memory for four strings in an array names = new string[4]; // storing items at index positions names[0] = "Kate"; names[1] = "Jack"; names[2] = "Rebecca"; names[3] = "Tom"; // looping through the names for (int i = 0; i < names.Length; i++) { // output the item at index position i Console.WriteLine(names[i]); }
- Run the code and note the result, as shown in the following output:
Kate Jack Rebecca Tom
Arrays are always of a fixed size at the time of memory allocation, so you need to decide how many items you want to store before instantiating them.
An alternative to defining the array in three steps as above is to use array initializer syntax, as shown in the following code:
string[] names2 = new[] { "Kate", "Jack", "Rebecca", "Tom" };
When you use the new[]
syntax to allocate memory for the array, you must have at least one item in the curly braces so that the compiler can infer the data type.
Arrays are useful for temporarily storing multiple items, but collections are a more flexible option when adding and removing items dynamically. You don't need to worry about collections right now, as we will cover them in Chapter 8, Working with Common .NET Types.