Learning basic Rust concepts
While Rust is quite an extensive language and to cover it completely requires a book of its own, you are advised to supplement your learning with a dedicated Rust book. You can find some examples at https://doc.rust-lang.org/book/. However, the most important concepts that we will often require when working with blockchains can be covered quickly. This is what we will attempt to do here.
First, let’s talk about variables and constants. These are the basic building blocks of any programming language.
Variables and constants
Variables are crucial to Rust as they are values that may change multiple times throughout the lifetime of the program. As can be seen in the following example, defining variables in Rust is extremely simple:
fn main() { let x = 5; println!("The value of is:{x}"); }
In the preceding code, we assign a value of 5
to x
, where x
is the variable, after which we print out the value of x
. The code is enclosed in the main function, which is the entry point in all Rust programs.
However, by default, as a safety check and to avoid unpredicted behavior, all variables in Rust are immutable (you reassign something to them), so the following example won’t work:
fn main() { let x = 5; println!("The value of is:{x}"); let x = 6; println!("The value of is:{x}"); }
The preceding program will give us an error. However, there is a way to make variables mutable, and that is by adding the mut
keyword:
fn main() { let mut x = 5; println!("The value of is:{x}"); x = 6; println!("The value of is:{x}"); }
Let’s quickly talk about constants. They’re similar to immutable variables but with some differences – you can’t (obviously) use the mut
keyword with them, which means they aren’t immutable by default – instead, they are always immutable. Next, you can use the const
keyword instead of let
to define constants and you must always mention the type. The last difference is that you can’t set the value of a constant to a value that is generated from a dynamic computation; it will always be a constant expression. The following code block shows an example of a constant declaration:
const HOURS_IN_A_DAY = 24;
An interesting property that needs to be pointed out here is overshadowing. Here, if we use the let
keyword to assign a new value to x
, we will not get an error. Let’s look at an example:
fn main() { let mut x = 5; println!("The value of is:{x}"); let x = 6; println!("The value of is:{x}"); }
In this example, we didn’t use the mut
keyword again when reassigning the value of x
to 6
as this will overwrite the value of x
and the new value will become 6
. Please note that when you run this code example, you will receive a warning regarding re-assignment, but there will be no error here.
Data types
As we learned in the Statically typed section, Rust is a statically typed language, meaning it has a requirement that it needs to be aware of the type of every single variable at compile time. Due to this, it is important to take a look at the various data types in Rust.
Let’s get started. In this section, we will look at the scalar data types in Rust, namely integers, floating-point numbers, Booleans, and characters. In the Tuples and arrays section, we will learn about compound data types – tuples and arrays.
Let’s start with integers – they can either be signed or unsigned. Signed integers are represented with an i – for example, i16
and i32
, while unsigned integers are represented with a u – for example, u16
, and u32
. Here, signed and unsigned refer to the possibility of the occurrence of a negative number and the digits 16 and 32 represent the number of bits of space that the variable is going to take. Additionally, Rust includes isize and usize types, which are architecture-dependent integer types. The isize and usize types are primarily used for indexing collections and interfacing with system calls, with their size varying based on the underlying machine architecture.
The most commonly used bits are 8, 16, 32, 64, and 128 and the default integer types that Rust considers are i32
and u32
.
The next scalar data type is floating-point numbers, which (unlike integers) are numbers that can store decimal points. All floating-point types are signed (again, this is different from integers). The two sub-types that are present are f32
and f64
, and they represent 32-bit and 64-bit, respectively. Unlike integers, the default here is 64. While f32
has single precision, f64
has double the precision but similar speed.
The following code block shows floating-point numbers in action:
fn main() { let x = 2.0; let y: f32 = 3.0; }
In the preceding example, in the first line, since we don’t mention the number of bits, Rust selects 64-bit by default. In the second example, 32-bit has been mentioned clearly by us.
The next data type is Boolean and it has two possible values – true
and false
. Let’s look at an example:
fn main() { let t = true; let f: bool = false; }
In the preceding example, we can see that we can either assign the Boolean value directly to the variable or we can mention bool
(as can be observed on the second line), which is how we indicate the Boolean type to Rust.
The last scalar type is char
and we can use it like this:
fn main() { let t: char = 'z'; let f: char = 😄; }
As we can see, char
can represent a lot more than just American Standard Code for Information Interchange (ASCII) characters. Now, it’s time to move on to compound data types.
Tuples and arrays
The two compound data types present in Rust are tuples and arrays. First, let’s take a look at tuples. These are comma-separated lists of values inside parentheses where each value has a type.
There are two important things to remember with tuples – firstly, each value in the tuple can have a different type and they don’t have to be the same, and secondly, tuples have a fixed length and once this length has been declared, tuples cannot grow or shrink in size. An important additional point is that tuples are sum data types, which means their total size is the aggregate of the sizes of all contained elements, along with any necessary padding. To better understand tuples, let’s look at an example:
fn main() { let tup = (500, 6.4, 1); let (x, y, z) = tup; println!("the value of y is: {y}"); let x:(i32, f64, u8) = (500, 6.4, 1); let five_hundred = x.0; }
There are five lines in the preceding example. We will break these down to expand our understanding.
The first line defines a tuple called tup
and assigns a list of three comma-separated values to it so that we can access individual values from tup
. We can deconstruct it, as we have done in the second line, to print the value of y
in the third line.
In the fourth line, we can be more explicit in defining our tuple and can mention the type of every single value to be more specific and for more control. The fifth line shows another way to access individual values from the tuple. So, with this extensive example, not only have we learned how to assign values to a tuple but also how to access those values. One thing to note is that in the preceding case, tup
will have a padding size of i32 + f64 + u8.
Now, let’s look at the second compound data type in Rust: array. Now, arrays are also a collection of values, just like tuples, but with a major difference – all the values in an array must be of the same type. Just like tuples, arrays also have a fixed length and are useful when you want to ensure that you always have a fixed number of elements. Let’s look at an example to explore arrays:
fn main() { let a = [1, 2, 3, 4, 5]; let months = ["jan", "feb", "mar"]; let b: [i32; 5] = [1, 2, 3, 4, 5]; let z = [3; 5]; let first = a[0]; }
In the preceding example, in the first line, we can see how we can assign an array to a variable. The second line also demonstrates the same but all the values being assigned are characters and these two variables (a
and months
) make it clear that all the values need to be of a specific type. In the third line, there’s a more explicit definition where we not only specifically mention the type of the array but also mention the length.
The fourth line is slightly different and if you were to print the value of z
, you would get [3, 3, 3, 3, 3]
as the output. This is because in [3; 5]
, 3
denotes the value that will be stored in the array and 5
represents the number of times it will be stored or the length of the array. So, this gives us a great way to define an array that has repeat values.
The last line demonstrates how to access the values of an array with the help of the index, where the index starts with 0
and goes all the way up to n-1
, where n
is the length of the array. So, the a
array starts with 0
for the first value and goes to 4
for the last value.
Numeric operations
All the basic mathematical operations are supported in Rust – for example, addition, subtraction, multiplication, division, and remainder. Let’s look at an example to understand this:
fn main() { let sum = 15 + 2; let difference = 15.3 – 2.2; let multiplication = 2 * 20; let division = 20 / 2; let remainder = 21 %2 ; }
The mathematical operators that are used in Rust, such as /
for division and %
for remainder division, are standard and are the same as any other programming language. The preceding example is quite straightforward and self-explanatory.
In Rust, the memory layout is an essential concept and mainly comprises the stack and the heap, along with virtual tables (v-tables) for polymorphism. First, let’s take a look at the stack.
Stack
The stack in Rust is a region of memory that’s used for static memory allocation. It operates in a very organized last-in, first-out manner. Variables stored on the stack have fixed sizes known at compile time. This makes stack operations incredibly fast as it’s just about moving the stack pointer up and down:
fn main() { let x = 5; // Integer stored on the stack let y = true; // Boolean stored on the stack let z = 'a'; // Character stored on the stack }
In the preceding example, x
, y
, and z
are all variables with sizes known at compile time. Rust allocates space for these variables directly on the stack. Each of these variables is pushed onto the stack when main()
starts, and popped off the stack when main()
completes. The efficiency of the stack comes from its predictability and the simplicity of the push/pop operations. Now, let’s move on to the heap.
Heap
The heap is crucial for dynamic memory allocation in Rust. It is where variables or data structures whose size might change or is unknown at compile time are stored. Since accessing the heap involves following pointers and more complex management, it’s slower compared to stack operations:
fn main() { let mut s = String::from("hello"); // String stored on the heap s.push_str(", world!"); // Modifying the string }
In the preceding code, s
is a String
type, which is mutable and can change size. Initially, Rust allocates memory on the heap for "hello"
. When we modify s
using push_str
, it might need more space than initially allocated. The heap allows this flexibility, but it requires Rust to manage the memory, keep track of its size, and potentially move it if more space is needed. Next, we’ll look at v-tables.
V-tables
V-tables enable Rust to support dynamic dispatch, particularly with trait objects. A v-table (or virtual method table) is a mechanism that’s used in object-oriented programming for method resolution at runtime:
trait Animal { fn speak(&self); } struct Dog; struct Cat; impl Animal for Dog { fn speak(&self) { println!("Dog says Bark!"); } } impl Animal for Cat { fn speak(&self) { println!("Cat says Meow!"); } } fn make_sound(animal: &dyn Animal) { animal.speak(); } fn main() { let dog = Dog; let cat = Cat; make_sound(&dog); make_sound(&cat); }
In this expanded example, make_sound
is a function that takes a trait
object, &dyn Animal
. We have two structs, Dog
and Cat
, each implementing the Animal
trait. When make_sound(&dog)
and make_sound(&cat)
are called, Rust uses the v-table of each object (Dog
and Cat
) to look up and call the appropriate speak
method. The v-table is essentially a lookup table where Rust can find the correct method implementations for a trait object at runtime, allowing for polymorphism.
Slices
Slices are defined as follows:
“A slice is a pointer to a block of memory and can be used to access portions of data stored in contiguous memory blocks.”
To learn more, please refer to https://www.tutorialspoint.com/rust/rust_slices.htm. This definition might be slightly confusing to us right now as we haven’t covered the concepts of accessing memory and pointers yet. So, to understand slices with ease, let’s break down the complexity slightly.
Slices enable us to refer to a part of a string or an array. Let’s look at two separate examples that will help us understand this statement:
fn main() { let n1 = "example".to_string(); let c1 = &n1[4..7]; }
In the preceding example, if you were to print the value of c1
, you would get ple
, because c1
only refers to a part of the n1
string.
Let’s see how this works – the values in the square brackets, [4..7]
, indicate that we want to refer to the values starting from the fifth value, which is m
, all the way to the last value, e
. However, the fifth value itself is not included in this because that’s where the counting starts and hence pleis
is taken from n1
into c1
.
Something is interesting on line 2 – the presence of &
. This is the reference operator and it’s used to create a copy of the desired values into c1
without affecting the original value of n1
. We will look at this in more detail in the Ownership and borrowing section. You may have also noticed to_String()
in the first line; we will discuss this in the next section.
Slices are not just useful for strings, but for arrays of numbers as well, so let’s learn about this via an example:
fn main() { let arr = [5, 7, 9, 11, 13]; let slice = &arr[1..3]; assert_eq!(slice, &[7, 9]); }
In the preceding example, we take an array called arr
and then a slice that selects values starting from the first value and until the third value, which simply means 7
and 9
. This is why in the last line, we assert whether the slice that we have created by selecting a value from arr
is equal to the slice containing 7
and 9
.
Strings
Working with strings is straightforward in Rust, so it’s important to know the difference between the String
type and string literals. String literals enable us to hard-code some text into our program, as shown in the following code block:
fn main() { let s = "hello"; }
In the preceding example, s
is a variable that is equal to "hello"
. Now, we can’t perform any operations on this string literal because, in Rust, string literals are immutable (this is done for better stability). This is why we have the String
type:
fn main() { let mut hello = String::from("hello"); hello.push('w'); hello.push_str("world!"); }
In the preceding example, note that we can create a mutable variable called "hello"
that will be of the String
type from a literal string, "hello"
, with the help of String::from
. String
is part of the std
library. Here, we get access to methods such as push
, which enables us to append a char
type to the String
type ("hello"
, in this case). The push_str
method enables us to append a String
type ("world!"
) to our existing string.
The topic of strings is incomplete without discussing 'to.string()'
. Here’s an example:
fn main() { let i = 5; let five = String::from("5"); assert_eq!(five, i.to_string()); }
In the preceding example, we take a variable, i
, which has a value of 5
. This is a number i64
. After that, we take a variable, five
, which converts the number 5
into a string.
In the last line, we use the assert
function to try and compare the values of five
and i
. We know that one of them is a string and the other is a number, so we use the to_string()
method to convert the value of i
into a string. Now, when we compare their values, they will be equal.
Enums
Enums in Rust provide us with a way of enlisting different values for a particular value. For example, if you wanted to enlist different types of proxy servers, you would list forward proxy and reverse proxy as the two different values that can be used. For instance, let’s consider an example where we define different types of cache strategies.
First, let’s redefine our CacheType
enum to represent two types of cache strategies – least recently used (LRU) and most recently used (MRU):
enum CacheType { LRU, MRU, }
Now, we can create variables of this enum type:
let lru_cache = CacheType::LRU; let mru_cache = CacheType::MRU;
Here, lru_cache
and mru_cache
are instances of CacheType
. We use the ::
syntax to specify which variant of the enum we want to create.
Next, let’s consider a struct named Cache
. This struct will have a field that uses our CacheType
enum. Note that we avoid using “type” as a field name since it’s a reserved keyword in Rust. Instead, we’ll use cache_type
:
struct Cache { level: String, cache_type: CacheType, }
In this example, the Cache
struct has two fields: level
, which is a string, and cache_type
, which is of the CacheType
type. This demonstrates how enums can be integrated into structs, offering a structured and clear way to define data.
By using enums, we set a clear, defined set of values for a variable type, reducing errors and misunderstandings in our code. This is especially helpful for collaborative programming, ensuring consistency and clarity. The use of structs, which we will explore in more detail later, further enhances this by allowing us to create complex data types that incorporate these enums.
Enums in Rust are not just for listing values; they offer a wide range of functionalities. Beyond the basic use demonstrated earlier, enums can serve several other purposes:
- C-type enums: In Rust, enums can behave like C-type enums, where each variant is automatically assigned an integer value starting from 0 or a predefined value if specified:
enum StatusCode { Ok = 200, BadRequest = 400, NotFound = 404, }
Here,
StatusCode::Ok
has a value of 200,BadRequest
has a value of 400, and so on. - Enums with values: Rust enums can also hold data. This feature allows more complex data structures than simple named values:
enum CacheStrategy { LRU(String), MRU(i32), }
In this example, LRU holds a
String
type, and MRU holds ani32
type. This makes enums incredibly powerful for diverse data representations. - Enum size: An important characteristic of enums in Rust is that the size of an enum type is determined by the largest variant it can hold. This is crucial for understanding memory allocation when using enums.
By understanding these advanced aspects of enums, we can appreciate their versatility and the powerful role they play in Rust’s type system. Enums go beyond simple value substitution, allowing for complex data representations and controlled memory usage. With this knowledge, we can use enums to create more efficient and expressive programs in Rust.
Now that we have a firm grip on the basic concepts of Rust, let’s learn some intermediate concepts that will help us in the next chapters when we build actual projects.