If you have coded in another language, you will have used these data types already. However, Rust has some quirks that can throw developers, especially if they come from dynamic languages. In order to see the motivation behind these quirks, it's important that we explore why Rust is such a paradigm-shifting language.
Why Rust?
With programming, there is usually a trade-off between speed/resources and development speed/safety. Low-level languages such as C/C++ can give the developer fine-grained control over the computer with fast code execution and minimal resource consumption. However, this is not free. Manual memory management can induce bugs and security vulnerabilities. On top of this, it takes more code and time to solve a problem in a low-level language. As a result of this, C++ web frameworks do not take up a large share of web development. Instead, it made sense to go for high-level programming languages where developers can solve problems safely and quickly.
However, it has to be noted that this memory safety comes at a cost. Languages such as Python, JavaScript, PHP, and Java keep track of all the variables defined and their references to a memory address. When there are no more variables pointing to a memory address, the data in that memory address gets deleted. This process is called garbage collection and consumes extra resources and time.
With Rust, memory safety is ensured without the costly garbage collection process. Instead, the compiler maps the variables, enforcing rules to ensure safety via a mechanism called the borrow checker. Because of this, Rust has enabled rapid, safe problem solving with truly performant code, thus breaking the speed/safety trade-off. As more data processing, traffic, and complex tasks are lifted into the web stack, Rust, with its growing number of web frameworks and libraries, has now become a viable choice for web development.
Before we get into developing a web app in Rust, we're going to briefly cover the basics of Rust. All of the code examples provided can be run in the online Rust playground at https://play.rust-lang.org/.
In the Rust playground, you may have the following layout:
fn main() {
println!("Hello, world!");
}
The main
function is the entry point where the code is run. If you're coming from a JavaScript or PHP background, your entry point is the first line of the file that is directly run, and the whole code block is essentially a main
function. This is also true of Python; however, a closer analogy would be the main
block that would be run if the file is directly run by the interpreter:
if __name__ == "__main__":
print("Hello, World!")
This is often used to define an entry point in something such as a Flask application.
Using strings in Rust
Rust, like other languages, has typical data formats such as strings, integers, floats, arrays, and hash maps (dictionaries). However, because of the way in which Rust manages memory, there are some quirks we have to look out for when using them. These quirks can be easily understood and handled but can trip up experienced developers from dynamic languages if they are not warned about them.
In this section, we will cover enough memory management that we can start defining and using various data types and variables. We will dive into the concepts of memory management in more detail in the Controlling variable ownership section, later in this chapter.
We will start off with strings. We can create our own print
function that accepts a string and prints it:
fn print(input_string: String) {
println!("{}", input_string);
}
fn main() {
let test_string = String::from("Hello, World!");
print(test_string);
}
Here, we defined a string using the from
function in the String
object, and then passed it through our own print
function to print it using Rust's built-in println!
function. (Technically, this is a macro;!
denotes that we can put multiple parameters inside the parentheses. We will cover macros later.)
Notice that the print
function expects the String
object to be passed through. This is the minimum amount of typing that's needed for a function. Now, we can try something a bit more familiar for a dynamic language. We don't call a String
object function; we just define the string using quotation marks:
fn print(input_string: str) {
println!("{}", input_string);
}
fn main() {
let test_string = "Hello, World!";
print(test_string);
}
What we have done here is defined a string literal and passed it through the print
function to be printed. However, we get the following error:
error[E0277]: the size for values of type `str` cannot be known at compilation time
In order to understand this, we have to have a high-level understanding of stack and heap memory.
Stack memory is fast, static, and allocated at compile time. Heap memory is slower and allocated at runtime. String literals can vary in size as they are the string data that we refer to. String objects, on the other hand, have a fixed size in the stack that consists of a reference to the string literal in the heap, the capacity of the string literal, and the length of the string literal. When we pass a string literal through our own print
function, it will have no idea of the size of the string literal being passed through. String literals can be converted into strings with to_string
:
fn print(input_string: String) {
println!("{}", input_string);
}
fn main() {
let test_string = "Hello, World!";
print(test_string.to_string());
}
Here, we converted the string literal just before passing it through the print
function. We can also get the print
function to accept a string literal reference by borrowing it using the &
operator:
fn print(input_string: &str) {
println!("{}", input_string);
}
fn main() {
let test_string = &"Hello, World!";
print(test_string);
}
Borrowing will be covered later in this chapter. What is essentially happening here is that test_string
is merely a reference to the string literal, which is then passed through to the print
function. One last thing we must note about strings is that we can get the string literal from the string with the as_str
method.
Understanding integers and floats
Rust has signed integers (denoted by i
) and unsigned integers (denoted by u
) that consist of 8, 16, 32, 64, and 128 bits. The math behind binary notation is not relevant for the scope of this book. What we do need to understand, though, is the range of numbers allowed in terms of bits. Because binary is either 0 or 1, we can calculate the integer range by raising two to the power of the number of bits. For example, for 8 bits, 2 to the power of 8 equates to 256. Considering the 0, this means that an i8
integer should have a range of 0 to 255, which can be tested by using the following code:
let number: u8 = 255;
Let's take a look at the following code:
let number: u8 = 256;
It's not surprising that the preceding code gives us the following overflow error:
literal `256` does not fit into the type `u8` whose range is `0..=255`
What's not expected is if we change it to a signed integer:
let number: i8 = 255;
Here, we get the following error:
literal `255` does not fit into the type `i8` whose range is `-128..=127`
This is because unsigned integers only house positive integers and signed integers house positive and negative integers. Since bits are memory size, the signed integer has to accommodate a range on both sides of zero, so the modulus of the signed integers is essentially half.
In terms of floats, Rust accommodates f32
and f64
floating points, which can be both negative and positive. Declaring a floating-point variable requires the same syntax as integers:
let float: f32 = 20.6;
It has to be noted that we can also annotate numbers with suffixes, as shown in the following code:
let x = 1u8;
Here, x
has a value of 1
with the type of u8
. Now that we have covered floats and integers, we can use vectors and arrays to store them.
Storing data in vectors and arrays
Rust stores sequenced data in vectors and arrays. Arrays are generally immutable and don't have push functions (append for Python). They also only accommodate one data type. This can be managed using structs and traits, but this will be covered later on in this chapter. You can define and loop through arrays and vectors with fairly standard syntax:
let int_array: [i32; 3] = [1, 2, 3];
for i in int_array.iter() {
println!("{}", i);
}
let str_vector: Vec<&str> = vec!["one", "two", "three"];
for i in str_vector.iter() {
println!("{}", i);
}
let second_int_array: [i32; 3] = [1, 2, 3];
let two = second_int_array[1];
Let's try and append "four"
to our str_vector
:
str_vector.push("four");
Here, we get an error about how we cannot borrow as mutable. This is because, by default, variables defined in Rust are not mutable. This can be easily remedied by putting a mut
keyword in front of the variable's name:
let mut str_vector: Vec<&str> = vec!["one", "two", "three"];
This also works for strings and numbers. While it might be tempting to define everything as a mut
variable, this forced immutability not only has performance benefits, but it also improves the safety. If you are not expecting a variable to change in a complex system, then not allowing it to mutate will throw up the error right then as opposed to allowing silent bugs to run in your system.
Mapping data with hash maps
In some languages, hash maps are referred to as dictionaries. In order to define a hash map in Rust, we must import the hash maps from the standard library. Once we've defined a new hash map, we can insert an entry, get it out of the hash map, and then print it:
use std::collections::HashMap;
fn main() {
let mut general_map: HashMap<&str, i8> = HashMap::new();
general_map.insert("test", 25);
let outcome: i8 = general_map.get("test");
println!("{}", outcome);
}
With this, we get the following error for defining the outcome variable:
expected `i8`, found enum `std::option::Option`
Here, we can see that the get
method does not actually return an i8
type, despite us inserting an i8
type into the hash map. It's returning an Option
enum instead. This is because the get
method could fail. We could pass in a key that does not exist. Therefore, we have to unwrap the option to get the value we're aiming to get:
let outcome: Option<&i8> = general_map.get("test");
println!("here is the outcome {}", outcome.unwrap());
However, directly unwrapping the result can result in an error being raised. Because Optional
is either Some
or None
, we can exploit Rust's match
statement to handle the outcome:
match general_map.get("test") {
None => println!("it failed"),
Some(result) => println!("Here is the result: {}", result)
}
Here, if the result is None
, then we print that it failed. If the result is Some
, we access the result in the Optional
wrapper and print it. The arrows in the match
statement can have their own code blocks. For instance, we can nest a match
statement within a match
statement. For instance, we can perform another lookup if the original lookup fails. In the following code, we can check to see if there's an entry under the "testing"
key. If it's not there, we can then check to see if there's an entry under the "test"
key. If that fails too, we must give up:
match general_map.get("testing") {
None => {
match general_map.get("test") {
None => println!("Both testing and test failed"),
Some(result) => println!("testing failed but test is: {}", result)
}
},
Some(result) => println!("Here is the result: {}", result)
}
Calling the insert
function again with the same key will merely update the value under that key. Calling the remove
function from the hash map with the desired key will remove the entry if it exists. There are some experimental functions such as reserve allocations, capacity, and more that will move to the stable build of Rust in time. Be sure to check the official Rust documentation for more functions for the hash map at https://doc.rust-lang.org/beta/std/collections/struct.HashMap.html.
Crates, tooling, and documentation will be covered in Chapter 2, Designing Your Web Application in Rust. Note that the hash map in this example can only accept i8
integers. We will cover how to enable different data types so that they can be stored with structs later in this chapter.
Handling results and errors
Like other languages, Rust throws and handles errors. It manages errors through two different types: Option
and Result
. We saw Option
in action in the hash map, where we had to unwrap the get
function to access the data in the hash map. Since Option
only returns None
or Some
, Result
returns Err
or Some
.
This is fairly similar, however, if Err
is exposed, as the Rust program panics and the program crashes with what is in the outcome of Err
. While there will be plenty of opportunities to throw errors, we will also want to throw our own when needed. When systems become more complex, it can be handy to purposefully throw errors if there is any undesired behavior. A good example is inserting data into a Redis cache.
Technically, there is nothing stopping us from inserting a range of keys into Redis. In order to prevent this, if the key is not an expected variant of what we want, we should throw an error. Let's demonstrate how to throw an error, depending on the data:
fn error_check(check: bool) -> Result<i8, &'static str> {
if check == true {
Err("this is an error")
} else {
Ok(1)
}
}
fn main() {
let result: i8 = error_check(false).unwrap();
println!("{}", result);
}
Note that there is no return
keyword. This is because the function returns the final expression in the function when there is no semicolon at the end of the expression. In our function, if we set the input to true
, we get the following error:
thread 'main' panicked at 'called `Result::unwrap()` on an `Err` value: "this is an error"'
This Result
wrapper gives us a lot of control of the outcome. Instead of throwing try
and except
blocks, we can wait until we're ready to handle the error. We can build a simple error handling function with a match
statement:
fn error_check(check: bool) -> Result<i8, &'static str> {
if check == true {
return Err("this is an error")
} else {
return Ok(1)
}
}
fn describe_result(result: Result<i8, &'static str>) {
match result {
Ok(x) => println!("it's a result of: {}", x),
Err(x) => println!("{}", x)
}
}
fn main() {
let result: Result<i8, &'static str> = error_check(true);
describe_result(result);
}
In the wild, this comes in useful when we must roll back a database entry or clean up a process before throwing an error. We also have to note the typing for Result
. In this result, we return an i8
integer (we can return other variables), but we can also return a reference to a string literal that has the 'static
notation. This is the lifetime notation. We will cover lifetime notation in more detail later in this chapter, but for now, the 'static
notation is telling the compiler that the error string will stay around for the entire runtime of the program.
This makes sense, as we would hate to lose the error message because we moved out of scope. Also, it's an error, so we should be ending the program soon. If we want to tolerate an outcome, we should be reaching for the option and handling None
. We can also signpost a little more with the expect
function as opposed to using unwrap
. It still unwraps the result, but adds an extra message in the error trace:
let result: i8 = error_check(true).expect("this has been caught");
We can also directly throw errors with the panic
function:
panic!("throwing some error");
We can also check for an error using is_err
:
result.is_err()
This returns a bool
value. As we can see, Rust supports a range of error handling. It is advised to keep these as simple as possible. For most processes in a simple web app, unwrapping straight away and throwing the error as soon as possible will manage most situations.
Now that we can utilize basic data structures while navigating Rust's quirks, we have to address problems around controlling the ownership of these data structures.