In this article from Paul Johnson, author of the book Learning Rust, we would take a look at how loops and conditions within any programming language are a fundamental aspect of operation. You may be looping around a list attempting to find when something matches, and when a match occurs, branching out to perform some other task; or, you may just want to check a value to see if it meets a condition. In any case, Rust allows you to do this.
(For more resources related to this topic, see here.)
In this article, we will cover the following topics:
Types of loop available
Different types of branching within loops
Recursive methods
When the semi-colon (;) can be omitted and what it means
Loops
Rust has essentially three types of loop—for, loop, and while.
The for loop
This type of loop is very simple to understand, yet rather powerful in operation. It is simple. In that, we have a start value, an end condition, and some form of value change. Although, the power comes in those two last points.
Let's take a simple example to start with—a loop that goes from 0 to 10 and outputs the value:
for x in 0..10
{
println!("{},", x);
}
We create a variable x that takes the expression (0..10) and does something with it. In Rust terminology, x is not only a variable but also an iterator, as it gives back a value from a series of elements.
This is obviously a very simple example. We can also go down as well, but the syntax is slightly different. In C, you will expect something akin to for (i = 10; i > 0; --i). This is not available in Rust, at least, not in the stable branches. Instead, we will use the rev() method, which is as follows:
for x in (0..10).rev()
{
println!("{},", x);
}
It is worth noting that, as with the C family, the last number is to be excluded. So, for the first example, the values outputted are 9 to 0; essentially, the program generates the output values from 0 to 10 and then outputs them in reverse. Notice also that the condition is in braces. This is because the second parameter is the condition.
In C#, this will be the equivalent of a foreach. In Rust, it will be as follows:
for var in condition
{
// do something
}
The C# equivalent for the preceding code is:
foreach(var t in condition)
// do something
Using enumerate
A loop condition can also be more complex using multiple conditions and variables. For example, the for loop can be tracked using enumerate. This will keep track of how many times the loop has executed, as shown here:
for(i, j) in (10..20).enumerate()
{
println!("loop has executed {} times. j = {}", i, j);
}
'The following is the output:
The enumeration is given in the first variable with the condition in the second.
This example is not of that much use, but where it comes into its own is when looping over an iterator.
Say we have an array that we need to iterate over to obtain the values. Here, the enumerate can be used to obtain the value of the array members. However, the value returned in the condition will be a pointer, so a code such as the one shown in the following example will fail to execute (line is a & reference whereas an i32 is expected)
fn main()
{
let my_array: [i32; 7] = [1i32,3,5,7,9,11,13];
let mut value = 0i32;
for(_, line) in my_array.iter().enumerate()
{
value += line;
}
println!("{}", value);
}
This can be simply converted back from the reference value, as follows:
for(_, line) in my_array.iter().enumerate()
{
value += *line;
}
The iter().enumerate() method can equally be used with the Vec type, as shown in the following code:
fn main()
{
let my_array = vec![1i32,3,5,7,9,11,13];
let mut value = 0i32;
for(_,line) in my_array.iter().enumerate()
{
value += *line;
}
println!("{}", value);
}
In both cases, the value given at the end will be 49, as shown in the following screenshot:
The _ parameter
You may be wondering what the _ parameter is. It's Rust, which means that there is an argument, but we'll never do anything with it, so it's a parameter that is only there to ensure that the code compiles. It's a throw-away. The _ parameter cannot be referred to either; whereas, we can do something with linenumber in for(linenumber, line), but we can't do anything with _ in for(_, line).
The simple loop
A simple form of the loop is called loop:
loop
{
println!("Hello");
}
The preceding code will either output Hello until the application is terminated or the loop reaches a terminating statement.
While…
The while condition is of slightly more use, as you will see in the following code snippet:
while (condition)
{
// do something
}
Let's take a look at the following example:
fn main() {
let mut done = 0u32;
while done != 32
{
println!("done = {}", done);
done+=1;
}
}
The preceding code will output done = 0 to done = 31. The loop terminates when done equals 32.
Prematurely terminating a loop
Depending on the size of the data being iterated over within a loop, the loop can be costly on processor time. For example, say the server is receiving data from a data-logging application, such as measuring values from a gas chromatograph, over the entire scan, it may record roughly half a million data points with an associated time position.
For our purposes, we want to add all of the recorded values until the value is over 1.5 and once that is reached, we can stop the loop.
Sound easy? There is one thing not mentioned, there is no guarantee that the recorded value will ever reach over 1.5, so how can we terminate the loop if the value is reached?
We can do this one of two ways. First is to use a while loop and introduce a Boolean to act as the test condition. In the following example, my_array represents a very small subsection of the data sent to the server.
fn main()
{
let my_array = vec![0.6f32, 0.4, 0.2, 0.8, 1.3, 1.1, 1.7, 1.9];
let mut counter: usize = 0;
let mut result = 0f32;
let mut test = false;
while test != true
{
if my_array[counter] > 1.5
{
test = true;
}
else
{
result += my_array[counter];
counter += 1;
}
}
println!("{}", result);
}
The result here is 4.4. This code is perfectly acceptable, if slightly long winded. Rust also allows the use of break and continue keywords (if you're familiar with C, they work in the same way).
Our code using break will be as follows:
fn main()
{
let my_array = vec![0.6f32, 0.4, 0.2, 0.8, 1.3, 1.1, 1.7, 1.9];
let mut result = 0f32;
for(_, value) in my_array.iter().enumerate()
{
if *value > 1.5
{
break;
}
else
{
result += *value;
}
}
println!("{}", result);
}
Again, this will give an answer of 4.4, indicating that the two methods used are the equivalent of each other.
If we replace break with continue in the preceding code example, we will get the same result (4.4). The difference between break and continue is that continue jumps to the next value in the iteration rather than jumping out, so if we had the final value of my_array as 1.3, the output at the end should be 5.7.
When using break and continue, always keep in mind this difference. While it may not crash the code, mistaking break and continue may lead to results that you may not expect or want.
Using loop labels
Rust allows us to label our loops. This can be very useful (for example with nested loops). These labels act as symbolic names to the loop and as we have a name to the loop, we can instruct the application to perform a task on that name.
Consider the following simple example:
fn main()
{
'outer_loop: for x in 0..10
{
'inner_loop: for y in 0..10
{
if x % 2 == 0 { continue 'outer_loop; }
if y % 2 == 0 { continue 'inner_loop; }
println!("x: {}, y: {}", x, y);
}
}
}
What will this code do?
Here x % 2 == 0 (or y % 2 == 0) means that if variable divided by two returns no remainder, then the condition is met and it executes the code in the braces. When x % 2 == 0, or when the value of the loop is an even number, we will tell the application to skip to the next iteration of outer_loop, which is an odd number. However, we will also have an inner loop. Again, when y % 2 is an even value, we will tell the application to skip to the next iteration of inner_loop.
In this case, the application will output the following results:
While this example may seem very simple, it does allow for a great deal of speed when checking data. Let's go back to our previous example of data being sent to the web service. Recall that we have two values—the recorded data and some other value, for ease, it will be a data point. Each data point is recorded 0.2 seconds apart; therefore, every 5th data point is 1 second.
This time, we want all of the values where the data is greater than 1.5 and the associated time of that data point but only on a time when it's dead on a second. As we want the code to be understandable and human readable, we can use a loop label on each loop.
The following code is not quite correct. Can you spot why? The code compiles as follows:
fn main()
{
let my_array = vec![0.6f32, 0.4, 0.2, 0.8, 1.3, 1.1, 1.7, 1.9, 1.3, 0.1, 1.6, 0.6, 0.9, 1.1, 1.31, 1.49, 1.5, 0.7];
let my_time = vec![0.2f32, 0.4, 0.6, 0.8, 1.0, 1.2, 1.4, 1.6, 1.8, 2.0, 2.2, 2.4, 2.6, 2.8, 3.0, 3.2, 3.4, 3.6, 3.8];
'time_loop: for(_, time_value) in my_time.iter().enumerate()
{
'data_loop: for(_, value) in my_array.iter().enumerate()
{
if *value < 1.5
{
continue 'data_loop;
}
if *time_value % 5f32 == 0f32
{
continue 'time_loop;
}
println!("Data point = {} at time {}s", *value, *time_value);
}
}
}
This example is a very good one to demonstrate the correct operator in use. The issue is the if *time_value % 5f32 == 0f32 line. We are taking a float value and using the modulus of another float to see if we end up with 0 as a float.
Comparing any value that is not a string, int, long, or bool type to another is never a good plan; especially, if the value is returned by some form of calculation. We can also not simply use continue on the time loop, so, how can we solve this problem?
If you recall, we're using _ instead of a named parameter for the enumeration of the loop. These values are always an integer, therefore if we replace _ for a variable name, then we can use % 5 to perform the calculation and the code becomes:
'time_loop: for(time_enum, time_value) in my_time.iter().enumerate()
{
'data_loop: for(_, value) in my_array.iter().enumerate()
{
if *value < 1.5
{
continue 'data_loop;
}
if time_enum % 5 == 0
{
continue 'time_loop;
}
println!("Data point = {} at time {}s", *value, *time_value);
}
}
The next problem is that the output isn't correct. The code gives the following:
Data point = 1.7 at time 0.4s
Data point = 1.9 at time 0.4s
Data point = 1.6 at time 0.4s
Data point = 1.5 at time 0.4s
Data point = 1.7 at time 0.6s
Data point = 1.9 at time 0.6s
Data point = 1.6 at time 0.6s
Data point = 1.5 at time 0.6s
The data point is correct, but the time is way out and continually repeats. We still need the continue statement for the data point step, but the time step is incorrect. There are a couple of solutions, but possibly the simplest will be to store the data and the time into a new vector and then display that data at the end.
The following code gets closer to what is required:
fn main()
{
let my_array = vec![0.6f32, 0.4, 0.2, 0.8, 1.3, 1.1, 1.7, 1.9, 1.3, 0.1, 1.6, 0.6, 0.9, 1.1, 1.31, 1.49, 1.5, 0.7];
let my_time = vec![0.2f32, 0.4, 0.6, 0.8, 1.0, 1.2, 1.4, 1.6, 1.8, 2.0, 2.2, 2.4, 2.6, 2.8, 3.0, 3.2, 3.4, 3.6, 3.8];
let mut my_new_array = vec![];
let mut my_new_time = vec![];
'time_loop: for(t, _) in my_time.iter().enumerate()
{
'data_loop: for(v, value) in my_array.iter().enumerate()
{
if *value < 1.5
{
continue 'data_loop;
}
else
{
if t % 5 != 0
{
my_new_array.push(*value);
my_new_time.push(my_time[v]);
}
}
if v == my_array.len()
{
break;
}
}
}
for(m, my_data) in my_new_array.iter().enumerate()
{
println!("Data = {} at time {}", *my_data, my_new_time[m]);
}
}
We will now get the following output:
Data = 1.7 at time 1.4
Data = 1.9 at time 1.6
Data = 1.6 at time 2.2
Data = 1.5 at time 3.4
Data = 1.7 at time 1.4
Yes, we now have the correct data, but the time starts again. We're close, but it's not right yet. We aren't continuing the time_loop loop and we will also need to introduce a break statement. To trigger the break, we will create a new variable called done. When v, the enumerator for my_array, reaches the length of the vector (this is the number of elements in the vector), we will change this from false to true. This is then tested outside of the data_loop. If done == true, break out of the loop.
The final version of the code is as follows:
fn main()
{
let my_array = vec![0.6f32, 0.4, 0.2, 0.8, 1.3, 1.1, 1.7, 1.9, 1.3, 0.1, 1.6, 0.6, 0.9, 1.1, 1.31, 1.49, 1.5, 0.7];
let my_time = vec![0.2f32, 0.4, 0.6, 0.8, 1.0, 1.2, 1.4, 1.6, 1.8, 2.0, 2.2, 2.4, 2.6, 2.8, 3.0, 3.2, 3.4, 3.6];
let mut my_new_array = vec![];
let mut my_new_time = vec![];
let mut done = false;
'time_loop: for(t, _) in my_time.iter().enumerate()
{
'data_loop: for(v, value) in my_array.iter().enumerate()
{
if v == my_array.len() - 1
{
done = true;
}
if *value < 1.5
{
continue 'data_loop;
}
else
{
if t % 5 != 0
{
my_new_array.push(*value);
my_new_time.push(my_time[v]);
}
else
{
continue 'time_loop;
}
}
}
if done {break;}
}
for(m, my_data) in my_new_array.iter().enumerate()
{
println!("Data = {} at time {}", *my_data, my_new_time[m]);
}
}
Our final output from the code is this:
Recursive functions
The final form of loop to consider is known as a recursive function. This is a function that calls itself until a condition is met. In pseudocode, the function looks like this:
float my_function(i32:a)
{
// do something with a
if (a != 32)
{
my_function(a);
}
else
{
return a;
}
}
An actual implementation of a recursive function would look like this:
fn recurse(n:i32)
{
let v = match n % 2
{
0 => n / 2,
_ => 3 * n + 1
};
println!("{}", v);
if v != 1
{
recurse(v)
}
}
fn main()
{
recurse(25)
}
The idea of a recursive function is very simple, but we need to consider two parts of this code. The first is the let line in the recurse function and what it means:
let v = match n % 2
{
0 => n / 2,
_ => 3 * n + 1
};
Another way of writing this is as follows:
let mut v = 0i32;
if n % 2 == 0
{
v = n / 2;
}
else
{
v = 3 * n + 1;
}
In C#, this will equate to the following:
var v = n % 2 == 0 ? n / 2 : 3 * n + 1;
The second part is that the semicolon is not being used everywhere. Consider the following example:
fn main()
{
recurse(25)
}
What is the difference between having and not having a semicolon?
Rust operates on a system of blocks called closures. The semicolon closes a block. Let's see what that means. Consider the following code as an example:
fn main()
{
let x = 5u32;
let y =
{
let x_squared = x * x;
let x_cube = x_squared * x;
x_cube + x_squared + x
};
let z =
{
2 * x;
};
println!("x is {:?}", x);
println!("y is {:?}", y);
println!("z is {:?}", z);
}
We have two different uses of the semicolon. If we look at the let y line first:
let y =
{
let x_squared = x * x;
let x_cube = x_squared * x;
x_cube + x_squared + x // no semi-colon
};
This code does the following:
The code within the braces is processed.
The final line, without the semicolon, is assigned to y.
Essentially, this is considered as an inline function that returns the line without the semicolon into the variable.
The second line to consider is for z:
let z =
{
2 * x;
};
Again, the code within the braces is evaluated. In this case, the line ends with a semicolon, so the result is suppressed and () to z.
When it is executed, we will get the following results:
In the code example, the line within fn main calling recurse gives the same result with or without the semicolon.
Summary
In this, we've covered the different types of loops that are available within Rust, as well as gained an understanding of when to use a semicolon and what it means to omit it. We have also considered enumeration and iteration over a vector and array and how to handle the data held within them.
Resources for Article:
Further resources on this subject:
Extra, Extra Collection, and Closure Changes that Rock! [article]
Create a User Profile System and use the Null Coalesce Operator [article]
Fine Tune Your Web Application by Profiling and Automation [article]
Read more