In this chapter, we covered the performance characteristics in Julia of the most important data structure in scientific computing—the array. We discussed why Julia's design enables extremely fast array operations and how to get the best performance in our code when operating on arrays. This brings us to the end of our journey of creating the fastest possible code in Julia. Using all the tips discussed until now, the performance of your code should approach that of well-written C.
Sometimes, however, this isn't enough, and we want higher performance; our data may be larger or our computations intensive. In this case, the only option is to parallelize our processing using multiple CPUs and systems. In the next chapter, we will take a brief look at the features that Julia provides to write parallel systems easily.