Experimenting with user-defined functions
Optimus tries to provide the most commonly used functions out of the box so that you can focus on your work instead of writing code. Of course, there are times when you will need to write custom functions to accomplish a task.
Before we deep dive into user-defined functions (UDF), let's explore a couple of scenarios regarding how data can be processed. Two such scenarios are known as vectorized and non-vectorized execution. This is important to understand because it can have a very big impact on performance.
Vectorized execution refers to operations that are performed on multiple components of a vector at the same time, in one statement. A vector is just a list of elements like the following:
[0, 1, 2, 3, 4, 5]
In the case of non-vectorized operations, the functions are executed in every element, one at a time. In the previous list, we need to pass every element to execute an operation. That's why using vectorized functions...