An alternative to using the standard parallel algorithms would be to leverage OpenMP's pragmas. They're an easy way to parallelize many types of computations by just adding a few lines of code. And if you want to distribute your code across a cluster, you might want to see what MPI can do for you. Those two can also be joined together.
With OpenMP, you can use various pragmas to easily parallelize code. For instance, you can write #pragma openmp parallel for before a for loop to get it executed using parallel threads. The library can do much more, such as executing computations on GPUs and other accelerators.
Integrating MPI into your project is harder than just adding an appropriate pragma. Here, you'll need to use the MPI API in your code base to send or receive data between processes (using calls such as MPI_Send and MPI_Recv), or perform various gather and reduce operations (calling MPI_Bcast and MPI_Reduce, among...