The parallel scalar product example is typical for many other tasks in the way how results are handled: the amount of data coming from all processors is reduced to a single number in the last step. Here, the root processor sums up all partial results from the processors. The command reduce can be efficiently used for this task. We modify the preceding code by letting reduce do the gathering and summation in one step. Here, the last lines of the preceding code are modified in this way:
......... modification of the script above .....
# Each processor reports its result back to the root
# and these results are summed up
total_dot = comm.reduce(partial_dot, op=MPI.SUM, root=0)
if rank==0:
print(f'The parallel scalar product of u and v'
f' on {nprocessors} processors is {total_dot}.\n'
f'The difference to the serial computation \
is {abs(total_dot-u@v)}')
Other frequently applied...