You're reading from Learning Concurrency in Python Build highly efficient, robust, and concurrent applications

Product type Paperback

Published in Aug 2017

Publisher Packt

ISBN-13 9781787285378

Length 360 pages

Edition 1st Edition

Languages

Python

Concepts

Concurrency

Author (1):

Elliot Forbes

View More author details

Improving number crunching with multiprocessing

So, we've seen exactly how we can improve things such as downloading images, but how do we improve the performance of our number crunching? Well, this is where multiprocessing shines if used in the correct manner.

In this example, we'll try to find the prime factors of 10,000 random numbers that fall between 20,000 and 100,000,000. We are not necessarily fussed about the order of execution so long as the work gets done, and we aren't sharing memory between any of our processes.

Sequential prime factorization

Again, we'll write a script that does this in a sequential manner, which we can easily verify is working correctly:

import time
import random
def calculatePrimeFactors(n):
  primfac = []
  d = 2
  while d*d <= n:
    while (n % d) == 0:
      primfac.append(d)  # supposing you want multiple factors repeated
      n //= d
    d += 1
  if n > 1:
    primfac.append(n)
  return primfac
def main():
  print("Starting number crunching")
  t0 = time.time()
  
  for i in range(10000):
    rand = random.randint(20000, 100000000)
    print(calculatePrimeFactors(rand))
  
  t1 = time.time()
  totalTime = t1 - t0
  print("Execution Time: {}".format(totalTime))
if __name__ == '__main__':
  main()

Breaking it down

The first two lines make up our required imports--we'll be needing both the time and the random modules. After our imports, we then go on to define the calculatePrimeFactors function, which takes an input of n. This efficiently calculates all of the prime factors of a given number, and appends them to an array, which is then returned once that function completes execution.

After this, we define the main function, which calculates the starting time and then cycles through 10,000 numbers, which are randomly generated by using random's randint. We then pass these generated numbers to the calculatePrimeFactors function, and we print out the result. Finally, we calculate the end time of this for loop and print it out.

If you execute this on your computer, you should see the array of prime factors being printed out for 10,000 different random numbers, as well as the total execution time for this code. For me, it took roughly 3.6 seconds to execute on my Macbook.

Concurrent prime factorization

So now let us have a look at how we can improve the performance of this program by utilizing multiple processes.

In order for us to split this workload up, we'll define an executeProc function, which, instead of generating 10,000 random numbers to be factorized, will generate 1,000 random numbers. We'll create 10 processes, and execute the function 10 times, though, so the total number of calculations should be the exact same as when we performed the sequential test:

import time
import random
from multiprocessing import Process
# This does all of our prime factorization on a given number 'n'
def calculatePrimeFactors(n):
  primfac = []
  d = 2
  while d*d <= n:
    while (n % d) == 0:
      primfac.append(d)  # supposing you want multiple factors repeated
      n //= d
    d += 1
  if n > 1:
    primfac.append(n)
  return primfac
# We split our workload from one batch of 10,000 calculations
# into 10 batches of 1,000 calculations
def executeProc():
  for i in range(1000):
    rand = random.randint(20000, 100000000)
    print(calculatePrimeFactors(rand))
def main():
  print("Starting number crunching")
  t0 = time.time() 
  procs = []
  # Here we create our processes and kick them off
  for i in range(10):
    proc = Process(target=executeProc, args=())
    procs.append(proc)
    proc.start()
  # Again we use the .join() method in order to wait for 
  # execution to finish for all of our processes
  for proc in procs:
    proc.join()
  t1 = time.time()
  totalTime = t1 - t0
  # we print out the total execution time for our 10
  # procs.
  print("Execution Time: {}".format(totalTime))
if __name__ == '__main__':
  main()

Breaking it down

This last code performs the exact same function as our originally posted code. The first change, however, is on line three. Here, we import the process from the multiprocessing module. Our following, the calculatePrimeFactors method has not been touched.

You should then see that we pulled out the for loop that initially ran for 10,000 iterations. We now placed this in a function called executeProc, and we also reduced our for loops range to 1,000.

Within the main function, we then create an empty array called procs. We then create 10 different processes, and set the target to be the executeProc function, and pass in no args. We append this newly created process to our procs arrays, and then we start the process by calling proc.start().

After we've created 10 individual processes, we then cycle through these processes which are now in our procs array, and join them. This ensures that every process has finished its calculations before we proceed to calculate the total execution time.

If you execute this now, you should see the 10,000 outputs now print out in your console, and you should also see a far lower execution time when compared to your sequential execution. For reference, the sequential program executed in 3.9 seconds on my computer compared to 1.9 seconds when running the multiprocessing version.

This is just a very basic demonstration as to how we can implement multiprocessing into our applications. In future chapters, we'll explore how we can create pools and utilize executors. The key point to take away from this is that we can improve the performance of some CPU-bound tasks by utilizing multiple cores.