Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Save more on your purchases! discount-offer-chevron-icon
Savings automatically calculated. No voucher code required.
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletter Hub
Free Learning
Arrow right icon
timer SALE ENDS IN
0 Days
:
00 Hours
:
00 Minutes
:
00 Seconds
Parallel Programming with Python
Parallel Programming with Python

Parallel Programming with Python: Develop efficient parallel systems using the robust Python environment.

eBook
€8.99 €14.99
Paperback
€18.99
Subscription
Free Trial
Renews at €18.99p/m

What do you get with eBook?

Product feature icon Instant access to your Digital eBook purchase
Product feature icon Download this book in EPUB and PDF formats
Product feature icon Access this title in our online reader with advanced features
Product feature icon DRM FREE - Read whenever, wherever and however you want
OR
Modal Close icon
Payment Processing...
tick Completed

Billing Address

Table of content icon View table of contents Preview book icon Preview Book

Parallel Programming with Python

Chapter 2. Designing Parallel Algorithms

While developing parallel systems, several aspects must be observed before you start with the lines of code. Outlining the problem and the way it will be paralleled from the beginning are essential in order to obtain success along the task. In this chapter, we'll approach some technical aspects to achieve solutions.

This chapter covers the following topics:

  • The divide and conquer technique
  • Data decomposition
  • Decomposing tasks with pipeline
  • Processing and mapping

The divide and conquer technique

When you face a complex issue, the first thing to be done is to decompose the problem in order to identify parts of it that may be handled independently. In general, the parallelizable parts in a solution are in pieces that can be divided and distributed for them to be processed by different workers. The technique of dividing and conquering involves splitting the domain recursively until an indivisible unit of the complete issue is found and solved. The sort algorithms, such as merge sort and quick sort , can be resolved by using this approach.

The following diagram shows the application of a merge sort in a vector of six elements, making the divide and conquer technique visible:

The divide and conquer technique

Merge sort (divide and conquer)

Using data decomposition

One of the ways to parallelize a problem is through data decomposition. Imagine a situation in which the task is to multiply a 2 x 2 matrix, which we will call Matrix A, by a scalar value of 4. In a sequential system, we will perform each multiplication operation one after the other, generating the final result at the end of all the instructions. Depending on the size of Matrix A, the sequential solution of the problem may be time consuming. However, when decomposition of data is applied, we can picture a scenario in which Matrix A is broken into pieces, and these pieces are associated with the workers that process the received data in a parallel way. The following diagram illustrates the concept of data decomposition applied to the example of a 2 x 2 matrix multiplied by a scalar value:

Using data decomposition

Data decomposition in a matrix example

The matrix problem presented in the preceding diagram had a certain symmetry where each necessary operation to get to the final result was executed...

Decomposing tasks with pipeline

The pipeline technique is used to organize tasks that must be executed in a collaborative way to resolve a problem. Pipeline breaks large tasks into smaller independent tasks that run in a parallel manner. The pipeline model could be compared to an assembly line at a vehicle factory where the chassis is the raw material, the input. As the raw material goes through different stages of production, several workers perform different actions one after another until the end of the process so that we can have a car ready. This model is very similar to the sequential paradigm of development; tasks are executed on data one after another, and normally, a task gets an input, which is the result of the previous task. So what differentiates this model from the sequential technique? Each stage of the pipeline technique possesses its own workers that act in a parallel way on the problem.

An example in the context of computing could be one in which a system processes images...

Processing and mapping

The number of workers is not always large enough to resolve a specific problem in a single step. Therefore, the decomposition techniques presented in the previous sections are necessary. However, decomposition techniques should not be applied arbitrarily; there are factors that can influence the performance of the solution. After decomposing data or tasks, the question we ought to ask is, "How do we divide the processing load among workers to obtain good performance?" This is not an easy question to answer, as it all depends on the problem under study.

Basically, we could mention two important steps when defining process mapping:

  • Identifying independent tasks
  • Identifying tasks that require data exchange

Identifying independent tasks

Identifying independent tasks in a system allows us to distribute the tasks among different workers, as these tasks do not need constant communication. As there is no need for a data location, tasks can be executed in different workers...

Summary

In this chapter, we discussed some ways to create parallel solutions. Your focus should be on the importance of dividing the processing load among different workers, considering the location and not the data.

In the next chapter, we will study how to identify a parallelizable problem.

Left arrow icon Right arrow icon

Description

Starting with the basics of parallel programming, you will proceed to learn about how to build parallel algorithms and their implementation. You will then gain the expertise to evaluate problem domains, identify if a particular problem can be parallelized, and how to use the Threading and Multiprocessor modules in Python. The Python Parallel (PP) module, which is another mechanism for parallel programming, is covered in depth to help you optimize the usage of PP. You will also delve into using Celery to perform distributed tasks efficiently and easily. Furthermore, you will learn about asynchronous I/O using the asyncio module. Finally, by the end of this book you will acquire an in-depth understanding about what the Python language has to offer in terms of built-in and external modules for an effective implementation of Parallel Programming. This is a definitive guide that will teach you everything you need to know to develop and maintain high-performance parallel computing systems using the feature-rich Python.

What you will learn

  • Explore techniques to parallelize problems
  • Integrate the Parallel Python module to implement Python code
  • Execute parallel solutions on simple problems
  • Achieve communication between processes using Pipe and Queue
  • Use Celery Distributed Task Queue
  • Implement asynchronous I/O using the Python asyncio module
  • Create threadsafe structures

Product Details

Country selected
Publication date, Length, Edition, Language, ISBN-13
Publication date : Jun 25, 2014
Length: 124 pages
Edition : 1st
Language : English
ISBN-13 : 9781783288403
Category :
Languages :

What do you get with eBook?

Product feature icon Instant access to your Digital eBook purchase
Product feature icon Download this book in EPUB and PDF formats
Product feature icon Access this title in our online reader with advanced features
Product feature icon DRM FREE - Read whenever, wherever and however you want
OR
Modal Close icon
Payment Processing...
tick Completed

Billing Address

Product Details

Publication date : Jun 25, 2014
Length: 124 pages
Edition : 1st
Language : English
ISBN-13 : 9781783288403
Category :
Languages :

Packt Subscriptions

See our plans and pricing
Modal Close icon
€18.99 billed monthly
Feature tick icon Unlimited access to Packt's library of 7,000+ practical books and videos
Feature tick icon Constantly refreshed with 50+ new titles a month
Feature tick icon Exclusive Early access to books as they're written
Feature tick icon Solve problems while you work with advanced search and reference features
Feature tick icon Offline reading on the mobile app
Feature tick icon Simple pricing, no contract
€189.99 billed annually
Feature tick icon Unlimited access to Packt's library of 7,000+ practical books and videos
Feature tick icon Constantly refreshed with 50+ new titles a month
Feature tick icon Exclusive Early access to books as they're written
Feature tick icon Solve problems while you work with advanced search and reference features
Feature tick icon Offline reading on the mobile app
Feature tick icon Choose a DRM-free eBook or Video every month to keep
Feature tick icon PLUS own as many other DRM-free eBooks or Videos as you like for just €5 each
Feature tick icon Exclusive print discounts
€264.99 billed in 18 months
Feature tick icon Unlimited access to Packt's library of 7,000+ practical books and videos
Feature tick icon Constantly refreshed with 50+ new titles a month
Feature tick icon Exclusive Early access to books as they're written
Feature tick icon Solve problems while you work with advanced search and reference features
Feature tick icon Offline reading on the mobile app
Feature tick icon Choose a DRM-free eBook or Video every month to keep
Feature tick icon PLUS own as many other DRM-free eBooks or Videos as you like for just €5 each
Feature tick icon Exclusive print discounts

Frequently bought together


Stars icon
Total 77.97
Parallel Programming with Python
€18.99
Mastering Python Regular Expressions
€20.99
Mastering Object-oriented Python
€37.99
Total 77.97 Stars icon
Banner background image

Table of Contents

9 Chapters
1. Contextualizing Parallel, Concurrent, and Distributed Programming Chevron down icon Chevron up icon
2. Designing Parallel Algorithms Chevron down icon Chevron up icon
3. Identifying a Parallelizable Problem Chevron down icon Chevron up icon
4. Using the threading and concurrent.futures Modules Chevron down icon Chevron up icon
5. Using Multiprocessing and ProcessPoolExecutor Chevron down icon Chevron up icon
6. Utilizing Parallel Python Chevron down icon Chevron up icon
7. Distributing Tasks with Celery Chevron down icon Chevron up icon
8. Doing Things Asynchronously Chevron down icon Chevron up icon
Index Chevron down icon Chevron up icon

Customer reviews

Top Reviews
Rating distribution
Full star icon Full star icon Full star icon Empty star icon Empty star icon 3
(9 Ratings)
5 star 11.1%
4 star 22.2%
3 star 33.3%
2 star 22.2%
1 star 11.1%
Filter icon Filter
Top Reviews

Filter reviews by




Fitzgerald Stowers Aug 04, 2014
Full star icon Full star icon Full star icon Full star icon Full star icon 5
Great book and very easy to understand!
Amazon Verified review Amazon
Pedro Medeiros Jan 30, 2015
Full star icon Full star icon Full star icon Full star icon Empty star icon 4
Very useful information. Could be better written, not suitable for parallel programming neophytes.
Amazon Verified review Amazon
Zippy Aug 15, 2014
Full star icon Full star icon Full star icon Full star icon Empty star icon 4
Python’s roots go so far back in time that, despite the prevalence of today’s multicore CPU computers, the most popular, “official” version of the language CPython) is limited by the GIL, or Global Interpreter Lock. The GIL restricts thread processing so that only one thread at a time for each interpreter process executing on the system.Among other things, the GIL prevents the sharing of code that’s not thread-safe with other threads, such as the code often found in the many C libraries that are often called upon by Python programmers, and it allows for simplified memory management.The GIL certainly puts the kibosh on anyone attempting to implement concurrency or pure parallelism on a multiprocessing machine running Python. What one gets instead is a sort of cooperative multitasking of sections of threads that run in spurts and stretches between moments of I/O (read, write, send, etc.) when the GIL is briefly released (CPU-bound threads that never deal with I/O are handled as a special case).Running a Python program on a multiple core PC is therefore asking for trouble, since each core “wants” to schedule its own runnable thread, and all the scheduling of threads is simultaneous. But the GIL only allows one thread at a time to run, so a Python programmer who divvies his algorithm into multiple threads will get quite a surprise when he discovers that his software is running slower than it should, since threads are competing for exclusive access to the GIL. Indeed, running such a multi-thread program on a multicore system can slow things down even more!The limitations imposed by GIL is probably the principal reason why some Python programmers have switched to working in Google’s Go language, which is built from the ground up to harness to power of multiple cores and processors via what its originators call “goroutines;” these vaguely resemble Unix pipes and can all run in the same memory address space via multiplexing onto multiple, parallel OS threads.But all is not lost for those Python programmers longing to delve into the realm of parallel and concurrent programming.For those completely clueless about how to thumb one’s nose at the GIL and tackle concurrency and parallelism in Python, such as using the open source and cross-platform “PP” or Parallel Python module, a good place to start is this book.That being the case, I won’t burden the reader with the quality of the author’s coverage in specific areas, since a beginner will have no idea of what I’m talking about. (Besides, others have commented on that elsewhere on this site.) Suffice it to say that, provided you can jump through enough coding hoops, you can bring a performance boost to your software, particularly if it has a sufficiently “granular” structure.
Amazon Verified review Amazon
A. Zubarev Aug 07, 2014
Full star icon Full star icon Full star icon Empty star icon Empty star icon 3
Parallel Programming is an increasingly hot topic in today's IT circles. For those who ponder why I can tell in short it is because of the CPU clock speeds stagnation. We, software engineers, are dealing with ever increasing volumes of data and are asked to deliver even faster, more robust applications and websites. This is tough. Parallel Programming is the answer. I hope I whet your appetite for exploring the Parallel Programming so now I can switch the focus to the book.It is not terribly long. Not costly either. In fact if you care I managed to read it whole in 3 hours plus (stats are from my ebook reader app) and managed to run a few examples that worked on my laptop with Windows 7. I am planning on running more examples later on a POSIX machine. Thing with the examples is they are classic ones: the Fibonacci series which is boring to me and far from what anybody would be dealing with at work and web crawling which is better done using say Nutch. The same code examples go through the entire book, just different techniques applied. What I wish Jan had done is explaining at least what technique helps in what case in real life. My other pet peeves are that there was no mention on how to leverage the GPU, how to eliminate the For Loops - this is actually a must in my opinion, and there was no coverage on how to debug parralel processes. Let me stop at debugging a tad longer: since Python allows mutability it becomes critical to exterminate nasty mutation bugs!In terms of closing, I have an advice to the author: it is hard to write a technical book, but I wish it could be longer and covered more ground, another advice is to the publisher, this book qualifies for the "Instant" moniker type of the books from Packt. By the way I like your website redesign!Three stars out of five.
Amazon Verified review Amazon
Brisard Aug 14, 2014
Full star icon Full star icon Full star icon Empty star icon Empty star icon 3
For a long time, programmers have been relying exclusively on Moore's law to resolve their performance issues. In other words, they trusted the fact that the CPU frequency would increase, making their program faster without changing one single line. Today however, "the free lunch is over", as already argued by Herb Sutter in 2004. The frequency of CPUs tends to stagnate, while the number of cores in even the cheapest laptop has increased. For Sutter, the immediate consequence is: "applications will increasingly need to be concurrent if they want to fully exploit CPU throughput gains".Python is no exception to this paradigm change and this book introduces several ways to go parallel with Python using the standard library modules threading, multiprocessing and asyncio, and the third-party modules Parallel Python and Celery. This is quite an impressive list for such a short book, but I personally think that IPython.parallel and mpi4py (both quite popular in scientific computing) should have made it into this book.Parallel programming in Python is no trivial task because of the Global Interpreter Lock (GIL) which "prevents multiple native threads from executing Python bytecodes at once". While this issue is briefly mentioned in this book (in a section called "Taking Care of Python GIL"), I think that the author should have made it clearer which modules suffer from this limitation, and which don't. For example, the following statement (which can be found page 30) is far too vague: "Within the Python programming language, the use of CPU-bound threads may harm performance of the application due to GIL.""may harm performance"? Could you be more precise, please? The answer can be found in the module's official documentation: "CPython implementation detail: In CPython, due to the Global Interpreter Lock, only one thread can execute Python code at once (even though certain performance-oriented libraries might overcome this limitation). If you want your application to make better use of the computational resources of multi-core machines, you are advised to use multiprocessing or concurrent.futures.ProcessPoolExecutor. However, threading is still an appropriate model if you want to run multiple I/O-bound tasks simultaneously."So the truth is that "may harm" should really read "does harm". To be fair, a much more clear-cut statement can be found page 94: "A way to solve this problem is delegating a blocking task to ThreadPoolExecutor (remember this works well if the processing is I/O bound; if it is CPU-bound, use ProcessPoolExecutor)".The book starts with two introductory chapters, in which the reader will find useful material. Chapter 1 presents a historical background, defines various forms of parallelization as well as the most common pitfalls of parallel programming (deadlock, starvation, race condition). Chapter 2 reviews a few parallelization strategies.In chapter 3, the author presents two simple problems which will be used in the remainder of the book to illustrate the use of the various modules I mentioned above. I find the idea of reusing the same two examples over and over again quite interesting, as it makes it easier to the reader to understand the differences and similarities between these modules. Besides, the two examples are simple, and so is their parallel implementation. Maybe these two problems are too simple, though. Indeed, both are embarrasingly parallel problems, for which very little communication is required (besides scattering the data at the beginning and gathering the results at the end). In such a favorable situation, the potential "parallel programming problems" listed in chapter 1 are simply swept under the rug, which is probably better for an introductory book on parallel programming.In Chapters 4, 5, 6, 7 and 8, these two problems are implemented using the following modules in turn: threading, multiprocessing, pp, celery and asyncio. Of course, the API of these modules is not (cannot be) described in detail in such a short book: only the most elementary classes and functions are described. So be ready to dive into the API documentations after reading this book! It should be mentioned that the chapters on celery and asyncio (new in Python 3.4) provide more details, and are very enjoyable. I do think that a closing comparison between all these modules is clearly missing.To conclude, I would recommend this book to those of you who are willing to go parallel in Python, but do not know where to start. This book usefully lists various options and provides the keys to each of those. Happy reading!
Amazon Verified review Amazon
Get free access to Packt library with over 7500+ books and video courses for 7 days!
Start Free Trial

FAQs

How do I buy and download an eBook? Chevron down icon Chevron up icon

Where there is an eBook version of a title available, you can buy it from the book details for that title. Add either the standalone eBook or the eBook and print book bundle to your shopping cart. Your eBook will show in your cart as a product on its own. After completing checkout and payment in the normal way, you will receive your receipt on the screen containing a link to a personalised PDF download file. This link will remain active for 30 days. You can download backup copies of the file by logging in to your account at any time.

If you already have Adobe reader installed, then clicking on the link will download and open the PDF file directly. If you don't, then save the PDF file on your machine and download the Reader to view it.

Please Note: Packt eBooks are non-returnable and non-refundable.

Packt eBook and Licensing When you buy an eBook from Packt Publishing, completing your purchase means you accept the terms of our licence agreement. Please read the full text of the agreement. In it we have tried to balance the need for the ebook to be usable for you the reader with our needs to protect the rights of us as Publishers and of our authors. In summary, the agreement says:

  • You may make copies of your eBook for your own use onto any machine
  • You may not pass copies of the eBook on to anyone else
How can I make a purchase on your website? Chevron down icon Chevron up icon

If you want to purchase a video course, eBook or Bundle (Print+eBook) please follow below steps:

  1. Register on our website using your email address and the password.
  2. Search for the title by name or ISBN using the search option.
  3. Select the title you want to purchase.
  4. Choose the format you wish to purchase the title in; if you order the Print Book, you get a free eBook copy of the same title. 
  5. Proceed with the checkout process (payment to be made using Credit Card, Debit Cart, or PayPal)
Where can I access support around an eBook? Chevron down icon Chevron up icon
  • If you experience a problem with using or installing Adobe Reader, the contact Adobe directly.
  • To view the errata for the book, see www.packtpub.com/support and view the pages for the title you have.
  • To view your account details or to download a new copy of the book go to www.packtpub.com/account
  • To contact us directly if a problem is not resolved, use www.packtpub.com/contact-us
What eBook formats do Packt support? Chevron down icon Chevron up icon

Our eBooks are currently available in a variety of formats such as PDF and ePubs. In the future, this may well change with trends and development in technology, but please note that our PDFs are not Adobe eBook Reader format, which has greater restrictions on security.

You will need to use Adobe Reader v9 or later in order to read Packt's PDF eBooks.

What are the benefits of eBooks? Chevron down icon Chevron up icon
  • You can get the information you need immediately
  • You can easily take them with you on a laptop
  • You can download them an unlimited number of times
  • You can print them out
  • They are copy-paste enabled
  • They are searchable
  • There is no password protection
  • They are lower price than print
  • They save resources and space
What is an eBook? Chevron down icon Chevron up icon

Packt eBooks are a complete electronic version of the print edition, available in PDF and ePub formats. Every piece of content down to the page numbering is the same. Because we save the costs of printing and shipping the book to you, we are able to offer eBooks at a lower cost than print editions.

When you have purchased an eBook, simply login to your account and click on the link in Your Download Area. We recommend you saving the file to your hard drive before opening it.

For optimal viewing of our eBooks, we recommend you download and install the free Adobe Reader version 9.