Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Save more on your purchases! discount-offer-chevron-icon
Savings automatically calculated. No voucher code required.
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletter Hub
Free Learning
Arrow right icon
timer SALE ENDS IN
0 Days
:
00 Hours
:
00 Minutes
:
00 Seconds
Mastering Python High Performance
Mastering Python High Performance

Mastering Python High Performance: Learn how to optimize your code and Python performance with this vital guide to Python performance profiling and benchmarking

eBook
$35.98 $39.99
Paperback
$48.99
Subscription
Free Trial
Renews at $19.99p/m

What do you get with Print?

Product feature icon Instant access to your digital copy whilst your Print order is Shipped
Product feature icon Paperback book shipped to your preferred address
Product feature icon Redeem a companion digital copy on all Print orders
Product feature icon Access this title in our online reader with advanced features
Product feature icon DRM FREE - Read whenever, wherever and however you want
OR
Modal Close icon
Payment Processing...
tick Completed

Shipping Address

Billing Address

Shipping Methods
Table of content icon View table of contents Preview book icon Preview Book

Mastering Python High Performance

Chapter 1. Profiling 101

Just like any infant needs to learn how to crawl before running 100 mts with obstacles in under 12 seconds, programmers need to understand the basics of profiling before trying to master that art. So, before we start delving into the mysteries of performance optimization and profiling on Python programs, we need to have a clear understanding of the basics.

Once you know the basics, you'll be able to learn about the tools and techniques. So, to start us off, this chapter will cover everything you need to know about profiling but were too afraid to ask. In this chapter we will do the following things:

  • We will provide a clear definition of what profiling is and the different profiling techniques.
  • We will explain the importance of profiling in the development cycle, because profiling is not something you do only once and then forget about it. Profiling should be an integral part of the development process, just like writing tests is.
  • We will cover things we can profile. We'll go over the different types of resources we'll be able to measure and how they'll help us find our problems.
  • We will discuss the risk of premature optimization, that is, why optimizing before profiling is generally a bad idea.
  • You will learn about running time complexity. Understanding profiling techniques is one step into successful optimization, but we also need to understand how to measure the complexity of an algorithm in order to understand whether we need to improve it or not.
  • We will also look at good practices. Finally, we'll go over some good practices to keep in mind when starting the profiling process of your project.

What is profiling?

A program that hasn't been optimized will normally spend most of its CPU cycles in some particular subroutines. Profiling is the analysis of how the code behaves in relation to the resources it's using. For instance, profiling will tell you how much CPU time an instruction is using or how much memory the full program is consuming. It is achieved by modifying either the source code of the program or the binary executable form (when possible) to use something called as a profiler.

Normally, developers profile their programs when they need to either optimize their performance or when those programs are suffering from some kind of weird bug, which can normally be associated with memory leaks. In such cases, profiling can help them get an in-depth understanding of how their code is using the computer's resources (that is, how many times a certain function is being called).

A developer can use this information, along with a working knowledge of the source code, to find the program's bottlenecks and memory leaks. The developer can then fix whatever is wrong with the code.

There are two main methodologies for profiling software: event-based profiling and statistical profiling. When using these types of software, you should keep in mind that they both have pros and cons.

Event-based profiling

Not every programming language supports this type of profiling. Here are some programming languages that support event-based profiling:

Event-based profilers (also known as tracing profilers) work by gathering data on specific events during the execution of our program. These profilers generate a large amount of data. Basically, the more events they listen to, the more data they will gather. This makes them somewhat impractical to use, and they are not the first choice when starting to profile a program. However, they are a good last resort when other profiling methods aren't enough or just aren't specific enough. Consider the case where you'd want to profile all the return statements. This type of profiler would give you the granularity you'd need for this task, while others would simply not allow you to execute this task.

A simple example of an event-based profiler on Python could be the following code (we'll understand this topic better once we reach the upcoming chapters):

import sys

def profiler(frame, event, arg):
    print 'PROFILER: %r %r' % (event, arg)

sys.setprofile(profiler)

#simple (and very ineficient) example of how to calculate the Fibonacci sequence for a number.
def fib(n):
    if n == 0:
        return 0
    elif n == 1:
        return 1
    else:
        return fib(n-1) + fib(n-2)

def fib_seq(n):
    seq = [ ]
    if n > 0:
        seq.extend(fib_seq(n-1))
    seq.append(fib(n))
    return seq

print fib_seq(2)

The preceding code contributes to the following output:

PROFILER: 'call' None
PROFILER: 'call' None
PROFILER: 'call' None
PROFILER: 'call' None
PROFILER: 'return' 0
PROFILER: 'c_call' <built-in method append of list object at 0x7f570ca215f0>
PROFILER: 'c_return' <built-in method append of list object at 0x7f570ca215f0>
PROFILER: 'return' [0]
PROFILER: 'c_call' <built-in method extend of list object at 0x7f570ca21bd8>
PROFILER: 'c_return' <built-in method extend of list object at 0x7f570ca21bd8>
PROFILER: 'call' None
PROFILER: 'return' 1
PROFILER: 'c_call' <built-in method append of list object at 0x7f570ca21bd8>
PROFILER: 'c_return' <built-in method append of list object at 0x7f570ca21bd8>
PROFILER: 'return' [0, 1]
PROFILER: 'c_call' <built-in method extend of list object at 0x7f570ca55bd8>
PROFILER: 'c_return' <built-in method extend of list object at 0x7f570ca55bd8>
PROFILER: 'call' None
PROFILER: 'call' None
PROFILER: 'return' 1
PROFILER: 'call' None
PROFILER: 'return' 0
PROFILER: 'return' 1
PROFILER: 'c_call' <built-in method append of list object at 0x7f570ca55bd8>
PROFILER: 'c_return' <built-in method append of list object at 0x7f570ca55bd8>
PROFILER: 'return' [0, 1, 1]
[0, 1, 1]
PROFILER: 'return' None
PROFILER: 'call' None
PROFILER: 'c_call' <built-in method discard of set object at 0x7f570ca8a960>
PROFILER: 'c_return' <built-in method discard of set object at 0x7f570ca8a960>
PROFILER: 'return' None
PROFILER: 'call' None
PROFILER: 'c_call' <built-in method discard of set object at 0x7f570ca8f3f0>
PROFILER: 'c_return' <built-in method discard of set object at 0x7f570ca8f3f0>
PROFILER: 'return' None

As you can see, PROFILER is called on every event. We can print/gather the information we deem relevant inside the PROFILER function. The last line on the sample code shows that the simple execution of fib_seq(2) generates a lot of output data. If we were dealing with a real-world program, this output would be several orders of magnitude bigger. This is why event-based profiling is normally the last option when it comes to profiling. There are other alternatives out there (as we'll see) that generate much less output, but, of course, have a lower accuracy rate.

Statistical profiling

Statistical profilers work by sampling the program counter at regular intervals. This in turn allows the developer to get an idea of how much time the target program is spending on each function. Since it works by sampling the PC, the resulting numbers will be a statistical approximation of reality instead of exact numbers. Still, it should be enough to get a glimpse of what the profiled program is doing and where the bottlenecks are.

Some advantages of this type of profiling are as follows:

  • Less data to analyze: Since we're only sampling the program's execution instead of saving every little piece of data, the amount of information to analyze will be significantly smaller.
  • Smaller profiling footprint: Due to the way the sampling is made (using OS interrupts), the target program suffers a smaller hit on its performance. Although the presence of the profiler is not 100 percent unnoticed, statistical profiling does less damage than the event-based one.

Here is an example of the output of OProfile (http://oprofile.sourceforge.net/news/), a Linux statistical profiler:

Function name,File name,Times Encountered,Percentage
"func80000","statistical_profiling.c",30760,48.96%
"func40000","statistical_profiling.c",17515,27.88%
"func20000","static_functions.c",7141,11.37%
"func10000","static_functions.c",3572,5.69%
"func5000","static_functions.c",1787,2.84%
"func2000","static_functions.c",768,1.22%
"func1500","statistical_profiling.c",701,1.12%
"func1000","static_functions.c",385,0.61%
"func500","statistical_profiling.c",194,0.31%

Here is the output of profiling the same Fibonacci code from the preceding code using a statistical profiler for Python called statprof:

  %   cumulative      self          
 time    seconds   seconds  name    
100.00      0.01      0.01  B02088_01_03.py:11:fib
  0.00      0.01      0.00  B02088_01_03.py:17:fib_seq
  0.00      0.01      0.00  B02088_01_03.py:21:<module>
---
Sample count: 1
Total time: 0.010000 seconds

As you can see, there is quite a difference between the output of both profilers for the same code.

The importance of profiling

Now that we know what profiling means, it is also important to understand how important and relevant it is to actually do it during the development cycle of our applications.

Profiling is not something everyone is used to do, especially with non-critical software (unlike peace maker embedded software or any other type of execution-critical example). Profiling takes time and is normally useful only after we've detected that something is wrong with our program. However, it could still be performed before that even happens to catch possible unseen bugs, which would, in turn, help chip away the time spent debugging the application at a later stage.

As hardware keeps advancing, getting faster and cheaper, it is increasingly hard to understand why we, as developers, should spend resources (mainly time) on profiling our creations. After all, we have practices such as test-driven development, code review, pair programming and others that assure us our code is solid and that it'll work as we want it. Right?

However, what we sometimes fail to realize is that the higher level our languages become (we've gone from assembler to JavaScript in just a few years), the less we think about CPU cycles, memory allocation, CPU registries, and so on. New generations of programmers learn their craft using higher level languages because they're easier to understand and provide more power out of the box. However, they also abstract the hardware and our interaction with it. As this tendency keeps growing, the chances that new developers will even consider profiling their software as another step on its development grows weaker by the second.

Let's look at the following scenario:

As we know, profiling measures the resources our program uses. As I've stated earlier, they keep getting cheaper and cheaper. So, the cost of getting our software out and the cost of making it available to a higher number of users is also getting cheaper.

These days, it is increasingly easy to create and publish an application that will be reached by thousands of people. If they like it and spread the word through social media, that number can blow up exponentially. Once that happens, something that is very common is that the software will crash, or it'll become impossibly slow and the users will just go away.

A possible explanation for the preceding scenario is, of course, a badly thought and non-scalable architecture. After all, one single server with a limited amount of RAM and processing power will get you so far until it becomes your bottleneck. However, another possible explanation, one that proves to be true many times, is that we failed to stress test our application. We didn't think about resource consumption; we just made sure our tests passed, and we were happy with that. In other words, we failed to go that extra mile, and as a result, our project crashed and burned.

Profiling can help avoid that crash and burn outcome, since it provides a fairly accurate view of what our program is doing, no matter the load. So, if we profile it with a very light load, and the result is that we're spending 80 percent of our time doing some kind of I/O operation, it might raise a flag for us. Even if, during our test, the application performed correctly, it might not do so under heavy stress. Think of a memory leak-type scenario. In those cases, small tests might not generate a big enough problem for us to detect it. However, a production deployment under heavy stress will. Profiling can provide enough evidence for us to detect this problem before it even turns into one.

What can we profile?

Going deeper into profiling, it is very important to understand what we can actually profile. Measuring is the core of profiling, so let's take a detailed look at the things we can measure during a program's execution.

Execution time

The most basic of the numbers we can gather when profiling is the execution time. The execution time of the entire process or just of a particular portion of the code will shed some light on its own. If you have experience in the area your program is running (that is, you're a web developer and you're working on a web framework), you probably already know what it means for your system to take too much time. For instance, a simple web server might take up to 100 milliseconds when querying the database, rendering the response, and sending it back to the client. However, if the same piece of code starts to slow down and now it takes 60 seconds to do the same task, then you should start thinking about profiling. You also have to consider that numbers here are relative. Let's assume another process: a MapReduce job that is meant to process 2 TB of information stored on a set of text files takes 20 minutes. In this case, you might not consider it as a slow process, even when it takes considerably more time than the slow web server mentioned earlier.

To get this type of information, you don't really need a lot of profiling experience or even complex tools to get the numbers. Just add the required lines into your code and run the program.

For instance, the following code will calculate the Fibonnacci sequence for the number 30:

import datetime

tstart = None
tend = None

def start_time():
    global tstart
    tstart = datetime.datetime.now()
def get_delta():
    global tstart
    tend = datetime.datetime.now()
    return tend - tstart
    
  

 def fib(n):
     return n if n == 0 or n == 1 else fib(n-1) + fib(n-2)

def fib_seq(n):
    seq = [ ]
    if n > 0:
        seq.extend(fib_seq(n-1))
    seq.append(fib(n))
    return seq

start_time()
print "About to calculate the fibonacci sequence for the number 30"
delta1 = get_delta()

start_time()
seq = fib_seq(30) 
delta2 = get_delta()

print "Now we print the numbers: "
start_time()
for n in seq:
    print n
delta3 = get_delta()

print "====== Profiling results ======="
print "Time required to print a simple message: %(delta1)s" % locals()
print "Time required to calculate fibonacci: %(delta2)s" % locals()
print "Time required to iterate and print the numbers: %(delta3)s" % locals()
print "======  ======="

Now, the code will produce the following output:

About to calculate the Fibonacci sequence for the number 30
Now we print the numbers: 
0
1
1
2
3
5
8
13
21
#...more numbers
4181
6765
10946
17711
28657
46368
75025
121393
196418
317811
514229
832040
====== Profiling results =======
Time required to print a simple message: 0:00:00.000030
Time required to calculate fibonacci: 0:00:00.642092
Time required to iterate and print the numbers: 0:00:00.000102

Based on the last three lines, we see the obvious results: the most expensive part of the code is the actual calculation of the Fibonacci sequence.

Tip

Downloading the example code

You can download the example code files from your account at http://www.packtpub.com for all the Packt Publishing books you have purchased. If you purchased this book elsewhere, you can visit http://www.packtpub.com/support and register to have the files e-mailed directly to you.

Where are the bottlenecks?

Once you've measured how much time your code needs to execute, you can profile it by paying special attention to the slow sections. These are the bottlenecks, and normally, they are related to one or a combination of the following reasons:

  • Heavy I/O operations, such as reading and parsing big files, executing long-running database queries, calling external services (such as HTTP requests), and so on
  • Unexpected memory leaks that start building up until there is no memory left for the rest of the program to execute properly
  • Unoptimized code that gets executed frequently
  • Intensive operations that are not cached when they could be

I/O-bound code (file reads/write, database queries, and so on) is usually harder to optimize, because that would imply changing the way the program is dealing with that I/O (normally using core functions from the language). Instead, when optimizing compute-bound code (like a function that is using a badly implemented algorithm), getting a performance improvement is easier (although not necessarily easy). This is because it just implies rewriting it.

A general indicator that you're near the end of a performance optimization process is when most of the bottlenecks left are due to I/O-bound code.

Memory consumption and memory leaks

Another very important resource to consider when developing software is memory. Regular software developers don't really care much about it, since the era of the 640 KB of RAM PC is long dead. However, a memory leak on a long-running program can turn any server into a 640 KB computer. Memory consumption is not just about having enough memory for your program to run; it's also about having control over the memory that your programs use.

There are some developments, such as embedded systems, that actually require developers to pay extra attention to the amount of memory they use, because it is a limited resource in those systems. However, an average developer can expect their target system to have the amount of RAM they require.

With RAM and higher level languages that come with automatic memory management (like garbage collection), the developer is less likely to pay much attention to memory utilization, trusting the platform to do it for them.

Keeping track of memory consumption is relatively straightforward. At least for a basic approach, just use your OS's task manager. It'll display, among other things, the amount of memory used or at least the percentage of total memory used by your program. The task manager is also a great tool to check your CPU time consumption. As you can see in the next screenshot, a simple Python program (the preceding one) is taking up almost the entire CPU power (99.8 percent), and barely 0.1 percent of the total memory that is available:

Memory consumption and memory leaks

With a tool like that (the top command line tool from Linux), spotting memory leaks can be easy, but that will depend on the type of software you're monitoring. If your program is constantly loading data, its memory consumption rate will be different from another program that doesn't have to deal much with external resources.

For instance, if we were to chart the memory consumption over time of a program dealing with lots of external data, it would look like the following chart:

Memory consumption and memory leaks

There will be peaks, when these resources get fully loaded into memory, but there will also be some drops, when those resources are released. Although the memory consumption numbers fluctuate quite a bit, it's still possible to estimate the average amount of memory that the program will use when no resources are loaded. Once you define that area (marked as a green box in the preceding chart), you can spot memory leaks.

Let's look at how the same chart would look with bad resource handling (not fully releasing allocated memory):

Memory consumption and memory leaks

In the preceding chart, you can clearly see that not all memory is released when a resource is no longer used, which is causing the line to move out of the green box. This means the program is consuming more and more memory every second, even when the resources loaded are released.

The same can be done with programs that aren't resource heavy, for instance, scripts that execute a particular processing task for a considerable period of time. In those cases, the memory consumption and the leaks should be easier to spot.

Let's take a look at an example:

Memory consumption and memory leaks

When the processing stage starts, the memory consumption should stabilize within a clearly defined range. If we spot numbers outside that range, especially if it goes out of it and never comes back, we're looking at another example of a memory leak.

Let's look at an example of such a case:

Memory consumption and memory leaks

The risk of premature optimization

Optimization is normally considered a good practice. However, this doesn't hold true when the act of optimization ends up driving the design decisions of the software solution.

A very common pitfall developers face while starting to code a new piece of software is premature optimization.

When this happens, the end result ends up being quite the opposite of the intended optimized code. It can contain an incomplete version of the required solution, or it can even contain errors derived from the optimization-driven design decisions.

As a normal rule of thumb, if you haven't measured (profiled) your code, optimizing it might not be the best idea. First, focus on readable code. Then, profile it and find out where the real bottlenecks are, and as a final step, perform the actual optimization.

Running time complexity

When profiling and optimizing code, it's really important to understand what Running time complexity (RTC) is and how we can use that knowledge to properly optimize our code.

RTC helps quantify the execution time of a given algorithm. It does so by providing a mathematical approximation of the time a piece of code will take to execute for any given input. It is an approximation, because that way, we're able to group similar algorithms using that value.

RTC is expressed using something called Big O notation. In mathematics, Big O notation is used to express the limiting behavior of a given function when the terms tend to infinity. If I apply that concept in computer science, we can use Big O notation to express the limiting behavior of the function describing the execution time.

In other words, this notation will give us a broad idea of how long our algorithm will take to process an arbitrarily large input. It will not, however, give us a precise number for the time of execution, which would require a more in-depth analysis of the source code.

As I've said earlier, we can use this tendency to group algorithms. Here are some of the most common groups:

Constant time – O(1)

This is the simplest of them all. This notation basically means that the action we're measuring will always take a constant amount of time, and this time is not dependent on the size of the input.

Here are some examples of code that have O(1) execution time:

  • Determining whether a number is odd or even:
    if number % 2:
      odd = True 
    else:
      odd = False
  • Printing a message into standard output:
    print "Hello world!"

Even something more conceptually complex, like finding the value of a key inside a dictionary (or hash table), if implemented correctly, can be done in constant time. Technically speaking, accessing an element on the hash takes O(1) amortized time, which roughly means that the average time each operation takes (without taking into account edge cases) is a constant O(1) time.

Linear time – O(n)

Linear time dictates that for a given input of arbitrary length n, the amount of time required for the execution of the algorithm is linearly proportional to n, for instance, 3n, 4n + 5, and so on.

Linear time – O(n)

The preceding chart clearly shows that both the blue (3n) line and the red one (4n + 5) have the same upper limit as the black line (n) when x tends to infinity. So, to simplify, we can just say that all three functions are O(n).

Examples of algorithms with this execution order are:

  • Finding the smallest value in an unsorted list
  • Comparing two strings
  • Deleting the last item inside a linked list

Logarithmic time – O(log n)

An algorithm with logarithmic execution time is one that will have a very determined upper limit time. A logarithmic function grows quickly at first, but it'll slow down as the input size gets bigger. It will never stop growing, but the amount it grows by will be so small that it will be irrelevant.

Logarithmic time – O(log n)

The preceding chart shows three different logarithmic functions. You can clearly see that they all possess a similar shape, including the upper limit x, which keeps increasing to infinity.

Some examples of algorithms that have logarithmic execution time are:

  • Binary search
  • Calculating Fibonacci numbers (using matrix multiplications)

Linearithmic time – O(nlog n)

A particular combination of the previous two orders of execution is the linearithmic time. It grows quickly as soon as the value of x starts increasing.

Here are some examples of algorithms that have this order of execution:

  • Merge sort
  • Heap sort
  • Quick sort (at least its average time complexity)

Let's see a few examples of plotted linearithmic functions to understand them better:

Linearithmic time – O(nlog n)

Factorial time – O(n!)

Factorial time is one of the worst execution times we might get out of an algorithm. It grows so quickly that it's hard to plot.

Here is a rough approximation of how the execution time of our algorithm would look with factorial time:

Factorial time – O(n!)

An example of an algorithm with factorial execution time is the solution for the traveling salesman using brute force search (basically checking every single possible solution).

Quadratic time – O(n^)

Quadratic execution time is another example of a fast growing algorithm. The bigger the input size, the longer it's going to take (this is true for most complexities, but then again, specially true for this one). Quadratic execution time is even less efficient that linearithmic time.

Some examples of algorithms having this order of execution are:

  • Bubble sort
  • Traversing a 2D array
  • Insertion sort

Here are some examples of plotted exponential functions:

Quadratic time – O(n^)

Finally, let's look at all examples plotted together to get a clear idea of algorithm efficiency:

Quadratic time – O(n^)

Leaving aside constant execution time, which is clearly faster but most of the time impossible to achieve in complex algorithms, the order or preference should be:

  • Logarithmic
  • Linear
  • Linearithmic
  • Quadratic
  • Factorial

Obviously, there are cases when you'll have no choice but to get a quadratic execution time as the best possible result. The idea is to always aim for the faster algorithms, but the limitations of your problems and technology will affect the actual result.

Note

Note that between quadratic and factorial times, there are several other alternatives (cubic, n ^ 4, and so on).

Another important consideration is that most algorithms don't have only a single order of execution time. They can have up to three orders of execution time: for the best case, normal case, and worst case scenarios. The scenario is determined by the properties of the input data. For instance, the insertion sort algorithm will run much faster if the input is already sorted (best case), and it will be worst (exponential order) for other types of input.

Other interesting cases to look at are the data types used. They inherently come with execution time that is associated with actions you can perform on them (lookup, insert, search, and so on). Let's look at some of the most common data types and their associated actions:

Data Structure

Time complexity

 

Average case

Worst case

 

Indexing

Search

Insertion

Deletion

Indexing

Search

Insertion

Deletion

List

O(1)

O(n)

-

-

O(1)

O(n)

-

-

Linked list

O(n)

O(n)

O(1)

O(1)

O(n)

O(n)

O(1)

O(n)

Doubly linked list

O(n)

O(n)

O(1)

O(1)

O(n)

O(n)

O(1)

O(1)

Dictionary

-

O(1)

O(1)

O(1)

-

O(n)

O(n)

O(n)

Binary search tree

O(log(n))

O(log(n))

O(log(n))

O(log(n))

O(n)

O(n)

O(n)

O(n)

Profiling best practices

Profiling is a repetitive task. You'll do it several times inside the same project in order to get the best results, and you'll do it again on the next project. Just like with any other repetitive task in software development, there is a set of best practices you can follow to ensure that you get the most out of the process. Let's look at some of them:

Build a regression-test suite

Before starting any kind of optimization process, you need to make sure that the changes you make to the code will not affect its functioning in a bad way. The best way to do this, especially when it's a big code base, is to create a test suite. Make sure that your code coverage is high enough to provide the confidence you need to make the changes. A test suite with 60 percent code coverage can lead to very bad results.

A regression-test suite will allow you to make as many optimization tries as you need to without fear of breaking the code.

Mind your code

Functional code tends to be easier to refactor, mainly because the functions structured that way tend to avoid side effects. This reduces any risk of affecting unwanted parts of your system. If your functions avoid a local mutable state, that's another winning point for you. This is because the code should be pretty straightforward for you to understand and change. Functions that don't follow the previously mentioned guidelines will require more work and care while refactoring.

Be patient

Profiling is not fast, not easy, and not an exact process. What this means is that you should not expect to just run the profiler and expect the data from it to point directly to your problem. That could happen, yes. However, most of the time, the problems you're trying to solve are the ones that simple debugging couldn't fix. This means you'll be browsing through data, plotting it to try to make sense of it, and narrowing down the source of your problem until you either need to start again, or you find it.

Keep in mind that the deeper you get into the profiled data, the deeper into the rabbit hole you get. Numbers will stop making sense right away, so make sure you know what you're doing and that you have the right tools for the job before you start. Otherwise, you'll waste your time and end up with nothing but frustration.

Gather as much data as you can

Depending on the type and size of software you're dealing with, you might want to get as much data as you can before you start analyzing it. Profilers are a great source for this. However, there are other sources, such as server logs from web applications, custom logs, system resources snapshots (like from the OS task manager), and so on.

Preprocess your data

After you have all the information from your profilers, your logs, and other sources, you will probably need to preprocess the data before analyzing it. Don't shy away from unstructured data just because a profiler can't understand it. Your analysis of the data will benefit from the extra numbers.

For instance, getting the web server logs is a great idea if you're profiling a web application, but those files are normally just text files with one line per request. By parsing it and getting the data into some kind of database system (like MongoDB, MySQL, or the like), you'll be able to give that data meaning (by parsing the dates, doing geolocation by source IP address, and so on) and query that information afterwards.

The formal name for the stage is ETL, which stands for extracting the data from it's sources, transforming it into something with meaning, and loading it into another system that you can later query.

Visualize your data

If you don't know exactly what it is that you're looking for and you're just looking for ways to optimize your code before something goes wrong, a great idea to get some insight into the data you've already preprocessed is to visualize it. Computers are great with numbers, but humans, on the other hand, are great with images when we want to find patterns and understand what kind of insight we can gather from the information we have.

For instance, to continue with the web server logs example, a simple plot (such as the ones you can do with MS Excel) for the requests by hour can provide some insight into the behavior of your users:

Visualize your data

The preceding chart clearly shows that the majority of requests are done during late afternoon and continue into the night. You can use this insight later on for further profiling. For instance, an optional improvement of your setup here would be to provide more resources for your infrastructure during that time (something that can be done with service providers such as Amazon Web Services).

Another example, using custom profiling data, could be the following chart:

Visualize your data

It uses data from the first code example of this chapter by counting the number of each event that triggers the profile function. We can then plot it and get an idea of the most common events. In our case, the call and return events are definitely taking up most of our program's time.

Summary

In this chapter, we've covered the basics of profiling. You understood profiling and its importance. You also learned how we can leverage it in order to get the most out of our code.

In the next chapter, we'll start getting our hands dirty by looking at some Python profilers and how we can use them on our applications.

Left arrow icon Right arrow icon

Description

Simply knowing how to code is not enough; on mission-critical pieces of code, every bit of memory and every CPU cycle counts, and knowing how to squish every bit of processing power out of your code is a crucial and sought-after skill. Nowadays, Python is used for many scientific projects, and sometimes the calculations done in those projects require some serious fine-tuning. Profilers are tools designed to help you measure the performance of your code and help you during the optimization process, so knowing how to use them and read their output is very handy. This book starts from the basics and progressively moves on to more advanced topics. You’ll learn everything from profiling all the way up to writing a real-life application and applying a full set of tools designed to improve it in different ways. In the middle, you’ll stop to learn about the major profilers used in Python and about some graphic tools to help you make sense of their output. You’ll then move from generic optimization techniques onto Python-specific ones, going over the main constructs of the language that will help you improve your speed without much of a change. Finally, the book covers some number-crunching-specific libraries and how to use them properly to get the best speed out of them. After reading this book, you will know how to take any Python code, profile it, find out where the bottlenecks are, and apply different techniques to remove them.

What you will learn

  • Master code optimization stepbystep and learn how to use different tools
  • Understand what a profiler is and how to read its output
  • Interpret visual output from profiling tools and improve the performance of your script
  • Use Cython to create fast applications using Python and C
  • Take advantage of PyPy to improve performance of Python code
  • Optimize numbercrunching code with NumPy, Numba, Parakeet, and Pandas
Estimated delivery fee Deliver to United States

Economy delivery 10 - 13 business days

Free $6.95

Premium delivery 6 - 9 business days

$21.95
(Includes tracking information)

Product Details

Country selected
Publication date, Length, Edition, Language, ISBN-13
Publication date : Sep 09, 2015
Length: 260 pages
Edition : 1st
Language : English
ISBN-13 : 9781783989300
Category :
Languages :

What do you get with Print?

Product feature icon Instant access to your digital copy whilst your Print order is Shipped
Product feature icon Paperback book shipped to your preferred address
Product feature icon Redeem a companion digital copy on all Print orders
Product feature icon Access this title in our online reader with advanced features
Product feature icon DRM FREE - Read whenever, wherever and however you want
OR
Modal Close icon
Payment Processing...
tick Completed

Shipping Address

Billing Address

Shipping Methods
Estimated delivery fee Deliver to United States

Economy delivery 10 - 13 business days

Free $6.95

Premium delivery 6 - 9 business days

$21.95
(Includes tracking information)

Product Details

Publication date : Sep 09, 2015
Length: 260 pages
Edition : 1st
Language : English
ISBN-13 : 9781783989300
Category :
Languages :

Packt Subscriptions

See our plans and pricing
Modal Close icon
$19.99 billed monthly
Feature tick icon Unlimited access to Packt's library of 7,000+ practical books and videos
Feature tick icon Constantly refreshed with 50+ new titles a month
Feature tick icon Exclusive Early access to books as they're written
Feature tick icon Solve problems while you work with advanced search and reference features
Feature tick icon Offline reading on the mobile app
Feature tick icon Simple pricing, no contract
$199.99 billed annually
Feature tick icon Unlimited access to Packt's library of 7,000+ practical books and videos
Feature tick icon Constantly refreshed with 50+ new titles a month
Feature tick icon Exclusive Early access to books as they're written
Feature tick icon Solve problems while you work with advanced search and reference features
Feature tick icon Offline reading on the mobile app
Feature tick icon Choose a DRM-free eBook or Video every month to keep
Feature tick icon PLUS own as many other DRM-free eBooks or Videos as you like for just $5 each
Feature tick icon Exclusive print discounts
$279.99 billed in 18 months
Feature tick icon Unlimited access to Packt's library of 7,000+ practical books and videos
Feature tick icon Constantly refreshed with 50+ new titles a month
Feature tick icon Exclusive Early access to books as they're written
Feature tick icon Solve problems while you work with advanced search and reference features
Feature tick icon Offline reading on the mobile app
Feature tick icon Choose a DRM-free eBook or Video every month to keep
Feature tick icon PLUS own as many other DRM-free eBooks or Videos as you like for just $5 each
Feature tick icon Exclusive print discounts

Frequently bought together


Stars icon
Total $ 152.97
Mastering Python Design Patterns
$48.99
Python 3 Object-Oriented Programming - Second Edition
$54.99
Mastering Python High Performance
$48.99
Total $ 152.97 Stars icon

Table of Contents

9 Chapters
1. Profiling 101 Chevron down icon Chevron up icon
2. The Profilers Chevron down icon Chevron up icon
3. Going Visual – GUIs to Help Understand Profiler Output Chevron down icon Chevron up icon
4. Optimize Everything Chevron down icon Chevron up icon
5. Multithreading versus Multiprocessing Chevron down icon Chevron up icon
6. Generic Optimization Options Chevron down icon Chevron up icon
7. Lightning Fast Number Crunching with Numba, Parakeet, and pandas Chevron down icon Chevron up icon
8. Putting It All into Practice Chevron down icon Chevron up icon
Index Chevron down icon Chevron up icon

Customer reviews

Rating distribution
Full star icon Full star icon Full star icon Full star icon Half star icon 4.2
(5 Ratings)
5 star 20%
4 star 80%
3 star 0%
2 star 0%
1 star 0%
Its Me Hara Jun 09, 2018
Full star icon Full star icon Full star icon Full star icon Full star icon 5
Good item. Good book for Python developer
Amazon Verified review Amazon
Duncan W. Robinson Oct 16, 2015
Full star icon Full star icon Full star icon Full star icon Empty star icon 4
Though I’ve programmed in R for some time, I am relatively new to Python. “Mastering Python High Performance” helped me with some Python syntax issues. I found Chapter 4 “Optimize Everything”, which includes coverage on dictionary lookups, list comprehensions & miscellaneous tips / tricks, the most useful. Chapter 5 was also interesting as it helped me understand the pros & cons of multithreading versus multiprocessing. Overall, I found this a helpful & informative text for a Python programmer with some prior experience in the language.
Amazon Verified review Amazon
Mike Driscoll Oct 15, 2015
Full star icon Full star icon Full star icon Full star icon Empty star icon 4
When I was originally reading this book, I was interested in seeing how the author would speed up their code. He covers a wide variety of topics which is good, but that also hampers the book as none of the topics are covered in depth. He uses lots of small examples and shows how to profile before moving on to optimizing them. Let’s go over each of the chapters so you can get a taste of what the book covers. Full disclosure: I was a technical reviewer of the book for Packt.Chapter one is all about profiling your code. It covers the differences between statistical and event-based profiling, what profiling is and why it’s important, bottlenecks and memory leaks. It also goes over running time complexity (linear, factorial, quadratic, etc) and profiling best practices.Then we logically move into chapter two where we learn about some profilers we can use with Python. The two that are covered are cProfile (included with Python) and line_profiler. The author demonstrates various ways to use cProfile to measure your code and also how to use Python’s pstats module, which is used for analyzing the results you receive from cProfile. Then the author moves on to using line_profiler and kernprof to analyze the same (or similar) examples used with cProfile. I think this is one of the best balanced chapters in the book and really quite interesting all by itself.Chapter three goes into using visual tools to help you understand your profiler’s output. In this chapter, you will learn about KCacheGrind / pyprof2calltree and RunSnakeRun. For the most part, you’ll just learn how to use these tools to figure out what your data means.In chapter four, we learn about optimizing your code. Topics covered are memoization, list comprehensions, generators, ctypes, loading your own custom C library, and some others tips and tricks. While this chapter is short, it has some good ideas and is worth reading or at least skimming.Chapter five digs into multithreading and multiprocessing. You’ll learn about the pros and cons of each. You will also learn about the Global Interpreter Lock and how that affects you when you choose one of these techniques.Chapter six goes into using PyPy and Cython and how they can be useful for additional optimizations to your code. I enjoyed this chapter, although I didn’t feel that PyPy got as much attention as Cython. There also weren’t very many coding examples.If you’re into number crunching, then chapter seven is for you. It goes over how to use Numba, Parakeet and pandas. Quite frankly, out of the three of libraries, I had only ever heard of pandas. I personally don’t need to do a lot of number crunching in my line of work, but it was interesting to see how each of the libraries worked and get a general idea of what they could be used for.Finally in chapter eight, the author attempts to put it all together. This chapter probably should have been twice as long as it is so that he could have actually covered everything. But in the end, it covers just barely enough and you do get to see a full example get optimized from beginning to end.Overall, I enjoyed this book. I would recommend this to anyone who needs ideas for optimizing their Python code or just for learning about profiling in general.
Amazon Verified review Amazon
Adriano Ribeiro Jan 23, 2016
Full star icon Full star icon Full star icon Full star icon Empty star icon 4
I think ththis book is a practical for impatients programmers. For me, the most interesting chapters were 4:Optimize Everything, 6:Generic Optimization Options and 7:Lightning Fast Number Crunching with Numba, Parakeet, and pandas. Very good start point.
Amazon Verified review Amazon
Chris Dec 01, 2015
Full star icon Full star icon Full star icon Full star icon Empty star icon 4
Mastering Python High Performance is a worthwhile buy for python programmers who are looking to take their skills to a new level. It gives you a pretty good explanation of how the code is executed when it is run which can be useful for anyone who wants to have more in depth knowledge of computing. If I had one gripe it would be it seems to rush into the topic a bit too quickly and jumps into algorithm and doesn’t sufficiently go over the topic. Overall, “High Performance Python” is a vast topic and this book does a good enough job getting into the finer points of python coding.
Amazon Verified review Amazon
Get free access to Packt library with over 7500+ books and video courses for 7 days!
Start Free Trial

FAQs

What is the digital copy I get with my Print order? Chevron down icon Chevron up icon

When you buy any Print edition of our Books, you can redeem (for free) the eBook edition of the Print Book you’ve purchased. This gives you instant access to your book when you make an order via PDF, EPUB or our online Reader experience.

What is the delivery time and cost of print book? Chevron down icon Chevron up icon

Shipping Details

USA:

'

Economy: Delivery to most addresses in the US within 10-15 business days

Premium: Trackable Delivery to most addresses in the US within 3-8 business days

UK:

Economy: Delivery to most addresses in the U.K. within 7-9 business days.
Shipments are not trackable

Premium: Trackable delivery to most addresses in the U.K. within 3-4 business days!
Add one extra business day for deliveries to Northern Ireland and Scottish Highlands and islands

EU:

Premium: Trackable delivery to most EU destinations within 4-9 business days.

Australia:

Economy: Can deliver to P. O. Boxes and private residences.
Trackable service with delivery to addresses in Australia only.
Delivery time ranges from 7-9 business days for VIC and 8-10 business days for Interstate metro
Delivery time is up to 15 business days for remote areas of WA, NT & QLD.

Premium: Delivery to addresses in Australia only
Trackable delivery to most P. O. Boxes and private residences in Australia within 4-5 days based on the distance to a destination following dispatch.

India:

Premium: Delivery to most Indian addresses within 5-6 business days

Rest of the World:

Premium: Countries in the American continent: Trackable delivery to most countries within 4-7 business days

Asia:

Premium: Delivery to most Asian addresses within 5-9 business days

Disclaimer:
All orders received before 5 PM U.K time would start printing from the next business day. So the estimated delivery times start from the next day as well. Orders received after 5 PM U.K time (in our internal systems) on a business day or anytime on the weekend will begin printing the second to next business day. For example, an order placed at 11 AM today will begin printing tomorrow, whereas an order placed at 9 PM tonight will begin printing the day after tomorrow.


Unfortunately, due to several restrictions, we are unable to ship to the following countries:

  1. Afghanistan
  2. American Samoa
  3. Belarus
  4. Brunei Darussalam
  5. Central African Republic
  6. The Democratic Republic of Congo
  7. Eritrea
  8. Guinea-bissau
  9. Iran
  10. Lebanon
  11. Libiya Arab Jamahriya
  12. Somalia
  13. Sudan
  14. Russian Federation
  15. Syrian Arab Republic
  16. Ukraine
  17. Venezuela
What is custom duty/charge? Chevron down icon Chevron up icon

Customs duty are charges levied on goods when they cross international borders. It is a tax that is imposed on imported goods. These duties are charged by special authorities and bodies created by local governments and are meant to protect local industries, economies, and businesses.

Do I have to pay customs charges for the print book order? Chevron down icon Chevron up icon

The orders shipped to the countries that are listed under EU27 will not bear custom charges. They are paid by Packt as part of the order.

List of EU27 countries: www.gov.uk/eu-eea:

A custom duty or localized taxes may be applicable on the shipment and would be charged by the recipient country outside of the EU27 which should be paid by the customer and these duties are not included in the shipping charges been charged on the order.

How do I know my custom duty charges? Chevron down icon Chevron up icon

The amount of duty payable varies greatly depending on the imported goods, the country of origin and several other factors like the total invoice amount or dimensions like weight, and other such criteria applicable in your country.

For example:

  • If you live in Mexico, and the declared value of your ordered items is over $ 50, for you to receive a package, you will have to pay additional import tax of 19% which will be $ 9.50 to the courier service.
  • Whereas if you live in Turkey, and the declared value of your ordered items is over € 22, for you to receive a package, you will have to pay additional import tax of 18% which will be € 3.96 to the courier service.
How can I cancel my order? Chevron down icon Chevron up icon

Cancellation Policy for Published Printed Books:

You can cancel any order within 1 hour of placing the order. Simply contact customercare@packt.com with your order details or payment transaction id. If your order has already started the shipment process, we will do our best to stop it. However, if it is already on the way to you then when you receive it, you can contact us at customercare@packt.com using the returns and refund process.

Please understand that Packt Publishing cannot provide refunds or cancel any order except for the cases described in our Return Policy (i.e. Packt Publishing agrees to replace your printed book because it arrives damaged or material defect in book), Packt Publishing will not accept returns.

What is your returns and refunds policy? Chevron down icon Chevron up icon

Return Policy:

We want you to be happy with your purchase from Packtpub.com. We will not hassle you with returning print books to us. If the print book you receive from us is incorrect, damaged, doesn't work or is unacceptably late, please contact Customer Relations Team on customercare@packt.com with the order number and issue details as explained below:

  1. If you ordered (eBook, Video or Print Book) incorrectly or accidentally, please contact Customer Relations Team on customercare@packt.com within one hour of placing the order and we will replace/refund you the item cost.
  2. Sadly, if your eBook or Video file is faulty or a fault occurs during the eBook or Video being made available to you, i.e. during download then you should contact Customer Relations Team within 14 days of purchase on customercare@packt.com who will be able to resolve this issue for you.
  3. You will have a choice of replacement or refund of the problem items.(damaged, defective or incorrect)
  4. Once Customer Care Team confirms that you will be refunded, you should receive the refund within 10 to 12 working days.
  5. If you are only requesting a refund of one book from a multiple order, then we will refund you the appropriate single item.
  6. Where the items were shipped under a free shipping offer, there will be no shipping costs to refund.

On the off chance your printed book arrives damaged, with book material defect, contact our Customer Relation Team on customercare@packt.com within 14 days of receipt of the book with appropriate evidence of damage and we will work with you to secure a replacement copy, if necessary. Please note that each printed book you order from us is individually made by Packt's professional book-printing partner which is on a print-on-demand basis.

What tax is charged? Chevron down icon Chevron up icon

Currently, no tax is charged on the purchase of any print book (subject to change based on the laws and regulations). A localized VAT fee is charged only to our European and UK customers on eBooks, Video and subscriptions that they buy. GST is charged to Indian customers for eBooks and video purchases.

What payment methods can I use? Chevron down icon Chevron up icon

You can pay with the following card types:

  1. Visa Debit
  2. Visa Credit
  3. MasterCard
  4. PayPal
What is the delivery time and cost of print books? Chevron down icon Chevron up icon

Shipping Details

USA:

'

Economy: Delivery to most addresses in the US within 10-15 business days

Premium: Trackable Delivery to most addresses in the US within 3-8 business days

UK:

Economy: Delivery to most addresses in the U.K. within 7-9 business days.
Shipments are not trackable

Premium: Trackable delivery to most addresses in the U.K. within 3-4 business days!
Add one extra business day for deliveries to Northern Ireland and Scottish Highlands and islands

EU:

Premium: Trackable delivery to most EU destinations within 4-9 business days.

Australia:

Economy: Can deliver to P. O. Boxes and private residences.
Trackable service with delivery to addresses in Australia only.
Delivery time ranges from 7-9 business days for VIC and 8-10 business days for Interstate metro
Delivery time is up to 15 business days for remote areas of WA, NT & QLD.

Premium: Delivery to addresses in Australia only
Trackable delivery to most P. O. Boxes and private residences in Australia within 4-5 days based on the distance to a destination following dispatch.

India:

Premium: Delivery to most Indian addresses within 5-6 business days

Rest of the World:

Premium: Countries in the American continent: Trackable delivery to most countries within 4-7 business days

Asia:

Premium: Delivery to most Asian addresses within 5-9 business days

Disclaimer:
All orders received before 5 PM U.K time would start printing from the next business day. So the estimated delivery times start from the next day as well. Orders received after 5 PM U.K time (in our internal systems) on a business day or anytime on the weekend will begin printing the second to next business day. For example, an order placed at 11 AM today will begin printing tomorrow, whereas an order placed at 9 PM tonight will begin printing the day after tomorrow.


Unfortunately, due to several restrictions, we are unable to ship to the following countries:

  1. Afghanistan
  2. American Samoa
  3. Belarus
  4. Brunei Darussalam
  5. Central African Republic
  6. The Democratic Republic of Congo
  7. Eritrea
  8. Guinea-bissau
  9. Iran
  10. Lebanon
  11. Libiya Arab Jamahriya
  12. Somalia
  13. Sudan
  14. Russian Federation
  15. Syrian Arab Republic
  16. Ukraine
  17. Venezuela