Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Conferences
Free Learning
Arrow right icon
Arrow up icon
GO TO TOP
Mastering Python High Performance

You're reading from   Mastering Python High Performance Learn how to optimize your code and Python performance with this vital guide to Python performance profiling and benchmarking

Arrow left icon
Product type Paperback
Published in Sep 2015
Publisher
ISBN-13 9781783989300
Length 260 pages
Edition 1st Edition
Languages
Arrow right icon
Author (1):
Arrow left icon
Fernando Donglio Fernando Donglio
Author Profile Icon Fernando Donglio
Fernando Donglio
Arrow right icon
View More author details
Toc

What can we profile?

Going deeper into profiling, it is very important to understand what we can actually profile. Measuring is the core of profiling, so let's take a detailed look at the things we can measure during a program's execution.

Execution time

The most basic of the numbers we can gather when profiling is the execution time. The execution time of the entire process or just of a particular portion of the code will shed some light on its own. If you have experience in the area your program is running (that is, you're a web developer and you're working on a web framework), you probably already know what it means for your system to take too much time. For instance, a simple web server might take up to 100 milliseconds when querying the database, rendering the response, and sending it back to the client. However, if the same piece of code starts to slow down and now it takes 60 seconds to do the same task, then you should start thinking about profiling. You also have to consider that numbers here are relative. Let's assume another process: a MapReduce job that is meant to process 2 TB of information stored on a set of text files takes 20 minutes. In this case, you might not consider it as a slow process, even when it takes considerably more time than the slow web server mentioned earlier.

To get this type of information, you don't really need a lot of profiling experience or even complex tools to get the numbers. Just add the required lines into your code and run the program.

For instance, the following code will calculate the Fibonnacci sequence for the number 30:

import datetime

tstart = None
tend = None

def start_time():
    global tstart
    tstart = datetime.datetime.now()
def get_delta():
    global tstart
    tend = datetime.datetime.now()
    return tend - tstart
    
  

 def fib(n):
     return n if n == 0 or n == 1 else fib(n-1) + fib(n-2)

def fib_seq(n):
    seq = [ ]
    if n > 0:
        seq.extend(fib_seq(n-1))
    seq.append(fib(n))
    return seq

start_time()
print "About to calculate the fibonacci sequence for the number 30"
delta1 = get_delta()

start_time()
seq = fib_seq(30) 
delta2 = get_delta()

print "Now we print the numbers: "
start_time()
for n in seq:
    print n
delta3 = get_delta()

print "====== Profiling results ======="
print "Time required to print a simple message: %(delta1)s" % locals()
print "Time required to calculate fibonacci: %(delta2)s" % locals()
print "Time required to iterate and print the numbers: %(delta3)s" % locals()
print "======  ======="

Now, the code will produce the following output:

About to calculate the Fibonacci sequence for the number 30
Now we print the numbers: 
0
1
1
2
3
5
8
13
21
#...more numbers
4181
6765
10946
17711
28657
46368
75025
121393
196418
317811
514229
832040
====== Profiling results =======
Time required to print a simple message: 0:00:00.000030
Time required to calculate fibonacci: 0:00:00.642092
Time required to iterate and print the numbers: 0:00:00.000102

Based on the last three lines, we see the obvious results: the most expensive part of the code is the actual calculation of the Fibonacci sequence.

Tip

Downloading the example code

You can download the example code files from your account at http://www.packtpub.com for all the Packt Publishing books you have purchased. If you purchased this book elsewhere, you can visit http://www.packtpub.com/support and register to have the files e-mailed directly to you.

Where are the bottlenecks?

Once you've measured how much time your code needs to execute, you can profile it by paying special attention to the slow sections. These are the bottlenecks, and normally, they are related to one or a combination of the following reasons:

  • Heavy I/O operations, such as reading and parsing big files, executing long-running database queries, calling external services (such as HTTP requests), and so on
  • Unexpected memory leaks that start building up until there is no memory left for the rest of the program to execute properly
  • Unoptimized code that gets executed frequently
  • Intensive operations that are not cached when they could be

I/O-bound code (file reads/write, database queries, and so on) is usually harder to optimize, because that would imply changing the way the program is dealing with that I/O (normally using core functions from the language). Instead, when optimizing compute-bound code (like a function that is using a badly implemented algorithm), getting a performance improvement is easier (although not necessarily easy). This is because it just implies rewriting it.

A general indicator that you're near the end of a performance optimization process is when most of the bottlenecks left are due to I/O-bound code.

You have been reading a chapter from
Mastering Python High Performance
Published in: Sep 2015
Publisher:
ISBN-13: 9781783989300
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $19.99/month. Cancel anytime
Banner background image