You're reading from Clean Code in Python Refactor your legacy code base

Product type Paperback

Published in Aug 2018

Publisher Packt

ISBN-13 9781788835831

Length 332 pages

Edition 1st Edition

Languages

Python

Concepts

Object Oriented Programming

Author (1):

Mariano Anaya

View More author details

The importance of having clean code

There are a huge number of reasons why clean code is important. Most of them revolve around the ideas of maintainability, reducing technical debt, working effectively with agile development, and managing a successful project.

The first idea I would like to explore is in regards to agile development and continuous delivery. If we want our project to be able to successfully deliver features constantly at a steady and predictable pace, then having a good and maintainable code base is a must.

Imagine you are driving a car on a road toward a destination you want to reach at a certain point in time. You have to estimate your arrival time so that you can tell the person who is waiting for you. If the car works fine, and the road is flat and perfect, then I do not see why you would miss your estimation by a large margin. Now, if the road is broken and you have to step out to move rocks out of the way, or avoid cracks, stop to check the engine every few kilometers, and so on, then it is very unlikely that you will know for sure when are you going to arrive (or if you are). I think the analogy is clear; the road is the code. If you want to move at a steady, constant, and predictable pace, the code needs to be maintainable and readable. If it is not, every time product management asks for a new feature, you will have to stop to refactor and fix the technical debt.

Technical debt refers to the concept of problems in the software as a result of a compromise, and a bad decision being made. In a way, it's possible to think about technical debt in two ways. From the present to the past. What if the problems we are currently facing are the result of previously written bad code? From the present to the future—if we decide to take the shortcut now, instead of investing time in a proper solution, what problems are we creating for ourselves in the future?

The word debt is a good choice. It's a debt because the code will be harder to change in the future than it would be to change it now. That incurred cost is the interests of the debt. Incurring in technical debt means that tomorrow, the code will be harder and more expensive (it would be possible to even measure this) than today, and even more expensive the day after, and so on.

Every time the team cannot deliver something on time and has to stop to fix and refactor the code is paying the price of technical debt.

The worst thing about technical debt is that it represents a long-term and underlying problem. It is not something that raises a high alarm. Instead, it is a silent problem, scattered across all parts of the project, that one day, at one particular time, will wake up and become a show-stopper.

The role of code formatting in clean code

Is clean code about formatting and structuring the code, according to some standards (for example, PEP-8, or a custom standard defined by the project guidelines)? The short answer is no.

Clean code is something else that goes way beyond coding standards, formatting, linting tools, and other checks regarding the layout of the code. Clean code is about achieving quality software and building a system that is robust, maintainable, and avoiding technical debt. A piece of code or an entire software component could be 100% with PEP-8 (or any other guideline), and still not satisfy these requirements.

However, not paying attention to the structure of the code has some perils. For this reason, we will first analyze the problems with a bad code structure, how to address them, and then we will see how to configure and use tools for Python projects in order to automatically check and correct problems.

To sum this up, we can say that clean code has nothing to do with things like PEP-8 or coding styles. It goes way beyond that, and it means something more meaningful to the maintainability of the code and the quality of the software. However, as we will see, formatting the code correctly is important in order to work efficiently.

Adhering to a coding style guide on your project

A coding guideline is a bare minimum a project should have to be considered being developed under quality standards. In this section, we will explore the reasons behind this, so in the following sections, we can start looking at ways to enforce this automatically by the means of tools.

The first thing that comes to my mind when I try to find good traits in a code layout is consistency. I would expect the code to be consistently structured so that it is easier to read and follow. If the code is not correct or consistently structured, and everyone on the team is doing things in their own way, then we will end up with code that will require extra effort and concentration to be followed correctly. It will be error-prone, misleading, and bugs or subtleties might slip through easily.

We want to avoid that. What we want is exactly the opposite of that—code that we can read and understand as quickly as possible at a single glance.

If all members of the development team agree on a standardized way of structuring the code, the resulting code would look much more familiar. As a result of that, you will quickly identify patterns (more about this in a second), and with these patterns in mind, it will be much easier to understand things and detect errors. For example, when something is amiss, you will notice that somehow, there is something odd in the patterns you are used to seeing, which will catch your attention. You will take a closer look, and you will more than likely spot the mistake!

As it was stated in the classical book, Code Complete, an interesting analysis of this was done on the paper titled Perceptions in Chess (1973), where an experiment was conducted in order to identify how different people can understand or memorize different chess positions. The experiment was conducted on players of all levels (novices, intermediate, and chess masters), and with different chess positions on the board. They found out that when the position was random, the novices did as well as the chess masters; it was just a memorization exercise that anyone could do at reasonably the same level. When the positions followed a logical sequence that might occur in a real game (again, consistency, adhering to a pattern), then the chess masters performed exceedingly better than the rest.

Now imagine this same situation applied to software. We, as the software engineers experts in Python, are like the chess masters in the previous example. When the code is structured randomly, without following any logic, or adhering to any standard, then it would be as difficult for us to spot mistakes as a novice developer. On the other hand, if we are used to reading code in a structured fashion, and we have learned to quickly get the ideas from the code by following patterns, then we are at a considerable advantage.

In particular, for Python, the sort of coding style you should follow is PEP-8. You can extend it or adopt some of its parts to the particularities of the project you are working on (for example, the length of the line, the notes about strings, and so on). However, I do suggest that regardless of whether you are using just plain PEP-8 or extending it, you should really stick to it instead of trying to come up with another different standard from scratch.

The reason for this is that this document already takes into consideration many of the particularities of the syntax of Python (that would not normally apply for other languages), and it was created by core Python developers who actually contributed to the syntax of Python. For this reason, it is hard to think that the accuracy of PEP-8 can be otherwise matched, not to mention, improved.

In particular, PEP-8 has some characteristics that carry other nice improvements when dealing with code, such as following:

Grepability: This is the ability to grep tokens inside the code; that is, to search in certain files (and in which part of those files) for the particular string we are looking for. One of the items introduced by this standard is something that differentiates the way of writing the assignment of values to variables, from the keyword arguments being passed to functions.

To see this better, let's use an example. Let's say we are debugging, and we need to find where the value to a parameter named location is being passed. We can run the following grep command, and the result will tell us the file and the line we are looking for:

$ grep -nr "location=" . 
./core.py:13: location=current_location,

Now, we want to know where this variable is being assigned this value, and the following command will also give us the information we are looking for:

$ grep -nr "location =" .
./core.py:10: current_location = get_location()

PEP-8 establishes the convention that, when passing arguments by keyword to a function, we don't use spaces, but we do when we assign variables. For that reason, we can adapt our search criteria (no spaces around the = on the first search, and one space on the second) and be more efficient in our search. That is one of the advantages of following a convention.

Consistency: If the code looks like a uniform format, the reading of it will be much easier. This is particularly important for onboarding, if you want to welcome new developers to your project, or even hire new (and probably less experienced) programmers on your team, and they need to become familiar with the code (which might even consist of several repositories). It will make their lives much easier if the code layout, documentation, naming convention, and such is identical across all files they open, in all repositories.
Code quality: By looking at the code in a structured fashion, you will become more proficient at understanding it at a glance (again, like in Perception in Chess), and you will spot bugs and mistakes more easily. In addition to that, tools that check for the quality of the code will also hint at potential bugs. Static analysis of the code might help to reduce the ratio of bugs per line of code.

You're reading from Clean Code in Python Refactor your legacy code base

Table of Contents (12) Chapters

The importance of having clean code

The role of code formatting in clean code

Adhering to a coding style guide on your project

Authors (1)

Other recommended products

Personalised recommendations for you