You're reading from Modern Python Cookbook The latest in modern Python recipes for the busy modern programmer

Product type Paperback

Published in Nov 2016

Publisher Packt

ISBN-13 9781786469250

Length 692 pages

Edition 1st Edition

Languages

Python

Concepts

Programming Language

Building complex strings with "template".format()

Creating complex strings is, in many ways, the polar opposite of parsing a complex string. We generally find that we'll use a template with substitution rules to put data into a more complex format.

Getting ready

Let's say we have pieces of data that we need to turn into a nicely formatted message. We might have data including the following:

>>> id = "IAD"
>>> location = "Dulles Intl Airport"
>>> max_temp = 32
>>> min_temp = 13
>>> precipitation = 0.4

And we'd like a line that looks like this:

IAD : Dulles Intl Airport : 32 / 13 / 0.40

How to do it...

Create a template string from the result, replacing all of the data items with {} placeholders. Inside each placeholder, put the name of the data item.

      '{id} : {location} : {max_temp} / {min_temp} / {precipitation}'

For each data item, append :data type information to the placeholders in the template string. The basic data type codes are:
- s for string
- d for decimal number
- f for floating-point number

It would look like this:

       '{id:s}  : {location:s} : {max_temp:d} / {min_temp:d} / {precipitation:f}'

Add length information where required. Length is not always required, and in some cases, it's not even desirable. In this example, though, the length information assures that each message has a consistent format. For strings and decimal numbers, prefix the format with the length like this: 19s or 3d. For floating-point numbers use a two part prefix like this: 5.2f to specify the total length of five characters with two to the right of the decimal point. Here's the whole format:

      '{id:3d}  : {location:19s} : {max_temp:3d} / {min_temp:3d} / {precipitation:5.2f}'

Use the format() method of this string to create the final string:

      >>> '{id:3s}  : {location:19s} :  {max_temp:3d} / {min_temp:3d} / {precipitation:5.2f}'.format(
      ... id=id, location=location, max_temp=max_temp,
      ... min_temp=min_temp, precipitation=precipitation
      ... )
      'IAD  : Dulles Intl Airport :   32 /  13 /  0.40'

We've provided all of the variables by name in the format() method of the template string. This can get tedious. In some cases, we might want to build a dictionary object with the variables. In that case, we can use the format_map() method:

>>> data = dict(
... id=id, location=location, max_temp=max_temp,
... min_temp=min_temp, precipitation=precipitation
... )
>>> '{id:3s}  : {location:19s} :  {max_temp:3d} / {min_temp:3d} / {precipitation:5.2f}'.format_map(data)
'IAD  : Dulles Intl Airport :   32 /  13 /  0.40'

We'll return to dictionaries in Chapter 4, Build-in Data Structures – list, set, dict.

The built-in vars() function builds a dictionary of all of the local variables for us:

>>> '{id:3s}  : {location:19s} :  {max_temp:3d} / {min_temp:3d} / {precipitation:5.2f}'.format_map(
...    vars()
... )
'IAD  : Dulles Intl Airport :   32 /  13 /  0.40'

The vars() function is very handy for building a dictionary automatically.

How it works...

The string format() and format_map() methods can do a lot of relatively sophisticated string assembly for us.

The basic feature is to interpolate data into a string based on names of keyword arguments or keys in a dictionary. Variables can also be interpolated by position—we can provide position numbers instead of names. We can use a format specification like {0:3s} to use the first positional argument to format().

We've seen three of the formatting conversions—s, d, f—there are many others. Details are in Section 6.1.3 of the Python Standard Library. Here are some of the format conversions we might use:

b is for binary, base 2.
c is for Unicode character. The value must be a number, which is converted to a character. Often, we use hexadecimal numbers for this so you might want to try values such as 0x2661 through 0x2666 for fun.
d is for decimal numbers.
E and e are for scientific notations. 6.626E-34 or 6.626e-34 depending on which E or e character is used.
F and f are for floating-point. For not a number the f format shows lowercase nan; the F format shows uppercase NAN.
G and g are for general. This switches automatically between E and F (or e and f,) to keep the output in the given sized field. For a format of 20.5G, up to 20-digit numbers will be displayed using F formatting. Larger numbers will use E formatting.
n is for locale-specific decimal numbers. This will insert , or . characters depending on the current locale settings. The default locale may not have a thousand separators defined. For more information, see the locale module.
o is for octal, base 8.
s is for string.
X and x is for hexadecimal, base 16. The digits include uppercase A-F and lowercase a-f, depending on which X or x format character is used.
% is for percentage. The number is multiplied by 100 and includes the %.

We have a number of prefixes we can use for these different types. The most common one is the length. We might use {name:5d} to put in a 5-digit number. There are several prefixes for the preceding types:

Fill and alignment: We can specify a specific filler character (space is the default) and an alignment. Numbers are generally aligned to the right and strings to the left. We can change that using <, >, or ^. This forces left alignment, right alignment, or centering. There's a peculiar = alignment that's used to put padding after a leading sign.
Sign: The default rule is a leading negative sign where needed. We can use + to put a sign on all numbers, - to put a sign only on negative numbers, and a space to use a space instead of a plus for positive numbers. In scientific output, we must use {value: 5.3f}. The space makes sure that room is left for the sign, assuring that all the decimal points line up nicely.
Alternate form: We can use the # to get an alternate form. We might have something like {0:#x}, {0:#o}, {0:#b} to get a prefix on hexadecimal, octal, or binary values. With a prefix, the numbers will look like 0xnnn, 0onnn, or 0bnnn. The default is to omit the two character prefix.
Leading zero: We can include 0 to get leading zeros to fill in the front of a number. Something like {code:08x) will produce a hexadecimal value with leading zeroes to pad it out to eight characters.
Width and precision: For integer values and strings, we only provide the width. For floating-point values we often provide width.precision.

There are some times when we won't use a {name:format} specification. Sometimes we'll need to use a {name!conversion} specification. There are only three conversions available.

{name!r} shows the representation that would be produced by repr(name)
{name!s} shows the string value that would be produced by str(name)
{name!a} shows the ASCII value that would be produced by ascii(name)

In Chapter 6, Basics of Classes and Objects, we'll leverage the idea of the {name!r} format specification to simplify displaying information about related objects.

There's more...

A handy debugging hack this:

print("some_variable={some_variable!r}".format_map(vars()))

The vars() function—with no arguments—collects all of the local variables into a mapping. We provide that mapping for format_map(). The format template can use lots of {variable_name!r} to display details about various objects we have in local variables.

Inside a class definition we can use techniques such as vars(self). This looks forward to Chapter 6, Basics of Classes and Objects:

>>> class Summary:
...     def __init__(self, id, location, min_temp, max_temp, precipitation):
...         self.id= id
...         self.location= location
...         self.min_temp= min_temp
...         self.max_temp= max_temp
...         self.precipitation= precipitation
...     def __str__(self):
...         return '{id:3s}  : {location:19s} :  {max_temp:3d} / {min_temp:3d} / {precipitation:5.2f}'.format_map(
...             vars(self)
...         )
>>> s= Summary('IAD', 'Dulles Intl Airport', 13, 32, 0.4)
>>> print(s)
IAD  : Dulles Intl Airport :   32 /  13 /  0.40

Our class definition includes a __str__() method. This method relies on vars(self) to create a useful dictionary of just the attribute of the object.