The factory pattern
We will start with the first creational design pattern from the Gang of Four book: the factory design pattern. In the factory design pattern, a client (meaning client code) asks for an object without knowing where the object is coming from (that is, which class is used to generate it). The idea behind a factory is to simplify the object creation process. It is easier to track which objects are created if this is done through a central function, compared to letting a client create objects using a direct class instantiation. A factory reduces the complexity of maintaining an application by decoupling the code that creates an object from the code that uses it.
Factories typically come in two forms—the factory method, which is a method (or simply a function for a Python developer) that returns a different object per input parameter, and the abstract factory, which is a group of factory methods used to create a family of related objects.
Let’s discuss the two forms of factory pattern, starting with the factory method.
The factory method
The factory method is based on a single function written to handle our object creation task. We execute it, passing a parameter that provides information about what we want, and, as a result, the wanted object is created.
Interestingly, when using the factory method, we are not required to know any details about how the resulting object is implemented and where it is coming from.
Real-world examples
We can find the factory method pattern used in real life in the context of a plastic toy construction kit. The molding material used to construct plastic toys is the same, but different toys (different figures or shapes) can be produced using the right plastic molds. This is like having a factory method in which the input is the name of the toy that we want (for example, a duck or car) and the output (after the molding) is the plastic toy that was requested.
In the software world, the Django web framework uses the factory method pattern for creating the fields of a web form. The forms
module included in Django (https://github.com/django/django/blob/main/django/forms/forms.py) supports the creation of different kinds of fields (for example, CharField
, EmailField
, and so on). Parts of their behavior can be customized using attributes such as max_length
and required
.
Use cases for the factory method pattern
If you realize that you cannot track the objects created by your application because the code that creates them is in many different places instead of in a single function/method, you should consider using the factory method pattern. The factory method centralizes object creation and tracking your objects becomes much easier. Note that it is fine to create more than one factory method, and this is how it is typically done in practice. Each factory method logically groups the creation of objects that have similarities. For example, one factory method might be responsible for connecting you to different databases (MySQL, SQLite); another factory method might be responsible for creating the geometrical object that you request (circle, triangle); and so on.
The factory method is also useful when you want to decouple object creation from object usage. We are not coupled to a specific class when creating an object; we just provide partial information about what we want by calling a function. This means that introducing changes to the function is easy and does not require any changes to the code that uses it.
Another use case worth mentioning is related to improving the performance and memory usage of an application. A factory method can improve performance and memory usage by creating new objects only if it is necessary. When we create objects using a direct class instantiation, extra memory is allocated every time a new object is created (unless the class uses caching internally, which is usually not the case). We can see that in practice in the following code (ch03/factory/id.py
), which creates two instances of the same class, MyClass
, and uses the id()
function to compare their memory addresses. The addresses are also printed in the output so that we can inspect them. The fact that the memory addresses are different means that two distinct objects are created. The code is as follows:
class MyClass: pass if __name__ == "__main__": a = MyClass() b = MyClass() print(id(a) == id(b)) print(id(a)) print(id(b))
Executing the code (ch03/factory/id.py
) on my computer results in the following output:
False 4330224656 4331646704
Note
The addresses that you see if you execute the file, where the id()
function is called, are not the same as the ones I see because they depend on the current memory layout and allocation. But the result must be the same—the two addresses should be different. There’s one exception that happens if you write and execute the code in the Python Read-Eval-Print Loop (REPL)—or, simply put, the interactive prompt—but that’s a REPL-specific optimization that does not happen normally.
Implementing the factory method pattern
Data comes in many forms. There are two main file categories for storing/retrieving data: human-readable files and binary files. Examples of human-readable files are XML, RSS/Atom, YAML, and JSON. Examples of binary files are the .sq3
file format used by SQLite and the .mp3
audio file format used to listen to music.
In this example, we will focus on two popular human-readable formats—XML and JSON. Although human-readable files are generally slower to parse than binary files, they make data exchange, inspection, and modification much easier. For this reason, it is advised that you work with human-readable files unless there are other restrictions that do not allow it (mainly unacceptable performance or proprietary binary formats).
In this case, we have some input data stored in an XML and a JSON file, and we want to parse them and retrieve some information. At the same time, we want to centralize the client’s connection to those (and all future) external services. We will use the factory method to solve this problem. The example focuses only on XML and JSON, but adding support for more services should be straightforward.
First, let’s look at the data files.
The JSON file, movies.json
, is a sample of a dataset containing information about American movies (title, year, director name, genre, and so on):
[ { "title": "After Dark in Central Park", "year": 1900, "director": null, "cast": null, "genre": null }, { "title": "Boarding School Girls' Pajama Parade", "year": 1900, "director": null, "cast": null, "genre": null }, { "title": "Buffalo Bill's Wild West Parad", "year": 1900, "director": null, "cast": null, "genre": null }, { "title": "Caught", "year": 1900, "director": null, "cast": null, "genre": null }, { "title": "Clowns Spinning Hats", "year": 1900, "director": null, "cast": null, "genre": null }, { "title": "Capture of Boer Battery by British", "year": 1900, "director": "James H. White", "cast": null, "genre": "Short documentary" }, { "title": "The Enchanted Drawing", "year": 1900, "director": "J. Stuart Blackton", "cast": null, "genre": null }, { "title": "Family Troubles", "year": 1900, "director": null, "cast": null, "genre": null }, { "title": "Feeding Sea Lions", "year": 1900, "director": null, "cast": "Paul Boyton", "genre": null } ]
The XML file, person.xml
, contains information about individuals (firstName
, lastName
, gender
, and so on), as follows:
- We start with the enclosing tag of the
persons
XML container:<persons>
- Then, an XML element representing a person’s data code is presented as follows:
<person> <firstName>John</firstName> <lastName>Smith</lastName> <age>25</age> <address> <streetAddress>21 2nd Street</streetAddress> <city>New York</city> <state>NY</state> <postalCode>10021</postalCode> </address> <phoneNumbers> <number type="home">212 555-1234</number> <number type="fax">646 555-4567</number> </phoneNumbers> <gender> <type>male</type> </gender> </person>
- An XML element representing another person’s data is shown by the following code:
<person> <firstName>Jimy</firstName> <lastName>Liar</lastName> <age>19</age> <address> <streetAddress>18 2nd Street</streetAddress> <city>New York</city> <state>NY</state> <postalCode>10021</postalCode> </address> <phoneNumbers> <number type="home">212 555-1234</number> </phoneNumbers> <gender> <type>male</type> </gender> </person>
- An XML element representing a third person’s data is shown by the following code:
<person> <firstName>Patty</firstName> <lastName>Liar</lastName> <age>20</age> <address> <streetAddress>18 2nd Street</streetAddress> <city>New York</city> <state>NY</state> <postalCode>10021</postalCode> </address> <phoneNumbers> <number type="home">212 555-1234</number> <number type="mobile">001 452-8819</number> </phoneNumbers> <gender> <type>female</type> </gender> </person>
- Finally, we close the XML container:
</persons>
We will use two libraries that are part of the Python distribution for working with JSON and XML: json
and xml.etree.ElementTree
.
We start by importing what we need for the various manipulations (json
, ElementTree
, and pathlib
), and we define a JSONDataExtractor
class, loading the data from the file and using the parsed_data
property to get it. That part of the code is as follows:
import json import xml.etree.ElementTree as ET from pathlib import Path class JSONDataExtractor: def __init__(self, filepath: Path): self.data = {} with open(filepath) as f: self.data = json.load(f) @property def parsed_data(self): return self.data
We also define an XMLDataExtractor
class, loading the data in the file via ElementTree
’s parser, and using the parsed_data
property to get the result, as follows:
class XMLDataExtractor: def __init__(self, filepath: Path): self.tree = ET.parse(filepath) @property def parsed_data(self): return self.tree
Now, we provide the factory function that helps select the right data extractor class depending on the target file’s extension (or raise an exception if it is not supported), as follows:
def extract_factory(filepath: Path): ext = filepath.name.split(".")[-1] if ext == "json": return JSONDataExtractor(filepath) elif ext == "xml": return XMLDataExtractor(filepath) else: raise ValueError("Cannot extract data")
Next, we define the main function of our program, extract()
; in the first part of the function, the code handles the JSON case, as follows:
def extract(case: str): dir_path = Path(__file__).parent if case == "json": path = dir_path / Path("movies.json") factory = extract_factory(path) data = factory.parsed_data for movie in data: print(f"- {movie['title']}") director = movie["director"] if director: print(f" Director: {director}") genre = movie["genre"] if genre: print(f" Genre: {genre}")
We add the final part of the extract()
function, working with the XML file using the factory method. XPath is used to find all person elements that have the last name Liar
. For each matched person, the basic name and phone number information are shown. The code is as follows:
elif case == "xml": path = dir_path / Path("person.xml") factory = extract_factory(path) data = factory.parsed_data search_xpath = ".//person[lastName='Liar']" items = data.findall(search_xpath) for item in items: first = item.find("firstName").text last = item.find("lastName").text print(f"- {first} {last}") for pn in item.find("phoneNumbers"): pn_type = pn.attrib["type"] pn_val = pn.text phone = f"{pn_type}: {pn_val}" print(f" {phone}")
Finally, we add some testing code:
if __name__ == "__main__": print("* JSON case *") extract(case="json") print("* XML case *") extract(case="xml")
Here is a summary of the implementation (in the ch03/factory/factory_method.py
file):
- After importing the modules we need, we start by defining a JSON data extractor class (
JSONDataExtractor
) and an XML data extractor class (XMLDataExtractor
). - We add a factory function,
extract_factory()
, to get the right data extractor class to instantiate. - We also add our wrapper and main function,
extract()
. - Finally, we add testing code, where we extract data from a JSON file and an XML file and parse the resulting text.
To test the example, run the following command:
python ch03/factory/factory_method.py
You should get the following output:
* JSON case * - After Dark in Central Park - Boarding School Girls' Pajama Parade - Buffalo Bill's Wild West Parad - Caught - Clowns Spinning Hats - Capture of Boer Battery by British Director: James H. White Genre: Short documentary - The Enchanted Drawing Director: J. Stuart Blackton - Family Troubles - Feeding Sea Lions * XML case * - Jimy Liar home: 212 555-1234 - Patty Liar home: 212 555-1234 mobile: 001 452-8819
Notice that although JSONDataExtractor
and XMLDataExtractor
have the same interfaces, what is returned by parsed_data()
is not handled in a uniform way; in one case we have a list, and in the other, we have a tree. Different Python code must be used to work with each data extractor. Although it would be nice to be able to use the same code for all extractors, this is not realistic for the most part unless we use some kind of common mapping for the data, which is often provided by external data providers. Assuming that you can use the same code for handling the XML and JSON files, what changes are required to support a third format—for example, SQLite? Find an SQLite file or create your own and try it.
Should you use the factory method pattern?
The main critique that veteran Python developers often express toward the factory method pattern is that it can be considered over-engineered or unnecessarily complex for many use cases. Python’s dynamic typing and first-class functions often allow for simpler, more straightforward solutions to problems that the factory method aims to solve. In Python, you can often use simple functions or class methods to create objects directly without needing to create separate factory classes or functions. This keeps the code more readable and Pythonic, adhering to the language’s philosophy of Simple is better than complex.
Also, Python’s support for default arguments, keyword arguments, and other language features often makes it easier to extend constructors in a backward-compatible way, reducing the need for separate factory methods. So, while the factory method pattern is a well-established design pattern in statically typed languages such as Java or C++, it is often seen as too cumbersome or verbose for Python’s more flexible and dynamic nature.
To show how one could deal with simple use cases without the factory method pattern, an alternative implementation has been provided in the ch03/factory/factory_method_not_needed.py
file. As you can see, there is no more factory. And the following extract from the code shows what we mean when we say that in Python, you just create objects where you need them, without an intermediary function or class, which makes your code more Pythonic:
if case == "json": path = dir_path / Path("movies.json") data = JSONDataExtractor(path).parsed_data
The abstract factory pattern
The abstract factory pattern is a generalization of the factory method idea. Basically, an abstract factory is a (logical) group of factory methods, where each factory method is responsible for generating a different kind of object.
We are going to discuss some examples, use cases, and a possible implementation.
Real-world examples
The abstract factory is used in car manufacturing. The same machinery is used for stamping the parts (doors, panels, hoods, fenders, and mirrors) of different car models. The model that is assembled by the machinery is configurable and easy to change at any time.
In the software category, the factory_boy
package (https://github.com/FactoryBoy/factory_boy) provides an abstract factory implementation for creating Django models in tests. An alternative tool is model_bakery
(https://github.com/model-bakers/model_bakery). Both packages are used for creating instances of models that support test-specific attributes. This is important because, this way, the readability of your tests is improved, and you avoid sharing unnecessary code.
Note
Django models are special classes used by the framework to help store and interact with data in the database (tables). See the Django documentation (https://docs.djangoproject.com) for more details.
Use cases for the abstract factory pattern
Since the abstract factory pattern is a generalization of the factory method pattern, it offers the same benefits: it makes tracking an object creation easier, it decouples object creation from object usage, and it gives us the potential to improve the memory usage and performance of our application.
Implementing the abstract factory pattern
To demonstrate the abstract factory pattern, I will reuse one of my favorite examples, included in the book Python 3 Patterns, Recipes and Idioms, by Bruce Eckel. Imagine that we are creating a game or we want to include a mini-game as part of our application to entertain our users. We want to include at least two games, one for children and one for adults. We will decide which game to create and launch at runtime, based on user input. An abstract factory takes care of the game creation part.
Let’s start with the kids’ game. It is called FrogWorld. The main hero is a frog who enjoys eating bugs. Every hero needs a good name, and in our case, the name is given by the user at runtime. The interact_with()
method is used to describe the interaction of the frog with an obstacle (for example, a bug, puzzle, and other frogs) as follows:
class Frog: def __init__(self, name): self.name = name def __str__(self): return self.name def interact_with(self, obstacle): act = obstacle.action() msg = f"{self} the Frog encounters {obstacle} and {act}!" print(msg)
There can be many kinds of obstacles, but for our example, an obstacle can only be a bug. When the frog encounters a bug, only one action is supported. It eats it:
class Bug: def __str__(self): return "a bug" def action(self): return "eats it"
The FrogWorld
class is an abstract factory. Its main responsibilities are creating the main character and the obstacle(s) in the game. Keeping the creation methods separate and their names generic (for example, make_character()
and make_obstacle()
) allows us to change the active factory (and, therefore, the active game) dynamically without any code changes. The code is as follows:
class FrogWorld: def __init__(self, name): print(self) self.player_name = name def __str__(self): return "\n\n\t------ Frog World -------" def make_character(self): return Frog(self.player_name) def make_obstacle(self): return Bug()
The WizardWorld game is similar. The only difference is that the wizard battles against monsters such as orks instead of eating bugs!
Here is the definition of the Wizard
class, which is similar to the Frog
one:
class Wizard: def __init__(self, name): self.name = name def __str__(self): return self.name def interact_with(self, obstacle): act = obstacle.action() msg = f"{self} the Wizard battles against {obstacle} and {act}!" print(msg)
Then, the definition of the Ork
class is as follows:
class Ork: def __str__(self): return "an evil ork" def action(self): return "kills it"
We also need to define a WizardWorld
class, similar to the FrogWorld
one that we have discussed; the obstacle, in this case, is an Ork
instance:
class WizardWorld: def __init__(self, name): print(self) self.player_name = name def __str__(self): return "\n\n\t------ Wizard World -------" def make_character(self): return Wizard(self.player_name) def make_obstacle(self): return Ork()
The GameEnvironment
class is the main entry point of our game. It accepts the factory as an input and uses it to create the world of the game. The play()
method initiates the interaction between the created hero and the obstacle, as follows:
class GameEnvironment: def __init__(self, factory): self.hero = factory.make_character() self.obstacle = factory.make_obstacle() def play(self): self.hero.interact_with(self.obstacle)
The validate_age()
function prompts the user to give a valid age. If the age is not valid, it returns a tuple with the first element set to False
. If the age is fine, the first element of the tuple is set to True
, and that’s the case where we care about the second element of the tuple, which is the age given by the user, as follows:
def validate_age(name): age = None try: age_input = input( f"Welcome {name}. How old are you? " ) age = int(age_input) except ValueError: print( f"Age {age} is invalid, please try again..." ) return False, age return True, age
Finally comes the main()
function definition, followed by calling it. It asks for the user’s name and age and decides which game should be played, given the age of the user, as follows:
def main(): name = input("Hello. What's your name? ") valid_input = False while not valid_input: valid_input, age = validate_age(name) game = FrogWorld if age < 18 else WizardWorld environment = GameEnvironment(game(name)) environment.play() if __name__ == "__main__": main()
The summary for the implementation we just discussed (see the complete code in the ch03/factory/abstract_factory.py
file) is as follows:
- We define
Frog
andBug
classes for the FrogWorld game. - We add a
FrogWorld
class, where we use ourFrog
andBug
classes. - We define
Wizard
andOrk
classes for the WizardWorld game. - We add a
WizardWorld
class, where we use ourWizard
andOrk
classes. - We define a
GameEnvironment
class. - We add a
validate_age()
function. - Finally, we have the
main()
function, followed by the conventional trick for calling it. The following are the aspects of this function:- We get the user’s input for name and age.
- We decide which game class to use based on the user’s age.
- We instantiate the right game class, and then the
GameEnvironment
class. - We call
.play()
on theenvironment
object to play the game.
Let’s call this program using the python ch03/factory/abstract_factory.py
command and see some sample output.
The sample output for a teenager is as follows:
Hello. What's your name? Arthur Welcome Arthur. How old are you? 13 ------ Frog World ------- Arthur the Frog encounters a bug and eats it!
The sample output for an adult is as follows:
Hello. What's your name? Tom Welcome Tom. How old are you? 34 ------ Wizard World ------- Tom the Wizard battles against an evil ork and kills it!
Try extending the game to make it more complete. You can go as far as you want; create many obstacles, many enemies, and whatever else you like.