Yet more __init__() techniques
We'll take a look at a few other, more advanced __init__()
techniques. These aren't quite so universally useful as the techniques in the previous sections.
The following is a definition for the Player
class that uses two strategy objects and a table
object. This shows an unpleasant-looking __init__()
method:
The __init__()
method for Player
seems to do little more than bookkeeping. We're simply transferring named parameters to same-named instance variables. If we have numerous parameters, simply transferring the parameters into the internal variables will amount to a lot of redundant-looking code.
We can use this Player
class (and related objects) as follows:
We can provide a very short and very flexible initialization by simply transferring keyword argument values directly into the internal instance variables.
The following is a way to build a Player
class using keyword argument values:
This sacrifices a great deal of readability for succinctness. It crosses over into a realm of potential obscurity.
Since the __init__()
method is reduced to one line, it removes a certain level of "wordiness" from the method. This wordiness, however, is transferred to each individual object constructor expression. We have to add the keywords to the object initialization expression since we're no longer using positional parameters, as shown in the following code snippet:
Why do this?
It does have a potential advantage. A class defined like this is quite open to extension. We can, with only a few specific worries, supply additional keyword parameters to a constructor.
The following is the expected use case:
The following is a bonus use case:
We've added a log_name
attribute without touching the class definition. This can be used, perhaps, as part of a larger statistical analysis. The Player2.log_name
attribute can be used to annotate logs or other collected data.
We are limited in what we can add; we can only add parameters that fail to conflict with the names already in use within the class. Some knowledge of the class implementation is required to create a subclass that doesn't abuse the set of keywords already in use. Since the **kw
parameter provides little information, we need to read carefully. In most cases, we'd rather trust the class to work than review the implementation details.
This kind of keyword-based initialization can be done in a superclass definition to make it slightly simpler for the superclass to implement subclasses. We can avoiding writing an additional __init__()
method in each subclass when the unique feature of the subclass involves simple new instance variables.
The disadvantage of this is that we have obscure instance variables that aren't formally documented via a subclass definition. If it's only one small variable, an entire subclass might be too much programming overhead to add a single variable to a class. However, one small variable often leads to a second and a third. Before long, we'll realize that a subclass would have been smarter than an extremely flexible superclass.
We can (and should) hybridize this with a mixed positional and keyword implementation as shown in the following code snippet:
This is more sensible than a completely open definition. We've made the required parameters positional parameters. We've left any nonrequired parameters as keywords. This clarifies the use of any extra keyword arguments given to the __init__()
method.
This kind of flexible, keyword-based initialization depends on whether we have relatively transparent class definitions. This openness to change requires some care to avoid debugging name clashes because the keyword parameter names are open-ended.
Initialization with type validation
Type validation is rarely a sensible requirement. In a way, this might be a failure to fully understand Python. The notional objective is to validate that all of the arguments are of a proper type. The issue with trying to do this is that the definition of proper is often far too narrow to be truly useful.
This is different from validating that objects meet other criteria. Numeric range checking, for example, may be essential to prevent infinite loops.
What can create problems is trying to do something like the following in an __init__()
method:
The isinstance()
method checks circumvent Python's normal duck typing.
We write a casino game simulation in order to experiment with endless variations on GameStrategy
. These are so simple (merely four methods) that there's little real benefit from inheritance from the superclass. We could define the classes independently, lacking an overall superclass.
The initialization error-checking shown in this example would force us to create subclasses merely to pass the error check. No usable code is inherited from the abstract superclass.
One of the biggest duck typing issues surrounds numeric types. Different numeric types will work in different contexts. Attempts to validate the types of arguments may prevent a perfectly sensible numeric type from working properly. When attempting validation, we have the following two choices in Python:
We write validation so that a relatively narrow collection of types is permitted, and someday the code will break because a new type that would have worked sensibly is prohibited
We eschew validation so that a broad collection of types is permitted, and someday the code will break because a type that would not work sensibly was used
Note that both are essentially the same. The code could perhaps break someday. It either breaks because a type was prevented from being used even though it's sensible or a type that's not really sensible was used.
Tip
Just allow it
Generally, it's considered better Python style to simply permit any type of data to be used.
We'll return to this in Chapter 4, The ABCs of Consistent Design.
The question is this: why restrict potential future use cases?
And the usual answer is that there's no good reason to restrict potential future use cases.
Rather than prevent a sensible, but possibly unforeseen, use case, we can provide documentation, testing, and debug logging to help other programmers understand any restrictions on the types that can be processed. We have to provide the documentation, logging, and test cases anyway, so there's minimal additional work involved.
The following is an example docstring that provides the expectations of the class:
The programmer using this class has been warned about what the type restrictions are. The use of other types is permitted. If the type isn't compatible with the expected type, then things will break. Ideally, we'll use too like unittest
or doctest
to uncover the breakage.
Initialization, encapsulation, and privacy
The general Python policy regarding privacy can be summed up as follows: we're all adults here.
Object-oriented design makes an explicit distinction between interface and implementation. This is a consequence of the idea of encapsulation. A class encapsulates a data structure, an algorithm, an external interface, or something meaningful. The idea is to have the capsule separate the class-based interface from the implementation details.
However, no programming language reflects every design nuance. Python, typically, doesn't implement all design considerations as explicit code.
One aspect of a class design that is not fully carried into code is the distinction between the private (implementation) and public (interface) methods or attributes of an object. The notion of privacy in languages that support it (C++ or Java are two examples) is already quite complex. These languages include settings such as private, protected, and public as well as "not specified", which is a kind of semiprivate. The private keyword is often used incorrectly, making subclass definition needlessly difficult.
Python's notion of privacy is simple, as follows:
It's all essentially public. The source code is available. We're all adults. Nothing can be truly hidden.
Conventionally, we'll treat some names in a way that's less public. They're generally implementation details that are subject to change without notice, but there's no formal notion of private.
Names that begin with _
are honored as less public by some parts of Python. The help()
function generally ignores these methods. Tools such as Sphinx can conceal these names from documentation.
Python's internal names begin (and end) with __
. This is how Python internals are kept from colliding with application features above the internals. The collection of these internal names is fully defined by the language reference. Further, there's no benefit to trying to use __
to attempt to create a "super private" attribute or method in our code. All that happens is that we create a potential future problem if a release of Python ever starts using a name we chose for internal purposes. Also, we're likely to run afoul of the internal name mangling that is applied to these names.
The rules for the visibility of Python names are as follows:
Most names are public.
Names that start with _
are somewhat less public. Use them for implementation details that are truly subject to change.
Names that begin and end with __
are internal to Python. We never make these up; we use the names defined by the language reference.
Generally, the Python approach is to register the intent of a method (or attribute) using documentation and a well-chosen name. Often, the interface methods will have elaborate documentation, possibly including doctest
examples, while the implementation methods will have more abbreviated documentation and may not have doctest
examples.
For programmers new to Python, it's sometimes surprising that privacy is not more widely used. For programmers experienced in Python, it's surprising how many brain calories get burned sorting out private and public declarations that aren't really very helpful because the intent is obvious from the method names and the documentation.