In simple terms, we can say that docstrings are basically documentation embedded in the source code. A docstring is basically a literal string, placed somewhere in the code, with the intention of documenting that part of the logic.
Notice the emphasis on the word documentation. This subtlety is important because it's meant to represent explanation, not justification. Docstrings are not comments; they are documentation.
Having comments in the code is a bad practice for multiple reasons. First, comments represent our failure to express our ideas in the code. If we actually have to explain why or how we are doing something, then that code is probably not good enough. For starters, it fails to be self-explanatory. Second, it can be misleading. Worst than having to spend some time reading a complicated section is to read a comment on how it is supposed to work, and figuring out that the code actually does something different. People tend to forget to update comments when they change the code, so the comment next to the line that was just changed will be outdated, resulting in a dangerous misdirection.
Sometimes, on rare occasions, we cannot avoid having comments. Maybe there is an error on a third-party library that we have to circumvent. In those cases, placing a small but descriptive comment might be acceptable.
With docstrings, however, the story is different. Again, they do not represent comments, but the documentation of a particular component (a module, class, method, or function) in the code. Their use is not only accepted but also encouraged. It is a good practice to add docstrings whenever possible.
The reason why they are a good thing to have in the code (or maybe even required, depending on the standards of your project) is that Python is dynamically typed. This means that, for example, a function can take anything as the value for any of its parameters. Python will not enforce, nor check, anything like this. So, imagine that you find a function in the code that you know you will have to modify. You are even lucky enough that the function has a descriptive name, and that its parameters do as well. It might still not be quite clear what types you should pass to it. Even if this is the case, how are they expected to be used?
Here is where a good docstring might be of help. Documenting the expected input and output of a function is a good practice that will help the readers of that function understand how it is supposed to work.
Consider this good example from the standard library:
In [1]: dict.update??
Docstring:
D.update([E, ]**F) -> None. Update D from dict/iterable E and F.
If E is present and has a .keys() method, then does: for k in E: D[k] = E[k]
If E is present and lacks a .keys() method, then does: for k, v in E: D[k] = v
In either case, this is followed by: for k in F: D[k] = F[k]
Type: method_descriptor
Here, the docstring for the update method on dictionaries gives us useful information, and it is telling us that we can use it in different ways:
- We can pass something with a .keys() method (for example, another dictionary), and it will update the original dictionary with the keys from the object passed per parameter:
>>> d = {}
>>> d.update({1: "one", 2: "two"})
>>> d
{1: 'one', 2: 'two'}
- We can pass an iterable of pairs of keys and values, and we will unpack them to update:
>>> d.update([(3, "three"), (4, "four")])
>>> d
{1: 'one', 2: 'two', 3: 'three', 4: 'four'}
In any case, the dictionary will be updated with the rest of the keyword arguments passed to it.
This information is crucial for someone that has to learn and understand how a new function works, and how they can take advantage of it.
Notice that in the first example, we obtained the docstring of the function by using the double question mark on it (dict.update??). This is a feature of the IPython interactive interpreter. When this is called, it will print the docstring of the object you are expecting. Now, imagine that in the same way, we obtained help from this function of the standard library; how much easier could you make the lives of your readers (the users of your code), if you place docstrings on the functions you write so that others can understand their workings in the same way?
The docstring is not something separated or isolated from the code. It becomes part of the code, and you can access it. When an object has a docstring defined, this becomes part of it via its __doc__ attribute:
>>> def my_function():
... """Run some computation"""
... return None
...
>>> my_function.__doc__
'Run some computation'
This means that it is even possible to access it at runtime and even generate or compile documentation from the source code. In fact, there are tools for that. If you run Sphinx, it will create the basic scaffold for the documentation of your project. With the autodoc extension (sphinx.ext.autodoc) in particular, the tool will take the docstrings from the code and place them in the pages that document the function.
Once you have the tools in place to build the documentation, make it public so that it becomes part of the project itself. For open source projects, you can use read the docs, which will generate the documentation automatically per branch or version (configurable). For companies or projects, you can have the same tools or configure these services on-premise, but regardless of this decision, the important part is that the documentation should be ready and available to all members of the team.
There is, unfortunately, one downside to docstrings, and it is that, as it happens with all documentation, it requires manual and constant maintenance. As the code changes, it will have to be updated. Another problem is that for docstrings to be really useful, they have to be detailed, which requires multiple lines.
Maintaining proper documentation is a software engineering challenge that we cannot escape from. It also makes sense to be like this. If you think about it, the reason for documentation to be manually written is because it is intended to be read by other humans. If it were automated, it would probably not be of much use. For the documentation to be of any value, everyone on the team must agree that it is something that requires manual intervention, hence the effort required. The key is to understand that software is not just about code. The documentation that comes with it is also part of the deliverable. Therefore, when someone is making a change on a function, it is equally important to also update the corresponding part of the documentation to the code that was just changed, regardless of whether its a wiki, a user manual, a README file, or several docstrings.