Modern Python Practices

Page content

Python has a long history, and it has evolved over time. This article describes some agreed modern best practices.

Tooling

Use Python 3

Use Python 3.9 or above for new work. Third-party libraries may require Python 3.6 or above, and several features of modern Python work best with version 3.7 or later.

Avoid using Python 2. It is not supported by the Python development team. The current versions of many libraries are not compatible with Python 2.

Use Virtual Environments for Development

The virtual environments feature enables you to define separate sets of packages for each Python project, so that they do not conflict with each other.

There are several tools that help you to manage your Python projects, and use virtual environments. Poetry is currently popular, but it may be superseded by Hatch. If you prefer, you can also manually set up and manage virtual environments.

Format Your Code

Use a formatting tool with a plugin to your editor, so that your code is automatically formatted to a consistent style.

If possible, use Black to format your code. Black is now the leading code formatter for Python. Black has been adopted by the Python Software Foundation and other major Python projects. It formats Python code to a style that follows the PEP 8 standard, but allows longer line lengths.

Language Syntax

Format Strings with f-strings

The new f-string syntax is both more readable and has better performance than older methods. Use f-strings instead of % formatting, str.format() or str.Template().

The older features for formatting strings will not be removed, to avoid breaking backward compatibility.

The f-strings feature was added in version 3.6 of Python. Alternate implementations of Python may include this specific feature, even when they do not support version 3.6 syntax.

PEP 498 explains f-strings in detail.

Use Datetime Objects with Time Zones

Always use datetime objects that are aware of time zones. By default, Python creates datetime objects that do not include a time zone. The documentation refers to datetime objects without a time zone as naive.

Avoid using date objects, except where the time of day is completely irrelevant. The date objects are always naive, and do not include a time zone.

Use aware datetime objects with the UTC time zone for timestamps, logs and other internal features.

To get the current time and date in UTC as an aware datetime object, specify the UTC time zone with now(). For example:

from datetime import datetime, timezone

dt = datetime.now(timezone.utc)

Python 3.9 and above include the zoneinfo module. This provides access to the standard IANA database of time zones. Previous versions of Python require a third-party library for time zones.

PEP 615 describes support for the IANA time zone database with zoneinfo.

Use enum or Named Tuples for Immutable Sets of Key-Value Pairs

Use the enum type in Python 3.4 or above for immutable collections of key-value pairs. Python 3 also has collections.namedtuple() for immutable key-value pairs.

Create Data Classes for Custom Data Objects

The data classes feature enables you to reduce the amount of code that you need to define classes for objects that exist to store values. The new syntax for data classes does not affect the behavior of the classes that you define with it. Each data class is a standard Python class.

Data classes were introduced in version 3.7 of Python.

PEP 557 describes data classes.

Use collections.abc for Custom Collection Types

The abstract base classes in collections.abc provide the components for building your own custom collection types.

Use these classes, because they are fast and well-tested. The implementations in Python 3.7 and above are written in C, to provide better performance than Python code.

Use breakpoint() for Debugging

This function drops you into the debugger at the point where it is called. Both the built-in debugger and external debuggers can use these breakpoints.

The breakpoint() feature was added in version 3.7 of Python.

PEP 553 describes the breakpoint() function.

Consider Using Type Hinting

Current versions of Python support type hinting. If you include these annotations, the Mypy tool can check your code.

PEP 484 - Type Hints and PEP 526 – Syntax for Variable Annotations define the notation for type hinting.

Application Design

Use Logging for Diagnostic Messages, Rather Than print()

The built-in print() statement is convenient for adding debugging information, but you should include logging in your scripts and applications. Use the logging module in the standard library, or a third-party logging module.

Only Use asyncio Where It Makes Sense

The asynchronous features of Python enable a single process to avoid blocking on I/O operations. You can achieve concurrency by running multiple Python processes, with or without asynchronous I/O.

To run multiple Web application processes, use Gunicorn or another WSGI server. Use the multiprocessing package in the Python standard library to build custom applications that run as multiple processes.

Code that needs asynchronous I/O must not call any function in the standard library that synchronous I/O, such as open(), or the logging module.

If you would like to work with asyncio, always use the most recent version of Python. Each new version of Python has improved the performance and features of async. For example, version 3.7 of Python introduced context variables, which enable you to have data that is local to a specific task.

The initial version of asyncio was included in version 3.4 of Python. Keywords for async and await were added in Python 3.5. Context variables and the asyncio.run() function were introduced in version 3.7 of Python.

PEP 0567 describes context variables.

Libraries

Handle Command-line Input with argparse

The argparse module is now the recommended way to process command-line input. Use argparse, rather than the older optparse and getopt.

The optparse module is officially deprecated, so update code that uses optparse to use argparse instead.

Refer to the argparse tutorial in the official documentation for more details.

Use pathlib for File and Directory Paths

Use pathlib objects instead of strings whenever you need to work with file and directory pathnames.

Consider using the the pathlib equivalents for os functions.

The existing methods in the standard library have been updated to support Path objects.

To list all of the the files in a directory, use either the .iterdir() function of a Path object, or the os.scandir() function.

This RealPython article provides a full explanation of the different Python functions for working with files and directories.

The pathlib module was added to the standard library in Python 3.4, and other standard library functions were updated to support Path objects in version 3.5 of Python.

Use os.scandir() Instead of os.listdir()

The os.scandir() function is significantly faster and more efficient than os.listdir(). Use os.scandir() wherever you previously used the os.listdir() function.

This function provides an iterator, and works with a context manager:

import os

with os.scandir('some_directory/') as entries:
    for entry in entries:
        print(entry.name)

The context manager frees resources as soon as the function completes. Use this option if you are concerned about performance or concurrency.

The os.walk() function now calls os.scandir(), so it automatically has the same improved performance as this function.

The os.scandir() function was added in version 3.5 of Python.

PEP 471 explains os.scandir().

Run External Commands with subprocess

The subprocess module provides a safe way to run external commands. Use subprocess rather than shell backquoting or the functions in os, such as spawn, popen2 and popen3. The subprocess.run() function in current versions of Python is sufficient for most cases.

PEP 324 explains the technical details of subprocess in detail.

Use Requests for HTTP Clients

Use the requests package for HTTP, rather than the urllib.request in the standard library.

Test with pytest

Use pytest for testing. It has has superceded nose as the most popular testing system for Python. Use the unittest module in the standard library for situations where you cannot add pytest to the project.