Skip to main content

Modern Good Practices for Python Development

·16 mins

Python has a long history, and it has evolved over time. This article describes some agreed modern best practices.

Using Python #

Avoid Using the Python Installation in Your Operating System #

If your operating system includes a Python installation, avoid using it for your projects. This Python installation is for operating system tools. It is likely to use an older version of Python, and may not include all of the standard features. An operating system copy of Python should be marked to prevent you from installing packages into it, but not all operating systems set the marker.

Install Python With Tools That Support Multiple Versions #

Use a tool like mise or pyenv to install copies of Python on your development systems, so that you can switch between different versions of Python for your projects. This enables you to upgrade each project to a new version of Python without interfering with other tools and projects that use Python.

Alternatively, consider using Development Containers, which enable you to define an isolated environment for a software project. This also allows you to use a separate version of Python for each project.

Ensure that the tool compiles Python, rather than downloading standalone builds. The standlone builds are modified versions of Python that are maintained by a third-party. Both the pyenv tool and the Visual Studio Code Dev Container feature automatically compile Python, but you must change the mise configuration to use compilation.

Only use the Python installation features of PDM and Hatch for experimental projects. These tools always download the third-party standalone builds when they manage versions of Python.

Use the Most Recent Version of Python That You Can #

For new projects, choose the most recent stable version of Python 3. This ensures that you have the latest security fixes, as well as the fastest performance.

Upgrade your projects as new Python versions are released. The Python development team usually support each version for five years, but some Python libraries may only support each version of Python for a shorter period of time. If you use tools that support multiple versions of Python and automated testing, you can test your projects on new Python versions with little risk.

Avoid using Python 2. It is not supported by the Python development team or by the developers of most popular Python libraries.

Use pipx To Run Developer Applications #

Use pipx to run Python applications on development systems, rather than installing the applications with pip or another method. The pipx tool automatically puts the libraries for each application into a separate Python virtual environment.

Use the pipx run command for most applications, rather than pipx install. The pipx run command downloads and runs the application without installing it. For example, this command downloads and runs the latest version of bpytop, a system monitoring tool:

pipx run bpytop

Each application is cached for 14 days after the first download, which means that the second use of pipx run bpytop will run as quickly as an installed application.

Use pipx install for tools that are essential for your development process, such as pre-commit:

pipx install pre-commit

If you use pre-commit, it automatically runs every time that you commit a change to version control. We install pre-commit, so that we keep the same version until we decide to upgrade it.

To set up pipx, follow the instructions on the pipx Website for your operating system. This ensures that pipx works with an appropriate Python installation.

PEP 668 - Marking Python base environments as “externally managed” recommends that users install Python applications with pipx.

Developing Python Projects #

Use a Project Tool #

If you use a project tool, it will follow the best practices for Python projects. Consider using the PDM tool to help you develop Python applications. Hatch is another well-known project tool, but it is most useful for developing Python libraries. Rye is a more experimental tool for Python projects, and it is likely to be superseded in future.

Avoid using the Poetry tool for new projects. Poetry uses non-standard implementations of key features. For example, it does not use the standard format in pyproject.toml files, which may cause compatibility issues with other tools.

Format Your Code #

Use a formatting tool with a plugin to your editor, so that your code is automatically formatted to a consistent style.

Black is currently the most popular code formatting tool for Python, but consider using Ruff. Ruff provides both code formatting and quality checks for Python code.

Use pre-commit to run the formatting tool before each commit to source control. You should also run the formatting tool with your CI system, so that it rejects any code that does not match the format for your project.

Use a Code Linter #

Use a code linting tool with a plugin to your editor, so that your code is automatically checked for issues.

flake8 is currently the most popular linter for Python, but consider using Ruff. Ruff includes the features of both flake8 itself and the most popular plugins for flake8.

Use pre-commit to run the linting tool before each commit to source control. You should also run the linting tool with your CI system, so that it rejects any code that does not meet the standards for your project.

Test with pytest #

Use pytest for testing. Use the unittest module in the standard library for situations where you cannot add pytest to the project.

By default, pytest runs tests in the order that they appear in the test code. To avoid issues where tests interfere with each other, always add the pytest-randomly plugin to pytest. This plugin causes pytest to run tests in random order. Randomizing the order of tests is a common good practice for software development.

To see how much of your code is covered by tests, add the pytest-cov plugin to pytest. This plugin uses coverage to analyze your code.

Package Your Projects #

Always package the tools and code libraries that you would like to share with other people. Packages enable people to use your code with the tools and systems that they prefer to work with, and select the version of your code that is best for them.

Use wheel packages for libraries. You can also use wheel packages for development tools. If you publish your Python application as a wheel, other developers can use it with pipx. Remember that all wheel packages require an existing installation of Python.

In most cases, you should package an application in a format that enables you to include your code, the dependencies and a copy of the required version of Python. This ensures that your code runs with the expected version of Python, and has the correct version of each dependency.

Use container images to package applications that provide a network service, such as a Web application. Use PyInstaller to publish desktop and command-line applications as a single executable file. Each container image and PyInstaller file includes a copy of Python, along with your code and the required dependencies.

Language Syntax #

Use Type Hinting #

Current versions of Python support type hinting. Consider using type hints in any critical application. If you develop a shared library, use type hints.

Once you add type hints, the mypy tool can check your code as you develop it. Code editors can also read type hints to display information about the code that you are working with.

If you use Pydantic in your application, it can work with type hints. Use the mypy plugin for Pydantic to improve the integration between mypy and Pydantic.

PEP 484 - Type Hints and PEP 526 – Syntax for Variable Annotations define the notation for type hinting.

Create Data Classes for Custom Data Objects #

Python code frequently has classes for data objects, items that exist to store values, but do not carry out actions. If your application could have a number of classes for data objects, consider using either Pydantic or the built-in data classes feature.

Pydantic provides validation, serialization and other features for data objects. You need to define the classes for Pydantic data objects with type hints.

The built-in syntax for data classes just enables you to reduce the amount of code that you need to define data objects. It also provides some features, such as the ability to mark instances of a data class as frozen. Each data class acts as a standard Python class, because syntax for data classes does not change the behavior of the classes that you define with it.

Data classes were introduced in version 3.7 of Python.

PEP 557 describes data classes.

Use enum or Named Tuples for Immutable Sets of Key-Value Pairs #

Use the enum type in Python 3.4 or above for immutable collections of key-value pairs. Enums can use class inheritance.

Python 3 also has collections.namedtuple() for immutable key-value pairs. Named tuples do not use classes.

Format Strings with f-strings #

The new f-string syntax is both more readable and has better performance than older methods. Use f-strings instead of % formatting, str.format() or str.Template().

The older features for formatting strings will not be removed, to avoid breaking backward compatibility.

The f-strings feature was added in version 3.6 of Python. Alternate implementations of Python may include this specific feature, even when they do not support version 3.6 syntax.

PEP 498 explains f-strings in detail.

Use Datetime Objects with Time Zones #

Always use datetime objects that are aware of time zones. By default, Python creates datetime objects that do not include a time zone. The documentation refers to datetime objects without a time zone as naive.

Avoid using date objects, except where the time of day is completely irrelevant. The date objects are always naive, and do not include a time zone.

Use aware datetime objects with the UTC time zone for timestamps, logs and other internal features.

To get the current time and date in UTC as an aware datetime object, specify the UTC time zone with now(). For example:

from datetime import datetime, timezone

dt = datetime.now(timezone.utc)

Python 3.9 and above include the zoneinfo module. This provides access to the standard IANA database of time zones. Previous versions of Python require a third-party library for time zones.

PEP 615 describes support for the IANA time zone database with zoneinfo.

Use collections.abc for Custom Collection Types #

The abstract base classes in collections.abc provide the components for building your own custom collection types.

Use these classes, because they are fast and well-tested. The implementations in Python 3.7 and above are written in C, to provide better performance than Python code.

Use breakpoint() for Debugging #

This function drops you into the debugger at the point where it is called. Both the built-in debugger and external debuggers can use these breakpoints.

The breakpoint() feature was added in version 3.7 of Python.

PEP 553 describes the breakpoint() function.

Application Design #

Use Logging for Diagnostic Messages, Rather Than print() #

The built-in print() statement is convenient for adding debugging information, but you should include logging in your scripts and applications. Use the logging module in the standard library, or a third-party logging module.

Use the TOML Format for Configuration #

Use TOML for data files that must be written or edited by human beings. Use the JSON format for data that is transferred between computer programs. Avoid using the INI or YAML formats.

Python 3.11 and above include tomllib to read the TOML format. Use tomli to add support for reading TOML to applications that run on older versions of Python.

If your Python software needs to generate TOML, add Tomli-W.

PEP 680 - tomllib: Support for Parsing TOML in the Standard Library explains why TOML is now included with Python.

Only Use async Where It Makes Sense #

The asynchronous features of Python enable a single process to avoid blocking on I/O operations. To achieve concurrency with Python, you must run multiple Python processes. Each of these processes may or may not use asynchronous I/O.

To run multiple application processes, either use a container system, with one container per process, or an application server like Gunicorn. If you need to build a custom application that manages multiple processes, use the multiprocessing package in the Python standard library.

Code that uses asynchronous I/O must not call any function that uses synchronous I/O, such as open(), or the logging module in the standard library. Instead, you need to use either the equivalent functions from asyncio in the standard library or a third-party library that is designed to support asynchronous code.

The FastAPI Web framework supports using both synchronous and asynchronous functions in the same application. You must still ensure that asynchronous functions never call any synchronous function.

If you would like to work with asyncio, use Python 3.7 or above. Version 3.7 of Python introduced context variables, which enable you to have data that is local to a specific task, as well as the asyncio.run() function.

PEP 0567 describes context variables.

Libraries #

Handle Command-line Input with argparse #

The argparse module is now the recommended way to process command-line input. Use argparse, rather than the older optparse and getopt.

The optparse module is officially deprecated, so update code that uses optparse to use argparse instead.

Refer to the argparse tutorial in the official documentation for more details.

Use pathlib for File and Directory Paths #

Use pathlib objects instead of strings whenever you need to work with file and directory pathnames.

Consider using the the pathlib equivalents for os functions.

The existing methods in the standard library have been updated to support Path objects.

To list all of the the files in a directory, use either the .iterdir() function of a Path object, or the os.scandir() function.

This RealPython article provides a full explanation of the different Python functions for working with files and directories.

The pathlib module was added to the standard library in Python 3.4, and other standard library functions were updated to support Path objects in version 3.5 of Python.

Use os.scandir() Instead of os.listdir() #

The os.scandir() function is significantly faster and more efficient than os.listdir(). Use os.scandir() wherever you previously used the os.listdir() function.

This function provides an iterator, and works with a context manager:

import os

with os.scandir('some_directory/') as entries:
    for entry in entries:
        print(entry.name)

The context manager frees resources as soon as the function completes. Use this option if you are concerned about performance or concurrency.

The os.walk() function now calls os.scandir(), so it automatically has the same improved performance as this function.

The os.scandir() function was added in version 3.5 of Python.

PEP 471 explains os.scandir().

Run External Commands with subprocess #

The subprocess module provides a safe way to run external commands. Use subprocess rather than shell backquoting or the functions in os, such as spawn, popen2 and popen3. The subprocess.run() function in current versions of Python is sufficient for most cases.

PEP 324 explains subprocess in detail.

Use httpx for Web Clients #

Use httpx for Web client applications. It supports HTTP/2, and async. The httpx package supersedes requests, which only supports HTTP 1.1.

Avoid using urllib.request from the Python standard library. It was designed as a low-level library, and lacks the features of httpx.

Best Practices for Python Projects #

Consider using a project tool to set up and develop your Python projects. If you decide not to use a project tool, set up your projects to follow the best practices in this section.

Use a pyproject.toml File #

Create a pyproject.toml file in the root directory of each Python project. Use this file as the central place to store configuration information about the project and the tools that it uses. For example, you list the dependencies of your project in the pyproject.toml file.

Python project tools like PDM and Hatch automatically create and use a pyproject.toml file.

The pyOpenSci project documentation on pyproject.toml provides an introduction to the file format. The various features of pyproject.toml files are defined these PEPs: PEP 517, PEP 518, PEP 621 and PEP 660.

Create a Directory Structure That Uses the src Layout #

Python itself does not require a specific directory structure for your projects. The Python packaging documentation describes two popular directory structures: the src layout and the flat layout. The pyOpenSci project documentation on directory structures explains the practical differences between the two.

For modern Python projects, use the src layout. This requires you to use editable installs of the packages in your project. PDM and Hatch support editable installs.

Use Virtual Environments for Development #

The virtual environments feature enables you to define one or more separate sets of packages for each Python project, and switch between them. This ensures that a set of packages that you use for a specific purpose do not conflict with any other Python packages on the system. Always use Python virtual environments for your projects.

Several tools automate virtual environments. The mise version manager includes support for virtual environments. The pyenv version manager supports virtual environments with the virtualenv plugin. If you use a tool like PDM or Hatch to develop your projects, these also manage Python virtual environments for you.

You can set up and use virtual environments with venv, which is part of the Python standard library. This is a manual process.

Use Requirements Files to Install Packages Into Environments #

Avoid using pip commands to install individual packages into virtual environments. If you use PDM or Hatch to develop your project, they can manage the contents of virtual environments for development and testing.

For other cases, use requirements files. A requirements file can specify the exact version and hash for each required package.

You run a tool to read the dependencies in the pyproject.toml file and generate a requirements file that lists the specific packages that are needed to provide those dependencies for the Python version and operating system. PDM, pip-tools and uv include features to create requirements files.

You can then use pip-sync or the pip sync feature of uv to make the packages in a target virtual environment match the list in the requirements file. This process ensures that any extra packages are removed from the virtual environment.

You can also run pip install with a requirements file. This only attempts to install the specified packages. For example, these commands install the packages that are specified by the file requirements-macos-dev.txt into the virtual environment .venv-dev:

source ./.venv-dev/bin/activate
python3 -m pip install --require-virtualenv -r requirements-macos-dev.txt

Ensure That Requirements Files Include Hashes #

Some tools require extra configuration to include package hashes in the requirements files that they generate. For example, you must set the generate-hashes option for the pip-compile and uv utilities to generate requirements.txt files that include hashes. Add this option to the relevant section of the pyproject.toml file.

For pip-tools, add the option to the tool.pip-tools section:

[tool.pip-tools]
# Set generate-hashes for pip-compile
generate-hashes = true

For uv, add the option to the tool.uv.pip section:

[tool.uv.pip]
# Set generate-hashes for uv
generate-hashes = true

pip-compile: Use the Correct Virtual Environment #

If you do not already have a tool that can create requirements files, you can use the pip-compile utility that is provided by pip-tools.

To ensure that it calculates the correct requirements for your application, the pip-compile tool must be run in a virtual environment that includes your application package. This means that you cannot use pipx to install pip-compile.