Modern Good Practices for Python Development
Table of Contents
Python has a long history, and it has evolved over time. This article describes some modern good practices.
Installing Python #
Install Python With Tools That Support Multiple Versions #
Instead of manually installing Python on to your development systems, use tools that provide copies of Python on demand. This means that you can choose a Python version for each of your projects, and upgrade projects to new versions of Python later without interfering with other tools and projects that use Python.
The official Python Install Manager for Microsoft Windows does this. It supports multiple Python versions, and you can use the py.exe tool to choose which version of Python to run. Version manager tools like pyenv also allow you to switch between different versions of Python at will, as well as providing the defined version for each of your projects. I provide a separate article on using version managers.
Modern project development tools like uv can install copies of Python as needed, in addition to their other features. They do this by using standalone builds, which are modified versions of Python that are maintained by Astral, not the Python project. These standalone builds have some limitations that are not present with other copies of Python.
If you use Development Containers you define a complete environment for a software project, which means that the project will always have a separate installation of Python. Development containers are a feature of Visual Studio Code and Jetbrains IDEs.
Both the pyenv tool and the Visual Studio Code Dev Container feature automatically compile Python from source code, rather than using the third-party standalone builds. If you use mise, you will need to change the configuration to compile Python from the official sources rather than downloading standalone builds.
Use the Most Recent Version of Python That You Can #
For new projects, choose the most recent stable version of Python 3. This ensures that you have the latest security fixes, as well as the fastest performance.
Upgrade your projects as new Python versions are released. The Python development team usually support each version for five years, but some Python libraries may only support each version of Python for a shorter period of time. If you use tools that support multiple versions of Python and automated testing, you can test your projects on new Python versions with little risk.
Avoid using Python 2. Older operating systems include Python 2, but it is not supported by the Python development team or by the developers of most popular Python libraries.
Avoid Using the Python Installation for Your Operating System #
If your operating system includes a Python installation, avoid using it for your projects. This Python installation will be for system tools. It is likely to use an older version of Python, and may not include all of the standard features. An operating system copy of Python should be marked to prevent you from installing packages into it, but not all operating systems set this marker.
Use a Helper to Run Python Scripts and Tools #
Use either pipx or uv to run Python tools and your single-file scripts on your computer. Both pipx and uv automatically provide each script and application with a separate Python virtual environment.
Running Python Scripts #
A Python script is a file that has the extension .py at the end of the name. Python reads and executes the script file from the top to the bottom. Optionally, the script can start with a metadata block that specifies the packages that it needs to use. This example script uses the packages requests and rich:
# /// script
# requires-python = ">=3.12"
# dependencies = [
# "requests<3",
# "rich<16"
# ]
# ///
import requests
from rich.console import Console
from rich.table import Table
table = Table(title="Cat Facts")
table.add_column("Fact", style="magenta")
response = requests.get("https://catfact.ninja/facts?limit=5")
items = response.json()["data"]
for item in items:
table.add_row(item["fact"])
console = Console()
console.print(table)
Use pipx or uv to run Python scripts. These tools will automatically download the required dependencies for each script into a separate Python virtual environment before running the code. For example:
pipx run my_script.py
uv run my_script.py
Avoid using the
pythonorpycommands to run single-file scripts. Python interpreters do not support metadata blocks and cannot manage dependencies.
You can add a shebang line to a script file, to tell operating systems to to run the script by using the tool that is specified by the shebang. Avoid using a shebang for your Python scripts unless you have a specific need to do so. The script file itself must be marked as executable for a shebang to work, and this is a security risk because the content of the script could be changed or replaced later. A shebang can also tie the script to a specific tool.
A Python script should be a single file that you have created to run on computers that are used for development. If you need more than this, create a Python application. To minimize complexity, you can use the flat project layout for applications. The project configuration will enable you to manage the code and dependencies, as well as providing support for packaging applications for distribution to other systems.
Running Python Tools #
Use the pipx run feature of pipx for most Python applications, or uvx, which is the equivalent command for uv. These download the application to a cache and run it. For example, these commands download and run the latest version of bpytop, a system monitoring tool:
pipx run bpytop
uvx bpytop
The bpytop tool is cached after the first download, which means that the second use of it will run as quickly as an installed application.
Use pipx install or uv tool install for tools that are essential for your development process. These options install the tool on to your system. This ensures that the tool is available if you have no Internet access, and that you keep the same version of the tool until you decide to upgrade it. For example, if you use Posting to test services that you develop on your laptop then you should install it, rather than use a temporary copy. To install posting as a Python package, run the appropriate command for pipx or uv:
pipx install posting
uv tool install posting
Application Design #
Use a Project Tool #
Use a project tool when you work with Python. There are several of these tools, which all provide the key features for managing Python projects. They generate directory structures that follow best practices, manage package dependencies and automate Python virtual environments, so that you do not need to manually create and activate environments as you work.
Project tools can also manage the versions of Python, so that you will automatically have the correct version of Python for each project. Your project tool can install copies of Python as needed. The section below explains in more detail.
Poetry is currently the most popular tool for managing Python projects, and it is a good choice for most cases. It is well-supported and has been steadily developed. The feature to provide versions of Python is currently experimental, so you should use a version manager alongside Poetry for significant projects.
There are currently several other well-known project tools for Python. The PDM project and uv from Astral are less conservative than Poetry, adopting new features and standards more rapidly. Some teams use Hatch, which provides a well-integrated set of features for building and testing Python software products.
Avoid using Rye. Rye has been superseded by uv.
You may need to create projects that include Python but cannot use Python project tools. In these cases, think carefully about the tools and directory structure that you will need, and ensure that you are familiar with the current best practices for Python projects.
Use a Modern Framework for Command-Line Applications #
If you are building a command-line application, you can use the flat project layout. This is the default layout for projects that are created by
uv.
Consider using the Cyclopts framework or the Typer library for building new command-line applications. Both of these use type hints and are built for modern Python. Many projects still use the older Click framework.
To add a command-line interface to a script or library, use the argparse module. Refer to the argparse tutorial in the official documentation for more details.
Use Products That Enable Concurrency and async #
By default, each Python process uses a single thread on a single CPU. This means that each process only performs one operation at a time. You can have multiple threads within a process, but this only enables switching between threads. If you need to achieve full concurrency with Python, you must run multiple Python processes, so that each process can run its threads on a separate CPU.
When you need to run operations at any scale, look for existing products that suit your needs. For example:
- Use workflow platforms such as Apache Airflow or Prefect to run sets of tasks.
- Build a Web application by combining a framework like FastAPI with an application server such as Granian or Gunicorn.
- Run computations that are distributed across many computers with the Dask or Ray frameworks.
All of these products enable you to run your Python code concurrently on multiple CPUs and on multiple computers, and can use asynchronous code.
The asynchronous features of Python enable threads to avoid blocking on I/O operations. To use asynchronous I/O in your code, you must use a Python library or framework that supports it. For example, the FastAPI Web framework supports both types of function in the same application. Code that uses asynchronous I/O must not call any other function that uses synchronous I/O, such as open(), or the logging module in the standard library. Instead, you need to use either the equivalent functions from asyncio in the standard library or ensure that the products and libraries that you use are designed to support asynchronous code.
If you use a framework that supports asynchronous I/O it may provide safe functions for services like logging, but you must still ensure that asynchronous functions never call any synchronous functions.
If you need to build a custom application with concurrency, consider using the concurrent.futures package in the Python standard library. This includes executors for distributing work across a pool of multiple threads or separate CPUs.
Plan for Distributing Your Work #
Always plan for how you will distribute the work that you produce in a project. The simplest method is through version control, but packages enable people to use your code with the operating systems and tools that they prefer to work with, rather than requiring that each system has developer tools installed.
Project tools like Poetry include support for building wheel packages. The wheel format is for sharing between Python installations. You can use wheel packages to publish tools for other developers, as well as libraries for use in other Python projects. If you publish your Python application as a wheel, other people can run it with the pipx and uv tools, as explained in the section on helpers.
Read the Python Packaging User Guide for more about wheel packages.
For other cases, you should use extra tools to package your work into a format that includes a copy of the required version of Python as well as your code and the dependencies. This ensures that your code runs with the expected version of Python, and that it has the correct version of each dependency.
Use OCI container images to package Python applications that are intended to be run by a service, such as Docker or a workflow engine, especially if the application provides a network service itself, such as a Web application. You can build OCI container images with Docker, buildah and other tools to include a copy of Python, along with your code and the required dependencies. OCI container images can run on any system that uses Docker, Podman or Kubernetes, as well as on cloud infrastructure. Consider using the official Python container image as the base image for your application container images.
Use PyInstaller or Nuitka to compile desktop and command-line applications as a single executable file. Each executable file includes a copy of Python, along with your code and the required dependencies. Each executable will only run on the type of operating system and CPU that it was compiled to use. For example, an executable for Microsoft Windows on Intel-compatible machines will work on all editions of Windows, but it will not run on macOS. Optionally, you can put executables in an operating system package to work with package management tools, such as an RPM or DEB package for Linux.
Requirements files: If you use requirements files to build or deploy projects then configure your tools to use hashes.
Configuration: Use Environment Variables or TOML #
Use environment variables for options that must be passed to an application each time that it starts, especially secrets like API tokens. If your application is a command-line tool, you should also provide options that can override the environment variables.
This approach enables you to set the variables in whatever way is appropriate for the current environment without changing the code. For example, you can use a tool like fnox to manage environment variables in development, and configure the orchestration system that runs the code on cloud services to set the variables as needed.
Use python-dotenv for environment variables that are defined in files, or Pydantic Settings for a full configuration system that supports files, environment variables and getting credentials from secure services.
If you need configuration files that are written or edited by human beings, use the TOML format. This format is an open standard that is used across Python projects and is also supported by other programming languages. For example, TOML is the default configuration file format for Rust projects. Python 3.11 and above include tomllib to read the TOML format. If your Python software must generate TOML, you need to add Tomli-W to your project.
TOML replaces the INI format. Avoid using INI for projects, even though the module for INI support has not yet been removed from the Python standard library.
Set Up Logging for Diagnostic Messages, Rather Than print() #
The built-in print() statement is convenient for adding debugging information, but you should use logging in applications.
Use a structured format for your logs so that they can be parsed and analyzed later. The format should always include timestamps with timezones. We include the timezones so that the data can be accurately searched and analyzed by other systems. We should expect servers and shared systems to use the UTC timezone, but log analyzers can never make this assumption.
Many frameworks use the logging module in the Python standard library, but this module was not designed to modern standards and requires some configuration to produce well-formatted logs. When you implement logging, consider using loguru or structlog.
Decide On A HTTP Client Library #
Avoid using urllib.request from the Python standard library. It was designed as a low-level library for HTTP, and lacks the features of modern Web client libraries. Many Python applications include requests, but this only supports HTTP/1.1, and cannot be used with async code. Consider alternative Web client libraries like aiohttp when you want to use async I/O.
Python SDKs for cloud services will have a dependency on a Web client library. Check which client library an SDK uses before you include it in your project.
Developing Python Projects #
Format Your Code #
Use a formatting tool with a plugin to your editor, so that your code is automatically formatted to a consistent style.
Consider using Ruff, which provides both code formatting and quality checks for Python code. Black is supported by the Python Software Foundation, and was the most popular code formatting tool for Python before the release of Ruff.
Use Git hooks to run the formatting tool before each commit to source control. You should also run the formatting tool with your CI system, so that it rejects any code that does not match the format for your project.
Use a Code Linter #
Use a code linting tool with a plugin to your editor, so that your code is automatically checked for issues.
Consider using Ruff for linting Python code. Before Ruff, the most popular code linter was flake8. Ruff includes the features of both flake8 and the most popular plugins for flake8, along with many other capabilities.
Use Git hooks to run the linting tool before each commit to source control. You should also run the linting tool with your CI system, so that it rejects any code that does not meet the standards for your project.
Use Type Hinting #
Current versions of Python support type hinting. Consider using type hints in any critical application. If you develop a shared library, use type hints.
Once you add type hints, type checkers like mypy and pyright can check your code as you develop it. Code editors will read type hints to display information about the code that you are working with. You can also add a type checker to your Git hooks and CI to validate that the code in your project is consistent.
If you use attrs or Pydantic in your application, they can work with type hints. If you use mypy, add the plugin for Pydantic to improve the integration between mypy and Pydantic.
PEP 484 - Type Hints and PEP 526 – Syntax for Variable Annotations define the notation for type hinting.
Test with pytest #
Use pytest for testing. Use the unittest module in the standard library for situations where you cannot add pytest to the project.
By default, pytest runs tests in the order that they appear in the test code. To avoid issues where tests interfere with each other, always add the pytest-randomly plugin to pytest. This plugin causes pytest to run tests in random order. Randomizing the order of tests is a common good practice for software development.
To see how much of your code is covered by tests, add the pytest-cov plugin to pytest. This plugin uses coverage to analyze your code.
Ensure That Requirements Files Include Hashes #
Python tools support hash checking to ensure that packages are valid. Some tools require extra configuration to include package hashes in the requirements files that they generate. For example, you must set the generate-hashes option for the pip-compile and uv utilities to generate requirements.txt files that include hashes. Add this option to the relevant section of the pyproject.toml file.
For pip-tools, add the option to the tool.pip-tools section:
[tool.pip-tools]
# Set generate-hashes for pip-compile
generate-hashes = true
For uv, add the option to the tool.uv.pip section:
[tool.uv.pip]
# Set generate-hashes for uv
generate-hashes = true
Language Syntax #
Create Data Classes for Custom Data Objects #
Python code frequently has classes for data objects: items that exist to store values, but do not carry out actions. If you are creating classes for data objects in your Python code, consider using Pydantic, attrs or the built-in data classes feature.
Pydantic provides validation, serialization and other features for data objects. You need to define the classes for Pydantic data objects with type hints. Classes that use attrs may use type hints, but it is optional.
The built-in Python syntax for data classes offers fewer capabilities than Pydantic or attrs. The data class syntax does enable you to reduce the amount of code that you need to define data objects. Each data class acts as a standard Python class. Data classes also provide a limited set of extra features, such as the ability to mark instances of a data class as frozen.
PEP 557 describes data classes.
Use enum or Named Tuples for Immutable Sets of Key-Value Pairs #
Use the enum type for immutable collections of key-value pairs. Enums can use class inheritance.
Python also has collections.namedtuple() for immutable key-value pairs. This feature was created before enum types. Named tuples do not use classes.
Format Strings with f-strings or t-strings #
The f-string syntax is both more readable and has better performance than older methods for formatting strings. Python 3.14 also includes the t-string syntax, which supports more advanced cases. Use f-strings or t-strings instead of % formatting, str.format() or str.Template().
The older features for formatting strings will not be removed, to avoid breaking backward compatibility.
PEP 498 explains f-strings in detail. PEP 750 explains t-strings.
Use Datetime Objects with Time Zones #
Always use datetime objects that are aware of time zones. By default, Python creates datetime objects that do not include a time zone. The documentation refers to datetime objects without a time zone as naive.
Avoid using date objects, except where the time of day is completely irrelevant. The date objects are always naive, and do not include a time zone.
Use aware datetime objects with the UTC time zone for timestamps, logs and other internal features.
To get the current time and date in UTC as an aware datetime object, specify the UTC time zone with now(). For example:
from datetime import datetime, timezone
dt = datetime.now(timezone.utc)
Python 3.9 and above include the zoneinfo module. This provides access to the standard IANA database of time zones. Previous versions of Python require a third-party library for time zones.
PEP 615 describes support for the IANA time zone database with zoneinfo.
Use pathlib for File and Directory Paths #
Use pathlib objects instead of strings whenever you need to work with file and directory pathnames. Consider using the the pathlib equivalents for os functions as well.
Methods in the standard library support Path objects. For example, to list all of the the files in a directory, you can use either the .iterdir() function of a Path object, or the os.scandir() function.
This RealPython article provides a full explanation of the different Python functions for working with files and directories.
Use os.scandir() Instead of os.listdir() #
The os.scandir() function is significantly faster and more efficient than os.listdir(). If you previously used the os.listdir() function, update your code to use os.scandir().
This function provides an iterator, and works with a context manager:
import os
with os.scandir('some_directory/') as entries:
for entry in entries:
print(entry.name)
The context manager frees resources as soon as the function completes. Use this option if you are concerned about performance or concurrency.
The os.walk() function now calls os.scandir(), so it automatically has the same improved performance as this function.
The os.scandir() function was added in version 3.5 of Python.
PEP 471 explains os.scandir().
Run External Commands with subprocess #
The subprocess module provides a safe way to run external commands. Use subprocess rather than shell backquoting or the functions in os, such as spawn, popen2 and popen3. The subprocess.run() function in current versions of Python is sufficient for most cases.
PEP 324 explains subprocess in detail.
Use collections.abc for Custom Collection Types #
The abstract base classes in collections.abc provide the components for building your own custom collection types.
Use these classes, because they are fast and well-tested. The implementations in Python 3.7 and above are written in C, to provide better performance than Python code.
Use breakpoint() for Debugging #
This function drops you into the debugger at the point where it is called. Both the built-in debugger and external debuggers can use these breakpoints.
The breakpoint() feature was added in version 3.7 of Python.
PEP 553 describes the breakpoint() function.
Data Formats and Storage #
There are now data file formats that are open, standardized and portable. If possible, use these formats, and avoid older formats. Modern formats are standardized, can be reliably read by many different systems and can be processed efficiently, even with large quantities of data. Some older formats are not standardized, which means that different systems can write different variations, which then cause errors when you move data between systems.
Modern Data Formats #
If possible, use JSON for structured data. It is a plain-text format for data objects. You can then also use these file formats:
- SQLite - Binary format for self-contained and robust database files
- Apache Parquet - Binary format for efficient storage of tabular data in files
All of the versions of Python 3 include modules for JSON and SQLite. The Pandas dataframe library supports Parquet, JSON and SQLite. DuckDB also supports all three formats.
If you need to work with other data formats, consider using a modern file format in your application and adding features to import data or generate exports in other formats when necessary. For example, DuckDB and Pandas include features to import and export data to files in the Excel format.
In most cases, you should use the JSON format to transfer data between systems, especially if the systems must communicate with HTTP. JSON documents can be used for any kind of data. Since JSON is plain-text, data in this format can be stored in either files or in a database. Every programming language and modern SQL database supports JSON.
You can validate JSON documents with JSON Schemas. Pydantic enables you to export your Python data objects to JSON and generate JSON Schemas from the data models.
Each SQLite database is a single file. Use SQLite files for data and configuration for applications as well as for queryable databases. They are arguably more portable and resilient than sets of plain-text files. SQLite is widely-supported, robust and the file format is guaranteed to be stable and portable for decades. Each SQLite database file can safely be gigabytes in size.
You can use SQLite databases for any kind of data. They can be used to store and query data in JSON format, they hold plain text with optional full-text search, and they can store binary data.
If you need to query a large set of tabular data, put a copy in Apache Parquet files and use that copy for your work. The Parquet format is specifically designed for large-scale data operations, and scales to tables with millions of rows. Parquet can store data that is in JSON format, as well as standard data types.
I provide a separate article with more details about modern data formats.
Avoid Problematic File Formats #
Avoid these older file formats:
- INI - Use TOML instead
- CSV - Use SQLite or Apache Parquet instead
- YAML - Use TOML or JSON instead
Systems can implement legacy formats in different ways, which means that there is a risk that data will not be read correctly when you use a file that has been created by another system. Files that are edited by humans are also more likely to contain errors, due to the complexities and inconsistency of these formats.
Working with CSV Files #
Python does include a module for CSV files, but consider using DuckDB instead. DuckDB provides CSV support that is tested for its ability to handle incorrectly formatted files.
Avoid creating CSV files, because modern data formats are all more capable. If you use DuckDB or Pandas then you can import and export data to Parquet, SQLite and Excel file formats. Unlike CSV, these file formats store explicit data types for items.
Working with YAML Files #
If you need to work with YAML in Python, use ruamel.yaml. This supports YAML version 1.2. Avoid using PyYAML, because it only supports version 1.1 of the YAML format.
Avoid creating YAML files, because modern formats offer better options. Consider using TOML for application configuration, and JSON or table-based storage like SQLite for larger sets of data.