Using Python packages

Making effective use of the work of others

Modern tooling making
Author

Mads-Peter V. Christiansen

These notes explore ways of using Python packages. As mentioned in previous notes, a package is

A collection of modules (.py-files), usually organized as a directory, and often distributed and installed together.

The central hub for packages is the Python Package Index (PyPI) where hundreds of thousands of packages are published.

Ways of old

The tool that accompanies PyPI is pip1, as discussed in another note. The typical way of working on a project with this setup is to make a directory, create a virtual environment, and use pip to install the required packages.

This note explores how uv makes this process both easier and more robust.

Projects with uv

uv formalizes this workflow into the concept of a project. A project can be started with the command

uv init <project_name>

which creates a directory called <project_name> with the following structure

<project_name>/
├── .python-version
├── main.py
├── pyproject.toml
├── README.md

The purpose of these files is:

  • .python-version: Specifies the default Python version the project will use, used by uv to choose which version of Python to create a virtual environment with.
  • main.py: The main Python script in the project, for small projects consisting of a single script you can keep it as is or rename it to something you find suitable. For larger projects with multiple modules (.py-files), it will be helpful to make sure that main.py is really the main script.
  • pyproject.toml: This is, as the name suggests, a project configuration file - in the current context the main purpose is to specify dependencies. You should not rename or delete this.
  • README.md: This is a markdown document2, if you were to share a small project this is where you’d put documentation, example usage, development notes, to-dos and so on.

With the project initialized, it’s time to start configuring it. Say you want to do some linear algebra, a perfectly reasonable thing to feel like doing, you might want numpy as a dependency, so uv provides the command

uv add numpy

This will do a few things

  1. It will add numpy to the dependency list in pyproject.toml.
  2. It will create (or update) a virtual environment, at .venv, that contains numpy with the Python version specified in .python-version.
  3. It will create (or update) a uv.lock file.

This command creates two new items in the project directory: .venv (the virtual environment we’re familiar with) and uv.lock. Both pyproject.toml and uv.lock specify dependencies, the important difference is the scope. Typically, pyproject.toml will specify version requirements broadly, for example that a project needs a certain package, whereas uv.lock will specify exactly which version you have installed in the project’s virtual environment.

The panel below shows pyproject.toml and part of uv.lock for a project created with the two steps above.

[project]
name = "example"
version = "0.1.0"
description = "Add your description here"
readme = "README.md"
requires-python = ">=3.12"
dependencies = [
    "numpy>=2.4.1",
]
version = 1
revision = 3
requires-python = ">=3.12"

[[package]]
name = "example"
version = "0.1.0"
source = { virtual = "." }
dependencies = [
    { name = "numpy" },
]

[package.metadata]
requires-dist = [{ name = "numpy", specifier = ">=2.4.1" }]

[[package]]
name = "numpy"
version = "2.4.1"
source = { registry = "https://pypi.org/simple" }
sdist = { url = "https://files.pythonhosted.org/packages/24/62/ae72ff66c0f1fd959925b4c11f8c2dea61f47f6acaea75a08512cdfe3fed/numpy-2.4.1.tar.gz", hash = "sha256:a1ceafc5042451a858231588a104093474c6a5c57dcc724841f5c888d237d690", size = 20721320, upload-time = "2026-01-10T06:44:59.619Z" }
wheels = [
    { url = "https://files.pythonhosted.org/packages/78/7f/ec53e32bf10c813604edf07a3682616bd931d026fcde7b6d13195dfb684a/numpy-2.4.1-cp312-cp312-macosx_10_13_x86_64.whl", hash = "sha256:d3703409aac693fa82c0aee023a1ae06a6e9d065dba10f5e8e80f642f1e9d0a2", size = 16656888, upload-time = "2026-01-10T06:42:40.913Z" },
    ...
    ...
    ...
]

numpy==2.4.1

As you can see, uv.lock is much more detailed. Generally you can think of pyproject.toml as what your project needs and uv.lock as the details needed to recreate the exact environment you are working with. Note that uv.lock is not meant to be edited by hand, but will be kept up to date by uv. Additionally, if you want to specify project requirements in a tool-agnostic way the requirements.txt format is an option; it can be generated with any of the following commands

pip freeze > requirements.txt
uv pip freeze > requirements.txt
uv export --format requirements.txt > requirements.txt

and installed with

pip install -r requirements.txt
uv pip install -r requirements.txt

Additional calls of uv add will update pyproject.toml, .venv and uv.lock so that they are all kept current and in sync. Once the required dependencies have been added, we would want to start running the script, so you can do:

source .venv/bin/activate
python main.py

This is perfectly valid but uv also provides a bit of convenience for this, through the command

uv run main.py

which automatically runs the script in the project’s virtual environment, even if a different environment is active. So to recap, the commands

  • uv init <project_name>
  • uv add <package_name>
  • uv run <script_name>

get your script running in a completely reproducible environment. As a project grows, a dependency may become obsolete, so uv provides the command

uv remove <package_name>

which like uv add will update the project’s virtual environment, pyproject.toml and uv.lock.

Cautionuv pip

An important difference between uv add and uv pip is that the latter does not change pyproject.toml or uv.lock - it just installs a package into the virtual environment. Because of this, using uv add is typically preferred, as it provides the additional benefits of keeping the dependency files in sync.

Finally, uv provides a command to synchronize the virtual environment with the lockfile (uv.lock),

uv sync

You’d, for example, use this after downloading a project from someone else, which then ensures you get the exact same environment that they had. An additional use case of uv sync is that it effectively resets the environment to exactly what is specified by the lockfile. This is useful, for example, if you have been experimenting with additional packages not installed through uv add.

CLI tools

There are many great programs that can assist us with common tasks; a number of these are available as command-line interface (CLI) tools. We may, for example, want to use a linter/formatter to apply consistent formatting to our code.

An example with ruff

ruff is a linter/formatter. It can check if Python code follows a configurable style guide among other things. When using such a tool, we may not want to consider it a dependency of the project we are working on, as the project does not need it to run. So we don’t really want to use uv add; uv provides a few ways of using such a tool. Firstly:

uvx ruff

uvx creates a temporary virtual environment isolated from everything else and immediately runs the tool. This is great for getting a feel for if a tool is something you want to use.

If you end up wanting to use the tool consistently, a dedicated environment for it can be created with

uv tool install ruff

and can still be used with the same command, but now from its own dedicated non-temporary virtual environment

uvx ruff

An example with httpie

Another tool that can be convenient is http/https through httpie. For example, we can download a file using

uvx --from httpie http --download <web_address> -o <save_path>

An example with jupyter

Sometimes you may want to use an interactive platform like a Jupyter notebook to explore, debug, etc. You can open a notebook in an isolated environment using

uvx jupyter lab

Or you can open the Jupyter Lab interface in the project’s environment using

uv run --with jupyter jupyter lab

Exercises

Exercise 1: Initialize a project with uv

In this exercise, you will recreate the environment discussed in Section 2. Start by creating the project directory using

uv init project_example

Afterwards, inspect the files that were created. Now add numpy as a dependency with

uv add "numpy>2.0.0"

Inspect the files in the directory again, and confirm that pyproject.toml was updated and that uv.lock was created. Add the following code to the main.py script

import numpy as np

def main():
    print('Numpy version:', np.version.full_version)


if __name__ == "__main__":
    main()

Run the script, either by activating the virtual environment and using python main.py or with uv run main.py. How does the version printed by the script compare to those stated by pyproject.toml and uv.lock?

Exercise 2: Compare uv add and uv pip

In the project from the previous exercise, run the command

uv pip install matplotlib 

Then check the output of

uv pip list

and see if pyproject.toml or uv.lock have been updated. Then run

uv sync

to restore the environment to the state described by the lockfile and rerun

uv pip list
Note

You can use this kind of workflow to experiment with other packages or to add packages to help you debug, etc.; uv sync allows you to easily reset to only those packages that are explicitly stated as dependencies.


Exercise 3: Reproduce a pandas environment

A colleague gives you the script below

import pandas as pd

def combine_experiment_data(baseline_data, treatment_data):
    """Combine experimental data from two groups."""
    baseline_df = pd.DataFrame(baseline_data)
    treatment_df = pd.DataFrame(treatment_data)
    combined = baseline_df.append(treatment_df, ignore_index=True)
    return combined

def analyze_results(df):
    """Analyze combined experimental results."""
    print(f"Total samples: {len(df)}")
    print("\nMean values by group:")
    print(df.groupby("group")["value"].mean())
    print("\nOverall statistics:")
    print(df["value"].describe())

def main():
    print("Pandas version:", pd.__version__)
    print()

    # Sample experimental data
    baseline = {"group": ["control"] * 5, "value": [23.5, 24.1, 22.8, 25.3, 23.9]}

    treatment = {"group": ["treatment"] * 5, "value": [28.2, 29.1, 27.5, 30.2, 28.8]}

    combined_data = combine_experiment_data(baseline, treatment)
    analyze_results(combined_data)


if __name__ == "__main__":
    main()

From the imports, you conclude that you need an environment with the pandas package to run the script. Use uv to set up such an environment.

Does the script run? What seems to be the problem?

You send the colleague a passive-aggressive email stating that the script does not work, and they provide you with a requirements.txt file shown below

pandas<2.0.0

Make this file and install it with uv pip install -r requirements.txt, then try rerunning the script.

With just one dependency, an issue like this could be resolved manually, but for a project with many dependencies it becomes extremely frustrating.

Note

Tutorials typically assume the newest version of the package the tutorial is about and generally do not provide any additional specifications.

However, for your own scientific work, reproducibility is key!


Exercise 4: Format code with ruff

You’re provided the script below

def calculate_stats(numbers):
    total=sum(numbers);average=total/len(numbers)
    maximum = max( numbers );minimum=min(numbers)
    return {'total':total,'average':average,'max':maximum,'min':minimum}

data=[1,2,3,4,5,6,7,8,9,10]
result=calculate_stats(data)
print(f"Results: Total={result['total']}, Average={result['average']}, Max={result['max']}, Min={result['min']}")

if result['average']>5:print("Average is greater than 5")
else:   print("Average is 5 or less")

This is a perfectly valid Python script; it is, however, not written in a particularly friendly way. Save the script, e.g., as messy_script.py and run

uvx ruff format messy_script.py

This uses uvx to run the ruff formatter on the script. Compare the readability of the formatted script to the original.

Note

While for a small script like the above, it may seem somewhat trivial to worry about formatting, consistent formatting makes working on larger projects much more pleasant.

Exercise 5: Analyze FASTA data with Biopython

Make a new project using uv init, you can, for example, call it bio-example.

Use uv add to add biopython as a dependency.

Use httpie through uvx to download the FASTA file found at

https://raw.githubusercontent.com/biopython/biopython/master/Doc/examples/ls_orchid.fasta

In the main.py file, replace the default content with

from Bio import SeqIO
import matplotlib.pyplot as plt

def main():
    # Parse FASTA file
    records = list(SeqIO.parse("ls_orchid.fasta", "fasta"))

    # Extract sequence lengths
    lengths = [len(record.seq) for record in records]

    # Plot
    plt.hist(lengths, bins=10, edgecolor="black")
    plt.xlabel("Sequence length (bp)")
    plt.ylabel("Count")
    plt.title("Length distribution of orchid DNA sequences")
    plt.show()
    
if __name__ == "__main__":
    main()

and run the script using uv run main.py.


Exercise 6: Run an ASE tutorial

The Atomic Simulation Environment (ASE) is a Python package that facilitates quantum mechanical simulations at the atomic level with interfaces to a variety of simulation codes.

This tutorial describes how to perform a set of calculations to find the volume of the unit cell of crystalline silver 3.

Your task is to set up a project (ase-example) in which you can run the code shown in the tutorial.


Exercise 7: Use Jupyter with PyTorch

Sometimes it reduces friction to try a package with a Jupyter notebook. You may, for example, have found the Introduction to Pytorch tutorial series.

Create a project, e.g., torch-example, add torch as a dependency and use the command

uv run --with jupyter jupyter lab

to open a notebook in the environment. Then copy some of the code from the tutorial, for example from the section on PyTorch Models, and run it.

Exercise 8: Reproduce a package tutorial

You’ve now created environments and run small examples for several packages. Now you will need to find a package, create a project, and reproduce a tutorial/document example of your own choosing.

To find a package, you can browse PyPI, use a search engine, or ask an LLM for suggestions. Make sure the package has documentation, or at least some example scripts.

Then follow the same steps as used above to set up an environment and work through your chosen tutorial/example.


Footnotes

  1. Perhaps this is not technically true, but the two are intertwined in spirit.↩︎

  2. Markdown, unlike say docx, can be read by any plain-text editor but can still be rendered in a way that is at least as nice as Word - these notes are written in markdown.↩︎

  3. This calculation is done with an empirical potential, so it can easily run on any laptop.↩︎