An Introduction to Python Package Managers
Python is a general purpose, high level language which, thanks to its simplicity and versatility, has become very popular, especially within the data science community. The extensive Python community has developed and contributed thousands of libraries and packages over the years in a plethora of different disciplines to aid developers with their applications. Managing these packages can be a challenging task without the correct tools. That’s where Python package managers come in. In this blog post we will explore what a package manager is and why they are important. We will then cover some popular examples, including how to use them, how to install them and the pros and cons of each.
Whilst we will briefly touch on virtual environments in places, we will explore these in more depth in an upcoming post.
What is a Python Package Manager?
Python package managers are essential tools that help developers install, manage, and update external libraries or packages used in Python projects. These packages can contain reusable code, modules, and functions developed by other programmers, making it easier for developers to build applications without reinventing the wheel. Package managers automate the process of fetching, installing, and handling dependencies, streamlining the workflow and ensuring a smooth development experience.
Managing Package Dependencies
One of the key challenges in software development is dealing with dependencies — the external libraries and packages that your project relies on. Python package managers help alleviate this challenge by managing dependencies automatically. When you install a package, the package manager will also fetch and install any dependencies required by that package, recursively handling all transitive dependencies whilst making sure all package versions integrate with each other.
Additionally, package managers provide support for creating virtual environments. Virtual environments enable developers to create isolated and self-contained environments for each project, ensuring that the dependencies installed for one project do not interfere with another.
Popular Python Package Managers
There are many different Python package managers out there. Attempting to write about all of these would lead to an almost never ending blog post and no one would want to read that! Instead, we will talk about some of the most popular options that a lot of Python developers use. These are: pip, conda and poetry. Each have their advantages and disadvantages which we will talk through below.
pip
The most widely used Python package manager is pip (short for “pip installs packages”). It comes pre-installed with Python versions 3.4 and later. Pip allows developers to easily install packages from the Python Package Index (PyPI) and other repositories. It also handles package versioning, so you can install specific versions of packages when needed.
How to Install pip
Typically, once you have installed Python pip is installed by default. If this is not the case, there are two ways to install pip:
ensurepip
get_pip.py
ensurepip
Since Python 3.4 the ensurepip
module was added to Python as a standard library. You can filter the instructions below to your preferred OS by clicking the corresponding tab:
Windows
In your preferred terminal run:
py -m ensurepip --upgrade
masOS
In your preferred terminal run:
python3 -m ensurepip --upgrade
Linux
In your preferred terminal run:
python3 -m ensurepip --upgrade
get_pip.py
An alternative way to install pip is by using a Python script get-pip.py
.
Windows
- Firstly download
get_pip.py
by visiting bootstrap.pypa.io/get-pip.py - Open the Command Prompt, navigate to the directory where you have downloaded
get_pip.py
and then run:
py get-pip.py
macOS
- Open your preferred terminal
- Download
get-pip.py
:
curl https://bootstrap.pypa.io/get-pip.py -o get-pip.py
- Install pip by running:
python3 get-pip.py
Linux
- Open your preferred terminal
- Download
get-pip.py
:
wget https://bootstrap.pypa.io/get-pip.py
- Install pip by running:
python3 get-pip.py
To check pip has installed run:
pip3 --version
How to use pip
To install a package:
pip3 install package_name
To uninstall a package:
pip3 uninstall package_name
To upgrade a package:
pip3 install --upgrade package_name
Pip is one of the easier Python package managers for getting started with. It is most-likely already pre-installed with Python and is simple to use. When you install a package with pip it will install any other packages that the desired package depends on. However, when you upgrade a package pip may not automatically update all of its relative dependencies which can lead to conflicts.
conda
While pip is excellent for most projects, there are cases when you may need a more comprehensive package manager like conda. Conda is primarily associated with Anaconda and Miniconda, two Python distributions aimed at scientific computing and data science. Conda can manage not only Python packages from PyPI but also non-Python libraries and binary packages. Furthermore, conda excels at handling dependencies and managing virtual environments (which will be discussed in a later blog).
How to install conda
Conda can be installed in two ways by either installing Anaconda or Miniconda. We will only consider installing Miniconda in this blog.
Windows
- Download the Miniconda installer for Windows from docs.conda.io/en/latest/miniconda.html.
- Run the installer and follow the prompts to install Miniconda
macOS
- Download the Miniconda installer for macOS or Linux from docs.conda.io/en/latest/miniconda.html
- Open a terminal of your choice and navigate to the directory containing the downloaded installer
- Run the installer script:
zsh Miniconda3-latest-MacOSX-x86_64.sh
- Follow the prompts to install Miniconda
Linux
- Download the Miniconda installer for macOS or Linux from docs.conda.io/en/latest/miniconda.html
- Open a terminal of your choice and navigate to the directory containing the downloaded installer
- Run the installer script:
bash Miniconda3-latest-Linux-x86_64.sh
- Follow the prompts to install Miniconda
To check that Conda has installed run:
conda --version
How to use conda
To install a package:
conda install package_name
To uninstall a package:
conda remove package_name
To upgrade a package:
conda update package_name
By default, conda will give preference to packages that are included in the Anaconda distribution. If you need to install PyPI packages that are not in the default conda distribution, you can install pip by running conda install pip
, then follow the pip instructions above. This will install a version of pip within your conda environment. You need to be careful when using pip inside of conda, for more information on using pip inside conda, Anaconda have written a useful blog on the subject, including some best practises.
poetry
Poetry is a modern and comprehensive Python package manager that combines dependency management and project packaging. It aims to simplify the workflow of managing dependencies and version control, making it an attractive choice for Python developers.
How to install poetry
Installation of poetry is slightly more involved than pip and conda, but thankfully poetry have released a Python script to aid in installation which can be accessed at install.python-poetry.org
Windows
- If you are comfortable with using powershell, download and execute the installer script by running:
(Invoke-WebRequest -Uri https://install.python-poetry.org -UseBasicParsing).Content | py -
Otherwise, copy and paste the content of the python script from install.python-poetry.org into a file called get-poetry.py
and run:
py get-poetry.py
- The installer script will have created a
poetry
wrapper at%APPDATA%\Python\Scripts
. This path needs to be added to your$PATH
if it has not already been added. You can find out more information on how to edit the$PATH
variable in this blog post - You may need to restart your machine before the command
poetry
will work
macOS
- Using your preferred terminal download and execute the installer script by running:
curl -sSL https://install.python-poetry.org | python3 -
- The installer script will have created a
poetry
wrapper at$HOME/.local/bin
. This path needs to be added to your$PATH
if it has not already been added. To do this run:
vim ~/.zshrc
Press i
(to enter insert mode) and add the following line to the file:
export PATH="$HOME/.local/bin"
Press Esc
and then enter :wq
(which will write and quit the file)
- To make the
poetry
command recognisable finally run:
source ~/.zshrc
Linux
- Using your preferred terminal download and execute the installer script by running:
curl -sSL https://install.python-poetry.org | python3 -
- The installer script will have created a
poetry
wrapper at$HOME/.local/bin
. This path needs to be added to your$PATH
if it has not already been added. To do this run:
vim ~/.bashrc
Press i
(to enter insert mode) and add the following line to the file:
export PATH="$HOME/.local/bin"
Press Esc
and then enter :wq
(which will write and quit the file)
- To make the
poetry
command recognisable finally run:
source ~/.bashrc
To check that poetry has installed run:
poetry --version
How to use poetry
First you need to create a new project:
poetry new project_name
To install a package:
poetry add package_name
To uninstall a package:
poetry remove package_name
To upgrade a package:
poetry update package_name
Package Management in Project Workflows
Python package managers are a critical component of project workflows and can be used in various ways:
- Setting up development environments: Package managers help developers create consistent development environments across different machines by specifying the package version numbers. In pip this is in the form of a
requirements.txt
file, in conda this is anenvironment.yml
file and in poetry this is apyproject.toml
file (this includes more than just python packages). - Continuous Integration (CI) and Deployment: Package managers facilitate the installation of dependencies in CI systems and deployment servers, ensuring that the application runs as expected in these environments.
- Version Control: Similar to setting up a development environment, by including a
requirements.txt
,environment.yml
orpyproject.toml
file in version control systems like Git, developers can ensure that collaborators and other team members have the same environment setup.
Creating and using these files is pretty straightforward. Lets take a look at how to do this in pip, conda and poetry.
pip
When using pip, it is easy to create a requirements file. All you need to do is run the following command in the terminal.
pip3 freeze > requirements.txt
This will produce a file with content which will look something similar to the example below.
flake8==4.0.1
numpy==1.25.2
pandas==2.1.0
scikit-learn==1.3.0
pip freeze
will produce a list of all the packages you have installed, along with their dependencies and the versions for each package. This list is then written to a file called requirements.txt
by using the >
command to redirect the output from pip freeze
.
To install all the packages and versions from a requirements file within a directory in pip you can execute the following command in the terminal.
pip3 install -r requirements.txt
We use the same command as before when installing a package, however a flag -r
is needed to tell pip to look inside requirements.txt
and pull all the packages and versions from this file.
conda
In pip a requirements.txt
file is used to store package versions (in practice the file could be given any name, but it is standard practice to name the file requirements
), however in conda a YAML file is used which is typically named environment.yml
. YAML (which stands for YAML Ain’t Markup Language) files are often used for configuration files and are human-readable. Within conda, YAML files are used to store any necessary information of your conda environment, this includes the packages for the project you are working on and the version of python being used (this could in practice be another coding language). To create an environment.yml
file in conda you can use the following command below.
conda env export > environment.yml
This will produce an environment.yml
file which will be similar to the example below.
name: <environment_name>
channels:
-defaults
dependencies:
- flake8=4.0.1
- numpy>=1.15.2
- pandas=2.1.0
- python=3.10.8
- scikit-learn=1.3.0
conda env export
is similar to pip freeze
and will export all the relevant packages from your environment with the relevant versions, however instead of this being a list, it is in a format suitable for a YAML file. Also like pip, we use the >
operator to write the information from conda env export
into environment.yml
.
To install all the packages and their dependencies with the specific versions from an environment.yml
file use the following command.
conda env create -f environment.yml
This will create a new conda environment with all the packages and versions specified in environment.yml
. If you want to know more about python environments we will talk more about these along with their uses in an upcoming blog.
poetry
By default, when you create a new poetry project (using poetry new <PROJECT-NAME>
)
a pyproject.toml
file will be generated. Once you have added packages to your
poetry project, your pyproject.toml
file will look like something similar to
below:
[tool.poetry]
name = "<environment_name>"
version = "0.1.0"
description = ""
authors = "Jane Doe {jane.doe@123evergreenterrace.com}"
[tool.poetry.dependencies]
python = "^3.10"
numpy = "^1.25.2"
pandas = "^2.1.0"
[build-system]
requires = ["poetry-core"]
build-backend = "poetry.core.masonry.api"
The python packages you install can be seen under [tool.poetry.dependencies]
along with the Python version. You can add extra requirements to the pyproject.toml
file by either manually editing it, or by using poetry add <package>
. If you want to manually edit the TOML file, the hat notation ^
is equivalent to greater than or equal to, e.g. if you require Python 3.10 or above you can add python = "^3.10"
to the TOML file.
When you install dependencies in a poetry project, the exact version numbers of the installed packages and their dependencies are added to a “poetry.lock” file located in the same directory. The pyproject.toml and poetry.lock files can then be shared with a colleague, who can install the dependencies by running the following command in the same directory as the files:
poetry install
To use the dependencies installed by poetry, you need to activate the poetry environment by running:
poetry shell
You will now be using the same development environment as any colleagues that are working on the same project. We will learn more about poetry and other virtual environments in an upcoming blog.
If instead you have a requirements.txt
file, we can still install the packages and relevant versions using poetry. This can be done as follows:
poetry add $( cat requirements.txt )
This will add each package and version to the pyproject.toml
file. We can use $( cat requirements.txt )
to feed each line of requirements.txt
to poetry add
.
Conclusion
Package manager | Easy to install | Online support | Latest packages always available | Virtual environment manager | Handles package dependencies | Small installation size | Multi-platform | Access to PyPI | Easy Python package publishing |
---|---|---|---|---|---|---|---|---|---|
pip | ✅ | ✅ | ✅ | ❌ | ❌ | ✅ | ❌ | ✅ | ❌ |
conda | ✅ | ✅ | ❌ | ✅ | ✅ | ❌ | ✅ | ✅* | ❌ |
poetry | ❌ | ✅ | ✅ | ✅ | ✅ | ✅ | ❌ | ✅ | ✅ |
Installing Python package managers is a straightforward process that varies slightly based on your operating system. Regardless of whether you’re using Windows, macOS, or Linux, setting up these tools is a small investment that pays off in significantly improved project management and development practices. A table summarising some pros and cons of each package manager we have covered is shown in the table above. I would not recommend installing all three package managers at once as it may become confusing to remember what you have installed in which package manager. I would recommend choosing whichever you like the look of best and try that one first. Personally I would recommend either installing pip or conda if this is your first introduction to Python and poetry if you are working on a collaborative project. However, choose the package manager that best suits your needs and enjoy the benefits of efficient dependency management and streamlined development workflows.