Configuring Python Workspace: Poetry
In the previous article, I have described my approach to configure Python workspace. I mentioned there that I do not use poetry because it “cannot be used to specify dependencies when you work with Jupyter notebooks”. However, people (@BasicWolf and @iroln) from the Russian tech website Habr recommended me to look at poetry closer, as it apparently can fulfil all my requirements. “Two heads are better than one”, and I started to explore this tool deeper. Indeed, I have managed to fulfil all my requirements with this tool but with some configurations. In this post, I describe how to configure it to meet my requirements and how to use it.
Table of Contents
I have modified my script to configure Python workspace in order to add a possibility to use poetry for dependency management. In particular, comparing to the previous version I have added the following part:
if [ $USE_POETRY -eq 1 ]; then echo "Installing poetry..." pyenv activate tools3 curl -sSL https://raw.githubusercontent.com/python-poetry/poetry/master/get-poetry.py | python source $HOME/.poetry/env # configuring poetry to create venv directories inside the project poetry config virtualenvs.in-project true # adding lookup of bash completion files in user's directory mkdir -p ~/.bash_completion.d/ if ! [ -f ~/.bash_completion ]; then echo '' >> ~/.bash_completion echo 'for bcfile in ~/.bash_completion.d/* ; do' >> ~/.bash_completion echo ' [ -f "$bcfile" ] && . $bcfile' >> ~/.bash_completion echo 'done' >> ~/.bash_completion echo '' >> ~/.bashrc echo 'if [ -f ~/.bash_completion ]; then' >> ~/.bashrc echo ' source ~/.bash_completion' >> ~/.bashrc echo 'fi' >> ~/.bashrc fi if ! [ -f ~/.bash_completion.d/poetry.bash-completion ]; then poetry completions bash > ~/.bash_completion.d/poetry.bash-completion fi pyenv deactivate else echo "Installing virtualenv..." pyenv activate tools3 pip install virtualenv pyenv deactivate mkdir -p ~/.bash_fns curl -L https://raw.githubusercontent.com/zyrikby/blog_related/master/2020-02-configuring-python-workspace/pip_functions.sh > ~/.bash_fns/pip_functions.sh echo '' >> ~/.bashrc echo 'if [ -f ~/.bash_fns/pip_functions.sh ]; then' >> ~/.bashrc echo ' source ~/.bash_fns/pip_functions.sh' >> ~/.bashrc echo 'fi' >> ~/.bashrc #activating for current shell session source ~/.bash_fns/pip_functions.sh fi
if statement checks if user wants to use poetry for configuration management or prefers my custom solution (described in the previous article). If
USE_POETRY is equal to
1, the script installs poetry and configures it. Let’s consider in details what is happening there.
pyenv activate tools3 curl -sSL https://raw.githubusercontent.com/python-poetry/poetry/master/get-poetry.py | python # installing poetry
In this code fragment, the script activates the “tool3” pyenv global environment and installs poetry according to the official recommendation. Poetry adds its directory to the path in the
.profile file, thus the poetry commands will be available only after the next login. In order to make them available right away, the scirpt runs the command
source $HOME/.poetry/env that activates poetry for the current shell session.
poetry config virtualenvs.in-project true tells poetry to create virtual environment directory (
.venv) inside a project directory. By default, poetry uses a separate cache directory where it stores all virtual environment related files. In the previous article, I have mentioned that in this case VSCode would not be able to activate the virtual environment automatically. Thus, this configuration is required to fulfil my requirement for automatic activation of the virtual environment.
The following fragment is required to activate bash completion for the poetry commands. Poetry documentation recommends activation of the bash completion using the command
poetry completions bash > /etc/bash_completion.d/poetry.bash-completion. Unfortunately, in (k)Ubuntu 18.04 this command generates
Permission denied exception because poetry does not have permission to write to the
/etc/bash_completion.d/ directory. In order to avoid this exception, you can either run this command with elevated privileges (e.g., using
sudo) or you can add the functionality of bash completion files activation from your user directory:
# adding lookup of bash completion files in user's directory mkdir -p ~/.bash_completion.d/ if ! [ -f ~/.bash_completion ]; then echo '' >> ~/.bash_completion echo 'for bcfile in ~/.bash_completion.d/* ; do' >> ~/.bash_completion echo ' [ -f "$bcfile" ] && . $bcfile' >> ~/.bash_completion echo 'done' >> ~/.bash_completion echo '' >> ~/.bashrc echo 'if [ -f ~/.bash_completion ]; then' >> ~/.bashrc echo ' source ~/.bash_completion' >> ~/.bashrc echo 'fi' >> ~/.bashrc fi
After that you just need to add poetry bash completion file to this directory if it is not there:
if ! [ -f ~/.bash_completion.d/poetry.bash-completion ]; then poetry completions bash > ~/.bash_completion.d/poetry.bash-completion fi
After you have executed this installation script please log off and log in in order to activate the configuration.
Development Workflows with Poetry
Poetry stores project configuration in the
pyproject.toml file. Mainly, you use it to specify your package and development dependencies (they all stored in the same file but in different toml sections). When you install dependencies, poetry creates the
poetry.lock file where it stores the exact versions of the dependencies after it has resolved them. You should add this file under the version control so that your collaborators later can replicate your virtual environment.
If you do not find something in this tutorial, please refer to the official documentation of this tool.
Python Development Workflow
As I described in my previous article, I use Python mainly to prepare some scripts required for my research. They are not supposed to be installed into the system. Thus, most often I make a root directory for the scripts related to a research project and initialize the poetry tool there:
$ mkdir -p ~/projects/new_scripts_proj $ cd ~/projects/new_scripts_proj $ poetry init
This command will ask several questions about the project settings, and as a result it will create a
pyproject.toml file inside the directory with the following content (I have not add any dependencies yet):
[tool.poetry] name = "new_scripts_proj" version = "0.1.0" description = "" authors = ["Yury Zhauniarovich <email@example.com>"] [tool.poetry.dependencies] python = "^3.8" [tool.poetry.dev-dependencies] [build-system] requires = ["poetry>=0.12"] build-backend = "poetry.masonry.api"
Now, you can add your dependencies either directly to this file (note that there are two different sections for package and development dependencies) or using the
poetry add command:
$ poetry add "numpy==1.18.0"
poetry add command adds the dependency to the
pyproject.toml file and installs it. Thus, when you first run this command it will also create a
.venv directory with virtual environmnet files inside the project folder. It will also automatically generate
poetry.lock file. The
--dev option allows you to add a development dependency. For instance, you can install pylint using the following command:
poetry add --dev pylint. After these operations, your
pyproject.toml should look similar to this:
[tool.poetry] name = "new_scripts_proj" version = "0.1.0" description = "" authors = ["Yury Zhauniarovich <firstname.lastname@example.org>"] [tool.poetry.dependencies] python = "^3.8" numpy = "1.18.0" [tool.poetry.dev-dependencies] pylint = "^2.4.4" [build-system] requires = ["poetry>=0.12"] build-backend = "poetry.masonry.api"
As you can see, the numpy library has been added to the package dependencies with the exact version we have provided. At the same time, the pylint package has been added to development dependencies, and its version is specified using caret requirements. You can read about all ways of specifying dependency versions supported by poetry in the documentation.
You can list all dependencies in the current virtual environment using the
poetry show command. By default, this command shows all dependencies and their versions, however it has several useful options. For instance, the
--tree option shows the dependency tree, while
--no-dev suppresses the output for development dependencies. However, the most useful option is
--outdated. It shows the list of outdated packages. For instance, we can see that the latest available version for numpy is
$ poetry show astroid 2.3.3 An abstract syntax tree for Python with inference support. isort 4.3.21 A Python utility / library to sort Python imports. lazy-object-proxy 1.4.3 A fast and thorough lazy object proxy. mccabe 0.6.1 McCabe checker, plugin for flake8 numpy 1.18.0 NumPy is the fundamental package for array computing with Python. pylint 2.4.4 python code static checker six 1.14.0 Python 2 and 3 compatibility utilities wrapt 1.11.2 Module for decorators, wrappers and monkey patching. $ poetry show --tree $ poetry show --tree numpy 1.18.0 NumPy is the fundamental package for array computing with Python. pylint 2.4.4 python code static checker ├── astroid >=2.3.0,<2.4 │ ├── lazy-object-proxy >=1.4.0,<1.5.0 │ ├── six >=1.12,<2.0 │ └── wrapt >=1.11.0,<1.12.0 ├── colorama * ├── isort >=4.2.5,<5 └── mccabe >=0.6,<0.7 $ poetry show --no-dev numpy 1.18.0 NumPy is the fundamental package for array computing with Python. $ poetry show -o numpy 1.18.0 1.18.1 NumPy is the fundamental package for array computing with Python. wrapt 1.11.2 1.12.0 Module for decorators, wrappers and monkey patching.
Poetry may be used to update dependencies. You can either update all package dependencies using the
poetry update command, or you can update a particular library providing its name as an argument. By default, this command updates all dependencies to the latest allowed versions. This means that in our case the command
poetry update numpy will not update numpy library, because only version
1.18.0 is allowed by our
pyproject.toml file. If you want to update this dependency, at first you need to modify
pyproject.toml weakening the specified constraint, e.g., changing the line to
numpy = "1.18.*". You can read more how to specify dependency versions in the documentation.
After the project is created, you can run VSCode in this directory, and it should automatically activate the virtual environment due to the configurations we have made previously. However, if you use other tools you may need to run some commands in the created virtual environment. In this case, you can use the following two commands to reach this goal:
poetry run <command> and
poetry shell. The former just runs one provided command, while the second spawns a new shell session with the activated virtual environment.
If you plan to develop a package, you may consider creating a project using the
poetry new command. In this case, poetry will create some boilerplate code and files so that you can start developing your package faster:
$ poetry new --src poetry-demo Created package poetry_demo in poetry-demo $ cd poetry-demo/ $ tree . ├── pyproject.toml ├── README.rst ├── src │ └── poetry_demo │ └── __init__.py └── tests ├── __init__.py └── test_poetry_demo.py
--src will put the sources of the package into the
src directory. If you plan to distribute packages this is a preferred way of organizing your project.
Then, you can use the commands
poetry build and
poetry publish to build package and publish it on Pypi. However, as I previously mentioned currently I do not publish packages, so if you need this functionality please refer to the poetry documentation.
Finally, if you have someone who still uses the
requirements.txt file to define dependencies, poetry can export the dependencies into this format:
$ poetry export -f requirements.txt numpy==1.18.0 \ --hash=sha256:b091e5d4cbbe79f0e8b6b6b522346e54a282eadb06e3fd761e9b6fafc2ca91ad \ --hash=sha256:443ab93fc35b31f01db8704681eb2fd82f3a1b2fa08eed2dd0e71f1f57423d4a \ --hash=sha256:88c5ccbc4cadf39f32193a5ef22e3f84674418a9fd877c63322917ae8f295a56 \ --hash=sha256:e1080e37c090534adb2dd7ae1c59ee883e5d8c3e63d2a4d43c20ee348d0459c5 \ --hash=sha256:f084d513de729ff10cd72a1f80db468cff464fedb1ef2fea030221a0f62d7ff4 \ --hash=sha256:1baefd1fb4695e7f2e305467dbd876d765e6edd30c522894df76f8301efaee36 \ --hash=sha256:cc070fc43a494e42732d6ae2f6621db040611c1dde64762a40c8418023af56d7 \ --hash=sha256:6f8113c8dbfc192b58996ee77333696469ea121d1c44ea429d8fd266e4c6be51 \ --hash=sha256:a30f5c3e1b1b5d16ec1f03f4df28e08b8a7529d8c920bbed657f4fde61f1fbcd \ --hash=sha256:3c68c827689ca0ca713dba598335073ce0966850ec0b30715527dce4ecd84055 \ --hash=sha256:f6a7421da632fc01e8a3ecd19c3f7350258d82501a646747664bae9c6a87c731 \ --hash=sha256:905cd6fa6ac14654a6a32b21fad34670e97881d832e24a3ca32e19b455edb4a8 \ --hash=sha256:854f6ed4fa91fa6da5d764558804ba5b0f43a51e5fe9fc4fdc93270b052f188a \ --hash=sha256:ac3cf835c334fcc6b74dc4e630f9b5ff7b4c43f7fb2a7813208d95d4e10b5623 \ --hash=sha256:62506e9e4d2a39c87984f081a2651d4282a1d706b1a82fe9d50a559bb58e705a \ --hash=sha256:9d6de2ad782aae68f7ed0e0e616477fbf693d6d7cc5f0f1505833ff12f84a673 \ --hash=sha256:1c35fb1131362e6090d30286cfda52ddd42e69d3e2bf1fea190a0fad83ea3a18 \ --hash=sha256:56710a756c5009af9f35b91a22790701420406d9ac24cf6b652b0e22cfbbb7ff \ --hash=sha256:03bbde29ac8fba860bb2c53a1525b3604a9b60417855ac3119d89868ec6041c3 \ --hash=sha256:712f0c32555132f4b641b918bdb1fd3c692909ae916a233ce7f50eac2de87e37 \ --hash=sha256:a9d72d9abaf65628f0f31bbb573b7d9304e43b1e6bbae43149c17737a42764c4
If you do not need hashes, add the
--without-hashes option to the command. In this case, the output of the command should be more familiar.
Data Analysis Workflow
The Data Analysis Workflow is similar to Python Development Workflow: you create a new directory, initialize new poetry project, and installs all necessary dependencies. Then, you just need to run Jupyter Notebook from the virtual environment:
$ poetry run jupyter notebook
Note, we do not need to install Jupyter notebook into our environment. It is globally available due to pyenv
jupyter global environment. However, the configuration that we have made will make the packages installed by poetry in the virtual environment available.
Personally, from now on I plan to use poetry for dependency management because it fulfils my requirements. Moreover, contrary to my custom approach described in the previous article that relies on pip, which does not have dependency resolver, poetry employs one, therefore you should always get consistent versions of the dependencies in your virtual environment.
However, if you are mostly develop Python packages rather than simple scripts you may want to consider a more advanced project management tool called DepHell. It was also recommended to me in the article, however, I have found it quite complicated to adapt in my scenarios.