Configure Poetry, git & VS Code for jupyter notebooks
The idea of this tutorial is to be able to set up python to start working with jupyter lab. From these setup I would like to be able to
- Create an virtual environment in python to manage my packages
- Version control my jupyter notebooks and other relevant files
- Configure VS code to run jupyter notebooks
- Finally deal with the annoying problem of Git and Jupyter. More about it later
Configuring Poetry
In this step we shall install poetry and install the necessary packages
- Install python-poetry to get started with poetry
- Create a new project using poetry new
poetry new
- Add the necessary packages using poetry
poetry add jupyterlab
poetry add scikit-learn
poetry add matplotlib
Configuring VS Code
- After installing vs code install the following plugins
- Python (ms-python)
- jupyter (ms-toolsai)
[!important]
The plugin installation might fail with latest builds like in Manjaro. Follow the procedure below

- Press Ctrl + Shift + P, followed by Configure runtime arguments
- Add after a comma
"enable-proposed-api": ["ms-toolsai.jupyter", "ms-python.python","ms-toolsai.jupyter-renderers"]
Inorder to figure out these errors – Toggle developer tools in the command pallete. Go to console. It would print the errors which is a hint on what to do.
Configure Git to avoid commiting every run of the same notebook as a change
The problem
Generally every run that the notebook goes through shows as a change. This happens as a lot of metadata about the run like the cell run count, time etc changes. This repeating changes makes it difficult to use GIt effectively. The code changes gets mixed with these run changes.
Solutions
- The general approach to this problem would be to clean up the jupyter notebook before everycommit. The problem with this approach is that one has to remember to clean it. Also, if many files have undergone run then each file has to be manually cleaned up.
- Automatically clean the notebooks before commit is a better solution. This wouldn’t require people to remember to clean the files before every commit.
The steps below help automating this step everytime the code is added to the staging area.
- Initialize GIT in the folder using git init
git init
- Create a gitignore file to ignore all files except
- notebook files
- poetry lock and toml files
- Create a .gitattribute file in source directory with the a single line _*
.ipynb filter=strip-notebook-output_
- Append the following at the end of the file .git/config
[filter "strip-notebook-output"]
clean = "jupyter nbconvert --ClearOutputPreprocessor.enabled=True --to=notebook --stdin --stdout --log-level=ERROR"
Sample gitignore file
*.*
!.gitattributes
!.gitignore
!*.py
!*.ipynb
!*.md
!poetry.lock
!poetry.toml
Sample gitattributes file
.ipynb filter=strip-notebook-output_
Sample .git/config file
[core]
repositoryformatversion = 0
filemode = true
bare = false
logallrefupdates = true
[gitflow “branch”]
master = master develop = develop
[gitflow “prefix”]
feature = feature/ bugfix = bugfix/ release = release/ hotfix = hotfix/ support = support/ versiontag =
[gitflow “path”]
hooks = /home/bharad/Work/ML/.git/hooks
[filter “strip-notebook-output”]
clean = “jupyter nbconvert –ClearOutputPreprocessor.enabled=True –to=notebook –stdin –stdout –log-level=ERROR”
[gitg]
mainline = refs/heads/master
Leave a Reply
You must be logged in to post a comment.