How to Organize a Self-contained Python Project
I often use and study open-source code from the Internet and run into the following situation:
I found a nice code repo on Github, clone it to my local computer, and then have to spend a lot of time just trying to get the code up and running because I have to figure out things like Python version, the required packages with specific versions, Jupyter configurations, etc.
In this short tutorial, I show how I organize a self-contained Python project, which can be up and running with minimal effort. I use a data science project with Jupyter as an example and store it on Github.
Prerequisite: you need to set up Python 3, Git, and (optional code editor) Atom: How to Setup Mac for Python Development
- Create a virtual environment
- Create a requirement.txt file and specify required packages and versions
- Install packages within the virtual environment and run Python programs
First, let’s create a new repo on Github: make sure you choose the Python
.gitignore file (don’t know what this is: https://help.github.com/en/github/using-git/ignoring-files).
Clone the repo to your local Mac:
Go inside the newly created folder
$ cd self-contained-project/ and do the following:
Note: how to create and activate the virtual environment on Windows 10 is different, please refer to How to Setup Python 3 on Windows 10
- Create a Python 3 virtual environment in a folder named “venv”:
$ python3 -m venv venvNote that you can change the virtual environment name to anything you like but using “venv” is a convention, which has been included in the default Github
.gitingorefile so that you won’t accidentally push the virtual environment folder to Github.
- Activate the virtual environment:
$ source venv/bin/activate
- Create a requirement.txt file and add the packages you need for the project.
Here I install Jupyter in the virtual environment, which saves the trouble of configuring a system-level Jupyter with different virtual environments. The drawback of doing this is adding about 200M more to the virtual environment and making package installation time slightly longer. However, if you prefer to install Jupyter once at the system level and configure it to use different virtual environments (it’s perfectly fine that way), just remove
jupyter from the
requirements.txt file and follow the steps at the end of this tutorial.
I also added another package
pandas for data analysis. Next, you can install
$ pip install -r requirements.txt and you should see the following output in the terminal, where you can find the versions of
pandas you just installed.
If you don’t specify package version information in the
requirements.txt file, the latest versions will be installed, which may or may not be what you want — package development is always evolving and your code may break for future versions. Therefore, it is a good practice to explicitly specify the version information as follows. You can add other packages in a similar way.
4. Start Jupyter to work on your project:
$ jupyter notebook:
5. You can now save the project, commit all changes, and push the code to Github:
$ git add .
$ git commit -am ‘finished the tutorial’
$ git push
Now, you have created a highly portable and self-contained Python project. To sum up, anyone who has Python3 can get the code up and running by simply doing the following:
$ git clone https://github.com/harrywang/self-contained-project.git
$ cd self-contained-project
$ python3 -m venv venv
$ source venv/bin/activate
$ pip install -r requirements.txt
$ jupyter notebook
How to configure Jupyter with different virtual environments
Assume you have Jupyter installed (How). Follow the following steps to set up a virtual environment in Jupyter:
- go to your project folder and create a virtual environment named “venv”
- activate the virtual environment
ipykernelin the virtual environment
- install the current virtual environment as a kernel for Jupyter. NOTE: you have to use a unique name for each project.
$ python -m venv venv
$ source projectname/bin/activate
(venv) $ pip install ipykernel
(venv) $ ipython kernel install --user --name=unique_project_name
- List all kernels and remove a virtual environment if needed
$ jupyter kernelspec list
$ jupyter kernelspec uninstall unique_project_name
Using Virtual Environments in Jupyter Notebook and Python
Are you working with Jupyter Notebook and Python? Do you also want to benefit from virtual environments? In this…