In this post I will demonstrate step-by-step how to set up a Jupyter Lab server inside a docker container. Of course, scripts, workflows, docker images etc. exist online which efficiently automate all these steps. What I would like to do, though, is give a sense of what is roughly behind the scenes so you can develop a better understanding. Let’s start.
Docker container set-up
First of all, after opening the terminal what you need to do is download the Ubuntu docker image:
$ docker pull ubuntu:latest
Then you need to create and optionally name a docker container using the above image and connect to its bash interactive terminal:
$ docker run -it --name my-jupyter-lab -p 0.0.0.0:56000:8888 ubuntu bash
Check my previous article for more info about ports.
Now, while being in the container’s terminal you will need to create a user and give elevated privileges. This is optional but recommended as an additional level of security of the container. For extra info on docker security check this and this.
To add the new user (named e.g. testuser) use the high-level and more user-friendly tool below. You just need to insert the password as the rest of the requested info are not important.
$ adduser testuser
In order to give elevated privileges you need to add the new user to the sudo
group. However, you first need to install sudo
. Along with it you can also install the rest of the software (which I will explain later) that we will need later:
$ apt update
$ apt install sudo nano openssl python3.8 python3-pip -y
Now you can add the new user to the sudo
group and exit the interactive terminal:
$ adduser testuser sudo
$ exit
The docker container has stopped running which you can verify by typing $ docker ps
in your host system. Therefore, you will need to start the container again and get inside the interactive terminal but this time as the testuser
$ docker container start my-jupyter-lab
$ docker exec -itu testuser my-jupyter-lab bash
and simply install jupyter lab in the container’s system
Jupyter Lab set-up
$ pip install jupyterlab
After jupyter lab has been installed you will probably notice some yellow warnings. They will be telling you that the ~/.local/bin
path is not in the $PATH
environment variable. This essentially means that when you type the command jupyter lab it won’t run since the system does not know where to find the executable. If you want to test it, simply type $ jupyter lab
in the container’s terminal. To make the system aware of the executable’s path modify the .bashrc
file using the text editor nano
.
$ cd ~
$ nano .bashrc
Go at the end of the file and add the following line. After adding it push Ctrl+O to save the change, Enter to accept the change and Ctrl+X to close the .bashrc
file.
export PATH="$HOME/.local/bin:$PATH"
Next execute the commands in .bashrc
in the interactive shell so the jupyter lab executable’s path can be added to the $PATH
environment variable
$ source .bashrc
You then need to create a notebook configuration file
$ jupyter lab --generate-config
What the notebook configuration file does is to initialize certain variables in Jupyter Lab startup. You can (and will) modify these values. But before that you need to ensure that the connection with the Jupyter Lab web interface through the browser will be secured. This can be achieved by encrypting the communication channel with the TLS protocol (remember openssl
lib we installed in the beginning).
You can achieve this by running the following line and creating a self-signed certificate. You will be asked for several detail that you can live empty. For more info on openssl check this.
$ openssl req -x509 -nodes -days 365 -newkey rsa:2048 -keyout /home/testuser/.jupyter/mykey.key -out /home/testuser/.jupyter/mycert.pem
You will notice that two files are created i.e., mykey.key
and mycert.pem
which you will use later. After that it’s good to create a hashed password. For this example I will use the SHA512 algorithm. So, just run python3
and type the following:
from notebook.auth import passwd
passwd(algorithm='sha512')
exit()
The result will be a string which you need to keep for later.
Going back to the notebook configuration file, open it and change/uncomment the following lines.
$ nano .jupyter/jupyter_lab_config.py -l
# path to mycert.pem
c.ServerApp.certfile = '/home/testuser/.jupyter/mycert.pem'
# denotes all IPs
c.ServerApp.ip = '*'
# path to mykey.key
c.ServerApp.keyfile = '/home/testuser/.jupyter/mykey.key'
c.ServerApp.open_browser = False
# pass the password as mapped by SHA512 algorithm that was generated in the previous step
c.ServerApp.password = 'sha512:228d0b43180c:e00bf58...'
# denotes port that jupyter server will be bind
c.ServerApp.port = 8888
# force use of bash shell
c.ServerApp.terminado_settings = {"shell_command": ["/bin/bash"]}
That was it! Now you only need to run the following lines and close the terminal. Don’t worry, the jupyter lab server will still be running due to & disown
. Just go to your browser and type https://<public IP>:56000.
$ mkdir notebooks && cd notebooks
$ jupyter lab & disown
Extra info: How to accelerate the above process
You can also bypass many of the steps above by starting with a python or data science dedicated docker images. You can also skip the step of exposing the jupyter lab app to the outside world by using an internal instead of the external IP address and then connect to it using an SSH tunnel (i.e., exploiting the proxy server concept).
Thanks for reading and keep reading!
Tags: docker hash jupyter lab jupyter notebook python server ssl ubuntu