HTTP Authentication for Jupyter Notebooks

Run Jupyter Notebooks from a remote server with HTTP Authentication using Apache as a reverse proxy

If like me you use a lot of Jupyter Notebooks, you might want to run them from a remote server. More RAM, more computing power adnd edicated ressources can be quite useful!

Jupyter notebooks provide a solution to run remotely with a simple password protection. It works, but it’s not very secure.
I wanted to add a layer of security with some basic HTTP authentication. Here’s what I did, using Let’s Encrypt and Apache’s built-in authentication and mod_proxy.


Note: I used Apache as a reverse proxy, but you could use the same configuration principles with any other webserver.


Configure your Jupyter Notebook server

First, you’ll need to configure Jupyter to run as a server.
Create a configuration file with this command:

jupyter notebook --generate-config

A wild jupyter_notebook_config.py file appears!
Before editing it, we’ll generate a pair of keys in order for our Jupyter server to run on HTTPS. As your Jupyter server will only communicate with Apache, you can create a basic SSL certificate:

openssl req -x509 -nodes -days 365 -newkey rsa:2048 -keyout privkey.pem -out fullchain.pem. 

Note: you could use Let’s Encrypt to generate a trusted certificate, but that’s not essential. See the example below if you want to see how.


Open the jupyter_notebook_config.py file and insert or edit these lines:

c.NotebookApp.certfile = '/path/to/fullchain.pem'
c.NotebookApp.keyfile = '/path/to/privkey.pem'
c.NotebookApp.ip = 'localhost'
c.NotebookApp.allow_origin = 'https://example.com'
c.NotebookApp.trust_xheaders = True
c.NotebookApp.base_url = '/jupyter'
c.NotebookApp.open_browser = False

certfile and keyfile are your newly generated SSL certificate keys.
Your server will only answer to localhost (ip), but will have to allow requests from https://example.com, as indicated in allow_origin (remember to replace with your own domain). You also need to set trust_xheaders to True for the same reason.
I found out that this setup won’t work if you don’t use a base_url that’s not the root of your domain. However, you can easily redirect traffic from the root of your domain to this base_url (we’ll do that later).
Finally, you can ask Jupyter not to try and open a browser, which probably wouldn’t work on a remote server anyway.
We’ll leave the other options to default values.


That’s all for Jupyter, you can run your notebooks server:

jupyter notebook


I recommend trying to access it using your remote server IP address or hostname (https://XXX.XXX.XXX.XXX:8888/jupyter for instance), in order to check if everything is indeed secured.
If everything went well, you should not be able to access anything.


Configure your proxy

Now we’ll configure a new Apache VirtualHost on our remote server, which will work as a proxy for our Jupyter instance. This means instead of accessing your notebooks with a URL such as http://localhost:8888 (which you can’t on a remote server), you’ll be able to use them with your own domain.

We’ll create a new file in /etc/apache2/sites-available/ to store this new VirtualHost:

sudo nano /etc/apache2/sites-available/jupyter-notebooks.conf  

Let’s add some very basic configuration:

<VirtualHost *:80>
        ServerName example.com
</VirtualHost>
<IfModule mod_ssl.c>
    <VirtualHost *:443>
            ServerName example.com
            ServerAdmin you@example.com

            ErrorLog ${APACHE_LOG_DIR}/error.log
            CustomLog ${APACHE_LOG_DIR}/com.example-access.log combined

            Redirect permanent / https://example.com/jupyter
    </VirtualHost>
</IfModule>

Your host will now answer both for HTTP (*:80) and HTTPS (*:443) (we need HTTP for now because it will be used by LetsEncrypt to validate our SSL certificate).
A simple redirect leads from https://example.com/ to https://example.com/jupyter.
The rest is pretty basic: we just store access logs in a specific file.


We’ll need to generate an auth file, with htpasswd. Depending on your system, you might need to install it first: on Debian for instance, it’s part of apache2-utils.
Then simply run:

htpasswd -c auth_file_name user_name  

You’ll be prompted to chose a new password for user_name, and it will be stored (encrypted) into a new auth_file_name file.


You can now edit your Apache conf file, and add these lines at the end of the <VirtualHost *:443> section:

<Location "/jupyter">
        AuthName "Please login"
        AuthType basic
        AuthBasicProvider file
        AuthUserFile "/path/to/auth_file_name"
        Require valid-user
</Location>

This indicates Apache that users need to be authentified in order to access https://example.com/jupyter, using credentials stored in our auth_file_name file.


At this point, your new VirtualHost is functionnal. If you set your domain’s DNS to point to your server, you should be able to access it. Simply enable the VirtualHost and reload Apache’s config:

sudo a2ensite jupyter-notebooks.conf  
sudo service apache2 reload  


Now, let’s enable a few Apache mods:

sudo a2enmod proxy proxy_http proxy_wstunnel ssl headers  


We can now configure Apache as a proxy for our Jupyter notebooks. Edit your VirtualHost as follows:

<VirtualHost *:80>
        ServerName example.com
</VirtualHost>
<IfModule mod_ssl.c>
    <VirtualHost *:443>
            ServerName example.com
            ServerAdmin you@example.com

            ErrorLog ${APACHE_LOG_DIR}/error.log
            CustomLog ${APACHE_LOG_DIR}/com.example-access.log combined

            SSLProxyEngine On
            SSLProxyVerify none
            SSLProxyCheckPeerCN off
            SSLProxyCheckPeerName off
            SSLProxyCheckPeerExpire off

            Redirect permanent / https://example.com/jupyter

            <Location "/jupyter">
                    AuthName "Please login"
                    AuthType basic
                    AuthBasicProvider file
                    AuthUserFile "/path/to/auth_file_name"
                    Require valid-user

                    ProxyPass https://localhost:8888/jupyter
                    ProxyPassReverse https://localhost:8888/jupyter
                    ProxyPassReverseCookieDomain localhost example.com
                    RequestHeader set Origin "https://localhost:8888"
            </Location>

            <Location "/jupyter/api/kernels">
                    ProxyPass ws://localhost:8888/jupyter/api/kernels
                    ProxyPassReverse ws://localhost:8888/jupyter/api/kernels
                    ProxyPass wss://localhost:8888/jupyter/api/kernels
                    ProxyPassReverse wss://localhost:8888/jupyter/api/kernels
            </Location>
    </VirtualHost>
</IfModule>

This simply proxies HTTPS traffic from https://example.com/jupyter to https://localhost:8888/jupyter and back, and the same thing for Jupyter’s internal API using Web Sockets.


Generate a SSL certificate with LetsEncrypt

Let’s Encrypt is an open-source and free certificate authority, for you to generate trusted SSL certificates.
Go to the website to get the exact setup instructions for your system. For example on Debian:

sudo apt-get install python-certbot-apache -t stretch-backports


Now you can run certbot:

sudo certbot

You’ll be asked to chose between all available VirtualHosts the one you want to create a certificate for. Certbot will generate and verify your new certificate, and ask you if you want to redirect HTTP traffic to HTTPS (which you should).
It will then edit your /etc/apache2/sites-available/jupyter-notebooks.conf file, which should now look like this:

<VirtualHost *:80>
        ServerName example.com
        RewriteEngine on
        RewriteCond %{SERVER_NAME} =example.com
        RewriteRule ^ https://%{SERVER_NAME}%{REQUEST_URI} [END,NE,R=permanent]
</VirtualHost>
<IfModule mod_ssl.c>
    <VirtualHost *:443>
            ServerName example.com
            ServerAdmin you@example.com

            ErrorLog ${APACHE_LOG_DIR}/error.log
            CustomLog ${APACHE_LOG_DIR}/com.example-access.log combined

            SSLProxyEngine On
            SSLProxyVerify none
            SSLProxyCheckPeerCN off
            SSLProxyCheckPeerName off
            SSLProxyCheckPeerExpire off

            Redirect permanent / https://example.com/jupyter

            <Location "/jupyter">
                    AuthName "Please login"
                    AuthType basic
                    AuthBasicProvider file
                    AuthUserFile "/path/to/auth_file_name"
                    Require valid-user

                    ProxyPass https://localhost:8888/jupyter
                    ProxyPassReverse https://localhost:8888/jupyter
                    ProxyPassReverseCookieDomain localhost example.com
                    RequestHeader set Origin "https://localhost:8888"
            </Location>

            <Location "/jupyter/api/kernels">
                    ProxyPass ws://localhost:8888/jupyter/api/kernels
                    ProxyPassReverse ws://localhost:8888/jupyter/api/kernels
                    ProxyPass wss://localhost:8888/jupyter/api/kernels
                    ProxyPassReverse wss://localhost:8888/jupyter/api/kernels
            </Location>

            Include /etc/letsencrypt/options-ssl-apache.conf
            SSLCertificateFile /etc/letsencrypt/live/example.com/fullchain.pem
            SSLCertificateKeyFile /etc/letsencrypt/live/example.com/privkey.pem
    </VirtualHost>
</IfModule>

Note: Certbot can automatically renew your certificates before they expire. I advise you enable this very useful feature (RTFM).


Now is time to reload Apache’s conf:

sudo service apache2 reload  


Wrapping up

You should now be all set to run your notebooks from a pretty and secure URL: https://example.com/jupyter.
If everything went fine, you’ll be asked for your username and password before accessing the Jupyter notebooks.

Now get back to real work ;-)

Let's work together !

Contact me !