Skip to content

Infrastructure code to prepare AWS resources and deploy a JupyterHub instance

Notifications You must be signed in to change notification settings

davidalber/jupyterhub-aws

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

51 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

JupyterHub AWS

This repository contains Terraform and Ansible code to set up AWS resources and deploy a JupyterHub instance using The Littlest JupyterHub (TLJH).

The infrastructure code here offers an alternative to the manual steps in TLJH's Installing on Amazon Web Services guide. While you can set up TLJH using the manual steps in the guide, you may enjoy the benefits of having your infrastructure setup in code if you are running a production instance of JupyterHub. It will be simple to create, replace, and destroy the infrastructure in a reproducible manner.

The downside of this approach is that you will need to install additional dependencies, whereas the manual approach from the guide is done entirely from the AWS console.

The code in this repository also sets up HTTPS, which is covered in a different TLJH guide: Enable HTTPS.

The process is broken down into steps:

When the process is complete, you will have a web service through which users can work on Jupyter notebooks without needing to set up Python and Jupyter on their own machines.

New JupyterHub user account

Feedback

I set this up because I was curious what operationalizing JupyterHub could look like, but I also put in effort to make it general enough for others to customize it easily to their situation. If you find this useful, please consider letting me know (star or email); that would mean a lot to me. Feel free to send feedback or other thoughts.

If I were using this in a shared environment with real users, there's more that I would do. If something can be added to help you, let me know. I might find time to add it. Some potential features follow.

Backups

There needs to be backup and restoration in the event that something goes wrong. The most obvious option is snapshotting the EBS volume, but I would also investigate how JupyterHub can be backed up without a complete snapshot.

Installing System Packages with Ansible

Users on JupyterHub can install packages into their own environments, but some packages are likely to be used by many users. In that situation, it makes sense to just have it installed once for everyone. TLJH has the guide Install conda / pip packages for all users, but I would investigate doing this in an Ansible playbook so that the environment is more reproducible.

DNS Configuration with Route53

For a fully-coded setup, I would manage DNS with Route53. To avoid extra complication for potential users, I didn't do that here. However, it might be nice to have optional configuration for Route53 users.

Features Considered and Cut

Elastic IP

An option that would simplify DNS configuration (at least if you ever need to replace the EC2 instance) would be to attach an Elastic IP. This would allow the user to have a fixed IP address across EC2 instance redeployment.

In the past, Elastic IP addresses were free while in use. AWS, however, recently added a fee to in-use addresses. Given that, I did not think that there would be demand for using an Elastic IP and did not implement it.

About

Infrastructure code to prepare AWS resources and deploy a JupyterHub instance

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published