DEPRECATED: Zero to JupyterHub in 15 minutes
THIS HAS NOW MOVED TO THE GITHUB REPOSITORY BELOW. WE’LL BUILD OFF OF IT. 





















A quick guide to creating your own JupyterHub deployment. See an overview by hovering over the table of contents at the left edge of the screen. This is very very very very very very very work in progress, so it will probably change quite a bit. 

TODO

  • Improve the “extending your setup” section with explanations for requesting more resources.
  • Add a “managing your jupyterhub” section that explains how to list users / resources / etc
  • Improve the oauth instructions so people know how to generate authentication credentials themselves
  • Build the mechanism for people to specify new packages
  • Build the mechanism for people to specify a github repo they want loaded in the root folder
  • Add instructions for connecting a docker image / github repo / etc.
  • Switch the backend in binder because now we’ve basically recreated binder ;-)
  • Create a repository that contains field: list_of_dependency pairs that we can use to compose domain-specific docker images @Chris H
  • Move these docs into a repository w/ a sphinx build
  • 🍺🍺🍺🍺🍺🍺🍺
  • Begin receiving angry emails from people for which this didn’t work.

Intro

This tutorial shows you how to set up a JupyterHub installation on a Kubernetes Cluster (on Google Cloud) using Helm for managing installation and upgrades. Firstly, here’s a short breakdown of the tools we’ll use, and how they fit together:

Tools for setting up JupyterHub

  • Google Cloud will provide the computing power that we’ll use. This is a service provided by Google that essentially allows us to use some of their computers. Fortunately, their computers are fancy, with lots of tools for scaling our usage up or down depending on our needs. There are lots of other cloud services out there (such as Microsoft Azure and Amazon EC2) but we’ll focus on Google’s for now.
  • You can access Google Cloud from its web console or via the “gcloud” SDK that you can download to your computer.
  • Kubernetes is a service that runs on cloud infrastructures, and actually does the communicating between computers on the cloud. Basically, a big challenge of cloud computing is that you want to interact with potentially lots of different computers. However, you want a single point-of-contact for controlling them. This is what kubernetes offers, and it’s what JupyterHub will utilize in order to increase the number of computers available if we need it.
  • We will interact with kubernetes via the Google Cloud terminal or the SDK.
  • Git and GitHub are used for managing repositories of code, as well as keeping track of how these repositories change over time. In particular we’ll use a github repository that the JupyterHub team has put together which contains a lot of useful configuration files to connect with google cloud and kubernetes. In addition, you’ll probably want to have some code show up on JupyterHub instances once users login, and a good way to do this is by hosting your code on github.
  • We can push / pull repositories from github using our terminal, and then instruct JupyterHub to automatically pull them into a new instance
  • Docker is a technology for “containerized” computing environments. This basically means creating a very specific combination of hardware + software that can be easily moved to any computer. It’s useful for standardizing the environment in which development happens. We use this to generate live computing environments that users of a jupyterhub will experience. They’ll all have the same basic set of files + packages.
  • Helm is technically a part of kubernetes, but is worth describing here. Basically this is the language that kubernetes uses as “instructions” for building a particular computing architecture in the cloud. We can think of them like recipes for deploying the particular setup that we want.
  • We’ll create a helm file for our setup so that kubernetes knows how to deploy.