JupyterHub#
JupyterHub is the best way to serve Jupyter notebook for multiple users. Because JupyterHub manages a separate Jupyter environment for each user, it can be used in a class of students, a corporate data science group, or a scientific research group. It is a multi-user Hub that spawns, manages, and proxies multiple instances of the single-user Jupyter notebook server.
Distributions#
JupyterHub can be used in a collaborative environment by both small (0-100 users) and large teams (more than 100 users) such as a class of students, corporate data science group or scientific research group. It has two main distributions which are developed to serve the needs of each of these teams respectively.
The Littlest JupyterHub distribution is suitable if you need a small number of users (1-100) and a single server with a simple environment.
Zero to JupyterHub with Kubernetes allows you to deploy dynamic servers on the cloud if you need even more users. This distribution runs JupyterHub on top of Kubernetes.
Note
It is important to evaluate these distributions before you can continue with the configuration of JupyterHub.
Subsystems#
JupyterHub is made up of four subsystems:
a Hub (tornado process) that is the heart of JupyterHub
a configurable http proxy (node-http-proxy) that receives the requests from the client’s browser
multiple single-user Jupyter notebook servers (Python/IPython/tornado) that are monitored by Spawners
an authentication class that manages how users can access the system
Additionally, optional configurations can be added through a config.py
file and manage users
kernels on an admin panel. A simplification of the whole system is displayed in the figure below:

JupyterHub performs the following functions:
The Hub launches a proxy
The proxy forwards all requests to the Hub by default
The Hub handles user login and spawns single-user servers on demand
The Hub configures the proxy to forward URL prefixes to the single-user notebook servers
For convenient administration of the Hub, its users, and services, JupyterHub also provides a REST API.
The JupyterHub team and Project Jupyter value our community, and JupyterHub follows the Jupyter Community Guides.
Documentation structure#
Tutorials#
This section of the documentation contains step-by-step tutorials that help outline the capabilities of JupyterHub and how you can achieve specific aims, such as installing it. The tutorials are recommended if you do not have much experience with JupyterHub.
Tutorials#
Tutorials provide step-by-step lessons to help you achieve a specific goal. They should be a good place to start learning about JupyterHub and how it works.
Installation#
This section covers how to get up-and-running with JupyterHub. It covers some basics of the tools needed to deploy JupyterHub as well as how to get it running on your own infrastructure.
Quickstart#
Prerequisites#
Before installing JupyterHub, you will need:
a Linux/Unix-based system
Python 3.8 or greater. An understanding of using
pip
orconda
for installing Python packages is helpful.Node.js 12 or greater, along with npm. Install Node.js/npm, using your operating system’s package manager.
If you are using
conda
, the nodejs and npm dependencies will be installed for you by conda.If you are using
pip
, install a recent version of nodejs/npm. For example, install it on Linux (Debian/Ubuntu) using:sudo apt-get install nodejs npm
nodesource is a great resource to get more recent versions of the nodejs runtime, if your system package manager only has an old version of Node.js.
A pluggable authentication module (PAM) to use the default Authenticator. PAM is often available by default on most distributions, if this is not the case it can be installed by using the operating system’s package manager.
TLS certificate and key for HTTPS communication
Domain name
Before running the single-user notebook servers (which may be on the same system as the Hub or not), you will need:
JupyterLab version 3 or greater, or Jupyter Notebook 4 or greater.
Installation#
JupyterHub can be installed with pip
(and the proxy with npm
) or conda
:
pip, npm:
python3 -m pip install jupyterhub
npm install -g configurable-http-proxy
python3 -m pip install jupyterlab notebook # needed if running the notebook servers in the same environment
conda (one command installs jupyterhub and proxy):
conda install -c conda-forge jupyterhub # installs jupyterhub and proxy
conda install jupyterlab notebook # needed if running the notebook servers in the same environment
Test your installation. If installed, these commands should return the packages’ help contents:
jupyterhub -h
configurable-http-proxy -h
Start the Hub server#
To start the Hub server, run the command:
jupyterhub
Visit http://localhost:8000
in your browser, and sign in with your Unix
credentials.
To allow multiple users to sign in to the Hub server, you must start
jupyterhub
as a privileged user, such as root:
sudo jupyterhub
The wiki describes how to run the server as a less privileged user. This requires additional configuration of the system.
Installation Basics#
Platform support#
JupyterHub is supported on Linux/Unix based systems. To use JupyterHub, you need a Unix server (typically Linux) running somewhere that is accessible to your team on the network. The JupyterHub server can be on an internal network at your organization, or it can run on the public internet (in which case, take care with the Hub’s security).
JupyterHub officially does not support Windows. You may be able to use JupyterHub on Windows if you use a Spawner and Authenticator that work on Windows, but the JupyterHub defaults will not. Bugs reported on Windows will not be accepted, and the test suite will not run on Windows. Small patches that fix minor Windows compatibility issues (such as basic installation) may be accepted, however. For Windows-based systems, we would recommend running JupyterHub in a docker container or Linux VM.
Additional Reference: Tornado’s documentation on Windows platform support
Planning your installation#
Prior to beginning installation, it’s helpful to consider some of the following:
deployment system (bare metal, Docker)
Authentication (PAM, OAuth, etc.)
Spawner of singleuser notebook servers (Docker, Batch, etc.)
Services (nbgrader, etc.)
JupyterHub database (default SQLite; traditional RDBMS such as PostgreSQL,) MySQL, or other databases supported by SQLAlchemy)
Folders and File Locations#
It is recommended to put all of the files used by JupyterHub into standard UNIX filesystem locations.
/srv/jupyterhub
for all security and runtime files/etc/jupyterhub
for all configuration files/var/log
for log files
Install JupyterHub with Docker#
The JupyterHub docker image is the fastest way to set up Jupyterhub in your local development environment.
Note
This quay.io/jupyterhub/jupyterhub
docker image is only an image for running
the Hub service itself. It does not provide the other Jupyter components,
such as Notebook installation, which are needed by the single-user servers.
To run the single-user servers, which may be on the same system as the Hub or
not, JupyterLab or Jupyter Notebook must be installed.
Important
We strongly recommend that you follow the Zero to JupyterHub tutorial to install JupyterHub.
Prerequisites#
You should have Docker installed on a Linux/Unix based system.
Run the Docker Image#
To pull the latest JupyterHub image and start the jupyterhub
container, run this command in your terminal.
docker run -d -p 8000:8000 --name jupyterhub quay.io/jupyterhub/jupyterhub jupyterhub
This command exposes the Jupyter container on port:8000. Navigate to http://localhost:8000
in a web browser to access the JupyterHub console.
You can stop and resume the container by running docker stop
and docker start
respectively.
# find the container id
docker ps
# stop the running container
docker stop <container-id>
# resume the paused container
docker start <container-id>
If you want to run docker on a computer that has a public IP then you should (as in MUST) secure it with ssl by adding ssl options to your docker configuration or using an ssl enabled proxy.
Mounting volumes enables you to persist and store the data generated by the docker container, even when you stop the container. The persistent data can be stored on the host system, outside the container.
Create System Users#
Spawn a root shell in your docker container by running this command in the terminal.:
docker exec -it jupyterhub bash
The created accounts will be used for authentication in JupyterHub’s default configuration.
Getting Started#
This section covers how to configure and customize JupyterHub for your needs. It contains information about authentication, networking, security, and other topics that are relevant to individuals or organizations deploying their own JupyterHub.
Configuration Basics#
This section contains basic information about configuring settings for a JupyterHub deployment. The Technical Reference documentation provides additional details.
This section will help you learn how to:
generate a default configuration file,
jupyterhub_config.py
start with a specific configuration file
configure JupyterHub using command line options
find information and examples for some common deployments
Generate a default config file#
On startup, JupyterHub will look by default for a configuration file,
jupyterhub_config.py
, in the current working directory.
To generate a default config file, jupyterhub_config.py
:
jupyterhub --generate-config
This default jupyterhub_config.py
file contains comments and guidance for all
configuration variables and their default values. We recommend storing
configuration files in the standard UNIX filesystem location, i.e.
/etc/jupyterhub
.
Start with a specific config file#
You can load a specific config file and start JupyterHub using:
jupyterhub -f /path/to/jupyterhub_config.py
If you have stored your configuration file in the recommended UNIX filesystem
location, /etc/jupyterhub
, the following command will start JupyterHub using
the configuration file:
jupyterhub -f /etc/jupyterhub/jupyterhub_config.py
The IPython documentation provides additional information on the config system that Jupyter uses.
Configure using command line options#
To display all command line options that are available for configuration run the following command:
jupyterhub --help-all
Configuration using the command line options is done when launching JupyterHub.
For example, to start JupyterHub on 10.0.1.2:443
with https, you
would enter:
jupyterhub --ip 10.0.1.2 --port 443 --ssl-key my_ssl.key --ssl-cert my_ssl.cert
All configurable options may technically be set on the command line,
though some are inconvenient to type. To set a particular configuration
parameter, c.Class.trait
, you would use the command line option,
--Class.trait
, when starting JupyterHub. For example, to configure the
c.Spawner.notebook_dir
trait from the command line, use the
--Spawner.notebook_dir
option:
jupyterhub --Spawner.notebook_dir='~/assignments'
Configure for various deployment environments#
The default authentication and process spawning mechanisms can be replaced, and specific authenticators and spawners can be set in the configuration file. This enables JupyterHub to be used with a variety of authentication methods or process control and deployment environments. Some examples, meant as illustrations, are:
Using GitHub OAuth instead of PAM with OAuthenticator
Spawning single-user servers with Docker, using the DockerSpawner
Run the proxy separately#
This is not strictly necessary, but useful in many cases. If you use a custom proxy (e.g. Traefik), this is also not needed.
Connections to user servers go through the proxy, and not the hub itself. If the proxy stays running when the hub restarts (for maintenance, re-configuration, etc.), then user connections are not interrupted. For simplicity, by default the hub starts the proxy automatically, so if the hub restarts, the proxy restarts, and user connections are interrupted. It is easy to run the proxy separately, for information see the separate proxy page.
Networking basics#
This section will help you with basic proxy and network configuration to:
set the proxy’s IP address and port
set the proxy’s REST API URL
configure the Hub if the Proxy or Spawners are remote or isolated
set the
hub_connect_ip
which services will use to communicate with the hub
Set the Proxy’s IP address and port#
The Proxy’s main IP address setting determines where JupyterHub is available to users.
By default, JupyterHub is configured to be available on all network interfaces
(''
) on port 8000. Note: Use of '*'
is discouraged for IP configuration;
instead, use of '0.0.0.0'
is preferred.
Changing the Proxy’s main IP address and port can be done with the following JupyterHub command line options:
jupyterhub --ip=192.168.1.2 --port=443
Or by placing the following lines in a configuration file,
jupyterhub_config.py
:
c.JupyterHub.ip = '192.168.1.2'
c.JupyterHub.port = 443
Port 443 is used in the examples since 443 is the default port for SSL/HTTPS.
Configuring only the main IP and port of JupyterHub should be sufficient for most deployments of JupyterHub. However, more customized scenarios may need additional networking details to be configured.
Note that c.JupyterHub.ip
and c.JupyterHub.port
are single values,
not tuples or lists – JupyterHub listens to only a single IP address and
port.
Set the Proxy’s REST API communication URL (optional)#
By default, the proxy’s REST API listens on port 8081 of localhost
only.
The Hub service talks to the proxy via a REST API on a secondary port.
The REST API URL (hostname and port) can be configured separately and override the default settings.
Set api_url#
The URL to access the API, c.configurableHTTPProxy.api_url
, is configurable.
An example entry to set the proxy’s API URL in jupyterhub_config.py
is:
c.ConfigurableHTTPProxy.api_url = 'http://10.0.1.4:5432'
proxy_api_ip and proxy_api_port (Deprecated in 0.8)#
If running the Proxy separate from the Hub, configure the REST API communication
IP address and port by adding this to the jupyterhub_config.py
file:
# ideally a private network address
c.JupyterHub.proxy_api_ip = '10.0.1.4'
c.JupyterHub.proxy_api_port = 5432
We recommend using the proxy’s api_url
setting instead of the deprecated
settings, proxy_api_ip
and proxy_api_port
.
Configure the Hub if the Proxy or Spawners are remote or isolated#
The Hub service listens only on localhost
(port 8081) by default.
The Hub needs to be accessible from both the proxy and all Spawners.
When spawning local servers, an IP address setting of localhost
is fine.
If either the Proxy or (more likely) the Spawners will be remote or isolated in containers, the Hub must listen on an IP that is accessible.
c.JupyterHub.hub_ip = '10.0.1.4'
c.JupyterHub.hub_port = 54321
Added in 0.8: The c.JupyterHub.hub_connect_ip
setting is the IP address or
hostname that other services should use to connect to the Hub. A common
configuration for, e.g. docker, is:
c.JupyterHub.hub_ip = '0.0.0.0' # listen on all interfaces
c.JupyterHub.hub_connect_ip = '10.0.1.4' # IP as seen on the docker network. Can also be a hostname.
Adjusting the hub’s URL#
The hub will most commonly be running on a hostname of its own. If it
is not – for example, if the hub is being reverse-proxied and being
exposed at a URL such as https://proxy.example.org/jupyter/
– then
you will need to tell JupyterHub the base URL of the service. In such
a case, it is both necessary and sufficient to set
c.JupyterHub.base_url = '/jupyter/'
in the configuration.
Security settings#
Important
You should not run JupyterHub without SSL encryption on a public network.
Security is the most important aspect of configuring Jupyter. Three (3) configuration settings are the main aspects of security configuration:
SSL encryption (to enable HTTPS)
Cookie secret (a key for encrypting browser cookies)
Proxy authentication token (used for the Hub and other services to authenticate to the Proxy)
The Hub hashes all secrets (e.g. auth tokens) before storing them in its database. A loss of control over read-access to the database should have minimal impact on your deployment. If your database has been compromised, it is still a good idea to revoke existing tokens.
Enabling SSL encryption#
Since JupyterHub includes authentication and allows arbitrary code execution, you should not run it without SSL (HTTPS).
Using an SSL certificate#
This will require you to obtain an official, trusted SSL certificate or create a
self-signed certificate. Once you have obtained and installed a key and
certificate, you need to specify their locations in the jupyterhub_config.py
configuration file as follows:
c.JupyterHub.ssl_key = '/path/to/my.key'
c.JupyterHub.ssl_cert = '/path/to/my.cert'
Some cert files also contain the key, in which case only the cert is needed. It is important that these files be put in a secure location on your server, where they are not readable by regular users.
If you are using a chain certificate, see also chained certificate for SSL in the JupyterHub Troubleshooting FAQ.
Using letsencrypt#
It is also possible to use letsencrypt to obtain
a free, trusted SSL certificate. If you run letsencrypt using the default
options, the needed configuration is (replace mydomain.tld
by your fully
qualified domain name):
c.JupyterHub.ssl_key = '/etc/letsencrypt/live/{mydomain.tld}/privkey.pem'
c.JupyterHub.ssl_cert = '/etc/letsencrypt/live/{mydomain.tld}/fullchain.pem'
If the fully qualified domain name (FQDN) is example.com
, the following
would be the needed configuration:
c.JupyterHub.ssl_key = '/etc/letsencrypt/live/example.com/privkey.pem'
c.JupyterHub.ssl_cert = '/etc/letsencrypt/live/example.com/fullchain.pem'
If SSL termination happens outside of the Hub#
In certain cases, for example, if the hub is running behind a reverse proxy, and SSL termination is being provided by NGINX, it is reasonable to run the hub without SSL.
To achieve this, remove c.JupyterHub.ssl_key
and c.JupyterHub.ssl_cert
from your configuration (setting them to None
or an empty string does not
have the same effect, and will result in an error).
Proxy authentication token#
The Hub authenticates its requests to the Proxy using a secret token that
the Hub and Proxy agree upon. Note that this applies to the default
ConfigurableHTTPProxy
implementation. Not all proxy implementations
use an auth token.
The value of this token should be a random string (for example, generated by
openssl rand -hex 32
). You can store it in the configuration file or an
environment variable.
Generating and storing token in the configuration file#
You can set the value in the configuration file, jupyterhub_config.py
:
c.ConfigurableHTTPProxy.api_token = 'abc123...' # any random string
Generating and storing as an environment variable#
You can pass this value of the proxy authentication token to the Hub and Proxy
using the CONFIGPROXY_AUTH_TOKEN
environment variable:
export CONFIGPROXY_AUTH_TOKEN=$(openssl rand -hex 32)
This environment variable needs to be visible to the Hub and Proxy.
Default if token is not set#
If you do not set the Proxy authentication token, the Hub will generate a random key itself. This means that any time you restart the Hub, you must also restart the Proxy. If the proxy is a subprocess of the Hub, this should happen automatically (this is the default configuration).
Authentication and User Basics#
The default Authenticator uses PAM (Pluggable Authentication Module) to authenticate system users with their usernames and passwords. With the default Authenticator, any user with an account and password on the system will be allowed to login.
Deciding who is allowed#
In the base Authenticator, there are 3 configuration options for granting users access to your Hub:
allow_all
grants any user who can successfully authenticate access to the Huballowed_users
defines a set of users who can access the Huballow_existing_users
enables managing users via the JupyterHub API or admin page
These options should apply to all Authenticators. Your chosen Authenticator may add additional configuration options to admit users, such as team membership, course enrollment, etc.
Important
You should always specify at least one allow configuration if you want people to be able to access your Hub! In most cases, this looks like:
c.Authenticator.allow_all = True
# or
c.Authenticator.allowed_users = {"name", ...}
Changed in version 5.0: If no allow config is specified, then by default nobody will have access to your Hub.
Prior to 5.0, the opposite was true; effectively allow_all = True
if no other allow config was specified.
You can restrict which users are allowed to login with a set,
Authenticator.allowed_users
:
c.Authenticator.allowed_users = {'mal', 'zoe', 'inara', 'kaylee'}
# c.Authenticator.allow_all = False
c.Authenticator.allow_existing_users = False
Users in the allowed_users
set are added to the Hub database when the Hub is started.
Changed in version 5.0: Authenticator.allow_all
and Authenticator.allow_existing_users
are new in JupyterHub 5.0
to enable explicit configuration of previously implicit behavior.
Prior to 5.0, allow_all
was implicitly True if allowed_users
was empty.
Starting with 5.0, to allow all authenticated users by default,
allow_all
must be explicitly set to True.
By default, allow_existing_users
is True when allowed_users
is not empty,
to ensure backward-compatibility.
To make the allowed_users
set restrictive,
set allow_existing_users = False
.
One Time Passwords ( request_otp )#
By setting request_otp
to true, the login screen will show and additional password input field
to accept an OTP:
c.Authenticator.request_otp = True
By default, the prompt label is OTP:
, but this can be changed by setting otp_prompt
:
c.Authenticator.otp_prompt = 'Google Authenticator:'
Configure admins (admin_users
)#
Note
As of JupyterHub 2.0, the full permissions of admin_users
should not be required.
Instead, it is best to assign roles to users or groups
with only the scopes they require.
Admin users of JupyterHub, admin_users
, can add and remove users from
the user allowed_users
set. admin_users
can take actions on other users’
behalf, such as stopping and restarting their servers.
A set of initial admin users, admin_users
can be configured as follows:
c.Authenticator.admin_users = {'mal', 'zoe'}
Users in the admin set are automatically added to the user allowed_users
set,
if they are not already present.
Each Authenticator may have different ways of determining whether a user is an
administrator. By default, JupyterHub uses the PAMAuthenticator which provides the
admin_groups
option and can set administrator status based on a user
group. For example, we can let any user in the wheel
group be an admin:
c.PAMAuthenticator.admin_groups = {'wheel'}
Give some users access to other users’ notebook servers#
The access:servers
scope can be granted to users to give them permission to visit other users’ servers.
For example, to give members of the teachers
group access to the servers of members of the students
group:
c.JupyterHub.load_roles = [
{
"name": "teachers",
"scopes": [
"admin-ui",
"list:users",
"access:servers!group=students",
],
"groups": ["teachers"],
}
]
By default, only the deprecated admin
role has global access
permissions.
As a courtesy, you should make sure your users know if admin access is enabled.
Add or remove users from the Hub#
Added in version 5.0: c.Authenticator.allow_existing_users
is added in 5.0 and True by default if any allowed_users
are specified.
Prior to 5.0, this behavior was not optional.
Users can be added to and removed from the Hub via the admin panel or the REST API.
To enable this behavior, set:
c.Authenticator.allow_existing_users = True
When a user is added, the user will be
automatically added to the allowed_users
set and database.
If allow_existing_users
is True, restarting the Hub will not require manually updating the allowed_users
set in your config file,
as the users will be loaded from the database.
If allow_existing_users
is False, users not granted access by configuration such as allowed_users
will not be permitted to login,
even if they are present in the database.
After starting the Hub once, it is not sufficient to remove a user
from the allowed users set in your config file. You must also remove the user
from the Hub’s database, either by deleting the user via JupyterHub’s
admin page, or you can clear the jupyterhub.sqlite
database and start
fresh.
Use LocalAuthenticator to create system users#
The LocalAuthenticator
is a special kind of Authenticator that has
the ability to manage users on the local system. When you try to add a
new user to the Hub, a LocalAuthenticator
will check if the user
already exists. If you set the configuration value, create_system_users
,
to True
in the configuration file, the LocalAuthenticator
has
the ability to add users to the system. The setting in the config
file is:
c.LocalAuthenticator.create_system_users = True
Adding a user to the Hub that doesn’t already exist on the system will
result in the Hub creating that user via the system adduser
command
line tool. This option is typically used on hosted deployments of
JupyterHub to avoid the need to manually create all your users before
launching the service. This approach is not recommended when running
JupyterHub in situations where JupyterHub users map directly onto the
system’s UNIX users.
Use OAuthenticator to support OAuth with popular service providers#
JupyterHub’s OAuthenticator currently supports the following popular services:
A generic implementation, which you can use for OAuth authentication with any provider, is also available.
Use DummyAuthenticator for testing#
The DummyAuthenticator
is a simple Authenticator that
allows for any username or password unless a global password has been set. If
set, it will allow for any username as long as the correct password is provided.
To set a global password, add this to the config file:
c.DummyAuthenticator.password = "some_password"
External services#
When working with JupyterHub, a Service is defined as a process that interacts with the Hub’s REST API. A Service may perform a specific action or task. For example, shutting down individuals’ single user notebook servers that have been idle for some time is a good example of a task that could be automated by a Service. Let’s look at how the jupyterhub_idle_culler script can be used as a Service.
Real-world example to cull idle servers#
JupyterHub has a REST API that can be used by external services. This document will:
explain some basic information about API tokens
clarify that API tokens can be used to authenticate to single-user servers as of version 0.8.0
show how the jupyterhub_idle_culler script can be:
used in a Hub-managed service
run as a standalone script
Both examples for jupyterhub_idle_culler
will communicate tasks to the
Hub via the REST API.
API Token basics#
Step 1: Generate an API token#
To run such an external service, an API token must be created and provided to the service.
As of version 0.6.0, the preferred way of doing this is to first generate an API token:
openssl rand -hex 32
In version 0.8.0, a TOKEN request page for generating an API token is available from the JupyterHub user interface:
Step 2: Pass environment variable with token to the Hub#
In the case of cull_idle_servers
, it is passed as the environment
variable called JUPYTERHUB_API_TOKEN
.
Step 3: Use API tokens for services and tasks that require external access#
While API tokens are often associated with a specific user, API tokens
can be used by services that require external access for activities
that may not correspond to a specific human, e.g. adding users during
setup for a tutorial or workshop. Add a service and its API token to the
JupyterHub configuration file, jupyterhub_config.py
:
c.JupyterHub.services = [
{'name': 'adding-users', 'api_token': 'super-secret-token'},
]
Step 4: Restart JupyterHub#
Upon restarting JupyterHub, you should see a message like below in the logs:
Adding API token for <username>
Authenticating to single-user servers using API token#
In JupyterHub 0.7, there is no mechanism for token authentication to single-user servers, and only cookies can be used for authentication. 0.8 supports using JupyterHub API tokens to authenticate to single-user servers.
How to configure the idle culler to run as a Hub-Managed Service#
Step 1: Install the idle culler:#
pip install jupyterhub-idle-culler
Step 2: In jupyterhub_config.py
, add the following dictionary for the idle-culler
Service to the c.JupyterHub.services
list:#
c.JupyterHub.services = [
{
'name': 'idle-culler',
'command': [sys.executable, '-m', 'jupyterhub_idle_culler', '--timeout=3600'],
}
]
c.JupyterHub.load_roles = [
{
"name": "list-and-cull", # name the role
"services": [
"idle-culler", # assign the service to this role
],
"scopes": [
# declare what permissions the service should have
"list:users", # list users
"read:users:activity", # read user last-activity
"admin:servers", # start/stop servers
],
}
]
where:
command
indicates that the Service will be launched as a subprocess, managed by the Hub.
Changed in version 2.0: Prior to 2.0, the idle-culler required ‘admin’ permissions. It now needs the scopes:
list:users
to access the user list endpointread:users:activity
to read activity infoadmin:servers
to start/stop servers
How to run cull-idle
manually as a standalone script#
Now you can run your script by providing it the API token and it will authenticate through the REST API to interact with it.
This will run the idle culler service manually. It can be run as a standalone
script anywhere with access to the Hub, and will periodically check for idle
servers and shut them down via the Hub’s REST API. In order to shutdown the
servers, the token given to cull-idle
must have permission to list users
and admin their servers.
Generate an API token and store it in the JUPYTERHUB_API_TOKEN
environment
variable. Run jupyterhub_idle_culler
manually.
export JUPYTERHUB_API_TOKEN='token'
python -m jupyterhub_idle_culler [--timeout=900] [--url=http://127.0.0.1:8081/hub/api]
Spawners and single-user notebook servers#
A Spawner starts each single-user notebook server. Since the single-user server is an instance of jupyter notebook
, an entire separate
multi-process application, many aspects of that server can be configured and there are a lot
of ways to express that configuration.
At the JupyterHub level, you can set some values on the Spawner. The simplest of these is
Spawner.notebook_dir
, which lets you set the root directory for a user’s server. This root
notebook directory is the highest-level directory users will be able to access in the notebook
dashboard. In this example, the root notebook directory is set to ~/notebooks
, where ~
is
expanded to the user’s home directory.
c.Spawner.notebook_dir = '~/notebooks'
You can also specify extra command line arguments to the notebook server with:
c.Spawner.args = ['--debug', '--profile=PHYS131']
This could be used to set the user’s default page for the single-user server:
c.Spawner.args = ['--NotebookApp.default_url=/notebooks/Welcome.ipynb']
Since the single-user server extends the notebook server application,
it still loads configuration from the jupyter_notebook_config.py
config file.
Each user may have one of these files in $HOME/.jupyter/
.
Jupyter also supports loading system-wide config files from /etc/jupyter/
,
which is the place to put configuration that you want to affect all of your users.
Working with the JupyterHub API#
JupyterHub’s functionalities can be accessed using its API. In this section, we cover how to use the JupyterHub API to achieve specific goals, for example, starting servers.
Starting servers with the JupyterHub API#
Sometimes, when working with applications such as BinderHub, it may be necessary to launch Jupyter-based services on behalf of your users. Doing so can be achieved through JupyterHub’s REST API, which allows one to launch and manage servers on behalf of users through API calls instead of the JupyterHub UI. This way, you can take advantage of other user/launch/lifecycle patterns that are not natively supported by the JupyterHub UI, all without the need to develop the server management features of JupyterHub Spawners and/or Authenticators.
This tutorial goes through working with the JupyterHub API to manage servers for users. In particular, it covers how to:
At the end, we also provide sample Python code that can be used to implement these steps.
Checking server status#
First, request information about a particular user using a GET request:
GET /hub/api/users/:username
The response you get will include a servers
field, which is a dictionary, as shown in this JSON-formatted response:
Required scope: read:servers
{
"admin": false,
"groups": [],
"pending": null,
"server": null,
"name": "test-1",
"kind": "user",
"last_activity": "2021-08-03T18:12:46.026411Z",
"created": "2021-08-03T18:09:59.767600Z",
"roles": ["user"],
"servers": {}
}
Many JupyterHub deployments only use a ‘default’ server, represented as an empty string ''
for a name. An investigation of the servers
field can yield one of two results. First, it can be empty as in the sample JSON response above. In such a case, the user has no running servers.
However, should the user have running servers, then the returned dict should contain various information, as shown in this response:
"servers": {
"": {
"name": "",
"last_activity": "2021-08-03T18:48:35.934000Z",
"started": "2021-08-03T18:48:29.093885Z",
"pending": null,
"ready": true,
"url": "/user/test-1/",
"user_options": {},
"progress_url": "/hub/api/users/test-1/server/progress"
}
}
Key properties of a server:
- name
the server’s name. Always the same as the key in
servers
.- ready
boolean. If true, the server can be expected to respond to requests at
url
.- pending
null
or a string indicating a transitional state (such asstart
orstop
). Will always benull
ifready
is true or a string if false.- url
The server’s url path (e.g.
/users/:name/:servername/
) where the server can be accessed ifready
is true.- progress_url
The API URL path (starting with
/hub/api
) where the progress API can be used to wait for the server to be ready.- last_activity
ISO8601 timestamp indicating when activity was last observed on the server.
- started
ISO801 timestamp indicating when the server was last started.
The two responses above are from a user with no servers and another with one ready
server. The sample below is a response likely to be received when one requests a server launch while the server is not yet ready:
"servers": {
"": {
"name": "",
"last_activity": "2021-08-03T18:48:29.093885Z",
"started": "2021-08-03T18:48:29.093885Z",
"pending": "spawn",
"ready": false,
"url": "/user/test-1/",
"user_options": {},
"progress_url": "/hub/api/users/test-1/server/progress"
}
}
Note that ready
is false
and pending
has the value spawn
, meaning that the server is not ready and attempting to access it may not work as it is still in the process of spawning. We’ll get more into this below in waiting for a server.
Starting servers#
To start a server, make this API request:
POST /hub/api/users/:username/servers/[:servername]
Required scope: servers
Assuming the request was valid, there are two possible responses:
- 201 Created
This status code means the launch completed and the server is ready and is available at the server’s URL immediately.
- 202 Accepted
This is the more likely response, and means that the server has begun launching, but is not immediately ready. As a result, the server shows
pending: 'spawn'
at this point and you should wait for it to start.
Waiting for a server to start#
After receiving a 202 Accepted
response, you have to wait for the server to start.
Two approaches can be applied to establish when the server is ready:
Polling the server model#
The simplest way to check if a server is ready is to programmatically query the server model until two conditions are true:
The server name is contained in the
servers
response, andservers['servername']['ready']
is true.
The Python code snippet below can be used to check if a server is ready:
def server_ready(hub_url, user, server_name="", token):
r = requests.get(
f"{hub_url}/hub/api/users/{user}/servers/{server_name}",
headers={"Authorization": f"token {token}"},
)
r.raise_for_status()
user_model = r.json()
servers = user_model.get("servers", {})
if server_name not in servers:
return False
server = servers[server_name]
if server['ready']:
print(f"Server {user}/{server_name} ready at {server['url']}")
return True
else:
print(f"Server {user}/{server_name} not ready, pending {server['pending']}")
return False
You can keep making this check until ready
is true.
Using the progress API#
The most efficient way to wait for a server to start is by using the progress API.
The progress URL is available in the server model under progress_url
and has the form /hub/api/users/:user/servers/:servername/progress
.
The default server progress can be accessed at :user/servers//progress
or :user/server/progress
as demonstrated in the following GET request:
GET /hub/api/users/:user/servers/:servername/progress
Required scope: read:servers
The progress API is an example of an EventStream API. Messages are streamed and delivered in the form:
data: {"progress": 10, "message": "...", ...}
where the line after data:
contains a JSON-serialized dictionary.
Lines that do not start with data:
should be ignored.
Progress events have the form:
{
"progress": 0-100,
"message": "",
"ready": True, # or False
}
- progress
integer, 0-100
- message
string message describing progress stages
- ready
present and true only for the last event when the server is ready
- url
only present if
ready
is true; will be the server’s URL
The progress API can be used even with fully ready servers. If the server is ready, there will only be one event, which will look like:
{
"progress": 100,
"ready": true,
"message": "Server ready at /user/test-1/",
"html_message": "Server ready at <a href=\"/user/test-1/\">/user/test-1/</a>",
"url": "/user/test-1/"
}
where ready
and url
are the same as in the server model, and ready
will always be true.
A significant advantage of the progress API is that it shows the status of the server through a stream of messages. Below is an example of a typical complete stream from the API:
data: {"progress": 0, "message": "Server requested"}
data: {"progress": 50, "message": "Spawning server..."}
data: {"progress": 100, "ready": true, "message": "Server ready at /user/test-user/", "html_message": "Server ready at <a href=\"/user/test-user/\">/user/test-user/</a>", "url": "/user/test-user/"}
Here is a Python example for consuming an event stream:
def event_stream(session, url):
"""Generator yielding events from a JSON event stream
For use with the server progress API
"""
r = session.get(url, stream=True)
r.raise_for_status()
for line in r.iter_lines():
line = line.decode('utf8', 'replace')
# event lines all start with `data:`
# all other lines should be ignored (they will be empty)
if line.startswith('data:'):
yield json.loads(line.split(':', 1)[1])
Stopping servers#
Servers can be stopped with a DELETE request:
DELETE /hub/api/users/:user/servers/[:servername]
Required scope: servers
Similar to when starting a server, issuing the DELETE request above might not stop the server immediately. Instead, the DELETE request has two possible response codes:
- 204 Deleted
This status code means the delete completed and the server is fully stopped. It will now be absent from the user
servers
model.- 202 Accepted
This code means your request was accepted but is not yet completely processed. The server has
pending: 'stop'
at this point.
There is no progress API for checking when a server actually stops.
The only way to wait for a server to stop is to poll it and wait for the server to disappear from the user servers
model.
This Python code snippet can be used to stop a server and the wait for the process to complete:
def stop_server(session, hub_url, user, server_name=""):
"""Stop a server via the JupyterHub API
Returns when the server has finished stopping
"""
# step 1: get user status
user_url = f"{hub_url}/hub/api/users/{user}"
server_url = f"{user_url}/servers/{server_name}"
log_name = f"{user}/{server_name}".rstrip("/")
log.info(f"Stopping server {log_name}")
r = session.delete(server_url)
if r.status_code == 404:
log.info(f"Server {log_name} already stopped")
r.raise_for_status()
if r.status_code == 204:
log.info(f"Server {log_name} stopped")
return
# else: 202, stop requested, but not complete
# wait for stop to finish
log.info(f"Server {log_name} stopping...")
# wait for server to be done stopping
while True:
r = session.get(user_url)
r.raise_for_status()
user_model = r.json()
if server_name not in user_model.get("servers", {}):
log.info(f"Server {log_name} stopped")
return
server = user_model["servers"][server_name]
if not server['pending']:
raise ValueError(f"Waiting for {log_name}, but no longer pending.")
log.info(f"Server {log_name} pending: {server['pending']}")
# wait to poll again
time.sleep(1)
Communicating with servers#
JupyterHub tokens with the access:servers
scope can be used to communicate with servers themselves.
The tokens can be the same as those you used to launch your service.
Note
Access scopes are new in JupyterHub 2.0. To access servers in JupyterHub 1.x, a token must be owned by the same user as the server, or be an admin token if admin_access is enabled.
The URL returned from a server model is the URL path suffix,
e.g. /user/:name/
to append to the jupyterhub base URL.
The returned URL is of the form {hub_url}{server_url}
,
where hub_url
would be http://127.0.0.1:8000
by default and server_url
is /user/myname
.
When combined, the two give a full URL of http://127.0.0.1:8000/user/myname
.
Python example#
The JupyterHub repo includes a complete example in examples/server-api
that ties all theses steps together.
In summary, the processes involved in managing servers on behalf of users are:
Get user information from
/user/:name
.The server model includes a
ready
state to tell you if it’s ready.If it’s not ready, you can follow up with
progress_url
to wait for it.If it is ready, you can use the
url
field to link directly to the running server.
The example below demonstrates starting and stopping servers via the JupyterHub API, including waiting for them to start via the progress API and waiting for them to stop by polling the user model.
def event_stream(session, url):
"""Generator yielding events from a JSON event stream
For use with the server progress API
"""
r = session.get(url, stream=True)
r.raise_for_status()
for line in r.iter_lines():
line = line.decode('utf8', 'replace')
# event lines all start with `data:`
# all other lines should be ignored (they will be empty)
if line.startswith('data:'):
yield json.loads(line.split(':', 1)[1])
def start_server(session, hub_url, user, server_name=""):
"""Start a server for a jupyterhub user
Returns the full URL for accessing the server
"""
user_url = f"{hub_url}/hub/api/users/{user}"
log_name = f"{user}/{server_name}".rstrip("/")
# step 1: get user status
r = session.get(user_url)
r.raise_for_status()
user_model = r.json()
# if server is not 'active', request launch
if server_name not in user_model.get('servers', {}):
log.info(f"Starting server {log_name}")
r = session.post(f"{user_url}/servers/{server_name}")
r.raise_for_status()
if r.status_code == 201:
log.info(f"Server {log_name} is launched and ready")
elif r.status_code == 202:
log.info(f"Server {log_name} is launching...")
else:
log.warning(f"Unexpected status: {r.status_code}")
r = session.get(user_url)
r.raise_for_status()
user_model = r.json()
# report server status
server = user_model['servers'][server_name]
if server['pending']:
status = f"pending {server['pending']}"
elif server['ready']:
status = "ready"
else:
# shouldn't be possible!
raise ValueError(f"Unexpected server state: {server}")
log.info(f"Server {log_name} is {status}")
# wait for server to be ready using progress API
progress_url = user_model['servers'][server_name]['progress_url']
for event in event_stream(session, f"{hub_url}{progress_url}"):
log.info(f"Progress {event['progress']}%: {event['message']}")
if event.get("ready"):
server_url = event['url']
break
else:
# server never ready
raise ValueError(f"{log_name} never started!")
# at this point, we know the server is ready and waiting to receive requests
# return the full URL where the server can be accessed
return f"{hub_url}{server_url}"
def stop_server(session, hub_url, user, server_name=""):
"""Stop a server via the JupyterHub API
Returns when the server has finished stopping
"""
# step 1: get user status
user_url = f"{hub_url}/hub/api/users/{user}"
server_url = f"{user_url}/servers/{server_name}"
log_name = f"{user}/{server_name}".rstrip("/")
log.info(f"Stopping server {log_name}")
r = session.delete(server_url)
if r.status_code == 404:
log.info(f"Server {log_name} already stopped")
r.raise_for_status()
if r.status_code == 204:
log.info(f"Server {log_name} stopped")
return
# else: 202, stop requested, but not complete
# wait for stop to finish
log.info(f"Server {log_name} stopping...")
# wait for server to be done stopping
while True:
r = session.get(user_url)
r.raise_for_status()
user_model = r.json()
if server_name not in user_model.get("servers", {}):
log.info(f"Server {log_name} stopped")
return
server = user_model["servers"][server_name]
if not server['pending']:
raise ValueError(f"Waiting for {log_name}, but no longer pending.")
log.info(f"Server {log_name} pending: {server['pending']}")
# wait to poll again
time.sleep(1)
Configuration#
Further tutorials of configuring JupyterHub for specific tasks
Real-time collaboration without impersonation#
Note
It is recommended to use at least JupyterLab 3.6 with JupyterHub >= 3.1.1 for this.
Note
Starting with JupyterLab >=4.0, installing the jupyter-collaboration package in your single-user environment enables collaborative mode, instead of passing the --collaborative
flag at runtime.
JupyterLab has support for real-time collaboration (RTC), where multiple users are working with the same Jupyter server and see each other’s edits. Beyond other collaborative-editing environments, Jupyter includes execution. So granting someone access to your server also means granting them access to run code as you. That’s a pretty big difference, and may not be acceptable to your users (or sysadmins!).
One strategy for this is to have the concept of “collaboration accounts”, which function as users themselves and represent the collaboration instead of any individual human. So instead of running code as you, anyone with access to the collaboration can run code as the collaboration.
Goals#
Our goal is to:
preserve default security, where nobody has access to each other’s servers
allow adding and removing users to collaborations without restarting JupyterHub
enable different behavior for collaboration, such as JupyterLab’s “collaborative mode”. This could also include mounting project-specific data sources, etc.
Key points to consider:
Roles are how we grant permission to users or groups.
A user can be in many groups, and both users and groups can have many roles.
Users and groups cannot have their role assignments change without restarting JupyterHub.
Users can have their group assignments change at any time, and Authenticators can even delegate group membership to your identity provider, such as GitHub teams, etc.
A collaboration account should:
not be a real human user
be able to launch a server (this may mean a real system user or not, depending on the Spawner)
launch a server with a different configuration than other users
And users with access to collaborations should be able to:
start and stop servers for collaboration users
access collaboration servers
see what other users are using the server, according to their JupyterHub identity
Initial setup#
First, we are going to define our collaborations and their initial membership. We do this in a yaml file, but it could come from some other source, such as your identity provider or another data source:
projects:
vox:
members:
- vex
- vax
- pike
mighty:
members:
- fjord
- beau
- jester
This is a small data structure where the keys of projects
are the names of the collaboration groups,
and the members
are a list of real users who should have access to these servers.
First, we are going to prepare to define the roles and groups:
c.JupyterHub.load_roles = []
c.JupyterHub.load_groups = {
# collaborative accounts get added to this group
# so it's easy to see which accounts are collaboration accounts
"collaborative": [],
}
where we create the collaborative
group which will contain only the collaboration accounts themselves.
Creating collaboration accounts#
Next, we are going to iterate through our collaborations, create users, and assign permissions. We are going to:
create a JupyterHub user for each collaboration
assign the collaboration user to the collaboration group
create a role granting access to the collaboration user’s account
create a group for each collaboration
assign the group to the role, so it has access to the account
assign members of the project to the collaboration group, so they have access to the project.
for project_name, project in project_config["projects"].items():
# get the members of the project
members = project.get("members", [])
print(f"Adding project {project_name} with members {members}")
# add them to a group for the project
c.JupyterHub.load_groups[project_name] = members
# define a new user for the collaboration
collab_user = f"{project_name}-collab"
# add the collab user to the 'collaborative' group
# so we can identify it as a collab account
c.JupyterHub.load_groups["collaborative"].append(collab_user)
# finally, grant members of the project collaboration group
# access to the collab user's server,
# and the admin UI so they can start/stop the server
c.JupyterHub.load_roles.append(
{
"name": f"collab-access-{project_name}",
"scopes": [
f"access:servers!user={collab_user}",
f"admin:servers!user={collab_user}",
"admin-ui",
f"list:users!user={collab_user}",
],
"groups": [project_name],
}
)
The members
step could be skipped if group membership is managed by the authenticator, or handled via the admin UI later, in which case we only need to handle group creation and role assignment.
This configuration code runs when jupyterhub starts up, and as noted above, users and groups cannot have their role assignments change without restarting JupyterHub. If new collaboration groups are created (within configuration, via the admin page, or via the Authenticator), the hub will need to be restarted in order for it to load roles for those new groups.
Distinguishing collaborative servers#
Finally, we want to enable RTC only on the collaborative user servers (and only the collaborative user servers),
which we do via a pre_spawn_hook
that checks for membership in the collaborative
group:
def pre_spawn_hook(spawner):
group_names = {group.name for group in spawner.user.groups}
if "collaborative" in group_names:
spawner.log.info(f"Enabling RTC for user {spawner.user.name}")
spawner.args.append("--LabApp.collaborative=True")
c.Spawner.pre_spawn_hook = pre_spawn_hook
This is also where we would put other collaboration customization, such as mounting data sets, any other collective credentials, etc.
Permissions#
What permissions did we need?
access:servers!user={collab_user}
is the main one -this is what grants us access to the running server. But it doesn’t grant us access to start or stop it.admin:servers!user={collab_user}
grants us access to start and stop the collaboration user’s serversadmin-ui
andlist:users!user={collab_user}
allow users who are members of collaboration accounts access to the admin UI, but only with the ability to see the collaboration accounts they have access to, not any other users.
The admin-ui
and list:users
permissions are not strictly required, but they provide a shortcut to having UI to list and access collaboration, but users will probably want to have convenient access.
The only built-in UI JupyterHub has to view other users’ servers is the admin page.
Users can have very limited access to the admin page to take the few actions they are allowed to do without needing any elevated permissions, so that’s the quickest way to given them the buttons they need for this, but it may not be how you want to do it in the long run.
If you provide the necessary links (e.g. https://yourhub.example/hub/spawn/vox-collab/
) on some other page, these permissions are not necessary.
How-to guides#
The How-to guides provide more in-depth details than the tutorials. They are recommended for those already familiar with JupyterHub and have a specific goal. The guides help answer the question “How do I …?” based on a particular topic.
How-to#
The How-to guides provide practical step-by-step details to help you achieve a particular goal. They are useful when you are trying to get something done but require you to understand and adapt the steps to your specific usecase.
Use the following guides when:
Deploying JupyterHub in “API only mode”#
As a service for deploying and managing Jupyter servers for users, JupyterHub exposes this functionality primarily via a REST API. For convenience, JupyterHub also ships with a basic web UI built using that REST API. The basic web UI enables users to click a button to quickly start and stop their servers, and it lets admins perform some basic user and server management tasks.
The REST API has always provided additional functionality beyond what is available in the basic web UI. Similarly, we avoid implementing UI functionality that is also not available via the API. With JupyterHub 2.0, the basic web UI will always be composed using the REST API. In other words, no UI pages should rely on information not available via the REST API. Previously, some admin UI functionality could only be achieved via admin pages, such as paginated requests.
Limited UI customization via templates#
The JupyterHub UI is customizable via extensible HTML templates, but this has some limited scope to what can be customized. Adding some content and messages to existing pages is well supported, but changing the page flow and what pages are available are beyond the scope of what is customizable.
Rich UI customization with REST API based apps#
Increasingly, JupyterHub is used purely as an API for managing Jupyter servers for other Jupyter-based applications that might want to present a different user experience. If you want a fully customized user experience, you can now disable the Hub UI and use your own pages together with the JupyterHub REST API to build your own web application to serve your users, relying on the Hub only as an API for managing users and servers.
One example of such an application is BinderHub, which powers https://mybinder.org, and motivates many of these changes.
BinderHub is distinct from a traditional JupyterHub deployment because it uses temporary users created for each launch. Instead of presenting a login page, users are presented with a form to specify what environment they would like to launch:
When a launch is requested:
an image is built, if necessary
a temporary user is created,
a server is launched for that user, and
when running, users are redirected to an already running server with an auth token in the URL
after the session is over, the user is deleted
This means that a lot of JupyterHub’s UI flow doesn’t make sense:
there is no way for users to login
the human user doesn’t map onto a JupyterHub
User
in a meaningful waywhen a server isn’t running, there isn’t a ‘restart your server’ action available because the user has been deleted
users do not have any access to any Hub functionality, so presenting pages for those features would be confusing
BinderHub is one of the motivating use cases for JupyterHub supporting being used only via its API. We’ll use BinderHub here as an example of various configuration options.
Disabling Hub UI#
c.JupyterHub.hub_routespec
is a configuration option to specify which URL prefix should be routed to the Hub.
The default is /
which means that the Hub will receive all requests not already specified to be routed somewhere else.
There are three values that are most logical for hub_routespec
:
/
- this is the default, and used in most deployments. It is also the only option prior to JupyterHub 1.4./hub/
- this serves only Hub pages, both UI and API/hub/api
- this serves only the Hub API, so all Hub UI is disabled, aside from the OAuth confirmation page, if used.
If you choose a hub routespec other than /
,
the main JupyterHub feature you will lose is the automatic handling of requests for /user/:username
when the requested server is not running.
JupyterHub’s handling of this request shows this page, telling you that the server is not running, with a button to launch it again:
If you set hub_routespec
to something other than /
,
it is likely that you also want to register another destination for /
to handle requests to not-running servers.
If you don’t, you will see a default 404 page from the proxy:
For mybinder.org, the default “start my server” page doesn’t make sense, because when a server is gone, there is no restart action. Instead, we provide hints about how to get back to a link to start a new server:
To achieve this, mybinder.org registers a route for /
that goes to a custom endpoint
that runs nginx and only serves this static HTML error page.
This is set with
c.Proxy.extra_routes = {
"/": "http://custom-404-entpoint/",
}
You may want to use an alternate behavior, such as redirecting to a landing page, or taking some other action based on the requested page.
If you use c.JupyterHub.hub_routespec = "/hub/"
,
then all the Hub pages will be available,
and only this default-page-404 issue will come up.
If you use c.JupyterHub.hub_routespec = "/hub/api/"
,
then only the Hub API will be available,
and all UI will be up to you.
mybinder.org takes this last option,
because none of the Hub UI pages really make sense.
Binder users don’t have any reason to know or care that JupyterHub happens
to be an implementation detail of how their environment is managed.
Seeing Hub error pages and messages in that situation is more likely to be confusing than helpful.
Added in version 1.4: c.JupyterHub.hub_routespec
and c.Proxy.extra_routes
are new in JupyterHub 1.4.
Writing a custom Proxy implementation#
JupyterHub 0.8 introduced the ability to write a custom implementation of the proxy. This enables deployments with different needs than the default proxy, configurable-http-proxy (CHP). CHP is a single-process nodejs proxy that the Hub manages by default as a subprocess (it can be run externally, as well, and typically is in production deployments).
The upside to CHP, and why we use it by default, is that it’s easy to install and run (if you have nodejs, you are set!). The downsides are that
it’s a single process and
does not support any persistence of the routing table.
So if the proxy process dies, your whole JupyterHub instance is inaccessible until the Hub notices, restarts the proxy, and restores the routing table. For deployments that want to avoid such a single point of failure, or leverage existing proxy infrastructure in their chosen deployment (such as Kubernetes ingress objects), the Proxy API provides a way to do that.
In general, for a proxy to be usable by JupyterHub, it must:
support websockets without prior knowledge of the URL where websockets may occur
support trie-based routing (i.e. allow different routes on
/foo
and/foo/bar
and route based on specificity)adding or removing a route should not cause existing connections to drop
Optionally, if the JupyterHub deployment is to use host-based routing, the Proxy must additionally support routing based on the Host of the request.
Subclassing Proxy#
To start, any Proxy implementation should subclass the base Proxy class, as is done with custom Spawners and Authenticators.
from jupyterhub.proxy import Proxy
class MyProxy(Proxy):
"""My Proxy implementation"""
...
Starting and stopping the proxy#
If your proxy should be launched when the Hub starts, you must define how to start and stop your proxy:
class MyProxy(Proxy):
...
async def start(self):
"""Start the proxy"""
async def stop(self):
"""Stop the proxy"""
These methods may be coroutines.
c.Proxy.should_start
is a configurable flag that determines whether the
Hub should call these methods when the Hub itself starts and stops.
Encryption#
When using internal_ssl
to encrypt traffic behind the proxy, at minimum,
your Proxy
will need client ssl certificates which the Hub
must be made
aware of. These can be generated with the command jupyterhub --generate-certs
which will write them to the internal_certs_location
in folders named
proxy_api
and proxy_client
. Alternatively, these can be provided to the
hub via the jupyterhub_config.py
file by providing a dict
of named paths
to the external_authorities
option. The hub will include all certificates
provided in that dict
in the trust bundle utilized by all internal
components.
Purely external proxies#
Probably most custom proxies will be externally managed,
such as Kubernetes ingress-based implementations.
In this case, you do not need to define start
and stop
.
To disable the methods, you can define should_start = False
at the class level:
class MyProxy(Proxy):
should_start = False
Routes#
At its most basic, a Proxy implementation defines a mechanism to add, remove, and retrieve routes. A proxy that implements these three methods is complete. Each of these methods may be a coroutine.
Definition: routespec
A routespec, which will appear in these methods, is a string describing a
route to be proxied, such as /user/name/
. A routespec will:
always end with
/
always start with
/
if it is a path-based route/proxy/path/
precede the leading
/
with a host for host-based routing, e.g.host.tld/proxy/path/
Adding a route#
When adding a route, JupyterHub may pass a JSON-serializable dict as a data
argument that should be attached to the proxy route. When that route is
retrieved, the data
argument should be returned as well. If your proxy
implementation doesn’t support storing data attached to routes, then your
Python wrapper may have to handle storing the data
piece itself, e.g in a
simple file or database.
async def add_route(self, routespec, target, data):
"""Proxy `routespec` to `target`.
Store `data` associated with the routespec
for retrieval later.
"""
Adding a route for a user looks like this:
await proxy.add_route('/user/pgeorgiou/', 'http://127.0.0.1:1227',
{'user': 'pgeorgiou'})
Removing routes#
delete_route()
is given a routespec to delete. If there is no such route,
delete_route
should still succeed, but a warning may be issued.
async def delete_route(self, routespec):
"""Delete the route"""
Retrieving routes#
For retrieval, you only need to implement a single method that retrieves all
routes. The return value for this function should be a dictionary, keyed by
routespec
, of dicts whose keys are the same three arguments passed to
add_route
(routespec
, target
, data
)
async def get_all_routes(self):
"""Return all routes, keyed by routespec"""
{
'/proxy/path/': {
'routespec': '/proxy/path/',
'target': 'http://...',
'data': {},
},
}
Note on activity tracking#
JupyterHub can track activity of users, for use in services such as culling
idle servers. As of JupyterHub 0.8, this activity tracking is the
responsibility of the proxy. If your proxy implementation can track activity
to endpoints, it may add a last_activity
key to the data
of routes
retrieved in .get_all_routes()
. If present, the value of last_activity
should be an ISO8601 UTC date
string:
{
'/user/pgeorgiou/': {
'routespec': '/user/pgeorgiou/',
'target': 'http://127.0.0.1:1227',
'data': {
'user': 'pgeourgiou',
'last_activity': '2017-10-03T10:33:49.570Z',
},
},
}
If the proxy does not track activity, then only activity to the Hub itself is tracked, and services such as cull-idle will not work.
Now that notebook-5.0
tracks activity internally, we can retrieve activity
information from the single-user servers instead, removing the need to track
activity in the proxy. But this is not yet implemented in JupyterHub 0.8.0.
Registering custom Proxies via entry points#
As of JupyterHub 1.0, custom proxy implementations can register themselves via
the jupyterhub.proxies
entry point metadata.
To do this, in your setup.py
add:
setup(
...
entry_points={
'jupyterhub.proxies': [
'mything = mypackage:MyProxy',
],
},
)
If you have added this metadata to your package, admins can select your authenticator with the configuration:
c.JupyterHub.proxy_class = 'mything'
instead of the full
c.JupyterHub.proxy_class = 'mypackage:MyProxy'
as previously required.
Additionally, configurable attributes for your proxy will
appear in jupyterhub help output and auto-generated configuration files
via jupyterhub --generate-config
.
Index of proxies#
A list of the proxies that are currently available for JupyterHub (that we know about).
jupyterhub/configurable-http-proxy
The default proxy which uses node-http-proxyjupyterhub/traefik-proxy
The proxy which configures traefik proxy server for jupyterhubAbdealiJK/configurable-http-proxy
A pure python implementation of the configurable-http-proxy
Using JupyterHub’s REST API#
This section will give you information on:
What you can do with the API
How to create an API token
Assigning permissions to a token
Updating to admin services
Making an API request programmatically using the requests library
Paginating API requests
Enabling users to spawn multiple named-servers via the API
Learn more about JupyterHub’s API
Before we discuss about JupyterHub’s REST API, you can learn about REST APIs here. A REST API provides a standard way for users to get and send information to the Hub.
What you can do with the API#
Using the JupyterHub REST API, you can perform actions on the Hub, such as:
Checking which users are active
Adding or removing users
Adding or removing services
Stopping or starting single user notebook servers
Authenticating services
Communicating with an individual Jupyter server’s REST API
Create an API token#
To send requests using the JupyterHub API, you must pass an API token with the request.
While JupyterHub is running, any JupyterHub user can request a token via the token
page.
This is accessible via a token
link in the top nav bar from the JupyterHub home page,
or at the URL /hub/token
.

JupyterHub’s API token page#

JupyterHub’s token page after successfully requesting a token.#
Register API tokens via configuration#
Sometimes, you’ll want to pre-generate a token for access to JupyterHub, typically for use by external services, so that both JupyterHub and the service have access to the same value.
First, you need to generate a good random secret. A good way of generating an API token is by running:
openssl rand -hex 32
This openssl
command generates a random token that can be added to the JupyterHub configuration in jupyterhub_config.py
.
For external services, this would be registered with JupyterHub via configuration:
c.JupyterHub.services = [
{
"name": "my-service",
"api_token": the_secret_value,
},
]
At this point, requests authenticated with the token will be associated with The service my-service
.
Note
You can also load additional tokens for users via the JupyterHub.api_tokens
configuration.
However, this option has been deprecated since the introduction of services.
Assigning permissions to a token#
Prior to JupyterHub 2.0, there were two levels of permissions:
user, and
admin
where a token would always have full permissions to do whatever its owner could do.
In JupyterHub 2.0,
specific permissions are now defined as ‘scopes’,
and can be assigned both at the user/service level,
and at the individual token level.
The previous behavior is represented by the scope inherit
,
and is still the default behavior for requesting a token if limited permissions are not specified.
This allows e.g. a user with full admin permissions to request a token with limited permissions.
In JupyterHub 5.0, you can specify scopes for a token when requesting it via the /hub/tokens
page as a space-separated list.
In JupyterHub 3.0 and later, you can also request tokens with limited scopes via the JupyterHub API (provided you already have a token!):
import json
from urllib.parse import quote
import requests
def request_token(
username, *, api_token, scopes=None, expires_in=0, hub_url="http://127.0.0.1:8081"
):
"""Request a new token for a user"""
request_body = {}
if expires_in:
request_body["expires_in"] = expires_in
if scopes:
request_body["scopes"] = scopes
url = hub_url.rstrip("/") + f"/hub/api/users/{quote(username)}/tokens"
r = requests.post(
url,
data=json.dumps(request_body),
headers={"Authorization": f"token {api_token}"},
)
if r.status_code >= 400:
# extract error message for nicer error messages
r.reason = r.json().get("message", r.text)
r.raise_for_status()
# response is a dict and will include the token itself in the 'token' field,
# as well as other fields about the token
return r.json()
request_token("myusername", scopes=["list:users"], api_token="abc123")
Updating to admin services#
Note
The api_tokens
configuration has been softly deprecated since the introduction of services.
We have no plans to remove it,
but deployments are encouraged to use service configuration instead.
If you have been using api_tokens
to create an admin user
and the token for that user to perform some automations, then
the services’ mechanism may be a better fit if you have the following configuration:
c.JupyterHub.admin_users = {"service-admin"}
c.JupyterHub.api_tokens = {
"secret-token": "service-admin",
}
This can be updated to create a service, with the following configuration:
c.JupyterHub.services = [
{
# give the token a name
"name": "service-admin",
"api_token": "secret-token",
# "admin": True, # if using JupyterHub 1.x
},
]
# roles were introduced in JupyterHub 2.0
# prior to 2.0, only "admin": True or False was available
c.JupyterHub.load_roles = [
{
"name": "service-role",
"scopes": [
# specify the permissions the token should have
"admin:users",
],
"services": [
# assign the service the above permissions
"service-admin",
],
}
]
The token will have the permissions listed in the role (see scopes for a list of available permissions), but there will no longer be a user account created to house it. The main noticeable difference between a user and a service is that there will be no notebook server associated with the account and the service will not show up in the various user list pages and APIs.
Make an API request#
To authenticate your requests, pass the API token in the request’s Authorization header.
Use requests#
Using the popular Python requests library, an API GET request is made to /users, and the request sends an API token for authorization. The response contains information about the users, here’s example code to make an API request for the users of a JupyterHub deployment
import requests
api_url = 'http://127.0.0.1:8081/hub/api'
r = requests.get(api_url + '/users',
headers={
'Authorization': f'token {token}',
}
)
r.raise_for_status()
users = r.json()
This example provides a slightly more complicated request (to /groups/formgrade-data301/users), yet the process is very similar:
import requests
api_url = 'http://127.0.0.1:8081/hub/api'
data = {'name': 'mygroup', 'users': ['user1', 'user2']}
r = requests.post(api_url + '/groups/formgrade-data301/users',
headers={
'Authorization': f'token {token}',
},
json=data,
)
r.raise_for_status()
r.json()
The same API token can also authorize access to the Jupyter Notebook REST API
provided by notebook servers managed by JupyterHub if it has the necessary access:servers
scope.
Paginating API requests#
Added in version 2.0.
Pagination is available through the offset
and limit
query parameters on
list endpoints, which can be used to return ideally sized windows of results.
Here’s example code demonstrating pagination on the GET /users
endpoint to fetch the first 20 records.
import os
import requests
api_url = 'http://127.0.0.1:8081/hub/api'
r = requests.get(
api_url + '/users?offset=0&limit=20',
headers={
"Accept": "application/jupyterhub-pagination+json",
"Authorization": f"token {token}",
},
)
r.raise_for_status()
r.json()
For backward-compatibility, the default structure of list responses is unchanged. However, this lacks pagination information (e.g. is there a next page), so if you have enough users that they won’t fit in the first response, it is a good idea to opt-in to the new paginated list format. There is a new schema for list responses which include pagination information. You can request this by including the header:
Accept: application/jupyterhub-pagination+json
with your request, in which case a response will look like:
{
"items": [
{
"name": "username",
"kind": "user",
...
},
],
"_pagination": {
"offset": 0,
"limit": 20,
"total": 50,
"next": {
"offset": 20,
"limit": 20,
"url": "http://127.0.0.1:8081/hub/api/users?limit=20&offset=20"
}
}
}
where the list results (same as pre-2.0) will be in items
,
and pagination info will be in _pagination
.
The next
field will include the offset
, limit
, and url
for requesting the next page.
next
will be null
if there is no next page.
Pagination is governed by two configuration options:
JupyterHub.api_page_default_limit
- the page size, iflimit
is unspecified in the request and the new pagination API is requested (default: 50)JupyterHub.api_page_max_limit
- the maximum page size a request can ask for (default: 200)
Pagination is enabled on the GET /users
, GET /groups
, and GET /proxy
REST endpoints.
Enabling users to spawn multiple named-servers via the API#
Support for multiple servers per user was introduced in JupyterHub version 0.8. Prior to that, each user could only launch a single default server via the API like this:
curl -X POST -H "Authorization: token <token>" "http://127.0.0.1:8081/hub/api/users/<user>/server"
With the named-server functionality, it’s now possible to launch more than one specifically named servers against a given user. This could be used, for instance, to launch each server based on a different image.
First you must enable named-servers by including the following setting in the jupyterhub_config.py
file.
c.JupyterHub.allow_named_servers = True
If you are using the zero-to-jupyterhub-k8s set-up to run JupyterHub,
then instead of editing the jupyterhub_config.py
file directly, you could pass
the following as part of the config.yaml
file, as per the tutorial:
hub:
extraConfig: |
c.JupyterHub.allow_named_servers = True
With that setting in place, a new named-server is activated like this:
POST /api/users/:username/servers/:servername
e.g.
curl -X POST -H "Authorization: token <token>" "http://127.0.0.1:8081/hub/api/users/<user>/servers/<serverA>"
curl -X POST -H "Authorization: token <token>" "http://127.0.0.1:8081/hub/api/users/<user>/servers/<serverB>"
The same servers can be stopped by substituting DELETE
for POST
above.
Some caveats for using named-servers#
For named-servers via the API to work, the spawner used to spawn these servers will need to be able to handle the case of multiple servers per user and ensure uniqueness of names, particularly if servers are spawned via docker containers or kubernetes pods.
Learn more about the API#
You can see the full JupyterHub REST API for more details.
Running proxy separately from the hub#
Background#
The thing which users directly connect to is the proxy, which by default is
configurable-http-proxy
. The proxy either redirects users to the
hub (for login and managing servers), or to their own single-user
servers. Thus, as long as the proxy stays running, access to existing
servers continues, even if the hub itself restarts or goes down.
When you first configure the hub, you may not even realize this because the proxy is automatically managed by the hub. This is great for getting started and even most use-cases, although, everytime you restart the hub, all user connections are also restarted. However, it is also simple to run the proxy as a service separate from the hub, so that you are free to reconfigure the hub while only interrupting users who are waiting for their notebook server to start. starting their notebook server.
The default JupyterHub proxy is configurable-http-proxy. If you are using a different proxy, such as Traefik, these instructions are probably not relevant to you.
Configuration options#
c.JupyterHub.cleanup_servers = False
should be set, which tells the
hub to not stop servers when the hub restarts (this is useful even if
you don’t run the proxy separately).
c.ConfigurableHTTPProxy.should_start = False
should be set, which
tells the hub that the proxy should not be started (because you start
it yourself).
c.ConfigurableHTTPProxy.auth_token = "CONFIGPROXY_AUTH_TOKEN"
should be set to a
token for authenticating communication with the proxy.
c.ConfigurableHTTPProxy.api_url = 'http://localhost:8001'
should be
set to the URL which the hub uses to connect to the proxy’s API.
Proxy configuration#
You need to configure a service to start the proxy. An example command line argument for this is:
$ configurable-http-proxy --ip=127.0.0.1 --port=8000 --api-ip=127.0.0.1 --api-port=8001 --default-target=http://localhost:8081 --error-target=http://localhost:8081/hub/error
(Details on how to do this is out of the scope of this tutorial. For example, it might be a systemd service configured within another docker container). The proxy has no configuration files, all configuration is via the command line and environment variables.
--api-ip
and --api-port
(which tells the proxy where to listen) should match the hub’s ConfigurableHTTPProxy.api_url
.
--ip
, -port
, and other options configure the user connections to the proxy.
--default-target
and --error-target
should point to the hub, and used when users navigate to the proxy originally.
You must define the environment variable CONFIGPROXY_AUTH_TOKEN
to
match the token given to c.ConfigurableHTTPProxy.auth_token
.
You should check the configurable-http-proxy options to see what other options are needed, for example, SSL options. Note that these options are configured in the hub if the hub is starting the proxy, so you need to configure the options there.
Docker image#
You can use jupyterhub configurable-http-proxy docker image to run the proxy.
See also#
Working with templates and UI#
The pages of the JupyterHub application are generated from Jinja templates. These allow the header, for example, to be defined once and incorporated into all pages. By providing your own template(s), you can have complete control over JupyterHub’s appearance.
Custom Templates#
JupyterHub will look for custom templates in all paths included in the
JupyterHub.template_paths
configuration option, falling back on these
default templates
if no custom template(s) with specified name(s) are found. This fallback
behavior is new in version 0.9; previous versions searched only the paths
explicitly included in template_paths
. You may override as many
or as few templates as you desire.
Extending Templates#
Jinja provides a mechanism to extend templates.
A base template can define block
(s) within itself that child templates can fill up or
supply content to. The
JupyterHub default templates
make extensive use of blocks, thus allowing you to customize parts of the
interface easily.
In general, a child template can extend a base template, page.html
, by beginning with:
{% extends "page.html" %}
This works, unless you are trying to extend the default template for the same
file name. Starting in version 0.9, you may refer to the base file with a
templates/
prefix. Thus, if you are writing a custom page.html
, start the
file with this block:
{% extends "templates/page.html" %}
By defining block
s with the same name as in the base template, child templates
can replace those sections with custom content. The content from the base
template can be included in the child template with the {{ super() }}
directive.
Example#
To add an additional message to the spawn-pending page, below the existing
text about the server starting up, place the content below in a file named
spawn_pending.html
. This directory must also be included in the
JupyterHub.template_paths
configuration option.
{% extends "templates/spawn_pending.html" %} {% block message %} {{ super() }}
<p>Patience is a virtue.</p>
{% endblock %}
Page Announcements#
To add announcements to be displayed on a page, you have two options:
Use configuration variables
Announcement Configuration Variables#
If you set the configuration variable JupyterHub.template_vars = {'announcement': 'some_text'}
, the given some_text
will be placed on
the top of all pages. The more specific variables
announcement_login
, announcement_spawn
, announcement_home
, and
announcement_logout
are more specific and only show on their
respective pages (overriding the global announcement
variable).
Note that changing these variables requires a restart, unlike direct
template extension.
Alternatively, you can get the same effect by extending templates, which allows you
to update the messages without restarting. Set
c.JupyterHub.template_paths
as mentioned above, and then create a
template (for example, login.html
) with:
{% extends "templates/login.html" %} {% set announcement = 'some message' %}
Extending page.html
puts the message on all pages, but note that
extending page.html
takes precedence over an extension of a specific
page (unlike the variable-based approach above).
Upgrading JupyterHub#
JupyterHub offers easy upgrade pathways between minor versions. This document describes how to do these upgrades.
If you are using a JupyterHub distribution, you should consult the distribution’s documentation on how to upgrade. This documentation is for those who have set up their JupyterHub without using a distribution.
This documentation is lengthy because it is quite detailed. Most likely, upgrading JupyterHub is painless, quick and with minimal user interruption.
The steps are discussed in detail, so if you get stuck at any step you can always refer to this guide.
Read the Changelog#
The changelog contains information on what has changed with the new JupyterHub release and any deprecation warnings. Read these notes to familiarize yourself with the coming changes. There might be new releases of the authenticators & spawners you use, so read the changelogs for those too!
Notify your users#
If you use the default configuration where configurable-http-proxy
is managed by JupyterHub, your users will see service disruption during
the upgrade process. You should notify them, and pick a time to do the
upgrade where they will be least disrupted.
If you use a different proxy or run configurable-http-proxy
independent of JupyterHub, your users will be able to continue using notebook
servers they had already launched, but will not be able to launch new servers or sign in.
Backup database & config#
Before doing an upgrade, it is critical to back up:
Your JupyterHub database (SQLite by default, or MySQL / Postgres if you used those). If you use SQLite (the default), you should backup the
jupyterhub.sqlite
file.Your
jupyterhub_config.py
file.Your users’ home directories. This is unlikely to be affected directly by a JupyterHub upgrade, but we recommend a backup since user data is critical.
Shut down JupyterHub#
Shut down the JupyterHub process. This would vary depending on how you
have set up JupyterHub to run. It is most likely using a process
supervisor of some sort (systemd
or supervisord
or even docker
).
Use the supervisor-specific command to stop the JupyterHub process.
Upgrade JupyterHub packages#
There are two environments where the jupyterhub
package is installed:
The hub environment: where the JupyterHub server process runs. This is started with the
jupyterhub
command, and is what people generally think of as JupyterHub.The notebook user environments: where the user notebook servers are launched from, and is probably custom to your own installation. This could be just one environment (different from the hub environment) that is shared by all users, one environment per user, or the same environment as the hub environment. The hub launched the
jupyterhub-singleuser
command in this environment, which in turn starts the notebook server.
You need to make sure the version of the jupyterhub
package matches
in both these environments. If you installed jupyterhub
with pip,
you can upgrade it with:
python3 -m pip install --upgrade jupyterhub==<version>
Where <version>
is the version of JupyterHub you are upgrading to.
If you used conda
to install jupyterhub
, you should upgrade it
with:
conda install -c conda-forge jupyterhub==<version>
You should also check for new releases of the authenticator & spawner you are using. You might wish to upgrade those packages, too, along with JupyterHub or upgrade them separately.
Upgrade JupyterHub database#
Once new packages are installed, you need to upgrade the JupyterHub
database. From the hub environment, in the same directory as your
jupyterhub_config.py
file, you should run:
jupyterhub upgrade-db
This should find the location of your database, and run the necessary upgrades for it.
SQLite database disadvantages#
SQLite has some disadvantages when it comes to upgrading JupyterHub. These are:
upgrade-db
may not work, and you may need to delete your database and start with a fresh one.downgrade-db
will not work if you want to rollback to an earlier version, so backup thejupyterhub.sqlite
file before upgrading.
What happens if I delete my database?#
Losing the Hub database is often not a big deal. Information that resides only in the Hub database includes:
active login tokens (user cookies, service tokens)
users added via JupyterHub UI, instead of config files
info about running servers
If the following conditions are true, you should be fine clearing the Hub database and starting over:
users specified in the config file, or login using an external authentication provider (Google, GitHub, LDAP, etc)
user servers are stopped during the upgrade
don’t mind causing users to log in again after the upgrade
Start JupyterHub#
Once the database upgrade is completed, start the jupyterhub
process again.
Log in and start the server to make sure things work as expected.
Check the logs for any errors or deprecation warnings. You might have to update your
jupyterhub_config.py
file to deal with any deprecated options.
Congratulations, your JupyterHub has been upgraded!
Interpreting common log messages#
When debugging errors and outages, looking at the logs emitted by JupyterHub is very helpful. This document intends to describe some common log messages, what they mean and what are the most common causes that generated them, as well as some possible ways to fix them.
Failing suspected API request to not-running server#
Example#
Your logs might be littered with lines that look scary
[W 2022-03-10 17:25:19.774 JupyterHub base:1349] Failing suspected API request to not-running server: /hub/user/<user-name>/api/metrics/v1
Cause#
This likely means that the user’s server has stopped running but they still have a browser tab open. For example, you might have 3 tabs open and you shut the server down via one. Another possible reason could be that you closed your laptop and the server was culled for inactivity, then reopened the laptop! However, the client-side code (JupyterLab, Classic Notebook, etc) doesn’t interpret the shut-down server and continues to make some API requests.
JupyterHub’s architecture means that the proxy routes all requests that don’t go to a running user server to the hub process itself. The hub process then explicitly returns a failure response, so the client knows that the server is not running anymore. This is used by JupyterLab to inform the user that the server is not running anymore, and provide an option to restart it.
Most commonly, you’ll see this in reference to the /api/metrics/v1
URL, used by jupyter-resource-usage.
Actions you can take#
This log message is benign, and there is usually no action for you to take.
JupyterHub Singleuser Version mismatch#
Example#
jupyterhub version 1.5.0 != jupyterhub-singleuser version 1.3.0. This could cause failure to authenticate and result in redirect loops!
Cause#
JupyterHub requires the jupyterhub
python package installed inside the image or
environment, the user server starts in. This message indicates that the version of
the jupyterhub
package installed inside the user image or environment is not
the same as the JupyterHub server’s version itself. This is not necessarily always a
problem - some version drift is mostly acceptable, and the only two known cases of
breakage are across the 0.7 and 2.0 version releases. In those cases, issues pop
up immediately after upgrading your version of JupyterHub, so always check the JupyterHub
changelog before upgrading!. The primary problems this could cause are:
Infinite redirect loops after the user server starts
Missing expected environment variables in the user server once it starts
Failure for the started user server to authenticate with the JupyterHub server - note that this is not the same as user authentication failing!
However, for the most part, unless you are seeing these specific issues, the log
message should be counted as a warning to get the jupyterhub
package versions
aligned, rather than as an indicator of an existing problem.
Actions you can take#
Upgrade the version of the jupyterhub
package in your user environment or image
so that it matches the version of JupyterHub running your JupyterHub server! If you
are using the zero-to-jupyterhub helm chart, you can find the appropriate
version of the jupyterhub
package to install in your user image here
Configuration#
The following guides provide examples, including configuration files and tips, for the following:
Configuring user environments#
To deploy JupyterHub means you are providing Jupyter notebook environments for multiple users. Often, this includes a desire to configure the user environment in a custom way.
Since the jupyterhub-singleuser
server extends the standard Jupyter notebook
server, most configuration and documentation that applies to Jupyter Notebook
applies to the single-user environments. Configuration of user environments
typically does not occur through JupyterHub itself, but rather through system-wide
configuration of Jupyter, which is inherited by jupyterhub-singleuser
.
Tip: When searching for configuration tips for JupyterHub user environments, you might want to remove JupyterHub from your search because there are a lot more people out there configuring Jupyter than JupyterHub and the configuration is the same.
This section will focus on user environments, which includes the following:
Installing packages#
To make packages available to users, you will typically install packages system-wide or in a shared environment.
This installation location should always be in the same environment where
jupyterhub-singleuser
itself is installed in, and must be readable and
executable by your users. If you want your users to be able to install additional
packages, the installation location must also be writable by your users.
If you are using a standard Python installation on your system, use the following command:
sudo python3 -m pip install numpy
to install the numpy package in the default Python 3 environment on your system
(typically /usr/local
).
You may also use conda to install packages. If you do, you should make sure that the conda environment has appropriate permissions for users to be able to run Python code in the env. The env must be readable and executable by all users. Additionally it must be writeable if you want users to install additional packages.
Configuring Jupyter and IPython#
Jupyter and IPython have their own configuration systems.
As a JupyterHub administrator, you will typically want to install and configure environments for all JupyterHub users. For example, let’s say you wish for each student in a class to have the same user environment configuration.
Jupyter and IPython support “system-wide” locations for configuration, which is the logical place to put global configuration that you want to affect all users. It’s generally more efficient to configure user environments “system-wide”, and it’s a good practice to avoid creating files in the users’ home directories. The typical locations for these config files are:
system-wide in
/etc/{jupyter|ipython}
env-wide (environment wide) in
{sys.prefix}/etc/{jupyter|ipython}
.
Jupyter environment configuration priority#
When Jupyter runs in an environment (conda or virtualenv), it prefers to load configuration from the environment over each user’s own configuration (e.g. in ~/.jupyter
).
This may cause issues if you use a shared conda environment or virtualenv for users, because e.g. jupyterlab may try to write information like workspaces or settings to the environment instead of the user’s own directory.
This could fail with something like Permission denied: $PREFIX/etc/jupyter/lab
.
To avoid this issue, set JUPYTER_PREFER_ENV_PATH=0
in the user environment:
c.Spawner.environment.update(
{
"JUPYTER_PREFER_ENV_PATH": "0",
}
)
which tells Jupyter to prefer user configuration paths (e.g. in ~/.jupyter
) to configuration set in the environment.
Example: Enable an extension system-wide#
For example, to enable the cython
IPython extension for all of your users, create the file /etc/ipython/ipython_config.py
:
c.InteractiveShellApp.extensions.append("cython")
Example: Enable a Jupyter notebook configuration setting for all users#
Note
These examples configure the Jupyter ServerApp, which is used by JupyterLab, the default in JupyterHub 2.0.
If you are using the classing Jupyter Notebook server, the same things should work, with the following substitutions:
Search for
jupyter_server_config
, and replace withjupyter_notebook_config
Search for
NotebookApp
, and replace withServerApp
To enable Jupyter notebook’s internal idle-shutdown behavior (requires notebook ≥ 5.4), set the following in the /etc/jupyter/jupyter_server_config.py
file:
# shutdown the server after no activity for an hour
c.ServerApp.shutdown_no_activity_timeout = 60 * 60
# shutdown kernels after no activity for 20 minutes
c.MappingKernelManager.cull_idle_timeout = 20 * 60
# check for idle kernels every two minutes
c.MappingKernelManager.cull_interval = 2 * 60
Installing kernelspecs#
You may have multiple Jupyter kernels installed and want to make sure that they are available to all of your users. This means installing kernelspecs either system-wide (e.g. in /usr/local/) or in the sys.prefix
of JupyterHub
itself.
Jupyter kernelspec installation is system-wide by default, but some kernels may default to installing kernelspecs in your home directory. These will need to be moved system-wide to ensure that they are accessible.
To see where your kernelspecs are, you can use the following command:
jupyter kernelspec list
Example: Installing kernels system-wide#
Let’s assume that I have a Python 2 and Python 3 environment that I want to make sure are available, I can install their specs system-wide (in /usr/local) using the following command:
/path/to/python3 -m ipykernel install --prefix=/usr/local
/path/to/python2 -m ipykernel install --prefix=/usr/local
Multi-user hosts vs. Containers#
There are two broad categories of user environments that depend on what Spawner you choose:
Multi-user hosts (shared system)
Container-based
How you configure user environments for each category can differ a bit depending on what Spawner you are using.
The first category is a shared system (multi-user host) where
each user has a JupyterHub account, a home directory as well as being
a real system user. In this example, shared configuration and installation
must be in a ‘system-wide’ location, such as /etc/
, or /usr/local
or a custom prefix such as /opt/conda
.
When JupyterHub uses container-based Spawners (e.g. KubeSpawner or DockerSpawner), the ‘system-wide’ environment is really the container image used for users.
In both cases, you want to avoid putting configuration in user home directories because users can change those configuration settings. Also, home directories typically persist once they are created, thereby making it difficult for admins to update later.
Named servers#
By default, in a JupyterHub deployment, each user has one server only.
JupyterHub can, however, have multiple servers per user. This is mostly useful in deployments where users can configure the environment in which their server will start (e.g. resource requests on an HPC cluster), so that a given user can have multiple configurations running at the same time, without having to stop and restart their own server.
To allow named servers, include this code snippet in your config file:
c.JupyterHub.allow_named_servers = True
Named servers were implemented in the REST API in JupyterHub 0.8, and JupyterHub 1.0 introduces UI for managing named servers via the user home page:
as well as the admin page:
Named servers can be accessed, created, started, stopped, and deleted from these pages. Activity tracking is now per server as well.
To limit the number of named server per user by setting a constant value, include this code snippet in your config file:
c.JupyterHub.named_server_limit_per_user = 5
Alternatively, to use a callable/awaitable based on the handler object, include this code snippet in your config file:
def named_server_limit_per_user_fn(handler):
user = handler.current_user
if user and user.admin:
return 0
return 5
c.JupyterHub.named_server_limit_per_user = named_server_limit_per_user_fn
This can be useful for quota service implementations. The example above limits the number of named servers for non-admin users only.
If named_server_limit_per_user
is set to 0
, no limit is enforced.
When using named servers, Spawners may need additional configuration to take the servername
into account. Whilst KubeSpawner
takes the servername
into account by default in pod_name_template
, other Spawners may not. Check the documentation for the specific Spawner to see how singleuser servers are named, for example in DockerSpawner
this involves modifying the name_template
setting to include servername
, eg. "{prefix}-{username}-{servername}"
.
Switching back to the classic notebook#
By default, the single-user server launches JupyterLab, which is based on Jupyter Server.
This is the default server when running JupyterHub ≥ 2.0.
To switch to using the legacy Jupyter Notebook server (notebook < 7.0), you can set the JUPYTERHUB_SINGLEUSER_APP
environment variable
(in the single-user environment) to:
export JUPYTERHUB_SINGLEUSER_APP='notebook.notebookapp.NotebookApp'
Note
JUPYTERHUB_SINGLEUSER_APP='notebook.notebookapp.NotebookApp'
is only valid for notebook < 7. notebook v7 is based on jupyter-server,
and the default jupyter-server application must be used.
Selecting the new notebook UI is no longer a matter of selecting the server app to launch,
but only the default URL for users to visit.
To use notebook v7 with JupyterHub, leave the default singleuser app config alone (or specify JUPYTERHUB_SINGLEUSER_APP=jupyter-server
) and set the default URL for user servers:
c.Spawner.default_url = '/tree/'
Changed in version 2.0: JupyterLab is now the default single-user UI, if available,
which is based on the Jupyter Server,
no longer the legacy Jupyter Notebook server.
JupyterHub prior to 2.0 launched the legacy notebook server (jupyter notebook
),
and the Jupyter server could be selected by specifying the following:
# jupyterhub_config.py
c.Spawner.cmd = ["jupyter-labhub"]
Alternatively, for an otherwise customized Jupyter Server app, set the environment variable using the following command:
export JUPYTERHUB_SINGLEUSER_APP='jupyter_server.serverapp.ServerApp'
Configure GitHub OAuth#
In this example, we show a configuration file for a fairly standard JupyterHub deployment with the following assumptions:
Running JupyterHub on a single cloud server
Using SSL on the standard HTTPS port 443
Using GitHub OAuth (using OAuthenticator) for login
Using the default spawner (to configure other spawners, uncomment and edit
spawner_class
as well as follow the instructions for your desired spawner)Users exist locally on the server
Users’ notebooks to be served from
~/assignments
to allow users to browse for notebooks within other users’ home directoriesYou want the landing page for each user to be a
Welcome.ipynb
notebook in their assignments directoryAll runtime files are put into
/srv/jupyterhub
and log files in/var/log
The jupyterhub_config.py
file would have these settings:
# jupyterhub_config.py file
c = get_config()
import os
pjoin = os.path.join
runtime_dir = os.path.join('/srv/jupyterhub')
ssl_dir = pjoin(runtime_dir, 'ssl')
if not os.path.exists(ssl_dir):
os.makedirs(ssl_dir)
# Allows multiple single-server per user
c.JupyterHub.allow_named_servers = True
# https on :443
c.JupyterHub.port = 443
c.JupyterHub.ssl_key = pjoin(ssl_dir, 'ssl.key')
c.JupyterHub.ssl_cert = pjoin(ssl_dir, 'ssl.cert')
# put the JupyterHub cookie secret and state db
# in /var/run/jupyterhub
c.JupyterHub.cookie_secret_file = pjoin(runtime_dir, 'cookie_secret')
c.JupyterHub.db_url = pjoin(runtime_dir, 'jupyterhub.sqlite')
# or `--db=/path/to/jupyterhub.sqlite` on the command-line
# use GitHub OAuthenticator for local users
c.JupyterHub.authenticator_class = 'oauthenticator.LocalGitHubOAuthenticator'
c.GitHubOAuthenticator.oauth_callback_url = os.environ['OAUTH_CALLBACK_URL']
# create system users that don't exist yet
c.LocalAuthenticator.create_system_users = True
# specify users and admin
c.Authenticator.allowed_users = {'rgbkrk', 'minrk', 'jhamrick'}
c.Authenticator.admin_users = {'jhamrick', 'rgbkrk'}
# uses the default spawner
# To use a different spawner, uncomment `spawner_class` and set to desired
# spawner (e.g. SudoSpawner). Follow instructions for desired spawner
# configuration.
# c.JupyterHub.spawner_class = 'sudospawner.SudoSpawner'
# start single-user notebook servers in ~/assignments,
# with ~/assignments/Welcome.ipynb as the default landing page
# this config could also be put in
# /etc/jupyter/jupyter_notebook_config.py
c.Spawner.notebook_dir = '~/assignments'
c.Spawner.args = ['--NotebookApp.default_url=/notebooks/Welcome.ipynb']
Using the GitHub Authenticator requires a few additional environment variables to be set prior to launching JupyterHub:
export GITHUB_CLIENT_ID=github_id
export GITHUB_CLIENT_SECRET=github_secret
export OAUTH_CALLBACK_URL=https://example.com/hub/oauth_callback
export CONFIGPROXY_AUTH_TOKEN=super-secret
# append log output to log file /var/log/jupyterhub.log
jupyterhub -f /etc/jupyterhub/jupyterhub_config.py &>> /var/log/jupyterhub.log
Visit the Github OAuthenticator reference to see the full list of options for configuring Github OAuth with JupyterHub.
Using a reverse proxy#
In the following example, we show configuration files for a JupyterHub server
running locally on port 8000
but accessible from the outside on the standard
SSL port 443
. This could be useful if the JupyterHub server machine is also
hosting other domains or content on 443
. The goal in this example is to
satisfy the following:
JupyterHub is running on a server, accessed only via
HUB.DOMAIN.TLD:443
On the same machine,
NO_HUB.DOMAIN.TLD
strictly serves different content, also on port443
nginx
orapache
is used as the public access point (which means that only nginx/apache will bind to443
)After testing, the server in question should be able to score at least an A on the Qualys SSL Labs SSL Server Test
Let’s start out with the needed JupyterHub configuration in jupyterhub_config.py
:
# Force the proxy to only listen to connections to 127.0.0.1 (on port 8000)
c.JupyterHub.bind_url = 'http://127.0.0.1:8000'
(For Jupyterhub < 0.9 use c.JupyterHub.ip = '127.0.0.1'
.)
For high-quality SSL configuration, we also generate Diffie-Helman parameters. This can take a few minutes:
openssl dhparam -out /etc/ssl/certs/dhparam.pem 4096
Nginx#
This nginx
config file is fairly standard fare except for the two
location
blocks within the main section for HUB.DOMAIN.tld.
To create a new site for jupyterhub in your Nginx config, make a new file
in sites.enabled
, e.g. /etc/nginx/sites.enabled/jupyterhub.conf
:
# Top-level HTTP config for WebSocket headers
# If Upgrade is defined, Connection = upgrade
# If Upgrade is empty, Connection = close
map $http_upgrade $connection_upgrade {
default upgrade;
'' close;
}
# HTTP server to redirect all 80 traffic to SSL/HTTPS
server {
listen 80;
server_name HUB.DOMAIN.TLD;
# Redirect the request to HTTPS
return 302 https://$host$request_uri;
}
# HTTPS server to handle JupyterHub
server {
listen 443;
ssl on;
server_name HUB.DOMAIN.TLD;
ssl_certificate /etc/letsencrypt/live/HUB.DOMAIN.TLD/fullchain.pem;
ssl_certificate_key /etc/letsencrypt/live/HUB.DOMAIN.TLD/privkey.pem;
ssl_protocols TLSv1 TLSv1.1 TLSv1.2;
ssl_prefer_server_ciphers on;
ssl_dhparam /etc/ssl/certs/dhparam.pem;
ssl_ciphers 'ECDHE-RSA-AES128-GCM-SHA256:ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-RSA-AES256-GCM-SHA384:ECDHE-ECDSA-AES256-GCM-SHA384:DHE-RSA-AES128-GCM-SHA256:DHE-DSS-AES128-GCM-SHA256:kEDH+AESGCM:ECDHE-RSA-AES128-SHA256:ECDHE-ECDSA-AES128-SHA256:ECDHE-RSA-AES128-SHA:ECDHE-ECDSA-AES128-SHA:ECDHE-RSA-AES256-SHA384:ECDHE-ECDSA-AES256-SHA384:ECDHE-RSA-AES256-SHA:ECDHE-ECDSA-AES256-SHA:DHE-RSA-AES128-SHA256:DHE-RSA-AES128-SHA:DHE-DSS-AES128-SHA256:DHE-RSA-AES256-SHA256:DHE-DSS-AES256-SHA:DHE-RSA-AES256-SHA:AES128-GCM-SHA256:AES256-GCM-SHA384:AES128-SHA256:AES256-SHA256:AES128-SHA:AES256-SHA:AES:CAMELLIA:DES-CBC3-SHA:!aNULL:!eNULL:!EXPORT:!DES:!RC4:!MD5:!PSK:!aECDH:!EDH-DSS-DES-CBC3-SHA:!EDH-RSA-DES-CBC3-SHA:!KRB5-DES-CBC3-SHA';
ssl_session_timeout 1d;
ssl_session_cache shared:SSL:50m;
ssl_stapling on;
ssl_stapling_verify on;
add_header Strict-Transport-Security max-age=15768000;
# Managing literal requests to the JupyterHub frontend
location / {
proxy_pass http://127.0.0.1:8000;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header Host $http_host;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
# websocket headers
proxy_http_version 1.1;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection $connection_upgrade;
proxy_set_header X-Scheme $scheme;
proxy_buffering off;
}
# Managing requests to verify letsencrypt host
location ~ /.well-known {
allow all;
}
}
If nginx
is not running on port 443, substitute $http_host
for $host
on
the lines setting the Host
header.
nginx
will now be the front-facing element of JupyterHub on 443
which means
it is also free to bind other servers, like NO_HUB.DOMAIN.TLD
to the same port
on the same machine and network interface. In fact, one can simply use the same
server blocks as above for NO_HUB
and simply add a line for the root directory
of the site as well as the applicable location call:
server {
listen 80;
server_name NO_HUB.DOMAIN.TLD;
# Redirect the request to HTTPS
return 302 https://$host$request_uri;
}
server {
listen 443;
ssl on;
# INSERT OTHER SSL PARAMETERS HERE AS ABOVE
# SSL cert may differ
# Set the appropriate root directory
root /var/www/html
# Set URI handling
location / {
try_files $uri $uri/ =404;
}
# Managing requests to verify letsencrypt host
location ~ /.well-known {
allow all;
}
}
Now restart nginx
, restart the JupyterHub, and enjoy accessing
https://HUB.DOMAIN.TLD
while serving other content securely on
https://NO_HUB.DOMAIN.TLD
.
SELinux permissions for Nginx#
On distributions with SELinux enabled (e.g. Fedora), one may encounter permission errors when the Nginx service is started.
We need to allow Nginx to perform network relay and connect to the JupyterHub port. The following commands do that:
semanage port -a -t http_port_t -p tcp 8000
setsebool -P httpd_can_network_relay 1
setsebool -P httpd_can_network_connect 1
Replace 8000 with the port the JupyterHub server is running from.
Apache#
As with Nginx above, you can use Apache as the reverse proxy. First, we will need to enable the Apache modules that we are going to need:
a2enmod ssl rewrite proxy headers proxy_http proxy_wstunnel
Our Apache configuration is equivalent to the Nginx configuration above:
Redirect HTTP to HTTPS
Good SSL Configuration
Support for WebSocket on any proxied URL
JupyterHub is running locally at http://127.0.0.1:8000
# Redirect HTTP to HTTPS
Listen 80
<VirtualHost HUB.DOMAIN.TLD:80>
ServerName HUB.DOMAIN.TLD
Redirect / https://HUB.DOMAIN.TLD/
</VirtualHost>
Listen 443
<VirtualHost HUB.DOMAIN.TLD:443>
ServerName HUB.DOMAIN.TLD
# Enable HTTP/2, if available
Protocols h2 http/1.1
# HTTP Strict Transport Security (mod_headers is required) (63072000 seconds)
Header always set Strict-Transport-Security "max-age=63072000"
# Configure SSL
SSLEngine on
SSLCertificateFile /etc/letsencrypt/live/HUB.DOMAIN.TLD/fullchain.pem
SSLCertificateKeyFile /etc/letsencrypt/live/HUB.DOMAIN.TLD/privkey.pem
SSLOpenSSLConfCmd DHParameters /etc/ssl/certs/dhparam.pem
# Intermediate configuration from SSL-config.mozilla.org (2022-03-03)
# Please note, that this configuration might be outdated - please update it accordingly using https://ssl-config.mozilla.org/
SSLProtocol all -SSLv3 -TLSv1 -TLSv1.1
SSLCipherSuite ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-RSA-AES128-GCM-SHA256:ECDHE-ECDSA-AES256-GCM-SHA384:ECDHE-RSA-AES256-GCM-SHA384:ECDHE-ECDSA-CHACHA20-POLY1305:ECDHE-RSA-CHACHA20-POLY1305:DHE-RSA-AES128-GCM-SHA256:DHE-RSA-AES256-GCM-SHA384
SSLHonorCipherOrder off
SSLSessionTickets off
# Use RewriteEngine to handle WebSocket connection upgrades
RewriteEngine On
RewriteCond %{HTTP:Connection} Upgrade [NC]
RewriteCond %{HTTP:Upgrade} websocket [NC]
RewriteRule /(.*) ws://127.0.0.1:8000/$1 [P,L]
<Location "/">
# preserve Host header to avoid cross-origin problems
ProxyPreserveHost on
# proxy to JupyterHub
ProxyPass http://127.0.0.1:8000/
ProxyPassReverse http://127.0.0.1:8000/
RequestHeader set "X-Forwarded-Proto" expr=%{REQUEST_SCHEME}
</Location>
</VirtualHost>
In case of the need to run JupyterHub under /jhub/ or another location please use the below configurations:
JupyterHub running locally at http://127.0.0.1:8000/jhub/ or other location
httpd.conf amendments:
RewriteRule /jhub/(.*) ws://127.0.0.1:8000/jhub/$1 [P,L]
RewriteRule /jhub/(.*) http://127.0.0.1:8000/jhub/$1 [P,L]
ProxyPass /jhub/ http://127.0.0.1:8000/jhub/
ProxyPassReverse /jhub/ http://127.0.0.1:8000/jhub/
jupyterhub_config.py amendments:
# The public facing URL of the whole JupyterHub application.
# This is the address on which the proxy will bind. Sets protocol, IP, base_url
c.JupyterHub.bind_url = 'http://127.0.0.1:8000/jhub/'
Run JupyterHub without root privileges using sudo
#
Note: Setting up sudo
permissions involves many pieces of system
configuration. It is quite easy to get wrong and very difficult to debug.
Only do this if you are very sure you must.
Overview#
There are many Authenticators and Spawners available for JupyterHub. Some, such as DockerSpawner or OAuthenticator, do not need any elevated permissions. This document describes how to get the full default behavior of JupyterHub while running notebook servers as real system users on a shared system, without running the Hub itself as root.
Since JupyterHub needs to spawn processes as other users, the simplest way is to run it as root, spawning user servers with setuid. But this isn’t especially safe, because you have a process running on the public web as root.
A more prudent way to run the server while preserving functionality is to
create a dedicated user with sudo
access restricted to launching and
monitoring single-user servers.
Create a user#
To do this, first create a user that will run the Hub:
sudo useradd rhea
This user shouldn’t have a login shell or password (possible with -r).
Set up sudospawner#
Next, you will need sudospawner to enable monitoring the single-user servers with sudo:
sudo python3 -m pip install sudospawner
Now we have to configure sudo to allow the Hub user (rhea
) to launch
the sudospawner script on behalf of our hub users (here zoe
and wash
).
We want to confine these permissions to only what we really need.
Edit /etc/sudoers
#
To do this we add to /etc/sudoers
(use visudo
for safe editing of sudoers):
specify the list of users
JUPYTER_USERS
for whomrhea
can spawn serversset the command
JUPYTER_CMD
thatrhea
can execute on behalf of usersgive
rhea
permission to runJUPYTER_CMD
on behalf ofJUPYTER_USERS
without entering a password
For example:
# comma-separated list of users that can spawn single-user servers
# this should include all of your Hub users
Runas_Alias JUPYTER_USERS = rhea, zoe, wash
# the command(s) the Hub can run on behalf of the above users without needing a password
# the exact path may differ, depending on how sudospawner was installed
Cmnd_Alias JUPYTER_CMD = /usr/local/bin/sudospawner
# actually give the Hub user permission to run the above command on behalf
# of the above users without prompting for a password
rhea ALL=(JUPYTER_USERS) NOPASSWD:JUPYTER_CMD
It might be useful to modify secure_path
to add commands in path. (Search for
secure_path
in the sudo docs
As an alternative to adding every user to the /etc/sudoers
file, you can
use a group in the last line above, instead of JUPYTER_USERS
:
rhea ALL=(%jupyterhub) NOPASSWD:JUPYTER_CMD
If the jupyterhub
group exists, there will be no need to edit /etc/sudoers
again. A new user will gain access to the application when added to the group:
$ adduser -G jupyterhub newuser
Test sudo
setup#
Test that the new user doesn’t need to enter a password to run the sudospawner command.
This should prompt for your password to switch to rhea
, but not prompt for
any password for the second switch. It should show some help output about
logging options:
$ sudo -u rhea sudo -n -u $USER /usr/local/bin/sudospawner --help
Usage: /usr/local/bin/sudospawner [OPTIONS]
Options:
--help show this help information
...
And this should fail:
$ sudo -u rhea sudo -n -u $USER echo 'fail'
sudo: a password is required
Enable PAM for non-root#
By default, PAM authentication is used by JupyterHub. To use PAM, the process may need to be able to read the shadow password database.
Shadow group (Linux)#
Note: On Fedora based distributions there is no clear way to configure the PAM database to allow sufficient access for authenticating with the target user’s password from JupyterHub. As a workaround we recommend use an alternative authentication method.
$ ls -l /etc/shadow
-rw-r----- 1 root shadow 2197 Jul 21 13:41 shadow
If there’s already a shadow group, you are set. If its permissions are more like:
$ ls -l /etc/shadow
-rw------- 1 root wheel 2197 Jul 21 13:41 shadow
Then you may want to add a shadow group, and make the shadow file group-readable:
$ sudo groupadd shadow
$ sudo chgrp shadow /etc/shadow
$ sudo chmod g+r /etc/shadow
We want our new user to be able to read the shadow passwords, so add it to the shadow group:
$ sudo usermod -a -G shadow rhea
If you want jupyterhub to serve pages on a restricted port (such as port 80 for HTTP),
then you will need to give node
permission to do so:
sudo setcap 'cap_net_bind_service=+ep' /usr/bin/node
However, you may want to further understand the consequences of this. (Further reading)
You may also be interested in limiting the amount of CPU any process can use
on your server. cpulimit
is a useful tool that is available for many Linux
distributions’ packaging system. This can be used to keep any user’s process
from using too much CPU cycles. You can configure it accoring to these
instructions.
Shadow group (FreeBSD)#
NOTE: This has not been tested on FreeBSD and may not work as expected on the FreeBSD platform. Do not use in production without verifying that it works properly!
$ ls -l /etc/spwd.db /etc/master.passwd
-rw------- 1 root wheel 2516 Aug 22 13:35 /etc/master.passwd
-rw------- 1 root wheel 40960 Aug 22 13:35 /etc/spwd.db
Add a shadow group if there isn’t one, and make the shadow file group-readable:
$ sudo pw group add shadow
$ sudo chgrp shadow /etc/spwd.db
$ sudo chmod g+r /etc/spwd.db
$ sudo chgrp shadow /etc/master.passwd
$ sudo chmod g+r /etc/master.passwd
We want our new user to be able to read the shadow passwords, so add it to the shadow group:
$ sudo pw user mod rhea -G shadow
Test that PAM works#
We can verify that PAM is working, with:
$ sudo -u rhea python3 -c "import pamela, getpass; print(pamela.authenticate('$USER', getpass.getpass()))"
Password: [enter your unix password]
Make a directory for JupyterHub#
JupyterHub stores its state in a database, so it needs write access to a directory. The simplest way to deal with this is to make a directory owned by your Hub user, and use that as the CWD when launching the server.
$ sudo mkdir /etc/jupyterhub
$ sudo chown rhea /etc/jupyterhub
Start jupyterhub#
Finally, start the server as our newly configured user, rhea
:
$ cd /etc/jupyterhub
$ sudo -u rhea jupyterhub --JupyterHub.spawner_class=sudospawner.SudoSpawner
And try logging in.
Troubleshooting: SELinux#
If you still get a generic Permission denied
PermissionError
, it’s possible SELinux is blocking you.
Here’s how you can make a module to resolve this.
First, put this in a file named sudo_exec_selinux.te
:
module sudo_exec_selinux 1.1;
require {
type unconfined_t;
type sudo_exec_t;
class file { read entrypoint };
}
#============= unconfined_t ==============
allow unconfined_t sudo_exec_t:file entrypoint;
Then run all of these commands as root:
$ checkmodule -M -m -o sudo_exec_selinux.mod sudo_exec_selinux.te
$ semodule_package -o sudo_exec_selinux.pp -m sudo_exec_selinux.mod
$ semodule -i sudo_exec_selinux.pp
Troubleshooting: PAM session errors#
If the PAM authentication doesn’t work and you see errors for
login:session-auth
, or similar, consider updating to a more recent version
of jupyterhub and disabling the opening of PAM sessions with
c.PAMAuthenticator.open_sessions=False
.
Explanation#
The Explanation section provides further details that can be used to better understand JupyterHub, such as how it can be used and configured. They are intended for those seeking to expand their knowledge of JupyterHub.
Explanation#
Explanation documentation provide big-picture descriptions of how JupyterHub works. This section is meant to build your understanding of particular topics.
Capacity planning#
General capacity planning advice for JupyterHub is hard to give, because it depends almost entirely on what your users are doing, and what JupyterHub users do varies wildly in terms of resource consumption.
There is no single answer to “I have X users, what resources do I need?” or “How many users can I support with this machine?”
Here are three typical Jupyter use patterns that require vastly different resources:
Learning: negligible resources because computation is mostly idle, e.g. students learning programming for the first time
Production code: very intense, sustained load, e.g. training machine learning models
Bursting: mostly idle, but needs a lot of resources for short periods of time (interactive research often looks like this)
But just because there’s no single answer doesn’t mean we can’t help. So we have gathered here some useful information to help you make your decisions about what resources you need based on how your users work, including the relative invariants in terms of resources that JupyterHub itself needs.
JupyterHub infrastructure#
JupyterHub consists of a few components that are always running. These take up very little resources, especially relative to the resources consumed by users when you have more than a few.
As an example, an instance of mybinder.org (running JupyterHub 1.5.0), running with typically ~100-150 users has:
Component |
CPU (mean/peak) |
Memory (mean/peak) |
---|---|---|
Hub |
4% / 13% |
(230 MB / 260 MB) |
Proxy |
6% / 13% |
(47 MB / 65 MB) |
So it would be pretty generous to allocate ~25% of one CPU core and ~500MB of RAM to overall JupyterHub infrastructure.
The rest is going to be up to your users. Per-user overhead from JupyterHub is typically negligible up to at least a few hundred concurrent active users.

JupyterHub component resource usage for mybinder.org.#
Factors to consider#
Static vs elastic resources#
A big factor in planning resources is: how much does it cost to change your mind? If you are using a single shared machine with local storage, migrating to a new one because it turns out your users don’t fit might be very costly. You will have to get a new machine, set it up, and maybe even migrate user data.
On the other hand, if you are using ephemeral resources, such as node pools in Kubernetes, changing resource types costs close to nothing because nodes can automatically be added or removed as needed.
Take that cost into account when you are picking how much memory or cpu to allocate to users.
Static resources (like the-littlest-jupyterhub) provide for more stable, predictable costs, but elastic resources (like zero-to-jupyterhub) tend to provide lower overall costs (especially when deployed with monitoring allowing cost optimizations over time), but which are less predictable.
Limit vs Request for resources#
Many scheduling tools like Kubernetes have two separate ways of allocating resources to users. A Request or Reservation describes how much resources are set aside for each user. Often, this doesn’t have any practical effect other than deciding when a given machine is considered ‘full’. If you are using expandable resources like an autoscaling Kubernetes cluster, a new node must be launched and added to the pool if you ‘request’ more resources than fit on currently running nodes (a cluster scale-up event). If you are running on a single VM, this describes how many users you can run at the same time, full stop.
A Limit, on the other hand, enforces a limit to how much resources any given user can consume. For more information on what happens when users try to exceed their limits, see Oversubscribed CPU is okay, running out of memory is bad.
In the strictest, safest case, you can have these two numbers be the same. That means that each user is limited to fit within the resources allocated to it. This avoids oversubscription of resources (allowing use of more than you have available), at the expense (in a literal, this-costs-money sense) of reserving lots of usually-idle capacity.
However, you often find that a small fraction of users use more resources than others. In this case you may give users limits that go beyond the amount of resources requested. This is called oversubscribing the resources available to users.
Having a gap between the request and the limit means you can fit a number of typical users on a node (based on the request), but still limit how much a runaway user can gobble up for themselves.
Oversubscribed CPU is okay, running out of memory is bad#
An important consideration when assigning resources to users is: What happens when users need more than I’ve given them?
A good summary to keep in mind:
When tasks don’t get enough CPU, things are slow. When they don’t get enough memory, things are broken.
This means it’s very important that users have enough memory, but much less important that they always have exclusive access to all the CPU they can use.
This relates to Limits and Requests, because these are the consequences of your limits and/or requests not matching what users actually try to use.
A table of mismatched resource allocation situations and their consequences:
issue |
consequence |
---|---|
Requests too high |
Unnecessarily high cost and/or low capacity. |
CPU limit too low |
Poor performance experienced by users |
CPU oversubscribed (too-low request + too-high limit) |
Poor performance across the system; may crash, if severe |
Memory limit too low |
Servers killed by Out-of-Memory Killer (OOM); lost work for users |
Memory oversubscribed (too-low request + too-high limit) |
System memory exhaustion - all kinds of hangs and crashes and weird errors. Very bad. |
Note that the ‘oversubscribed’ problem case is where the request is lower than typical usage, meaning that the total reserved resources isn’t enough for the total actual consumption. This doesn’t mean that all your users exceed the request, just that the limit gives enough room for the average user to exceed the request.
All of these considerations are important per node. Larger nodes means more users per node, and therefore more users to average over. It also means more chances for multiple outliers on the same node.
Example case for oversubscribing memory#
Take for example, this system and sampling of user behavior:
System memory = 8G
memory request = 1G, limit = 3G
typical ‘heavy’ user: 2G
typical ‘light’ user: 0.5G
This will assign 8 users to those 8G of RAM (remember: only requests are used for deciding when a machine is ‘full’). As long as the total of 8 users actual usage is under 8G, everything is fine. But the limit allows a total of 24G to be used, which would be a mess if everyone used their full limit. But not everyone uses the full limit, which is the point!
This pattern is fine if 1/8 of your users are ‘heavy’ because typical usage will be ~0.7G,
and your total usage will be ~5G (1 × 2 + 7 × 0.5 = 5.5
).
But if 50% of your users are ‘heavy’ you have a problem because that means your users will be trying to use 10G (4 × 2 + 4 × 0.5 = 10
),
which you don’t have.
You can make guesses at these numbers, but the only real way to get them is to measure (see Measuring user resource consumption).
CPU:memory ratio#
Most of the time, you’ll find that only one resource is the limiting factor for your users. Most often it’s memory, but for certain tasks, it could be CPU (or even GPUs).
Many cloud deployments have just one or a few fixed ratios of cpu to memory (e.g. ‘general purpose’, ‘high memory’, and ‘high cpu’). Setting your secondary resource allocation according to this ratio after selecting the more important limit results in a balanced resource allocation.
For instance, some of Google Cloud’s ratios are:
node type |
GB RAM / CPU core |
---|---|
n2-highmem |
8 |
n2-standard |
4 |
n2-highcpu |
1 |
Idleness#
Jupyter being an interactive tool means people tend to spend a lot more time reading and thinking than actually running resource-intensive code. This significantly affects how much cpu resources a typical active user needs, but often does not significantly affect the memory.
Ways to think about this:
More idle users means unused CPU. This generally means setting your CPU limit higher than your CPU request.
What do your users do when they are running code? Is it typically single-threaded local computation in a notebook? If so, there’s little reason to set a limit higher than 1 CPU core.
Do typical computations take a long time, or just a few seconds? Longer typical computations means it’s more likely for users to be trying to use the CPU at the same moment, suggesting a higher request.
Even with idle users, parallel computation adds up quickly - one user fully loading 4 cores and 3 using almost nothing still averages to more than a full CPU core per user.
Long-running intense computations suggest higher requests.
Again, using mybinder.org as an example—we run around 100 users on 8-core nodes, and still see fairly low overall CPU usage on each user node. The limit here is actually Kubernetes’ pods per node, not memory or CPU. This is likely a extreme case, as many Binder users come from clicking links on webpages without any actual intention of running code.

mybinder.org node CPU usage is low with 50-150 users sharing just 8 cores#
Concurrent users and culling idle servers#
Related to [][idleness], all of these resource consumptions and limits are calculated based on concurrently active users, not total users. You might have 10,000 users of your JupyterHub deployment, but only 100 of them running at any given time. That 100 is the main number you need to use for your capacity planning. JupyterHub costs scale very little based on the number of total users, up to a point.
There are two important definitions for active user:
Are they actually there (i.e. a human interacting with Jupyter, or running code that might be )
Is their server running (this is where resource reservations and limits are actually applied)
Connecting those two definitions (how long are servers running if their humans aren’t using them) is an important area of deployment configuration, usually implemented via the JupyterHub idle culler service.
There are a lot of considerations when it comes to culling idle users that will depend:
How much does it save me to shut down user servers? (e.g. keeping an elastic cluster small, or keeping a fixed-size deployment available to active users)
How much does it cost my users to have their servers shut down? (e.g. lost work if shutdown prematurely)
How easy do I want it to be for users to keep their servers running? (e.g. Do they want to run unattended simulations overnight? Do you want them to?)
Like many other things in this guide, there are many correct answers leading to different configuration choices. For more detail on culling configuration and considerations, consult the JupyterHub idle culler documentation.
More tips#
Start strict and generous, then measure#
A good tip, in general, is to give your users as much resources as you can afford that you think they might use. Then, use resource usage metrics like prometheus to analyze what your users actually need, and tune accordingly. Remember: Limits affect your user experience and stability. Requests mostly affect your costs.
For example, a sensible starting point (lacking any other information) might be:
request:
cpu: 0.5
mem: 2G
limit:
cpu: 1
mem: 2G
(more memory if significant computations are likely - machine learning models, data analysis, etc.)
Some actions
If you see out-of-memory killer events, increase the limit (or talk to your users!)
If you see typical memory well below your limit, reduce the request (but not the limit)
If nobody uses that much memory, reduce your limit
If CPU is your limiting scheduling factor and your CPUs are mostly idle, reduce the cpu request (maybe even to 0!).
If CPU usage continues to be low, increase the limit to 2 or 4 to allow bursts of parallel execution.
Measuring user resource consumption#
It is highly recommended to deploy monitoring services such as Prometheus and Grafana to get a view of your users’ resource usage. This is the only way to truly know what your users need.
JupyterHub has some experimental grafana dashboards you can use as a starting point, to keep an eye on your resource usage. Here are some sample charts from (again from mybinder.org), showing >90% of users using less than 10% CPU and 200MB, but a few outliers near the limit of 1 CPU and 2GB of RAM. This is the kind of information you can use to tune your requests and limits.
Measuring costs#
Measuring costs may be as important as measuring your users activity. If you are using a cloud provider, you can often use cost thresholds and quotas to instruct them to notify you if your costs are too high, e.g. “Have AWS send me an email if I hit X spending trajectory on week 3 of the month.” You can then use this information to tune your resources based on what you can afford. You can mix this information with user resource consumption to figure out if you have a problem, e.g. “my users really do need X resources, but I can only afford to give them 80% of X.” This information may prove useful when asking your budget-approving folks for more funds.
Additional resources#
There are lots of other resources for cost and capacity planning that may be specific to JupyterHub and/or your cloud provider.
Here are some useful links to other resources
Zero to JupyterHub documentation on
Cloud platform cost calculators:
The Hub’s Database#
JupyterHub uses a database to store information about users, services, and other data needed for operating the Hub. This is the state of the Hub.
Why does JupyterHub have a database?#
JupyterHub is a stateful application (more on that ‘state’ later). Updating JupyterHub’s configuration or upgrading the version of JupyterHub requires restarting the JupyterHub process to apply the changes. We want to minimize the disruption caused by restarting the Hub process, so it can be a mundane, frequent, routine activity. Storing state information outside the process for later retrieval is necessary for this, and one of the main thing databases are for.
A lot of the operations in JupyterHub are also relationships, which is exactly what SQL databases are great at. For example:
Given an API token, what user is making the request?
Which users don’t have running servers?
Which servers belong to user X?
Which users have not been active in the last 24 hours?
Finally, a database allows us to have more information stored without needing it all loaded in memory, e.g. supporting a large number (several thousands) of inactive users.
What’s in the database?#
The short answer of what’s in the JupyterHub database is “everything.” JupyterHub’s state lives in the database. That is, everything JupyterHub needs to be aware of to function that doesn’t come from the configuration files, such as
users, roles, role assignments
state, urls of running servers
Hashed API tokens
Short-lived state related to OAuth flow
Timestamps for when users, tokens, and servers were last used
What’s not in the database#
Not quite all of JupyterHub’s state is in the database. This mostly involves transient state, such as the ‘pending’ transitions of Spawners (starting, stopping, etc.). Anything not in the database must be reconstructed on Hub restart, and the only sources of information to do that are the database and JupyterHub configuration file(s).
How does JupyterHub use the database?#
JupyterHub makes some unusual choices in how it connects to the database. These choices represent trade-offs favoring single-process simplicity and performance at the expense of horizontal scalability (multiple Hub instances).
We often say that the Hub ‘owns’ the database. This ownership means that we assume the Hub is the only process that will talk to the database. This assumption enables us to make several caching optimizations that dramatically improve JupyterHub’s performance (i.e. data written recently to the database can be read from memory instead of fetched again from the database) that would not work if multiple processes could be interacting with the database at the same time.
Database operations are also synchronous, so while JupyterHub is waiting on a database operation, it cannot respond to other requests.
This allows us to avoid complex locking mechanisms, because transaction races can only occur during an await
, so we only need to make sure we’ve completed any given transaction before the next await
in a given request.
Note
We are slowly working to remove these assumptions, and moving to a more traditional db session per-request pattern. This will enable multiple Hub instances and enable scaling JupyterHub, but will significantly reduce the number of active users a single Hub instance can serve.
Database performance in a typical request#
Most authenticated requests to JupyterHub involve a few database transactions:
look up the authenticated user (e.g. look up token by hash, then resolve owner and permissions)
record activity
perform any relevant changes involved in processing the request (e.g. create the records for a running server when starting one)
This means that the database is involved in almost every request, but only in quite small, simple queries, e.g.:
lookup one token by hash
lookup one user by name
list tokens or servers for one user (typically 1-10)
etc.
The database as a limiting factor#
As a result of the above transactions in most requests, database performance is the leading factor in JupyterHub’s baseline requests-per-second performance, but that cost does not scale significantly with the number of users, active or otherwise. However, the database is rarely a limiting factor in JupyterHub performance in a practical sense, because the main thing JupyterHub does is start, stop, and monitor whole servers, which take far more time than any small database transaction, no matter how many records you have or how slow your database is (within reason). Additionally, there is usually very little load on the database itself.
By far the most taxing activity on the database is the ‘list all users’ endpoint, primarily used by the idle-culling service. Database-based optimizations have been added to make even these operations feasible for large numbers of users:
State filtering on GET /hub/api/users?state=active, which limits the number of results in the query to only the relevant subset (added in JupyterHub 1.3), rather than all users.
Pagination of all list endpoints, allowing the request of a large number of resources to be more fairly balanced with other Hub activities across multiple requests (added in 2.0).
Note
It’s important to note when discussing performance and limiting factors and that all of this only applies to requests to /hub/...
.
The Hub and its database are not involved in most requests to single-user servers (/user/...
), which is by design, and largely motivated by the fact that the Hub itself doesn’t need to be fast because its operations are infrequent and large.
Database backends#
JupyterHub supports a variety of database backends via SQLAlchemy. The default is sqlite, which works great for many cases, but you should be able to use many backends supported by SQLAlchemy. Usually, this will mean PostgreSQL or MySQL, both of which are officially supported and well tested with JupyterHub, but others may work as well. See SQLAlchemy’s docs for how to connect to different database backends. Doing so generally involves:
installing a Python package that provides a client implementation, and
setting
JupyterHub.db_url
to connect to your database with the specified implementation
Default backend: SQLite#
The default database backend for JupyterHub is SQLite. We have chosen SQLite as JupyterHub’s default because it’s simple (the ‘database’ is a single file) and ubiquitous (it is in the Python standard library). It works very well for testing, small deployments, and workshops.
For production systems, SQLite has some disadvantages when used with JupyterHub:
upgrade-db
may not always work, and you may need to start with a fresh databasedowngrade-db
will not work if you want to rollback to an earlier version, so backup thejupyterhub.sqlite
file before upgrading (JupyterHub automatically creates a date-stamped backup file when upgrading sqlite)
The sqlite documentation provides a helpful page about when to use SQLite and where traditional RDBMS may be a better choice.
Picking your database backend (PostgreSQL, MySQL)#
When running a long term deployment or a production system, we recommend using a full-fledged relational database, such as PostgreSQL or MySQL, that supports the SQL ALTER TABLE
statement, which is used in some database upgrade steps.
In general, you select your database backend with JupyterHub.db_url
, and can further configure it (usually not necessary) with JupyterHub.db_kwargs
.
Notes and Tips#
SQLite#
The SQLite database should not be used on NFS. SQLite uses reader/writer locks
to control access to the database. This locking mechanism might not work
correctly if the database file is kept on an NFS filesystem. This is because
fcntl()
file locking is broken on many NFS implementations. Therefore, you
should avoid putting SQLite database files on NFS since it will not handle well
multiple processes which might try to access the file at the same time.
PostgreSQL#
We recommend using PostgreSQL for production if you are unsure whether to use MySQL or PostgreSQL or if you do not have a strong preference. There is additional configuration required for MySQL that is not needed for PostgreSQL.
For example, to connect to a PostgreSQL database with psycopg2:
install psycopg2:
pip install psycopg2
(orpsycopg2-binary
to avoid compilation, which is not recommended for production)set authentication via environment variables
PGUSER
andPGPASSWORD
configure
JupyterHub.db_url
:c.JupyterHub.db_url = "postgresql+psycopg2://my-postgres-server:5432/my-db-name"
MySQL / MariaDB#
You should probably use the
pymysql
ormysqlclient
sqlalchemy provider, or another backend recommended by sqlalchemyYou also need to set
pool_recycle
to some value (typically 60 - 300, JupyterHub will default to 60) which depends on your MySQL setup. This is necessary since MySQL kills connections serverside if they’ve been idle for a while, and the connection from the hub will be idle for longer than most connections. This behavior will lead to frustrating ‘the connection has gone away’ errors from sqlalchemy ifpool_recycle
is not set.If you use
utf8mb4
collation with MySQL earlier than 5.7.7 or MariaDB earlier than 10.2.1 you may get an1709, Index column size too large
error. To fix this you need to setinnodb_large_prefix
to enabled andinnodb_file_format
toBarracuda
to allow for the index sizes jupyterhub uses.row_format
will be set toDYNAMIC
as long as those options are set correctly. Later versions of MariaDB and MySQL should set these values by default, as well as have a defaultDYNAMIC
row_format
and pose no trouble to users.
For example, to connect to a mysql database with mysqlclient:
install mysqlclient:
pip install mysqlclient
configure
JupyterHub.db_url
:c.JupyterHub.db_url = "mysql+mysqldb://myuser:mypassword@my-sql-server:3306/my-db-name"
Security Overview#
The Security Overview section helps you learn about:
the design of JupyterHub with respect to web security
the semi-trusted user
the available mitigations to protect untrusted users from each other
the value of periodic security audits
This overview also helps you obtain a deeper understanding of how JupyterHub works.
Semi-trusted and untrusted users#
JupyterHub is designed to be a simple multi-user server for modestly sized groups of semi-trusted users. While the design reflects serving semi-trusted users, JupyterHub can also be suitable for serving untrusted users, but is not suitable for untrusted users in its default configuration.
As a result, using JupyterHub with untrusted users means more work by the administrator, since much care is required to secure a Hub, with extra caution on protecting users from each other.
One aspect of JupyterHub’s design simplicity for semi-trusted users is that the Hub and single-user servers are placed in a single domain, behind a proxy. If the Hub is serving untrusted users, many of the web’s cross-site protections are not applied between single-user servers and the Hub, or between single-user servers and each other, since browsers see the whole thing (proxy, Hub, and single user servers) as a single website (i.e. single domain).
Protect users from each other#
To protect users from each other, a user must never be able to write arbitrary HTML and serve it to another user on the Hub’s domain. This is prevented by JupyterHub’s authentication setup because only the owner of a given single-user notebook server is allowed to view user-authored pages served by the given single-user notebook server.
To protect all users from each other, JupyterHub administrators must ensure that:
A user does not have permission to modify their single-user notebook server, including:
the installation of new packages in the Python environment that runs their single-user server;
the creation of new files in any
PATH
directory that precedes the directory containingjupyterhub-singleuser
(if thePATH
is used to resolve the single-user executable instead of using an absolute path);the modification of environment variables (e.g. PATH, PYTHONPATH) for their single-user server;
the modification of the configuration of the notebook server (the
~/.jupyter
orJUPYTER_CONFIG_DIR
directory).unrestricted selection of the base environment (e.g. the image used in container-based Spawners)
If any additional services are run on the same domain as the Hub, the services must never display user-authored HTML that is neither sanitized nor sandboxed to any user that lacks authentication as the author of a file.
Mitigate security issues#
The several approaches to mitigating security issues with configuration options provided by JupyterHub include:
Enable user subdomains#
JupyterHub provides the ability to run single-user servers on their own domains. This means the cross-origin protections between servers has the desired effect, and user servers and the Hub are protected from each other.
Subdomains are the only way to reliably isolate user servers from each other.
To enable subdomains, set:
c.JupyterHub.subdomain_host = "https://jupyter.example.org"
When subdomains are enabled, each user’s single-user server will be at e.g. https://username.jupyter.example.org
.
This also requires all user subdomains to point to the same address,
which is most easily accomplished with wildcard DNS, where a single A record points to your server and a wildcard CNAME record points to your A record:
A jupyter.example.org 192.168.1.123
CNAME *.jupyter.example.org jupyter.example.org
Since this spreads the service across multiple domains, you will likely need wildcard SSL as well,
matching *.jupyter.example.org
.
Unfortunately, for many institutional domains, wildcard DNS and SSL may not be available.
We also strongly encourage serving JupyterHub and user content on a domain that is not a subdomain of any sensitive content. For reasoning, see GitHub’s discussion of moving user content to github.io from *.github.com.
If you do plan to serve untrusted users, enabling subdomains is highly encouraged, as it resolves many security issues, which are difficult to unavoidable when JupyterHub is on a single-domain.
Important
JupyterHub makes no guarantees about protecting users from each other unless subdomains are enabled.
If you want to protect users from each other, you must enable per-user domains.
Disable user config#
If subdomains are unavailable or undesirable, JupyterHub provides a
configuration option Spawner.disable_user_config = True
, which can be set to prevent
the user-owned configuration files from being loaded. After implementing this
option, PATH
s and package installation are the other things that the
admin must enforce.
Prevent spawners from evaluating shell configuration files#
For most Spawners, PATH
is not something users can influence, but it’s important that
the Spawner should not evaluate shell configuration files prior to launching the server.
Isolate packages in a read-only environment#
The user must not have permission to install packages into the environment where the singleuser-server runs. On a shared system, package isolation is most easily handled by running the single-user server in a root-owned virtualenv with disabled system-site-packages. The user must not have permission to install packages into this environment. The same principle extends to the images used by container-based deployments. If users can select the images in which their servers run, they can disable all security for their own servers.
It is important to note that the control over the environment is only required for the single-user server, and not the environment(s) in which the users’ kernel(s) may run. Installing additional packages in the kernel environment does not pose additional risk to the web application’s security.
Encrypt internal connections with SSL/TLS#
By default, all communications within JupyterHub—between the proxy, hub, and single
-user notebooks—are performed unencrypted. Setting the internal_ssl
flag in
jupyterhub_config.py
secures the aforementioned routes. Turning this
feature on does require that the enabled Spawner
can use the certificates
generated by the Hub
(the default LocalProcessSpawner
can, for instance).
It is also important to note that this encryption does not cover the
zmq tcp
sockets between the Notebook client and kernel yet. While users cannot
submit arbitrary commands to another user’s kernel, they can bind to these
sockets and listen. When serving untrusted users, this eavesdropping can be
mitigated by setting KernelManager.transport
to ipc
. This applies standard
Unix permissions to the communication sockets thereby restricting
communication to the socket owner. The internal_ssl
option will eventually
extend to securing the tcp
sockets as well.
Mitigating same-origin deployments#
While per-user domains are required for robust protection of users from each other,
you can mitigate many (but not all) cross-user issues.
First, it is critical that users cannot modify their server environments, as described above.
Second, it is important that users do not have access:servers
permission to any server other than their own.
If users can access each others’ servers, additional security measures must be enabled, some of which come with distinct user-experience costs.
Without the Same-Origin Policy (SOP) protecting user servers from each other, each user server is considered a trusted origin for requests to each other user server (and the Hub itself). Servers cannot meaningfully distinguish requests originating from other user servers, because SOP implies a great deal of trust, losing many restrictions applied to cross-origin requests.
That means pages served from each user server can:
arbitrarily modify the path in the Referer
make fully authorized requests with cookies
access full page contents served from the hub or other servers via popups
JupyterHub uses distinct xsrf tokens stored in cookies on each server path to attempt to limit requests across. This has limitations because not all requests are protected by these XSRF tokens, and unless additional measures are taken, the XSRF tokens from other user prefixes may be retrieved.
For example:
Content-Security-Policy
header must prohibit popups and iframes from the same origin. The following Content-Security-Policy rules are insecure and readily enable users to access each others’ servers:frame-ancestors: 'self'
frame-ancestors: '*'
sandbox allow-popups
Ideally, pages should use the strictest
Content-Security-Policy: sandbox
available, but this is not feasible in general for JupyterLab pages, which need at leastsandbox allow-same-origin allow-scripts
to work.
The default Content-Security-Policy for single-user servers is
frame-ancestors: 'none'
which prohibits iframe embedding, but not pop-ups.
A more secure Content-Security-Policy that has some costs to user experience is:
frame-ancestors: 'none'; sandbox allow-same-origin allow-scripts
allow-popups
is not disabled by default because disabling it breaks legitimate functionality, like “Open this in a new tab”, and the “JupyterHub Control Panel” menu item.
To reiterate, the right way to avoid these issues is to enable per-user domains, where none of these concerns come up.
Note: even this level of protection requires administrators maintaining full control over the user server environment. If users can modify their server environment, these methods are ineffective, as users can readily disable them.
Forced-login#
Jupyter servers can share links with ?token=...
.
JupyterHub prior to 5.0 will accept this request and persist the token for future requests.
This is useful for enabling admins to create ‘fully authenticated’ links bypassing login.
However, it also means users can share their own links that will log other users into their own servers,
enabling them to serve each other notebooks and other arbitrary HTML, depending on server configuration.
Added in version 4.1: Setting environment variable JUPYTERHUB_ALLOW_TOKEN_IN_URL=0
in the single-user environment can opt out of accepting token auth in URL parameters.
Added in version 5.0: Accepting tokens in URLs is disabled by default, and JUPYTERHUB_ALLOW_TOKEN_IN_URL=1
environment variable must be set to allow token auth in URL parameters.
Security audits#
We recommend that you do periodic reviews of your deployment’s security. It’s good practice to keep JupyterHub, configurable-http-proxy, and nodejs versions up to date.
A handy website for testing your deployment is Qualsys’ SSL analyzer tool.
Vulnerability reporting#
If you believe you have found a security vulnerability in JupyterHub, or any Jupyter project, please report it to security@ipython.org. If you prefer to encrypt your security reports, you can use this PGP public key.
JupyterHub and OAuth#
JupyterHub uses OAuth 2 as an internal mechanism for authenticating users. As such, JupyterHub itself always functions as an OAuth provider. You can find out more about what that means below.
Additionally, JupyterHub is often deployed with OAuthenticator, where an external identity provider, such as GitHub or KeyCloak, is used to authenticate users. When this is the case, there are two nested OAuth flows: an internal OAuth flow where JupyterHub is the provider, and an external OAuth flow, where JupyterHub is the client.
This means that when you are using JupyterHub, there is always at least one and often two layers of OAuth involved in a user logging in and accessing their server.
The following points are noteworthy:
Single-user servers never need to communicate with or be aware of the upstream provider configured in your Authenticator. As far as the servers are concerned, only JupyterHub is an OAuth provider, and how users authenticate with the Hub itself is irrelevant.
When interacting with a single-user server, there are ~always two tokens: first, a token issued to the server itself to communicate with the Hub API, and second, a per-user token in the browser to represent the completed login process and authorized permissions. More on this later.
Key OAuth terms#
Here are some key definitions to keep in mind when we are talking about OAuth. You can also read more in detail here.
provider: The entity responsible for managing identity and authorization; always a web server. JupyterHub is always an OAuth provider for JupyterHub’s components. When OAuthenticator is used, an external service, such as GitHub or KeyCloak, is also an OAuth provider.
client: An entity that requests OAuth tokens on a user’s behalf; generally a web server of some kind. OAuth clients are services that delegate authentication and/or authorization to an OAuth provider. JupyterHub services or single-user servers are OAuth clients of the JupyterHub provider. When OAuthenticator is used, JupyterHub is itself also an OAuth client for the external OAuth provider, e.g. GitHub.
browser: A user’s web browser, which makes requests and stores things like cookies.
token: The secret value used to represent a user’s authorization. This is the final product of the OAuth process.
code: A short-lived temporary secret that the client exchanges for a token at the conclusion of OAuth, in what’s generally called the “OAuth callback handler.”
One oauth flow#
OAuth flow is what we call the sequence of HTTP requests involved in authenticating a user and issuing a token, ultimately used for authorizing access to a service or single-user server.
A single OAuth flow typically goes like this:
OAuth request and redirect#
A browser makes an HTTP request to an OAuth client.
There are no credentials, so the client redirects the browser to an “authorize” page on the OAuth provider with some extra information:
the OAuth client ID of the client itself.
the redirect URI to be redirected back to after completion.
the scopes requested, which the user should be presented with to confirm. This is the “X would like to be able to Y on your behalf. Allow this?” page you see on all the “Login with …” pages around the Internet.
During this authorize step, the browser must be authenticated with the provider. This is often already stored in a cookie, but if not the provider webapp must begin its own authentication process before serving the authorization page. This may even begin another OAuth flow!
After the user tells the provider that they want to proceed with the authorization, the provider records this authorization in a short-lived record called an OAuth code.
Finally, the oauth provider redirects the browser back to the oauth client’s “redirect URI” (or “OAuth callback URI”), with the OAuth code in a URL parameter.
That marks the end of the requests made between the browser and the provider.
State after redirect#
At this point:
The browser is authenticated with the provider.
The user’s authorized permissions are recorded in an OAuth code.
The provider knows that the permissions requested by the OAuth client have been granted, but the client doesn’t know this yet.
All the requests so far have been made directly by the browser. No requests have originated from the client or provider.
OAuth Client Handles Callback Request#
At this stage, we get to finish the OAuth process. Let’s dig into what the OAuth client does when it handles the OAuth callback request.
The OAuth client receives the code and makes an API request to the provider to exchange the code for a real token. This is the first direct request between the OAuth client and the provider.
Once the token is retrieved, the client usually makes a second API request to the provider to retrieve information about the owner of the token (the user). This is the step where behavior diverges for different OAuth providers. Up to this point, all OAuth providers are the same, following the OAuth specification. However, OAuth does not define a standard for issuing tokens in exchange for information about their owner or permissions (OpenID Connect does that), so this step may be different for each OAuth provider.
Finally, the OAuth client stores its own record that the user is authorized in a cookie. This could be the token itself, or any other appropriate representation of successful authentication.
Now that credentials have been established, the browser can be redirected to the original URL where it started, to try the request again. If the client wasn’t able to keep track of the original URL all this time (not always easy!), you might end up back at a default landing page instead of where you started the login process. This is frustrating!
😮💨 phew.
So that’s one OAuth process.
Full sequence of OAuth in JupyterHub#
Let’s go through the above OAuth process in JupyterHub, with specific examples of each HTTP request and what information it contains. For bonus points, we are using the double-OAuth example of JupyterHub configured with GitHubOAuthenticator.
To disambiguate, we will call the OAuth process where JupyterHub is the provider “internal OAuth,” and the one with JupyterHub as a client “external OAuth.”
Our starting point:
a user’s single-user server is running. Let’s call them
danez
Jupyterhub is running with GitHub as an OAuth provider (this means two full instances of OAuth),
Danez has a fresh browser session with no cookies yet.
First request:
browser->single-user server running JupyterLab or Jupyter Classic
GET /user/danez/notebooks/mynotebook.ipynb
no credentials, so single-user server (as an OAuth client) starts internal OAuth process with JupyterHub (the provider)
response: 302 redirect ->
/hub/api/oauth2/authorize
with:client-id=
jupyterhub-user-danez
redirect-uri=
/user/danez/oauth_callback
(we’ll come back later!)
Second request, following redirect:
browser->JupyterHub
GET /hub/api/oauth2/authorize
no credentials, so JupyterHub starts external OAuth process with GitHub
response: 302 redirect ->
https://github.com/login/oauth/authorize
with:client-id=
jupyterhub-client-uuid
redirect-uri=
/hub/oauth_callback
(we’ll come back later!)
pause This is where JupyterHub configuration comes into play. Recall, in this case JupyterHub is using:
c.JupyterHub.authenticator_class = 'github'
That means authenticating a request to the Hub itself starts
a second, external OAuth process with GitHub as a provider.
This external OAuth process is optional, though.
If you were using the default username+password PAMAuthenticator,
this redirect would have been to /hub/login
instead, to present the user
with a login form.
Third request, following redirect:
browser->GitHub
GET https://github.com/login/oauth/authorize
Here, GitHub prompts for login and asks for confirmation of authorization
(more redirects if you aren’t logged in to GitHub yet, but ultimately back to this /authorize
URL).
After successful authorization
(either by looking up a pre-existing authorization,
or recording it via form submission)
GitHub issues an OAuth code and redirects to /hub/oauth_callback?code=github-code
Next request:
browser->JupyterHub
GET /hub/oauth_callback?code=github-code
Inside the callback handler, JupyterHub makes two API requests:
The first:
JupyterHub->GitHub
POST https://github.com/login/oauth/access_token
request made with OAuth code from URL parameter
response includes an access token
The second:
JupyterHub->GitHub
GET https://api.github.com/user
request made with access token in the
Authorization
headerresponse is the user model, including username, email, etc.
Now the external OAuth callback request completes with:
set cookie on
/hub/
path, recording jupyterhub authentication so we don’t need to do external OAuth with GitHub again for a whileredirect ->
/hub/api/oauth2/authorize
🎉 At this point, we have completed our first OAuth flow! 🎉
Now, we get our first repeated request:
browser->jupyterhub
GET /hub/api/oauth2/authorize
this time with credentials, so jupyterhub either
serves the internal authorization confirmation page, or
automatically accepts authorization (shortcut taken when a user is visiting their own server)
redirect ->
/user/danez/oauth_callback?code=jupyterhub-code
Here, we start the same OAuth callback process as before, but at Danez’s single-user server for the internal OAuth.
browser->single-user server
GET /user/danez/oauth_callback
(in handler)
Inside the internal OAuth callback handler, Danez’s server makes two API requests to JupyterHub:
The first:
single-user server->JupyterHub
POST /hub/api/oauth2/token
request made with oauth code from url parameter
response includes an API token
The second:
single-user server->JupyterHub
GET /hub/api/user
request made with token in the
Authorization
headerresponse is the user model, including username, groups, etc.
Finally completing GET /user/danez/oauth_callback
:
response sets cookie, storing encrypted access token
finally redirects back to the original
/user/danez/notebooks/mynotebook.ipynb
Final request:
browser -> single-user server
GET /user/danez/notebooks/mynotebook.ipynb
encrypted jupyterhub token in cookie
To authenticate this request, the single token stored in the encrypted cookie is passed to the Hub for verification:
single-user server -> Hub
GET /hub/api/user
browser’s token in Authorization header
response: user model with name, groups, etc.
If the user model matches who should be allowed (e.g. Danez), then the request is allowed. See Scopes in JupyterHub for how JupyterHub uses scopes to determine authorized access to servers and services.
the end
Token caches and expiry#
Because tokens represent information from an external source, they can become ‘stale,’ or the information they represent may no longer be accurate. For example: a user’s GitHub account may no longer be authorized to use JupyterHub, that should ultimately propagate to revoking access and force logging in again.
To handle this, OAuth tokens and the various places they are stored can expire, which should have the same effect as no credentials, and trigger the authorization process again.
In JupyterHub’s internal OAuth, we have these layers of information that can go stale:
The OAuth client has a cache of Hub responses for tokens, so it doesn’t need to make API requests to the Hub for every request it receives. This cache has an expiry of five minutes by default, and is governed by the configuration
HubAuth.cache_max_age
in the single-user server.The internal OAuth token is stored in a cookie, which has its own expiry (default: 14 days), governed by
JupyterHub.cookie_max_age_days
.The internal OAuth token itself can also expire, which is by default the same as the cookie expiry, since it makes sense for the token itself and the place it is stored to expire at the same time. This is governed by
JupyterHub.cookie_max_age_days
first, or can overridden byJupyterHub.oauth_token_expires_in
.
That’s all for internal auth storage, but the information from the external authentication provider (could be PAM or GitHub OAuth, etc.) can also expire. Authenticator configuration governs when JupyterHub needs to ask again, triggering the external login process anew before letting a user proceed.
jupyterhub-hub-login
cookie stores that a browser is authenticated with the Hub. This expires according toJupyterHub.cookie_max_age_days
configuration, with a default of 14 days. Thejupyterhub-hub-login
cookie is encrypted withJupyterHub.cookie_secret
configuration.Authenticator.refresh_user()
is a method to refresh a user’s auth info. By default, it does nothing, but it can return an updated user model if a user’s information has changed, or force a full login process again if needed.Authenticator.auth_refresh_age
configuration governs how oftenrefresh_user()
will be called to check if a user must login again (default: 300 seconds).Authenticator.refresh_pre_spawn
configuration governs whetherrefresh_user()
should be called prior to spawning a server, to force fresh auth info when a server is launched (default: False). This can be useful when Authenticators pass access tokens to spawner environments, to ensure they aren’t getting a stale token that’s about to expire.
So what happens when these things expire or get stale?
If the HubAuth token response cache expires, when a request is made with a token, the Hub is asked for the latest information about the token. This usually has no visible effect, since it is just refreshing a cache. If it turns out that the token itself has expired or been revoked, the request will be denied.
If the token has expired, but is still in the cookie: when the token response cache expires, the next time the server asks the hub about the token, no user will be identified and the internal OAuth process begins again.
If the token cookie expires, the next browser request will be made with no credentials, and the internal OAuth process will begin again. This will usually have the form of a transparent redirect browsers won’t notice. However, if this occurs on an API request in a long-lived page visit such as a JupyterLab session, the API request may fail and require a page refresh to get renewed credentials.
If the JupyterHub cookie expires, the next time the browser makes a request to the Hub, the Hub’s authorization process must begin again (e.g. login with GitHub). Hub cookie expiry on its own does not mean that a user can no longer access their single-user server!
If credentials from the upstream provider (e.g. GitHub) become stale or outdated, these will not be refreshed until/unless
refresh_user
is called andrefresh_user()
on the given Authenticator is implemented to perform such a check. At this point, few Authenticators implementrefresh_user
to support this feature. If your Authenticator does not or cannot implementrefresh_user
, the only way to force a check is to reset theJupyterHub.cookie_secret
encryption key, which invalidates thejupyterhub-hub-login
cookie for all users.
Logging out#
Logging out of JupyterHub means clearing and revoking many of these credentials:
The
jupyterhub-hub-login
cookie is revoked, meaning the next request to the Hub itself will require a new login.The token stored in the
jupyterhub-user-username
cookie for the single-user server will be revoked, based on its associaton withjupyterhub-session-id
, but the cookie itself cannot be cleared at this pointThe shared
jupyterhub-session-id
is cleared, which ensures that the HubAuth token response cache will not be used, and the next request with the expired token will ask the Hub, which will inform the single-user server that the token has expired
Extra bits#
A tale of two tokens#
TODO: discuss API token issued to server at startup ($JUPYTERHUB_API_TOKEN) and OAuth-issued token in the cookie, and some details of how JupyterLab currently deals with that. They are different, and JupyterLab should be making requests using the token from the cookie, not the token from the server, but that is not currently the case.
Redirect loops#
In general, an authenticated web endpoint has this behavior, based on the authentication/authorization state of the browser:
If authorized, allow the request to happen
If authenticated (I know who you are) but not authorized (you are not allowed), fail with a 403 permission denied error
If not authenticated, start a redirect process to establish authorization, which should end in a redirect back to the original URL to try again. This is why problems in authentication result in redirect loops! If the second request fails to detect the authentication that should have been established during the redirect, it will start the authentication redirect process over again, and keep redirecting in a loop until the browser balks.
The JupyterHub single-user server#
When a user logs into JupyterHub, they get a ‘server’, which we usually call the single-user server, because it’s a server that’s meant for a single JupyterHub user. Each JupyterHub user gets a different one (or more than one!).
A single-user server is a process running somewhere that is:
accessible over http[s],
authenticated via JupyterHub using OAuth 2.0,
started by a Spawner, and
‘owned’ by a single JupyterHub user
The single-user server command#
The Spawner’s default single-user server startup command, jupyterhub-singleuser
, launches jupyter-server
, the same program used when you run jupyter lab
on your laptop.
(It can also launch the legacy jupyter-notebook
server).
That’s why JupyterHub looks familiar to folks who are already using Jupyter at home or elsewhere.
It’s the same!
jupyterhub-singleuser
customizes that program to change (approximately) one thing: authenticate requests with JupyterHub.
Single-user server authentication#
Implementation-wise, JupyterHub single-user servers are a special-case of Services
and as such use the same (OAuth) authentication mechanism (more on OAuth in JupyterHub at JupyterHub and OAuth).
This is primarily implemented in the HubOAuth
class.
This code resides in jupyterhub.singleuser
subpackage of JupyterHub.
The main task of this code is to:
resolve a JupyterHub token to a JupyterHub user (authenticate)
check permissions (
access:servers
) for the token to make sure the request should be allowed (authorize)if not authorized, begin the OAuth process with a redirect to the Hub
after login, store OAuth tokens in a cookie only used by this single-user server
implement logout to clear the cookie
Most of this is implemented in the HubOAuth
class. jupyterhub.singleuser
is responsible for adapting the base Jupyter Server to use HubOAuth for these tasks.
JupyterHub authentication extension#
By default, jupyter-server
uses its own cookie to authenticate.
If that cookie is not present, the server redirects you a login page and asks you to enter a password or token.
Jupyter Server 2.0 introduces two new APIs for customizing authentication: the IdentityProvider and the Authorizer. More information can be found in the Jupyter Server documentation.
JupyterHub implements these APIs in jupyterhub.singleuser.extension
.
The IdentityProvider is responsible for authenticating requests. In JupyterHub, that means extracting OAuth tokens from the request and resolving them to a JupyterHub user.
The Authorizer is a separate API for authorizing actions on particular resources.
Because the JupyterHub IdentityProvider only allows authenticating users who already have the necessary access:servers
permission to access the server, the default Authorizer only contains a redundant check for this same permission, and ignores the resource inputs.
However, specifying a custom Authorizer allows for granular permissions, such as read-only access to subsets of a shared server.
JupyterHub authentication via subclass#
Prior to Jupyter Server 2 (i.e. Jupyter Server 1.x or the legacy jupyter-notebook
server), JupyterHub authentication is applied via subclass.
Originally a subclass of NotebookApp
,
this approach works with both jupyter-server
and jupyter-notebook
.
Instead of using the extension mechanisms above,
the server application is subclassed. This worked well in the jupyter-notebook
days,
but doesn’t fit well with Jupyter Server’s extension-based architecture.
Selecting jupyterhub-singleuser implementation#
Using the JupyterHub singleuser-server extension is the default behavior of JupyterHub 4 and Jupyter Server 2, otherwise the subclass approach is taken.
You can opt-out of the extension by setting the environment variable JUPYTERHUB_SINGLEUSER_EXTENSION=0
:
c.Spawner.environment.update(
{
"JUPYTERHUB_SINGLEUSER_EXTENSION": "0",
}
)
The subclass approach will also be taken if you’ve opted to use the classic notebook server with:
JUPYTERHUB_SINGLEUSER_APP=notebook
which was introduced in JupyterHub 2.
Other customizations#
jupyterhub-singleuser
makes other small customizations to how the single-user server behaves:
logs activity on the single-user server, used in idle-culling.
disables some features that don’t make sense in JupyterHub (trash, retrying ports)
loading options such as URLs and SSL configuration from the environment
customize logging for consistency with JupyterHub logs
Running a single-user server that’s not jupyterhub-singleuser
#
By default, jupyterhub-singleuser
is the same jupyter-server
used by JupyterLab, Jupyter notebook (>= 7), etc.
But technically, all JupyterHub cares about is that it is:
an http server at the prescribed URL, accessible from the Hub and proxy, and
authenticated via OAuth with the Hub (it doesn’t even have to do this, if you want to do your own authentication, as is done in BinderHub)
which means that you can customize JupyterHub to launch any web application that meets these criteria, by following the specifications in Services.
Most of the time, though, it’s easier to use jupyter-server-proxy if you want to launch additional web applications in JupyterHub.
JupyterHub RBAC#
Role Based Access Control (RBAC) in JupyterHub serves to provide fine grained control of access to Jupyterhub’s API resources.
RBAC is new in JupyterHub 2.0.
Motivation#
The JupyterHub API requires authorization to access its APIs. This ensures that an arbitrary user, or even an unauthenticated third party, are not allowed to perform such actions. For instance, the behaviour prior to adoption of RBAC is that creating or deleting users requires admin rights.
The prior system is functional, but lacks flexibility. If your Hub serves a number of users in different groups, you might want to delegate permissions to other users or automate certain processes. Prior to RBAC, appointing a ‘group-only admin’ or a bot that culls idle servers, requires granting full admin rights to all actions. This poses a risk of the user or service intentionally or unintentionally accessing and modifying any data within the Hub and violates the principle of least privilege.
To remedy situations like this, JupyterHub is transitioning to an RBAC system. By equipping users, groups and services with roles that supply them with a collection of permissions (scopes), administrators are able to fine-tune which parties are granted access to which resources.
Definitions#
Scopes are specific permissions used to evaluate API requests. For example: the API endpoint users/servers
, which enables starting or stopping user servers, is guarded by the scope servers
.
Scopes are not directly assigned to requesters. Rather, when a client performs an API call, their access will be evaluated based on their assigned roles.
Roles are collections of scopes that specify the level of what a client is allowed to do. For example, a group administrator may be granted permission to control the servers of group members, but not to create, modify or delete group members themselves. Within the RBAC framework, this is achieved by assigning a role to the administrator that covers exactly those privileges.
Technical Overview#
Roles#
JupyterHub provides four (4) roles that are available by default:
Default roles
user
role provides a default user scopeself
that grants access to the user’s own resources.admin
role contains all available scopes and grants full rights to all actions. This role cannot be edited.token
role provides a default token scopeinherit
that resolves to the same permissions as the owner of the token has.server
role allows for posting activity of “itself” only.
These roles cannot be deleted.
We call these ‘default’ roles because they are available by default and have a default collection of scopes. However, you can define the scopes associated with each role (excluding the admin role) to suit your needs, as seen below.
The user
, admin
, and token
roles, by default, all preserve the permissions prior to Role-based Access Control (RBAC).
Only the server
role is changed from pre-2.0, to reduce its permissions to activity-only
instead of the default of a full access token.
Additional custom roles can also be defined (see Defining Roles). Roles can be assigned to the following entities:
Users
Services
Groups
An entity can have zero, one, or multiple roles, and there are no restrictions on which roles can be assigned to which entity. Roles can be added to or removed from entities at any time.
Users
When a new user gets created, they are assigned their default role, user
. Additionally, if the user is created with admin privileges (via c.Authenticator.admin_users
in jupyterhub_config.py
or admin: true
via API), they will be also granted admin
role. If existing user’s admin status changes via API or jupyterhub_config.py
, their default role will be updated accordingly (after next startup for the latter).
Services
Services do not have a default role. Services without roles have no access to the guarded API end-points. So, most services will require assignment of a role in order to function.
Groups
A group does not require any role, and has no roles by default. If a user is a member of a group, they automatically inherit any of the group’s permissions (see Resolving roles and scopes for more details). This is useful for assigning a set of common permissions to several users.
Tokens
A token’s permissions are evaluated based on their owning entity. Since a token is always issued for a user or service, it can never have more permissions than its owner. If no specific scopes are requested for a new token, the token is assigned the scopes of the token
role.
Defining Roles#
Roles can be defined or modified in the configuration file as a list of dictionaries. An example:
# in jupyterhub_config.py
c.JupyterHub.load_roles = [
{
'name': 'server-rights',
'description': 'Allows parties to start and stop user servers',
'scopes': ['servers'],
'users': ['alice', 'bob'],
'services': ['idle-culler'],
'groups': ['admin-group'],
}
]
The role server-rights
now allows the starting and stopping of servers by any of the following:
users
alice
andbob
the service
idle-culler
any member of the
admin-group
.
Attention
Tokens cannot be assigned roles through role definition but may be assigned specific roles when requested via API (see Requesting API token with specific scopes).
Another example:
# in jupyterhub_config.py
c.JupyterHub.load_roles = [
{
'description': 'Read-only user models',
'name': 'reader',
'scopes': ['read:users'],
'services': ['external'],
'users': ['maria', 'joe']
}
]
The role reader
allows users maria
and joe
and service external
to read (but not modify) any user’s model.
Requirements
In a role definition, the name
field is required, while all other fields are optional.
Role names must:
be 3 - 255 characters
use ascii lowercase, numbers, ‘unreserved’ URL punctuation
-_.~
start with a letter
end with letter or number.
users
, services
, and groups
only accept objects that already exist in the database or are defined previously in the file.
It is not possible to implicitly add a new user to the database by defining a new role.
If no scopes are defined for new role, JupyterHub will raise a warning. Providing non-existing scopes will result in an error.
In case the role with a certain name already exists in the database, its definition and scopes will be overwritten. This holds true for all roles except the admin
role, which cannot be overwritten; an error will be raised if trying to do so. All the role bearers permissions present in the definition will change accordingly.
Overriding Default Roles#
Role definitions can include those of the “default” roles listed above (admin excluded),
if the default scopes associated with those roles do not suit your deployment.
For example, to specify what permissions the $JUPYTERHUB_API_TOKEN issued to all single-user servers
has,
define the server
role.
To restore the JupyterHub 1.x behavior of servers being able to do anything their owners can do,
use the scope inherit
(for ‘inheriting’ the owner’s permissions):
c.JupyterHub.load_roles = [
{
'name': 'server',
'scopes': ['inherit'],
}
]
or, better yet, identify the specific scopes you want server environments to have access to.
If you don’t want to get too detailed,
one option is the self
scope,
which will have no effect on non-admin users,
but will restrict the token issued to admin user servers to only have access to their own resources,
instead of being able to take actions on behalf of all other users.
c.JupyterHub.load_roles = [
{
'name': 'server',
'scopes': ['self'],
}
]
Removing Roles#
Only the entities present in the role definition in the jupyterhub_config.py
remain the role bearers. If a user, service or group is removed from the role definition, they will lose the role on the next startup.
Once a role is loaded, it remains in the database until removing it from the jupyterhub_config.py
and restarting the Hub. All previously defined role bearers will lose the role and associated permissions. Default roles, even if previously redefined through the config file and removed, will not be deleted from the database.
Scopes in JupyterHub#
A scope has a syntax-based design that reveals which resources it provides access to. Resources are objects with a type, associated data, relationships to other resources, and a set of methods that operate on them (see RESTful API documentation for more information).
<resource>
in the RBAC scope design refers to the resource name in the JupyterHub’s API endpoints in most cases. For instance, <resource>
equal to users
corresponds to JupyterHub’s API endpoints beginning with /users.
Scope conventions#
<resource>
The top-level<resource>
scopes, such asusers
orgroups
, grant read, write, and list permissions to the resource itself as well as its sub-resources. For example, the scopeusers:activity
is included in the scopeusers
.read:<resource>
Limits permissions to read-only operations on single resources.list:<resource>
Read-only access to listing endpoints. Useread:<resource>:<subresource>
to control what fields are returned.admin:<resource>
Grants additional permissions such as create/delete on the corresponding resource in addition to read and write permissions.access:<resource>
Grants access permissions to the<resource>
via API or browser.<resource>:<subresource>
The vertically filtered scopes provide access to a subset of the information granted by the<resource>
scope. E.g., the scopeusers:activity
only provides permission to post user activity.<resource>!<object>=<objectname>
Horizontal filtering is implemented by the!<object>=<objectname>
scope structure. A resource (or sub-resource) can be filtered based onuser
,server
,group
orservice
name. For instance,<resource>!user=charlie
limits access to only return resources of usercharlie
.
Only one filter per scope is allowed, but filters for the same scope have an additive effect; a larger filter can be used by supplying the scope multiple times with different filters.
By adding a scope to an existing role, all role bearers will gain the associated permissions.
Metascopes#
Metascopes do not follow the general scope syntax. Instead, a metascope resolves to a set of scopes, which can refer to different resources, based on their owning entity. In JupyterHub, there are currently two metascopes:
default user scope
self
, anddefault token scope
inherit
.
Default user scope#
Access to the user’s own resources and subresources is covered by metascope self
. This metascope includes the user’s model, activity, servers and tokens. For example, self
for a user named “gerard” includes:
users!user=gerard
where theusers
scope provides access to the full user model and activity. The filter restricts this access to the user’s own resources.servers!user=gerard
which grants the user access to their own servers without being able to create/delete any.tokens!user=gerard
which allows the user to access, request and delete their own tokens.access:servers!user=gerard
which allows the user to access their own servers via API or browser.
The self
scope is only valid for user entities. In other cases (e.g., for services) it resolves to an empty set of scopes.
Default token scope#
The token metascope inherit
causes the token to have the same permissions as the token’s owner. For example, if a token owner has roles containing the scopes read:groups
and read:users
, the inherit
scope resolves to the set of scopes {read:groups, read:users}
.
If the token owner has default user
role, the inherit
scope resolves to self
, which will subsequently be expanded to include all the user-specific scopes (or empty set in the case of services).
If the token owner is a member of any group with roles, the group scopes will also be included in resolving the inherit
scope.
Horizontal filtering#
Horizontal filtering, also called resource filtering, is the concept of reducing the payload of an API call to cover only the subset of the resources that the scopes of the client provides them access to.
Requested resources are filtered based on the filter of the corresponding scope. For instance, if a service requests a user list (guarded with scope read:users
) with a role that only contains scopes read:users!user=hannah
and read:users!user=ivan
, the returned list of user models will be an intersection of all users and the collection {hannah, ivan}
. In case this intersection is empty, the API call returns an HTTP 404 error, regardless if any users exist outside of the clients scope filter collection.
In case a user resource is being accessed, any scopes with group filters will be expanded to filters for each user in those groups.
Self-referencing filters#
There are some ‘shortcut’ filters, which can be applied to all scopes, that filter based on the entities associated with the request.
The !user
filter is a special horizontal filter that strictly refers to the “owner only” scopes, where owner is a user entity. The filter resolves internally into !user=<ownerusername>
ensuring that only the owner’s resources may be accessed through the associated scopes.
For example, the server
role assigned by default to server tokens contains access:servers!user
and users:activity!user
scopes. This allows the token to access and post activity of only the servers owned by the token owner.
Added in version 3.0: !service
and !server
filters.
In addition to !user
, tokens may have filters !service
or !server
, which expand similarly to !service=servicename
and !server=servername
.
This only applies to tokens issued via the OAuth flow.
In these cases, the name is the issuing entity (a service or single-user server),
so that access can be restricted to the issuing service,
e.g. access:servers!server
would grant access only to the server that requested the token.
These filters can be applied to any scope.
Vertical filtering#
Vertical filtering, also called attribute filtering, is the concept of reducing the payload of an API call to cover only the attributes of the resources that the scopes of the client provides them access to. This occurs when the client scopes are subscopes of the API endpoint that is called.
For instance, if a client requests a user list with the only scope being read:users:groups
, the returned list of user models will contain only a list of groups per user.
In case the client has multiple subscopes, the call returns the union of the data the client has access to.
The payload of an API call can be filtered both horizontally and vertically simultaneously. For instance, performing an API call to the endpoint /users/
with the scope users:name!user=juliette
returns a payload of [{name: 'juliette'}]
(provided that this name is present in the database).
Available scopes#
Table below lists all available scopes and illustrates their hierarchy. Indented scopes indicate subscopes of the scope(s) above them.
There are four exceptions to the general scope conventions:
read:users:name
is a subscope of bothread:users
andread:servers
.
Theread:servers
scope requires access to the user name (server owner) due to named servers distinguished internally in the form!server=username/servername
.read:users:activity
is a subscope of bothread:users
andusers:activity
.
Posting activity via theusers:activity
, which is not included inusers
scope, needs to check the last valid activity of the user.read:roles:users
is a subscope of bothread:roles
andadmin:users
.
Admin privileges to the users resource include the information about user roles.read:roles:groups
is a subscope of bothread:roles
andadmin:groups
.
Similar to theread:roles:users
above.
Table 1. Available scopes and their hierarchy
Scope |
Grants permission to: |
---|---|
|
Identify the owner of the requesting entity. |
|
The user’s own resources (metascope for users, resolves to (no_scope) for services) |
|
Everything that the token-owning entity can access (metascope for tokens) |
|
Access the admin page. Permission to take actions via the admin page granted separately. |
|
Read, write, create and delete users and their authentication state, not including their servers or tokens. |
|
Read a user’s authentication state. |
|
Read and write permissions to user models (excluding servers, tokens and authentication state). |
|
Read user models (excluding including servers, tokens and authentication state). |
|
Read names of users. |
|
Read users’ group membership. |
|
Read time of last user activity. |
|
List users, including at least their names. |
|
Read names of users. |
|
Update time of last user activity. |
|
Read time of last user activity. |
|
Read user role assignments. |
|
Delete users. |
|
Read role assignments. |
|
Read user role assignments. |
|
Read service role assignments. |
|
Read group role assignments. |
|
Read, start, stop, create and delete user servers and their state. |
|
Read and write users’ server state. |
|
Start and stop user servers. |
|
Read users’ names and their server models (excluding the server state). |
|
Read names of users. |
|
Stop and delete users’ servers. |
|
Read, write, create and delete user tokens. |
|
Read user tokens. |
|
Read and write group information, create and delete groups. |
|
Read and write group information, including adding/removing users to/from groups. |
|
Read group models. |
|
Read group names. |
|
List groups, including at least their names. |
|
Read group names. |
|
Read group role assignments. |
|
Delete groups. |
|
Create, read, update, delete services, not including services defined from config files. |
|
List services, including at least their names. |
|
Read service names. |
|
Read service models. |
|
Read service names. |
|
Read service role assignments. |
|
Read detailed information about the Hub. |
|
Access services via API or browser. |
|
Manage access to shared servers. |
|
Access user servers via API or browser. |
|
Read information about shared access to servers. |
|
Read and revoke a user’s access to shared servers. |
|
Read servers shared with a user. |
|
Read and revoke a group’s access to shared servers. |
|
Read servers shared with a group. |
|
Read information about the proxy’s routing table, sync the Hub with the proxy and notify the Hub about a new proxy. |
|
Shutdown the hub. |
|
Read prometheus metrics. |
Added in version 3.0: The admin-ui
scope is added to explicitly grant access to the admin page,
rather than combining admin:users
and admin:servers
permissions.
This means a deployment can enable the admin page with only a subset of functionality enabled.
Note that this means actions to take via the admin UI
and access to the admin UI are separated.
For example, it generally doesn’t make sense to grant
admin-ui
without at least list:users
for at least some subset of users.
For example:
c.JupyterHub.load_roles = [
{
"name": "instructor-data8",
"scopes": [
# access to the admin page
"admin-ui",
# list users in the class group
"list:users!group=students-data8",
# start/stop servers for users in the class
"admin:servers!group=students-data8",
# access servers for users in the class
"access:servers!group=students-data8",
],
"group": ["instructors-data8"],
}
]
will grant instructors in the data8 course permission to:
view the admin UI
see students in the class (but not all users)
start/stop/access servers for users in the class
but not permission to administer the users themselves (e.g. change their permissions, etc.)
Caution
Note that only the horizontal filtering can be added to scopes to customize them.
Metascopes self
and all
, <resource>
, <resource>:<subresource>
, read:<resource>
, admin:<resource>
, and access:<resource>
scopes are predefined and cannot be changed otherwise.
Access scopes#
An access scope is used to govern access to a JupyterHub service or a user’s single-user server. This means making API requests, or visiting via a browser using OAuth. Without the appropriate access scope, a user or token should not be permitted to make requests of the service.
When you attempt to access a service or server authenticated with JupyterHub, it will begin the oauth flow for issuing a token that can be used to access the service.
If the user does not have the access scope for the relevant service or server, JupyterHub will not permit the oauth process to complete.
If oauth completes, the token will have at least the access scope for the service.
For minimal permissions, this is the only scope granted to tokens issued during oauth by default,
but can be expanded via Spawner.oauth_client_allowed_scopes
or a service’s oauth_client_allowed_scopes
configuration.
If a given service or single-user server can be governed by a single boolean “yes, you can use this service” or “no, you can’t,” or limiting via other existing scopes, access scopes are enough to manage access to the service. But you can also further control granular access to servers or services with custom scopes, to limit access to particular APIs within the service, e.g. read-only access.
Example access scopes#
Some example access scopes for services:
- access:services
access to all services
- access:services!service=somename
access to the service named
somename
and for user servers:
- access:servers
access to all user servers
- access:servers!user
access to all of a user’s own servers (never in resolved scopes, but may be used in configuration)
- access:servers!user=name
access to all of
name
’s servers- access:servers!group=groupname
access to all servers owned by a user in the group
groupname
- access:servers!server
access to only the issuing server (only relevant when applied to oauth tokens associated with a particular server, e.g. via the
Spawner.oauth_client_allowed_scopes
configuration.- access:servers!server=username/
access to only
username
’s default server.
Custom scopes#
Added in version 3.0.
JupyterHub 3.0 introduces support for custom scopes. Services that use JupyterHub for authentication and want to implement their own granular access may define additional custom scopes and assign them to users with JupyterHub roles.
Custom scope names must start with custom:
and contain only lowercase ascii letters, numbers, hyphen, underscore, colon, and asterisk (-_:*
).
The part after custom:
must start with a letter or number.
Scopes may not end with a hyphen or colon.
The only strict requirement is that a custom scope definition must have a description
.
It may also have subscopes
if you are defining multiple scopes that have a natural hierarchy,
For example:
c.JupyterHub.custom_scopes = {
"custom:myservice:read": {
"description": "read-only access to myservice",
},
"custom:myservice:write": {
"description": "write access to myservice",
# write permission implies read permission
"subscopes": [
"custom:myservice:read",
],
},
}
c.JupyterHub.load_roles = [
# graders have read-only access to the service
{
"name": "service-user",
"groups": ["graders"],
"scopes": [
"custom:myservice:read",
"access:service!service=myservice",
],
},
# instructors have read and write access to the service
{
"name": "service-admin",
"groups": ["instructors"],
"scopes": [
"custom:myservice:write",
"access:service!service=myservice",
],
},
]
In the above configuration, two scopes are defined:
custom:myservice:read
grants read-only access to the service, andcustom:myservice:write
grants write access to the servicewrite access implies read access via the
subscope
These custom scopes are assigned to two groups via roles
:
users in the group
graders
are granted read access to the serviceusers in the group
instructors
areboth are granted access to the service via
access:service!service=myservice
When the service completes OAuth, it will retrieve the user model from /hub/api/user
.
This model includes a scopes
field which is a list of authorized scope for the request,
which can be used.
def require_scope(scope):
"""decorator to require a scope to perform an action"""
def wrapper(func):
@functools.wraps(func)
def wrapped_func(request):
user = fetch_hub_api_user(request.token)
if scope not in user["scopes"]:
raise HTTP403(f"Requires scope {scope}")
else:
return func()
return wrapper
@require_scope("custom:myservice:read")
async def read_something(request):
...
@require_scope("custom:myservice:write")
async def write_something(request):
...
If you use HubOAuthenticated
, this check is performed automatically
against the .hub_scopes
attribute of each Handler
(the default is populated from $JUPYTERHUB_OAUTH_ACCESS_SCOPES
and usually access:services!service=myservice
).
Changed in version 3.0: The JUPYTERHUB_OAUTH_SCOPES environment variable is deprecated and renamed to JUPYTERHUB_OAUTH_ACCESS_SCOPES, to avoid ambiguity with JUPYTERHUB_OAUTH_CLIENT_ALLOWED_SCOPES
from tornado import web
from jupyterhub.services.auth import HubOAuthenticated
class MyHandler(HubOAuthenticated, BaseHandler):
hub_scopes = ["custom:myservice:read"]
@web.authenticated
def get(self):
...
Existing scope filters (!user=
, etc.) may be applied to custom scopes.
Custom scope filters are NOT supported.
Warning
JupyterHub allows you to define custom scopes, but it does not enforce that your services apply them.
For example, if you enable read-only access to servers via custom JupyterHub
(as seen in the read-only
example),
it is the administrator’s responsibility to enforce that they are applied.
If you allow users to launch servers without that custom Authorizer,
read-only permissions will not be enforced, and the default behavior of unrestricted access via the access:servers
scope will be applied.
Scopes and APIs#
The scopes are also listed in the JupyterHub REST API documentation. Each API endpoint has a list of scopes which can be used to access the API; if no scopes are listed, the API is not authenticated and can be accessed without any permissions (i.e., no scopes).
Listed scopes by each API endpoint reflect the “lowest” permissions required to gain any access to the corresponding API.
For example, posting user’s activity (POST /users/:name/activity) needs users:activity
scope.
If scope users
is held by the request, the access will be granted as the required scope is a subscope of the users
scope.
If, on the other hand, read:users:activity
scope is the only scope held, the request will be denied.
Use Cases#
To determine which scopes a role should have, one can follow these steps:
Determine what actions the role holder should have/have not access to
Match the actions against the JupyterHub’s APIs
Check which scopes are required to access the APIs
Combine scopes and subscopes if applicable
Customize the scopes with filters if needed
Define the role with required scopes and assign to users/services/groups/tokens
Below, different use cases are presented on how to use the RBAC framework
Service to cull idle servers#
Finding and shutting down idle servers can save a lot of computational resources. We can make use of jupyterhub-idle-culler to manage this for us. Below follows a short tutorial on how to add a cull-idle service in the RBAC system.
Install the cull-idle server script with
pip install jupyterhub-idle-culler
.Define a new service
idle-culler
and a new role for this service:# in jupyterhub_config.py c.JupyterHub.services = [ { "name": "idle-culler", "command": [ sys.executable, "-m", "jupyterhub_idle_culler", "--timeout=3600" ], } ] c.JupyterHub.load_roles = [ { "name": "idle-culler", "description": "Culls idle servers", "scopes": ["read:users:name", "read:users:activity", "servers"], "services": ["idle-culler"], } ]
Important
Note that in the RBAC system the
admin
field in theidle-culler
service definition is omitted. Instead, theidle-culler
role provides the service with only the permissions it needs.If the optional actions of deleting the idle servers and/or removing inactive users are desired, change the following scopes in the
idle-culler
role definition:servers
toadmin:servers
for deleting serversread:users:name
,read:users:activity
toadmin:users
for deleting users.
Restart JupyterHub to complete the process.
API launcher#
A service capable of creating/removing users and launching multiple servers should have access to:
POST and DELETE /users
POST and DELETE /users/:name/server or /users/:name/servers/:server_name
Creating/deleting servers
The scopes required to access the API enpoints:
admin:users
servers
admin:servers
From the above, the role definition is:
# in jupyterhub_config.py
c.JupyterHub.load_roles = [
{
"name": "api-launcher",
"description": "Manages servers",
"scopes": ["admin:users", "admin:servers"],
"services": [<service_name>]
}
]
If needed, the scopes can be modified to limit the permissions to e.g. a particular group with !group=groupname
filter.
Group admin roles#
Roles can be used to specify different group member privileges.
For example, a group of students class-A
may have a role allowing all group members to access information about their group. Teacher johan
, who is a student of class-A
but a teacher of another group of students class-B
, can have additional role permitting him to access information about class-B
students as well as start/stop their servers.
The roles can then be defined as follows:
# in jupyterhub_config.py
c.JupyterHub.load_groups = {
'class-A': ['johan', 'student1', 'student2'],
'class-B': ['student3', 'student4']
}
c.JupyterHub.load_roles = [
{
'name': 'class-A-student',
'description': 'Grants access to information about the group',
'scopes': ['read:groups!group=class-A'],
'groups': ['class-A']
},
{
'name': 'class-B-student',
'description': 'Grants access to information about the group',
'scopes': ['read:groups!group=class-B'],
'groups': ['class-B']
},
{
'name': 'teacher',
'description': 'Allows for accessing information about teacher group members and starting/stopping their servers',
'scopes': [ 'read:users!group=class-B', 'servers!group=class-B'],
'users': ['johan']
}
]
In the above example, johan
has privileges inherited from class-A-student
role and the teacher
role on top of those.
Note
The scope filters (!group=
) limit the privileges only to the particular groups. johan
can access the servers and information of class-B
group members only.
Technical Implementation#
Roles are stored in the database, where they are associated with users, services, and groups. Roles can be added or modified as explained in the Defining Roles section. Users, services, groups, and tokens can gain, change, and lose roles. This is currently achieved via jupyterhub_config.py
(see Defining Roles) and will be made available via API in the future. The latter will allow for changing a user’s role, and thereby its permissions, without the need to restart JupyterHub.
Roles and scopes utilities can be found in roles.py
and scopes.py
modules. Scope variables take on five different formats that are reflected throughout the utilities via specific nomenclature:
Scope variable nomenclature
scopes
List of scopes that may contain abbreviations (used in role definitions). E.g.,["users:activity!user", "self"]
.expanded scopes
Set of fully expanded scopes without abbreviations (i.e., resolved metascopes, filters, and subscopes). E.g.,{"users:activity!user=charlie", "read:users:activity!user=charlie"}
.parsed scopes
Dictionary representation of expanded scopes. E.g.,{"users:activity": {"user": ["charlie"]}, "read:users:activity": {"users": ["charlie"]}}
.intersection
Set of expanded scopes as intersection of 2 expanded scope sets.identify scopes
Set of expanded scopes needed for identity (whoami) endpoints.
Resolving roles and scopes#
Resolving roles involves determining which roles a user, service, or group has, extracting the list of scopes from each role and combining them into a single set of scopes.
Resolving scopes involves expanding scopes into all their possible subscopes (expanded scopes), parsing them into the format used for access evaluation (parsed scopes) and, if applicable, comparing two sets of scopes (intersection). All procedures take into account the scope hierarchy, vertical and horizontal filtering, limiting or elevated permissions (read:<resource>
or admin:<resource>
, respectively), and metascopes.
Roles and scopes are resolved on several occasions, for example when requesting an API token with specific scopes or when making an API request. The following sections provide more details.
Requesting API token with specific scopes#
Changed in version 3.0: API tokens have scopes instead of roles, so that their permissions cannot be updated.
You may still request roles for a token, but those roles will be evaluated to the corresponding scopes immediately.
Prior to 3.0, tokens stored roles, which meant their scopes were resolved on each request.
API tokens grant access to JupyterHub’s APIs. The RBAC framework allows for requesting tokens with specific permissions.
RBAC is involved in several stages of the OAuth token flow.
When requesting a token via the tokens API (/users/:name/tokens
), or the token page (/hub/token
),
if no scopes are requested, the token is issued with the permissions stored on the default token
role
(provided the requester is allowed to create the token).
OAuth tokens are also requested via OAuth flow
If the token is requested with any scopes, the permissions of requesting entity are checked against the requested permissions to ensure the token would not grant its owner additional privileges.
If a token has any scopes that its owner does not possess at the time of making the API request, those scopes are removed. The API request is resolved without additional errors using the scope intersection; the Hub logs a warning in this case (see Figure 2).
Resolving a token’s scope (yellow box in Figure 1) corresponds to resolving all the roles of the token’s owner (including the roles associated with their groups) and the token’s own scopes into a set of scopes. The two sets are compared (Resolve the scopes box in orange in Figure 1), taking into account the scope hierarchy. If the token’s scopes are a subset of the token owner’s scopes, the token is issued with the requested scopes; if not, JupyterHub will raise an error.
Figure 1 below illustrates the steps involved. The orange rectangles highlight where in the process the roles and scopes are resolved.

Figure 1. Resolving roles and scopes during API token request#
Making an API request#
With the RBAC framework, each authenticated JupyterHub API request is guarded by a scope decorator that specifies which scopes are required in order to gain the access to the API.
When an API request is made, the requesting API token’s scopes are again intersected with its owner’s (yellow box in Figure 2) to ensure that the token does not grant more permissions than its owner has at the request time (e.g., due to changing/losing roles).
If the owner’s roles do not include some scopes of the token, only the intersection of the token’s and owner’s scopes will be used. For example, using a token with scope users
whose owner’s role scope is read:users:name
will result in only the read:users:name
scope being passed on. In the case of no intersection, an empty set of scopes will be used.
The passed scopes are compared to the scopes required to access the API as follows:
if the API scopes are present within the set of passed scopes, the access is granted and the API returns its “full” response
if that is not the case, another check is utilized to determine if subscopes of the required API scopes can be found in the passed scope set:
if found, the RBAC framework employs the filtering procedures to refine the API response to access only resource attributes corresponding to the passed scopes. For example, providing a scope
read:users:activity!group=class-C
for theGET /users
API will return a list of user models from groupclass-C
containing only thelast_activity
attribute for each user modelif not found, the access to API is denied
Figure 2 illustrates this process highlighting the steps where the role and scope resolutions as well as filtering occur in orange.

Figure 2. Resolving roles and scopes when an API request is made#
Upgrading JupyterHub with RBAC framework#
RBAC framework requires different database setup than any previous JupyterHub versions due to eliminating the distinction between OAuth and API tokens (see OAuth vs API tokens for more details). This requires merging the previously two different database tables into one. By doing so, all existing tokens created before the upgrade no longer comply with the new database version and must be replaced.
This is achieved by the Hub deleting all existing tokens during the database upgrade and recreating the tokens loaded via the jupyterhub_config.py
file with updated structure. However, any manually issued or stored tokens are not recreated automatically and must be manually re-issued after the upgrade.
No other database records are affected.
Upgrade steps#
All running servers must be stopped before proceeding with the upgrade.
To upgrade the Hub, follow the Upgrading JupyterHub instructions.
Attention
We advise against defining any new roles in the
jupyterhub.config.py
file right after the upgrade is completed and JupyterHub restarted for the first time. This preserves the ‘current’ state of the Hub. You can define and assign new roles on any other following startup.After restarting the Hub re-issue all tokens that were previously issued manually (i.e., not through the
jupyterhub_config.py
file).
When the JupyterHub is restarted for the first time after the upgrade, all users, services and tokens stored in the database or re-loaded through the configuration file will be assigned their default role. Any newly added entities after that will be assigned their default role only if no other specific role is requested for them.
Changing the permissions after the upgrade#
Once all the upgrade steps above are completed, the RBAC framework will be available for utilization. You can define new roles, modify default roles (apart from admin
) and assign them to entities as described in the Defining Roles section.
We recommended the following procedure to start with RBAC:
Identify which admin users and services you would like to grant only the permissions they need through the new roles.
Strip these users and services of their admin status via API or UI. This will change their roles from
admin
touser
.Note
Stripping entities of their roles is currently available only via
jupyterhub_config.py
(see Removing Roles).Define new roles that you would like to start using with appropriate scopes and assign them to these entities in
jupyterhub_config.py
.Restart the JupyterHub for the new roles to take effect.
OAuth vs API tokens#
Before RBAC#
Previous JupyterHub versions utilize two types of tokens, OAuth token and API token.
OAuth token is issued by the Hub to a single-user server when the user logs in. The token is stored in the browser cookie and is used to identify the user who owns the server during the OAuth flow. This token by default expires when the cookie reaches its expiry time of 2 weeks (or after 1 hour in JupyterHub versions < 1.3.0).
API token is issued by the Hub to a single-user server when launched and is used to communicate with the Hub’s APIs such as posting activity or completing the OAuth flow. This token has no expiry by default.
API tokens can also be issued to users via API (/hub/token or POST /users/:username/tokens) and services via jupyterhub_config.py
to perform API requests.
With RBAC#
The RBAC framework allows for granting tokens different levels of permissions via scopes attached to roles. The ‘only identify’ purpose of the separate OAuth tokens is no longer required. API tokens can be used for every action, including the login and authentication, for which an API token with no role (i.e., no scope in Available scopes) is used.
OAuth tokens are therefore dropped from the Hub upgraded with the RBAC framework.
Reference#
The Reference section provides technical information about JupyterHub, such as monitoring the state of your installation and working with JupyterHub’s API modules and classes.
Reference#
Reference documentation provide technical descriptions about JupyterHub and how it works. This section is divided into two broad subsections:
Technical reference.
API reference.
Technical Overview#
The Technical Overview section gives you a high-level view of:
JupyterHub’s major Subsystems: Hub, Proxy, Single-User Notebook Server
how the subsystems interact
the process from JupyterHub access to user login
JupyterHub’s default behavior
customizing JupyterHub
The goal of this section is to share a deeper technical understanding of JupyterHub and how it works.
The Major Subsystems: Hub, Proxy, Single-User Notebook Server#
JupyterHub is a set of processes that together, provide a single-user Jupyter
Notebook server for each person in a group. Three subsystems are started
by the jupyterhub
command line program:
Hub (Python/Tornado): manages user accounts, authentication, and coordinates Single User Notebook Servers using a Spawner.
Proxy: the public-facing part of JupyterHub that uses a dynamic proxy to route HTTP requests to the Hub and Single User Notebook Servers. configurable http proxy (node-http-proxy) is the default proxy.
Single-User Notebook Server (Python/Tornado): a dedicated, single-user, Jupyter Notebook server is started for each user on the system when the user logs in. The object that starts the single-user notebook servers is called a Spawner.
How the Subsystems Interact#
Users access JupyterHub through a web browser, by going to the IP address or the domain name of the server.
The basic principles of operation are:
The Hub spawns the proxy (in the default JupyterHub configuration)
The proxy forwards all requests to the Hub by default
The Hub handles login and spawns single-user notebook servers on demand
The Hub configures the proxy to forward URL prefixes to single-user notebook servers
The proxy is the only process that listens on a public interface. The Hub sits
behind the proxy at /hub
. Single-user servers sit behind the proxy at
/user/[username]
.
Different authenticators control access to JupyterHub. The default one (PAM) uses the user accounts on the server where JupyterHub is running. If you use this, you will need to create a user account on the system for each user on your team. However, using other authenticators you can allow users to sign in with e.g. a GitHub account, or with any single-sign-on system your organization has.
Next, spawners control how JupyterHub starts the individual notebook server for each user. The default spawner will start a notebook server on the same machine running under their system username. The other main option is to start each server in a separate container, often using Docker.
The Process from JupyterHub Access to User Login#
When a user accesses JupyterHub, the following events take place:
Login data is handed to the Authenticator instance for validation
The Authenticator returns the username if the login information is valid
A single-user notebook server instance is spawned for the logged-in user
When the single-user notebook server starts, the proxy is notified to forward requests made to
/user/[username]/*
, to the single-user notebook server.A cookie is set on
/hub/
, containing an encrypted token. (Prior to version 0.8, a cookie for/user/[username]
was used too.)The browser is redirected to
/user/[username]
, and the request is handled by the single-user notebook server.
How does the single-user server identify the user with the Hub via OAuth?
On request, the single-user server checks a cookie
If no cookie is set, the single-user server redirects to the Hub for verification via OAuth
After verification at the Hub, the browser is redirected back to the single-user server
The token is verified and stored in a cookie
If no user is identified, the browser is redirected back to
/hub/login
Default Behavior#
By default, the Proxy listens on all public interfaces on port 8000. Thus you can reach JupyterHub through either:
http://localhost:8000
or any other public IP or domain pointing to your system.
In their default configuration, the other services, the Hub and Single-User Notebook Servers, all communicate with each other on localhost only.
By default, starting JupyterHub will write two files to disk in the current working directory:
jupyterhub.sqlite
is the SQLite database containing all of the state of the Hub. This file allows the Hub to remember which users are running and where, as well as storing other information enabling you to restart parts of JupyterHub separately. It is important to note that this database contains no sensitive information other than Hub usernames.jupyterhub_cookie_secret
is the encryption key used for securing cookies. This file needs to persist so that a Hub server restart will avoid invalidating cookies. Conversely, deleting this file and restarting the server effectively invalidates all login cookies. The cookie secret file is discussed in the Cookie Secret section of the Security Settings document.
The location of these files can be specified via configuration settings. It is
recommended that these files be stored in standard UNIX filesystem locations,
such as /etc/jupyterhub
for all configuration files and /srv/jupyterhub
for
all security and runtime files.
Customizing JupyterHub#
There are two basic extension points for JupyterHub:
How users are authenticated by Authenticators
How user’s single-user notebook server processes are started by Spawners
Each is governed by a customizable class, and JupyterHub ships with basic defaults for each.
To enable custom authentication and/or spawning, subclass Authenticator
or
Spawner
, and override the relevant methods.
Authenticators#
The Authenticator
is the mechanism for authorizing users to use the
Hub and single user notebook servers.
The default PAM Authenticator#
JupyterHub ships with the default PAM-based Authenticator, for logging in with local user accounts via a username and password.
The OAuthenticator#
Some login mechanisms, such as OAuth, don’t map onto username and password authentication, and instead use tokens. When using these mechanisms, you can override the login handlers.
You can see an example implementation of an Authenticator that uses GitHub OAuth at OAuthenticator.
JupyterHub’s OAuthenticator currently supports the following popular services:
Auth0
Bitbucket
CILogon
GitHub
GitLab
Globus
Google
MediaWiki
OpenShift
A generic implementation, which you can use for OAuth authentication with any provider, is also available.
The Dummy Authenticator#
When testing, it may be helpful to use the
DummyAuthenticator
. This allows for any username and
password unless a global password has been set. Once set, any username will
still be accepted but the correct password will need to be provided.
Added in version 5.0: The DummyAuthenticator’s default allow_all
is True,
unlike most other Authenticators.
Additional Authenticators#
Additional authenticators can be found on GitHub by searching for topic:jupyterhub topic:authenticator.
Technical Overview of Authentication#
How the Base Authenticator works#
The base authenticator uses simple username and password authentication.
The base Authenticator has one central method:
Authenticator.authenticate#
This method is passed the Tornado RequestHandler
and the POST data
from JupyterHub’s login form. Unless the login form has been customized,
data
will have two keys:
username
password
If authentication is successful the authenticate
method must return either:
the username (non-empty str) of the authenticated user
or a dictionary with fields:
name
: the usernameadmin
: optional, a boolean indicating whether the user is an admin. In most cases it is better to use fine grained RBAC permissions instead of giving users full admin privileges.auth_state
: optional, a dictionary of auth state that will be persistedgroups
: optional, a list of JupyterHub group memberships
Otherwise, it must return None
.
Writing an Authenticator that looks up passwords in a dictionary requires only overriding this one method:
from secrets import compare_digest
from traitlets import Dict
from jupyterhub.auth import Authenticator
class DictionaryAuthenticator(Authenticator):
passwords = Dict(config=True,
help="""dict of username:password for authentication"""
)
async def authenticate(self, handler, data):
username = data["username"]
password = data["password"]
check_password = self.passwords.get(username, "")
# always call compare_digest, for timing attacks
if compare_digest(check_password, password) and username in self.passwords:
return username
else:
return None
Normalize usernames#
Since the Authenticator and Spawner both use the same username,
sometimes you want to transform the name coming from the authentication service
(e.g. turning email addresses into local system usernames) before adding them to the Hub service.
Authenticators can define normalize_username
, which takes a username.
The default normalization is to cast names to lowercase
For simple mappings, a configurable dict Authenticator.username_map
is used to turn one name into another:
c.Authenticator.username_map = {
'service-name': 'localname'
}
When using PAMAuthenticator
, you can set
c.PAMAuthenticator.pam_normalize_username = True
, which will
normalize usernames using PAM (basically round-tripping them: username
to uid to username), which is useful in case you use some external
service that allows multiple usernames mapping to the same user (such
as ActiveDirectory, yes, this really happens). When
pam_normalize_username
is on, usernames are not normalized to
lowercase.
Validate usernames#
In most cases, there is a very limited set of acceptable usernames.
Authenticators can define validate_username(username)
,
which should return True for a valid username and False for an invalid one.
The primary effect this has is improving error messages during user creation.
The default behavior is to use configurable Authenticator.username_pattern
,
which is a regular expression string for validation.
To only allow usernames that start with ‘w’:
c.Authenticator.username_pattern = r'w.*'
How to write a custom authenticator#
You can use custom Authenticator subclasses to enable authentication via other mechanisms. One such example is using GitHub OAuth.
Because the username is passed from the Authenticator to the Spawner,
a custom Authenticator and Spawner are often used together.
For example, the Authenticator methods, Authenticator.pre_spawn_start()
and Authenticator.post_spawn_stop()
, are hooks that can be used to do
auth-related startup (e.g. opening PAM sessions) and cleanup
(e.g. closing PAM sessions).
Registering custom Authenticators via entry points#
As of JupyterHub 1.0, custom authenticators can register themselves via
the jupyterhub.authenticators
entry point metadata.
To do this, in your setup.py
add:
setup(
...
entry_points={
'jupyterhub.authenticators': [
'myservice = mypackage:MyAuthenticator',
],
},
)
If you have added this metadata to your package, admins can select your authenticator with the configuration:
c.JupyterHub.authenticator_class = 'myservice'
instead of the full
c.JupyterHub.authenticator_class = 'mypackage:MyAuthenticator'
previously required.
Additionally, configurable attributes for your authenticator will
appear in jupyterhub help output and auto-generated configuration files
via jupyterhub --generate-config
.
Allowing access#
When dealing with logging in, there are generally two separate steps:
- authentication
identifying who is trying to log in, and
- authorization
deciding whether an authenticated user is allowed to access your JupyterHub
Authenticator.authenticate()
is responsible for authenticating users.
It is perfectly fine in the simplest cases for Authenticator.authenticate
to be responsible for authentication and authorization,
in which case authenticate
may return None
if the user is not authorized.
However, Authenticators also have two methods, check_allowed()
and check_blocked_users()
, which are called after successful authentication to further check if the user is allowed.
If check_blocked_users()
returns False, authorization stops and the user is not allowed.
If Authenticator.allow_all
is True OR check_allowed()
returns True, authorization proceeds.
Added in version 5.0: Authenticator.allow_all
and Authenticator.allow_existing_users
are new in JupyterHub 5.0.
By default, allow_all
is False,
which is a change from pre-5.0, where allow_all
was implicitly True if allowed_users
was empty.
Overriding check_allowed
#
Changed in version 5.0: check_allowed()
is not called if allow_all
is True.
Changed in version 5.0: Starting with 5.0, check_allowed()
should NOT return True if no allow config
is specified (allow_all
should be used instead).
The base implementation of check_allowed()
checks:
if username is in the
allowed_users
set, return Trueelse return False
Changed in version 5.0: Prior to 5.0, this would also return True if allowed_users
was empty.
For clarity, this is no longer the case. A new allow_all
property (default False) has been added which is checked before calling check_allowed
.
If allow_all
is True, this takes priority over check_allowed
, which will be ignored.
If your Authenticator subclass similarly returns True when no allow config is defined,
this is fully backward compatible for your users, but means allow_all = False
has no real effect.
You can make your Authenticator forward-compatible with JupyterHub 5 by defining allow_all
as a boolean config trait on your class:
class MyAuthenticator(Authenticator):
# backport allow_all from JupyterHub 5
allow_all = Bool(False, config=True)
def check_allowed(self, username, authentication):
if self.allow_all:
# replaces previous "if no auth config"
return True
...
If an Authenticator defines additional sources of allow
configuration,
such as membership in a group or other information,
it should override check_allowed
to account for this.
Note
allow_
configuration should generally be additive,
i.e. if access is granted by any allow configuration,
a user should be authorized.
JupyterHub recommends that Authenticators applying restrictive configuration should use names like block_
or require_
,
and check this during check_blocked_users
or authenticate
, not check_allowed
.
In general, an Authenticator’s skeleton should look like:
class MyAuthenticator(Authenticator):
# backport allow_all for compatibility with JupyterHub < 5
allow_all = Bool(False, config=True)
require_something = List(config=True)
allowed_something = Set()
def authenticate(self, data, handler):
...
if success:
return {"username": username, "auth_state": {...}}
else:
return None
def check_blocked_users(self, username, authentication=None):
"""Apply _restrictive_ configuration"""
if self.require_something and not has_something(username, self.request_):
return False
# repeat for each restriction
if restriction_defined and restriction_not_met:
return False
return super().check_blocked_users(self, username, authentication)
def check_allowed(self, username, authentication=None):
"""Apply _permissive_ configuration
Only called if check_blocked_users returns True
AND allow_all is False
"""
if self.allow_all:
# check here to backport allow_all behavior
# from JupyterHub 5
# this branch will never be taken with jupyterhub >=5
return True
if self.allowed_something and user_has_something(username):
return True
# repeat for each allow
if allow_config and allow_met:
return True
# should always have this at the end
if self.allowed_users and username in self.allowed_users:
return True
# do not call super!
# super().check_allowed is not safe with JupyterHub < 5.0,
# as it will return True if allowed_users is empty
return False
Key points:
allow_all
is backported from JupyterHub 5, for consistent behavior in all versions of JupyterHub (optional)restrictive configuration is checked in
check_blocked_users
if any restriction is not met,
check_blocked_users
returns Falsepermissive configuration is checked in
check_allowed
if any
allow
condition is met,check_allowed
returns True
So the logical expression for a user being authorized should look like:
if ALL restrictions are met AND ANY admissions are met: user is authorized
Custom error messages#
Any of these authentication and authorization methods may raise a web.HTTPError
Exception
from tornado import web
raise web.HTTPError(403, "informative message")
if you want to show a more informative login failure message rather than the generic one.
Authentication state#
JupyterHub 0.8 adds the ability to persist state related to authentication,
such as auth-related tokens.
If such state should be persisted, .authenticate()
should return a dictionary of the form:
{
'name': username,
'auth_state': {
'key': 'value',
}
}
where username
is the username that has been authenticated,
and auth_state
is any JSON-serializable dictionary.
Because auth_state
may contain sensitive information,
it is encrypted before being stored in the database.
To store auth_state, two conditions must be met:
persisting auth state must be enabled explicitly via configuration
c.Authenticator.enable_auth_state = True
encryption must be enabled by the presence of
JUPYTERHUB_CRYPT_KEY
environment variable, which should be a hex-encoded 32-byte key. For example:export JUPYTERHUB_CRYPT_KEY=$(openssl rand -hex 32)
JupyterHub uses Fernet to encrypt auth_state.
To facilitate key-rotation, JUPYTERHUB_CRYPT_KEY
may be a semicolon-separated list of encryption keys.
If there are multiple keys present, the first key is always used to persist any new auth_state.
Using auth_state#
Typically, if auth_state
is persisted it is desirable to affect the Spawner environment in some way.
This may mean defining environment variables, placing certificate in the user’s home directory, etc.
The Authenticator.pre_spawn_start()
method can be used to pass information from authenticator state
to Spawner environment:
class MyAuthenticator(Authenticator):
async def authenticate(self, handler, data=None):
username = await identify_user(handler, data)
upstream_token = await token_for_user(username)
return {
'name': username,
'auth_state': {
'upstream_token': upstream_token,
},
}
async def pre_spawn_start(self, user, spawner):
"""Pass upstream_token to spawner via environment variable"""
auth_state = await user.get_auth_state()
if not auth_state:
# auth_state not enabled
return
spawner.environment['UPSTREAM_TOKEN'] = auth_state['upstream_token']
Note that environment variable names and values are always strings, so passing multiple values means setting multiple environment variables or serializing more complex data into a single variable, e.g. as a JSON string.
auth state can also be used to configure the spawner via config without subclassing
by setting c.Spawner.auth_state_hook
. This function will be called with (spawner, auth_state)
,
only when auth_state is defined.
For example: (for KubeSpawner)
def auth_state_hook(spawner, auth_state):
spawner.volumes = auth_state['user_volumes']
spawner.mounts = auth_state['user_mounts']
c.Spawner.auth_state_hook = auth_state_hook
Authenticator-managed group membership#
Added in version 2.2.
Some identity providers may have their own concept of group membership that you would like to preserve in JupyterHub.
This is now possible with Authenticator.manage_groups
.
You can set the config:
c.Authenticator.manage_groups = True
to enable this behavior.
The default is False for Authenticators that ship with JupyterHub,
but may be True for custom Authenticators.
Check your Authenticator’s documentation for manage_groups
support.
If True, Authenticator.authenticate()
and Authenticator.refresh_user()
may include a field groups
which is a list of group names the user should be a member of:
Membership will be added for any group in the list
Membership in any groups not in the list will be revoked
Any groups not already present in the database will be created
If
None
is returned, no changes are made to the user’s group membership
If authenticator-managed groups are enabled,
all group-management via the API is disabled,
and roles cannot be specified with load_groups
traitlet.
Authenticator-managed roles#
Added in version 5.0.
Some identity providers may have their own concept of role membership that you would like to preserve in JupyterHub.
This is now possible with Authenticator.manage_roles
.
You can set the config:
c.Authenticator.manage_roles = True
to enable this behavior.
The default is False for Authenticators that ship with JupyterHub,
but may be True for custom Authenticators.
Check your Authenticator’s documentation for manage_roles
support.
If True, Authenticator.authenticate()
and Authenticator.refresh_user()
may include a field roles
which is a list of roles that user should be assigned to:
User will be assigned each role in the list
User will be revoked roles not in the list (but they may still retain the role privileges if they inherit the role from their group)
Any roles not already present in the database will be created
Attributes of the roles (
description
,scopes
,groups
,users
, andservices
) will be updated if givenIf
None
is returned, no changes are made to the user’s roles
If authenticator-managed roles are enabled,
all role-management via the API is disabled,
and roles cannot be assigned to groups nor users via load_roles
traitlet
(roles can still be created via load_roles
or assigned to services).
When an authenticator manages roles, the initial roles and role assignments
can be loaded from role specifications returned by the Authenticator.load_managed_roles()
method.
The authenticator-manged roles and role assignment will be deleted after restart if:
Authenticator.reset_managed_roles_on_startup
is set toTrue
, andthe roles and role assignments are not included in the initial set of roles returned by the
Authenticator.load_managed_roles()
method.
pre_spawn_start and post_spawn_stop hooks#
Authenticators use two hooks, Authenticator.pre_spawn_start()
and
Authenticator.post_spawn_stop(user, spawner)()
to add pass additional state information
between the authenticator and a spawner. These hooks are typically used auth-related
startup, i.e. opening a PAM session, and auth-related cleanup, i.e. closing a
PAM session.
JupyterHub as an OAuth provider#
Beginning with version 0.8, JupyterHub is an OAuth provider.
Spawners#
A Spawner starts each single-user notebook server. The Spawner represents an abstract interface to a process, and a custom Spawner needs to be able to take three actions:
start a process
poll whether a process is still running
stop a process
Examples#
Additional Spawners can be installed from separate packages. Some examples include:
DockerSpawner for spawning user servers in Docker containers
dockerspawner.DockerSpawner
for spawning identical Docker containers for each userdockerspawner.SystemUserSpawner
for spawning Docker containers with an environment and home directory for each userboth
DockerSpawner
andSystemUserSpawner
also work with Docker Swarm for launching containers on remote machines
SudoSpawner enables JupyterHub to run without being root, by spawning an intermediate process via
sudo
BatchSpawner for spawning remote servers using batch systems
YarnSpawner for spawning notebook servers in YARN containers on a Hadoop cluster
SSHSpawner to spawn notebooks on a remote server using SSH
KubeSpawner to spawn notebook servers on kubernetes cluster.
NomadSpawner to spawn a notebook server as a Nomad job inside HashiCorp’s Nomad cluster
Spawner control methods#
Spawner.start#
Spawner.start
should start a single-user server for a single user.
Information about the user can be retrieved from self.user
,
an object encapsulating the user’s name, authentication, and server info.
The return value of Spawner.start
should be the (ip, port)
of the running server,
or a full URL as a string.
Most Spawner.start
functions will look similar to this example:
async def start(self):
self.ip = '127.0.0.1'
self.port = random_port()
# get environment variables,
# several of which are required for configuring the single-user server
env = self.get_env()
cmd = []
# get jupyterhub command to run,
# typically ['jupyterhub-singleuser']
cmd.extend(self.cmd)
cmd.extend(self.get_args())
await self._actually_start_server_somehow(cmd, env)
# url may not match self.ip:self.port, but it could!
url = self._get_connectable_url()
return url
When Spawner.start
returns, the single-user server process should actually be running,
not just requested. JupyterHub can handle Spawner.start
being very slow
(such as PBS-style batch queues, or instantiating whole AWS instances)
via relaxing the Spawner.start_timeout
config value.
Note on IPs and ports#
Spawner.ip
and Spawner.port
attributes set the bind URL,
which the single-user server should listen on
(passed to the single-user process via the JUPYTERHUB_SERVICE_URL
environment variable).
The return value is the IP and port (or full URL) the Hub should connect to.
These are not necessarily the same, and usually won’t be in any Spawner that works with remote resources or containers.
The default for Spawner.ip
, and Spawner.port
is 127.0.0.1:{random}
,
which is appropriate for Spawners that launch local processes,
where everything is on localhost and each server needs its own port.
For remote or container Spawners, it will often make sense to use a different value,
such as ip = '0.0.0.0'
and a fixed port, e.g. 8888
.
The defaults can be changed in the class,
preserving configuration with traitlets:
from traitlets import default
from jupyterhub.spawner import Spawner
class MySpawner(Spawner):
@default("ip")
def _default_ip(self):
return '0.0.0.0'
@default("port")
def _default_port(self):
return 8888
async def start(self):
env = self.get_env()
cmd = []
# get jupyterhub command to run,
# typically ['jupyterhub-singleuser']
cmd.extend(self.cmd)
cmd.extend(self.get_args())
remote_server_info = await self._actually_start_server_somehow(cmd, env)
url = self.get_public_url_from(remote_server_info)
return url
Exception handling#
When Spawner.start
raises an Exception, a message can be passed on to the user via the exception using a .jupyterhub_html_message
or .jupyterhub_message
attribute.
When the Exception has a .jupyterhub_html_message
attribute, it will be rendered as HTML to the user.
Alternatively .jupyterhub_message
is rendered as unformatted text.
If both attributes are not present, the Exception will be shown to the user as unformatted text.
Spawner.poll#
Spawner.poll
checks if the spawner is still running.
It should return None
if it is still running,
and an integer exit status, otherwise.
In the case of local processes, Spawner.poll
uses os.kill(PID, 0)
to check if the local process is still running. On Windows, it uses psutil.pid_exists
.
Spawner.stop#
Spawner.stop
should stop the process. It must be a tornado coroutine, which should return when the process has finished exiting.
Spawner state#
JupyterHub should be able to stop and restart without tearing down single-user notebook servers. To do this task, a Spawner may need to persist some information that can be restored later. A JSON-able dictionary of state can be used to store persisted information.
Unlike start, stop, and poll methods, the state methods must not be coroutines.
In the case of single processes, the Spawner state is only the process ID of the server:
def get_state(self):
"""get the current state"""
state = super().get_state()
if self.pid:
state['pid'] = self.pid
return state
def load_state(self, state):
"""load state from the database"""
super().load_state(state)
if 'pid' in state:
self.pid = state['pid']
def clear_state(self):
"""clear any state (called after shutdown)"""
super().clear_state()
self.pid = 0
Spawner options form#
(new in 0.4)
Some deployments may want to offer options to users to influence how their servers are started. This may include cluster-based deployments, where users specify what resources should be available, or docker-based deployments where users can select from a list of base images.
This feature is enabled by setting Spawner.options_form
, which is an HTML form snippet
inserted unmodified into the spawn form.
If the Spawner.options_form
is defined, when a user tries to start their server, they will be directed to a form page, like this:
If Spawner.options_form
is undefined, the user’s server is spawned directly, and no spawn page is rendered.
See this example for a form that allows custom CLI args for the local spawner.
Spawner.options_from_form
#
Options from this form will always be a dictionary of lists of strings, e.g.:
{
'integer': ['5'],
'text': ['some text'],
'select': ['a', 'b'],
}
When formdata
arrives, it is passed through Spawner.options_from_form(formdata)
,
which is a method to turn the form data into the correct structure.
This method must return a dictionary, and is meant to interpret the lists-of-strings into the correct types. For example, the options_from_form
for the above form would look like:
def options_from_form(self, formdata):
options = {}
options['integer'] = int(formdata['integer'][0]) # single integer value
options['text'] = formdata['text'][0] # single string value
options['select'] = formdata['select'] # list already correct
options['notinform'] = 'extra info' # not in the form at all
return options
which would return:
{
'integer': 5,
'text': 'some text',
'select': ['a', 'b'],
'notinform': 'extra info',
}
When Spawner.start
is called, this dictionary is accessible as self.user_options
.
Writing a custom spawner#
If you are interested in building a custom spawner, you can read this tutorial.
Registering custom Spawners via entry points#
As of JupyterHub 1.0, custom Spawners can register themselves via
the jupyterhub.spawners
entry point metadata.
To do this, in your setup.py
add:
setup(
...
entry_points={
'jupyterhub.spawners': [
'myservice = mypackage:MySpawner',
],
},
)
If you have added this metadata to your package, users can select your spawner with the configuration:
c.JupyterHub.spawner_class = 'myservice'
instead of the full
c.JupyterHub.spawner_class = 'mypackage:MySpawner'
previously required.
Additionally, configurable attributes for your spawner will
appear in jupyterhub help output and auto-generated configuration files
via jupyterhub --generate-config
.
Environment variables and command-line arguments#
Spawners mainly do one thing: launch a command in an environment.
The command-line is constructed from user configuration:
Spawner.cmd (default:
['jupyterhub-singleuser']
)Spawner.args (CLI args to pass to the cmd, default: empty)
where the configuration:
c.Spawner.cmd = ["my-singleuser-wrapper"]
c.Spawner.args = ["--debug", "--flag"]
would result in spawning the command:
my-singleuser-wrapper --debug --flag
The Spawner.get_args()
method is how Spawner.args
is accessed,
and can be used by Spawners to customize/extend user-provided arguments.
Prior to 2.0, JupyterHub unconditionally added certain options if specified to the command-line,
such as --ip={Spawner.ip}
and --port={Spawner.port}
.
These have now all been moved to environment variables,
and from JupyterHub 2.0,
the command-line launched by JupyterHub is fully specified by overridable configuration Spawner.cmd + Spawner.args
.
Most process configuration is passed via environment variables.
Additional variables can be specified via the Spawner.environment
configuration.
The process environment is returned by Spawner.get_env
, which specifies the following environment variables:
JUPYTERHUB_SERVICE_URL
- the bind URL where the server should launch its HTTP server (http://127.0.0.1:12345
). This includesSpawner.ip
andSpawner.port
; new in 2.0, prior to 2.0 IP, port were on the command-line and only if specifiedJUPYTERHUB_SERVICE_PREFIX
- the URL prefix the service will run on (e.g./user/name/
)JUPYTERHUB_USER
- the JupyterHub user’s usernameJUPYTERHUB_SERVER_NAME
- the server’s name, if using named servers (default server has an empty name)JUPYTERHUB_API_URL
- the full URL for the JupyterHub API (http://17.0.0.1:8001/hub/api)JUPYTERHUB_BASE_URL
- the base URL of the whole jupyterhub deployment, i.e. the bit beforehub/
oruser/
, as set byc.JupyterHub.base_url
(default:/
)JUPYTERHUB_API_TOKEN
- the API token the server can use to make requests to the Hub. This is also the OAuth client secret.JUPYTERHUB_CLIENT_ID
- the OAuth client ID for authenticating visitors.JUPYTERHUB_OAUTH_CALLBACK_URL
- the callback URL to use in OAuth, typically/user/:name/oauth_callback
JUPYTERHUB_OAUTH_ACCESS_SCOPES
- the scopes required to access the server (calledJUPYTERHUB_OAUTH_SCOPES
prior to 3.0)JUPYTERHUB_OAUTH_CLIENT_ALLOWED_SCOPES
- the scopes the service is allowed to request. If no scopes are requested explicitly, these scopes will be requested.JUPYTERHUB_PUBLIC_URL
- the public URL of the server, e.g.https://jupyterhub.example.org/user/name/
. Empty if no public URL is specified (default). Will be available if subdomains are configured.JUPYTERHUB_PUBLIC_HUB_URL
- the public URL of JupyterHub as a whole, e.g.https://jupyterhub.example.org/
. Empty if no public URL is specified (default). Will be available if subdomains are configured.
Optional environment variables, depending on configuration:
JUPYTERHUB_SSL_[KEYFILE|CERTFILE|CLIENT_CI]
- SSL configuration, wheninternal_ssl
is enabledJUPYTERHUB_ROOT_DIR
- the root directory of the server (notebook directory), whenSpawner.notebook_dir
is defined (new in 2.0)JUPYTERHUB_DEFAULT_URL
- the default URL for the server (for redirects from/user/:name/
), ifSpawner.default_url
is defined (new in 2.0, previously passed via CLI)JUPYTERHUB_DEBUG=1
- generic debug flag, sets maximum log level whenSpawner.debug
is True (new in 2.0, previously passed via CLI)JUPYTERHUB_DISABLE_USER_CONFIG=1
- disable loading user config, sets maximum log level whenSpawner.debug
is True (new in 2.0, previously passed via CLI)JUPYTERHUB_[MEM|CPU]_[LIMIT_GUARANTEE]
- the values of CPU and memory limits and guarantees. These are not expected to be enforced by the process, but are made available as a hint, e.g. for resource monitoring extensions.
Spawners, resource limits, and guarantees (Optional)#
Some spawners of the single-user notebook servers allow setting limits or
guarantees on resources, such as CPU and memory. To provide a consistent
experience for sysadmins and users, we provide a standard way to set and
discover these resource limits and guarantees, such as for memory and CPU.
For the limits and guarantees to be useful, the spawner must implement
support for them. For example, LocalProcessSpawner
, the default
spawner, does not support limits and guarantees. One of the spawners
that supports limits and guarantees is the
systemdspawner
.
Memory Limits & Guarantees#
c.Spawner.mem_limit
: A limit specifies the maximum amount of memory
that may be allocated, though there is no promise that the maximum amount will
be available. In supported spawners, you can set c.Spawner.mem_limit
to
limit the total amount of memory that a single-user notebook server can
allocate. Attempting to use more memory than this limit will cause errors. The
single-user notebook server can discover its own memory limit by looking at
the environment variable MEM_LIMIT
, which is specified in absolute bytes.
c.Spawner.mem_guarantee
: Sometimes, a guarantee of a minimum amount of
memory is desirable. In this case, you can set c.Spawner.mem_guarantee
to
to provide a guarantee that at minimum this much memory will always be
available for the single-user notebook server to use. The environment variable
MEM_GUARANTEE
will also be set in the single-user notebook server.
The spawner’s underlying system or cluster is responsible for enforcing these
limits and providing these guarantees. If these values are set to None
, no
limits or guarantees are provided, and no environment values are set.
CPU Limits & Guarantees#
c.Spawner.cpu_limit
: In supported spawners, you can set
c.Spawner.cpu_limit
to limit the total number of cpu-cores that a
single-user notebook server can use. These can be fractional - 0.5
means 50%
of one CPU core, 4.0
is 4 CPU-cores, etc. This value is also set in the
single-user notebook server’s environment variable CPU_LIMIT
. The limit does
not claim that you will be able to use all the CPU up to your limit as other
higher priority applications might be taking up CPU.
c.Spawner.cpu_guarantee
: You can set c.Spawner.cpu_guarantee
to provide a
guarantee for CPU usage. The environment variable CPU_GUARANTEE
will be set
in the single-user notebook server when a guarantee is being provided.
The spawner’s underlying system or cluster is responsible for enforcing these
limits and providing these guarantees. If these values are set to None
, no
limits or guarantees are provided, and no environment values are set.
Encryption#
Communication between the Proxy
, Hub
, and Notebook
can be secured by
turning on internal_ssl
in jupyterhub_config.py
. For a custom spawner to
utilize these certs, there are two methods of interest on the base Spawner
class: .create_certs
and .move_certs
.
The first method, .create_certs
will sign a key-cert pair using an internally
trusted authority for notebooks. During this process, .create_certs
can
apply ip
and dns
name information to the cert via an alt_names
kwarg
.
This is used for certificate authentication (verification). Without proper
verification, the Notebook
will be unable to communicate with the Hub
and
vice versa when internal_ssl
is enabled. For example, given a deployment
using the DockerSpawner
which will start containers with ips
from the
docker
subnet pool, the DockerSpawner
would need to instead choose a
container ip
prior to starting and pass that to .create_certs
(TODO: edit).
In general though, this method will not need to be changed and the default
ip
/dns
(localhost) info will suffice.
When .create_certs
is run, it will create the certificates in a default,
central location specified by c.JupyterHub.internal_certs_location
. For
Spawners
that need access to these certs elsewhere (i.e. on another host
altogether), the .move_certs
method can be overridden to move the certs
appropriately. Again, using DockerSpawner
as an example, this would entail
moving certs to a directory that will get mounted into the container this
spawner starts.
Configuration Reference#
Important
Make sure the version of JupyterHub for this documentation matches your installation version, as the output of this command may change between versions.
JupyterHub configuration#
As explained in the Configuration Basics
section, the jupyterhub_config.py
can be automatically generated via
jupyterhub --generate-config
Most of this information is available in a nicer format in:
The following contains the output of that command for reference.
# Configuration file for jupyterhub.
c = get_config() #noqa
#------------------------------------------------------------------------------
# Application(SingletonConfigurable) configuration
#------------------------------------------------------------------------------
## This is an application.
## The date format used by logging formatters for %(asctime)s
# Default: '%Y-%m-%d %H:%M:%S'
# c.Application.log_datefmt = '%Y-%m-%d %H:%M:%S'
## The Logging format template
# Default: '[%(name)s]%(highlevel)s %(message)s'
# c.Application.log_format = '[%(name)s]%(highlevel)s %(message)s'
## Set the log level by value or name.
# Choices: any of [0, 10, 20, 30, 40, 50, 'DEBUG', 'INFO', 'WARN', 'ERROR', 'CRITICAL']
# Default: 30
# c.Application.log_level = 30
## Configure additional log handlers.
#
# The default stderr logs handler is configured by the log_level, log_datefmt
# and log_format settings.
#
# This configuration can be used to configure additional handlers (e.g. to
# output the log to a file) or for finer control over the default handlers.
#
# If provided this should be a logging configuration dictionary, for more
# information see:
# https://docs.python.org/3/library/logging.config.html#logging-config-
# dictschema
#
# This dictionary is merged with the base logging configuration which defines
# the following:
#
# * A logging formatter intended for interactive use called
# ``console``.
# * A logging handler that writes to stderr called
# ``console`` which uses the formatter ``console``.
# * A logger with the name of this application set to ``DEBUG``
# level.
#
# This example adds a new handler that writes to a file:
#
# .. code-block:: python
#
# c.Application.logging_config = {
# "handlers": {
# "file": {
# "class": "logging.FileHandler",
# "level": "DEBUG",
# "filename": "<path/to/file>",
# }
# },
# "loggers": {
# "<application-name>": {
# "level": "DEBUG",
# # NOTE: if you don't list the default "console"
# # handler here then it will be disabled
# "handlers": ["console", "file"],
# },
# },
# }
# Default: {}
# c.Application.logging_config = {}
## Instead of starting the Application, dump configuration to stdout
# Default: False
# c.Application.show_config = False
## Instead of starting the Application, dump configuration to stdout (as JSON)
# Default: False
# c.Application.show_config_json = False
#------------------------------------------------------------------------------
# JupyterHub(Application) configuration
#------------------------------------------------------------------------------
## An Application for starting a Multi-User Jupyter Notebook server.
## Maximum number of concurrent servers that can be active at a time.
#
# Setting this can limit the total resources your users can consume.
#
# An active server is any server that's not fully stopped. It is considered
# active from the time it has been requested until the time that it has
# completely stopped.
#
# If this many user servers are active, users will not be able to launch new
# servers until a server is shutdown. Spawn requests will be rejected with a 429
# error asking them to try again.
#
# If set to 0, no limit is enforced.
# Default: 0
# c.JupyterHub.active_server_limit = 0
## Duration (in seconds) to determine the number of active users.
# Default: 1800
# c.JupyterHub.active_user_window = 1800
## Resolution (in seconds) for updating activity
#
# If activity is registered that is less than activity_resolution seconds more
# recent than the current value, the new value will be ignored.
#
# This avoids too many writes to the Hub database.
# Default: 30
# c.JupyterHub.activity_resolution = 30
## DEPRECATED since version 2.0.0.
#
# The default admin role has full permissions, use custom RBAC scopes instead to
# create restricted administrator roles.
# https://jupyterhub.readthedocs.io/en/stable/rbac/index.html
# Default: False
# c.JupyterHub.admin_access = False
## DEPRECATED since version 0.7.2, use Authenticator.admin_users instead.
# Default: set()
# c.JupyterHub.admin_users = set()
## Allow named single-user servers per user
# Default: False
# c.JupyterHub.allow_named_servers = False
## Answer yes to any questions (e.g. confirm overwrite)
# Default: False
# c.JupyterHub.answer_yes = False
## The default amount of records returned by a paginated endpoint
# Default: 50
# c.JupyterHub.api_page_default_limit = 50
## The maximum amount of records that can be returned at once
# Default: 200
# c.JupyterHub.api_page_max_limit = 200
## PENDING DEPRECATION: consider using services
#
# Dict of token:username to be loaded into the database.
#
# Allows ahead-of-time generation of API tokens for use by externally managed services,
# which authenticate as JupyterHub users.
#
# Consider using services for general services that talk to the
# JupyterHub API.
# Default: {}
# c.JupyterHub.api_tokens = {}
## Authentication for prometheus metrics
# Default: True
# c.JupyterHub.authenticate_prometheus = True
## Class for authenticating users.
#
# This should be a subclass of :class:`jupyterhub.auth.Authenticator`
#
# with an :meth:`authenticate` method that:
#
# - is a coroutine (asyncio or tornado)
# - returns username on success, None on failure
# - takes two arguments: (handler, data),
# where `handler` is the calling web.RequestHandler,
# and `data` is the POST form data from the login page.
#
# .. versionchanged:: 1.0
# authenticators may be registered via entry points,
# e.g. `c.JupyterHub.authenticator_class = 'pam'`
#
# Currently installed:
# - default: jupyterhub.auth.PAMAuthenticator
# - dummy: jupyterhub.auth.DummyAuthenticator
# - null: jupyterhub.auth.NullAuthenticator
# - pam: jupyterhub.auth.PAMAuthenticator
# Default: 'jupyterhub.auth.PAMAuthenticator'
# c.JupyterHub.authenticator_class = 'jupyterhub.auth.PAMAuthenticator'
## The base URL of the entire application.
#
# Add this to the beginning of all JupyterHub URLs.
# Use base_url to run JupyterHub within an existing website.
# Default: '/'
# c.JupyterHub.base_url = '/'
## The public facing URL of the whole JupyterHub application.
#
# This is the address on which the proxy will bind.
# Sets protocol, ip, base_url
# Default: 'http://:8000'
# c.JupyterHub.bind_url = 'http://:8000'
## Whether to shutdown the proxy when the Hub shuts down.
#
# Disable if you want to be able to teardown the Hub while leaving the
# proxy running.
#
# Only valid if the proxy was starting by the Hub process.
#
# If both this and cleanup_servers are False, sending SIGINT to the Hub will
# only shutdown the Hub, leaving everything else running.
#
# The Hub should be able to resume from database state.
# Default: True
# c.JupyterHub.cleanup_proxy = True
## Whether to shutdown single-user servers when the Hub shuts down.
#
# Disable if you want to be able to teardown the Hub while leaving the
# single-user servers running.
#
# If both this and cleanup_proxy are False, sending SIGINT to the Hub will
# only shutdown the Hub, leaving everything else running.
#
# The Hub should be able to resume from database state.
# Default: True
# c.JupyterHub.cleanup_servers = True
## Maximum number of concurrent users that can be spawning at a time.
#
# Spawning lots of servers at the same time can cause performance problems for
# the Hub or the underlying spawning system. Set this limit to prevent bursts of
# logins from attempting to spawn too many servers at the same time.
#
# This does not limit the number of total running servers. See
# active_server_limit for that.
#
# If more than this many users attempt to spawn at a time, their requests will
# be rejected with a 429 error asking them to try again. Users will have to wait
# for some of the spawning services to finish starting before they can start
# their own.
#
# If set to 0, no limit is enforced.
# Default: 100
# c.JupyterHub.concurrent_spawn_limit = 100
## The config file to load
# Default: 'jupyterhub_config.py'
# c.JupyterHub.config_file = 'jupyterhub_config.py'
## DEPRECATED: does nothing
# Default: False
# c.JupyterHub.confirm_no_ssl = False
## Enable `__Host-` prefix on authentication cookies.
#
# The `__Host-` prefix on JupyterHub cookies provides further
# protection against cookie tossing when untrusted servers
# may control subdomains of your jupyterhub deployment.
#
# _However_, it also requires that cookies be set on the path `/`,
# which means they are shared by all JupyterHub components,
# so a compromised server component will have access to _all_ JupyterHub-related
# cookies of the visiting browser.
# It is recommended to only combine `__Host-` cookies with per-user domains.
#
# .. versionadded:: 4.1
# Default: False
# c.JupyterHub.cookie_host_prefix_enabled = False
## Number of days for a login cookie to be valid.
# Default is two weeks.
# Default: 14
# c.JupyterHub.cookie_max_age_days = 14
## The cookie secret to use to encrypt cookies.
#
# Loaded from the JPY_COOKIE_SECRET env variable by default.
#
# Should be exactly 256 bits (32 bytes).
# Default: traitlets.Undefined
# c.JupyterHub.cookie_secret = traitlets.Undefined
## File in which to store the cookie secret.
# Default: 'jupyterhub_cookie_secret'
# c.JupyterHub.cookie_secret_file = 'jupyterhub_cookie_secret'
## Custom scopes to define.
#
# For use when defining custom roles,
# to grant users granular permissions
#
# All custom scopes must have a description,
# and must start with the prefix `custom:`.
#
# For example::
#
# custom_scopes = {
# "custom:jupyter_server:read": {
# "description": "read-only access to a single-user server",
# },
# }
# Default: {}
# c.JupyterHub.custom_scopes = {}
## The location of jupyterhub data files (e.g. /usr/local/share/jupyterhub)
# Default: '$HOME/checkouts/readthedocs.org/user_builds/jupyterhub/envs/latest/share/jupyterhub'
# c.JupyterHub.data_files_path = '/home/docs/checkouts/readthedocs.org/user_builds/jupyterhub/envs/latest/share/jupyterhub'
## Include any kwargs to pass to the database connection.
# See sqlalchemy.create_engine for details.
# Default: {}
# c.JupyterHub.db_kwargs = {}
## url for the database. e.g. `sqlite:///jupyterhub.sqlite`
# Default: 'sqlite:///jupyterhub.sqlite'
# c.JupyterHub.db_url = 'sqlite:///jupyterhub.sqlite'
## log all database transactions. This has A LOT of output
# Default: False
# c.JupyterHub.debug_db = False
## DEPRECATED since version 0.8: Use ConfigurableHTTPProxy.debug
# Default: False
# c.JupyterHub.debug_proxy = False
## If named servers are enabled, default name of server to spawn or open when no
# server is specified, e.g. by user-redirect.
#
# Note: This has no effect if named servers are not enabled, and does _not_
# change the existence or behavior of the default server named `''` (the empty
# string). This only affects which named server is launched when no server is
# specified, e.g. by links to `/hub/user-redirect/lab/tree/mynotebook.ipynb`.
# Default: ''
# c.JupyterHub.default_server_name = ''
## The default URL for users when they arrive (e.g. when user directs to "/")
#
# By default, redirects users to their own server.
#
# Can be a Unicode string (e.g. '/hub/home') or a callable based on the handler
# object:
#
# ::
#
# def default_url_fn(handler):
# user = handler.current_user
# if user and user.admin:
# return '/hub/admin'
# return '/hub/home'
#
# c.JupyterHub.default_url = default_url_fn
# Default: traitlets.Undefined
# c.JupyterHub.default_url = traitlets.Undefined
## Dict authority:dict(files). Specify the key, cert, and/or
# ca file for an authority. This is useful for externally managed
# proxies that wish to use internal_ssl.
#
# The files dict has this format (you must specify at least a cert)::
#
# {
# 'key': '/path/to/key.key',
# 'cert': '/path/to/cert.crt',
# 'ca': '/path/to/ca.crt'
# }
#
# The authorities you can override: 'hub-ca', 'notebooks-ca',
# 'proxy-api-ca', 'proxy-client-ca', and 'services-ca'.
#
# Use with internal_ssl
# Default: {}
# c.JupyterHub.external_ssl_authorities = {}
## DEPRECATED.
#
# If you need to register additional HTTP endpoints please use services instead.
# Default: []
# c.JupyterHub.extra_handlers = []
## DEPRECATED: use output redirection instead, e.g.
#
# jupyterhub &>> /var/log/jupyterhub.log
# Default: ''
# c.JupyterHub.extra_log_file = ''
## Extra log handlers to set on JupyterHub logger
# Default: []
# c.JupyterHub.extra_log_handlers = []
## Alternate header to use as the Host (e.g., X-Forwarded-Host)
# when determining whether a request is cross-origin
#
# This may be useful when JupyterHub is running behind a proxy that rewrites
# the Host header.
# Default: ''
# c.JupyterHub.forwarded_host_header = ''
## Generate certs used for internal ssl
# Default: False
# c.JupyterHub.generate_certs = False
## Generate default config file
# Default: False
# c.JupyterHub.generate_config = False
## The URL on which the Hub will listen. This is a private URL for internal
# communication. Typically set in combination with hub_connect_url. If a unix
# socket, hub_connect_url **must** also be set.
#
# For example:
#
# "http://127.0.0.1:8081"
# "unix+http://%2Fsrv%2Fjupyterhub%2Fjupyterhub.sock"
#
# .. versionadded:: 0.9
# Default: ''
# c.JupyterHub.hub_bind_url = ''
## The ip or hostname for proxies and spawners to use
# for connecting to the Hub.
#
# Use when the bind address (`hub_ip`) is 0.0.0.0, :: or otherwise different
# from the connect address.
#
# Default: when `hub_ip` is 0.0.0.0 or ::, use `socket.gethostname()`,
# otherwise use `hub_ip`.
#
# Note: Some spawners or proxy implementations might not support hostnames. Check your
# spawner or proxy documentation to see if they have extra requirements.
#
# .. versionadded:: 0.8
# Default: ''
# c.JupyterHub.hub_connect_ip = ''
## DEPRECATED
#
# Use hub_connect_url
#
# .. versionadded:: 0.8
#
# .. deprecated:: 0.9
# Use hub_connect_url
# Default: 0
# c.JupyterHub.hub_connect_port = 0
## The URL for connecting to the Hub. Spawners, services, and the proxy will use
# this URL to talk to the Hub.
#
# Only needs to be specified if the default hub URL is not connectable (e.g.
# using a unix+http:// bind url).
#
# .. seealso::
# JupyterHub.hub_connect_ip
# JupyterHub.hub_bind_url
#
# .. versionadded:: 0.9
# Default: ''
# c.JupyterHub.hub_connect_url = ''
## The ip address for the Hub process to *bind* to.
#
# By default, the hub listens on localhost only. This address must be accessible from
# the proxy and user servers. You may need to set this to a public ip or '' for all
# interfaces if the proxy or user servers are in containers or on a different host.
#
# See `hub_connect_ip` for cases where the bind and connect address should differ,
# or `hub_bind_url` for setting the full bind URL.
# Default: '127.0.0.1'
# c.JupyterHub.hub_ip = '127.0.0.1'
## The internal port for the Hub process.
#
# This is the internal port of the hub itself. It should never be accessed directly.
# See JupyterHub.port for the public port to use when accessing jupyterhub.
# It is rare that this port should be set except in cases of port conflict.
#
# See also `hub_ip` for the ip and `hub_bind_url` for setting the full
# bind URL.
# Default: 8081
# c.JupyterHub.hub_port = 8081
## The routing prefix for the Hub itself.
#
# Override to send only a subset of traffic to the Hub. Default is to use the
# Hub as the default route for all requests.
#
# This is necessary for normal jupyterhub operation, as the Hub must receive
# requests for e.g. `/user/:name` when the user's server is not running.
#
# However, some deployments using only the JupyterHub API may want to handle
# these events themselves, in which case they can register their own default
# target with the proxy and set e.g. `hub_routespec = /hub/` to serve only the
# hub's own pages, or even `/hub/api/` for api-only operation.
#
# Note: hub_routespec must include the base_url, if any.
#
# .. versionadded:: 1.4
# Default: '/'
# c.JupyterHub.hub_routespec = '/'
## Trigger implicit spawns after this many seconds.
#
# When a user visits a URL for a server that's not running,
# they are shown a page indicating that the requested server
# is not running with a button to spawn the server.
#
# Setting this to a positive value will redirect the user
# after this many seconds, effectively clicking this button
# automatically for the users,
# automatically beginning the spawn process.
#
# Warning: this can result in errors and surprising behavior
# when sharing access URLs to actual servers,
# since the wrong server is likely to be started.
# Default: 0
# c.JupyterHub.implicit_spawn_seconds = 0
## Timeout (in seconds) to wait for spawners to initialize
#
# Checking if spawners are healthy can take a long time if many spawners are
# active at hub start time.
#
# If it takes longer than this timeout to check, init_spawner will be left to
# complete in the background and the http server is allowed to start.
#
# A timeout of -1 means wait forever, which can mean a slow startup of the Hub
# but ensures that the Hub is fully consistent by the time it starts responding
# to requests. This matches the behavior of jupyterhub 1.0.
#
# .. versionadded: 1.1.0
# Default: 10
# c.JupyterHub.init_spawners_timeout = 10
## The location to store certificates automatically created by
# JupyterHub.
#
# Use with internal_ssl
# Default: 'internal-ssl'
# c.JupyterHub.internal_certs_location = 'internal-ssl'
## Enable SSL for all internal communication
#
# This enables end-to-end encryption between all JupyterHub components.
# JupyterHub will automatically create the necessary certificate
# authority and sign notebook certificates as they're created.
# Default: False
# c.JupyterHub.internal_ssl = False
## The public facing ip of the whole JupyterHub application
# (specifically referred to as the proxy).
#
# This is the address on which the proxy will listen. The default is to
# listen on all interfaces. This is the only address through which JupyterHub
# should be accessed by users.
# Default: ''
# c.JupyterHub.ip = ''
## Supply extra arguments that will be passed to Jinja environment.
# Default: {}
# c.JupyterHub.jinja_environment_options = {}
## Interval (in seconds) at which to update last-activity timestamps.
# Default: 300
# c.JupyterHub.last_activity_interval = 300
## Dict of `{'group': {'users':['usernames'], 'properties': {}}` to load at
# startup.
#
# Example::
#
# c.JupyterHub.load_groups = {
# 'groupname': {
# 'users': ['usernames'],
# 'properties': {'key': 'value'},
# },
# }
#
# This strictly *adds* groups and users to groups. Properties, if defined,
# replace all existing properties.
#
# Loading one set of groups, then starting JupyterHub again with a different set
# will not remove users or groups from previous launches. That must be done
# through the API.
#
# .. versionchanged:: 3.2
# Changed format of group from list of usernames to dict
# Default: {}
# c.JupyterHub.load_groups = {}
## List of predefined role dictionaries to load at startup.
#
# For instance::
#
# load_roles = [
# {
# 'name': 'teacher',
# 'description': 'Access to users' information and group membership',
# 'scopes': ['users', 'groups'],
# 'users': ['cyclops', 'gandalf'],
# 'services': [],
# 'groups': []
# }
# ]
#
# All keys apart from 'name' are optional.
# See all the available scopes in the JupyterHub REST API documentation.
#
# Default roles are defined in roles.py.
# Default: []
# c.JupyterHub.load_roles = []
## The date format used by logging formatters for %(asctime)s
# See also: Application.log_datefmt
# c.JupyterHub.log_datefmt = '%Y-%m-%d %H:%M:%S'
## The Logging format template
# See also: Application.log_format
# c.JupyterHub.log_format = '[%(name)s]%(highlevel)s %(message)s'
## Set the log level by value or name.
# See also: Application.log_level
# c.JupyterHub.log_level = 30
##
# See also: Application.logging_config
# c.JupyterHub.logging_config = {}
## Specify path to a logo image to override the Jupyter logo in the banner.
# Default: ''
# c.JupyterHub.logo_file = ''
## Maximum number of concurrent named servers that can be created by a user at a
# time.
#
# Setting this can limit the total resources a user can consume.
#
# If set to 0, no limit is enforced.
#
# Can be an integer or a callable/awaitable based on the handler object:
#
# ::
#
# def named_server_limit_per_user_fn(handler):
# user = handler.current_user
# if user and user.admin:
# return 0
# return 5
#
# c.JupyterHub.named_server_limit_per_user = named_server_limit_per_user_fn
# Default: 0
# c.JupyterHub.named_server_limit_per_user = 0
## Expiry (in seconds) of OAuth access tokens.
#
# The default is to expire when the cookie storing them expires,
# according to `cookie_max_age_days` config.
#
# These are the tokens stored in cookies when you visit
# a single-user server or service.
# When they expire, you must re-authenticate with the Hub,
# even if your Hub authentication is still valid.
# If your Hub authentication is valid,
# logging in may be a transparent redirect as you refresh the page.
#
# This does not affect JupyterHub API tokens in general,
# which do not expire by default.
# Only tokens issued during the oauth flow
# accessing services and single-user servers are affected.
#
# .. versionadded:: 1.4
# OAuth token expires_in was not previously configurable.
# .. versionchanged:: 1.4
# Default now uses cookie_max_age_days so that oauth tokens
# which are generally stored in cookies,
# expire when the cookies storing them expire.
# Previously, it was one hour.
# Default: 0
# c.JupyterHub.oauth_token_expires_in = 0
## File to write PID
# Useful for daemonizing JupyterHub.
# Default: ''
# c.JupyterHub.pid_file = ''
## The public facing port of the proxy.
#
# This is the port on which the proxy will listen.
# This is the only port through which JupyterHub
# should be accessed by users.
# Default: 8000
# c.JupyterHub.port = 8000
## DEPRECATED since version 0.8 : Use ConfigurableHTTPProxy.api_url
# Default: ''
# c.JupyterHub.proxy_api_ip = ''
## DEPRECATED since version 0.8 : Use ConfigurableHTTPProxy.api_url
# Default: 0
# c.JupyterHub.proxy_api_port = 0
## DEPRECATED since version 0.8: Use ConfigurableHTTPProxy.auth_token
# Default: ''
# c.JupyterHub.proxy_auth_token = ''
## DEPRECATED since version 0.8: Use ConfigurableHTTPProxy.check_running_interval
# Default: 5
# c.JupyterHub.proxy_check_interval = 5
## The class to use for configuring the JupyterHub proxy.
#
# Should be a subclass of :class:`jupyterhub.proxy.Proxy`.
#
# .. versionchanged:: 1.0
# proxies may be registered via entry points,
# e.g. `c.JupyterHub.proxy_class = 'traefik'`
#
# Currently installed:
# - configurable-http-proxy: jupyterhub.proxy.ConfigurableHTTPProxy
# - default: jupyterhub.proxy.ConfigurableHTTPProxy
# Default: 'jupyterhub.proxy.ConfigurableHTTPProxy'
# c.JupyterHub.proxy_class = 'jupyterhub.proxy.ConfigurableHTTPProxy'
## DEPRECATED since version 0.8. Use ConfigurableHTTPProxy.command
# Default: []
# c.JupyterHub.proxy_cmd = []
## Set the public URL of JupyterHub
#
# This will skip any detection of URL and protocol from requests,
# which isn't always correct when JupyterHub is behind
# multiple layers of proxies, etc.
# Usually the failure is detecting http when it's really https.
#
# Should include the full, public URL of JupyterHub,
# including the public-facing base_url prefix
# (i.e. it should include a trailing slash), e.g.
# https://jupyterhub.example.org/prefix/
# Default: ''
# c.JupyterHub.public_url = ''
## Recreate all certificates used within JupyterHub on restart.
#
# Note: enabling this feature requires restarting all notebook servers.
#
# Use with internal_ssl
# Default: False
# c.JupyterHub.recreate_internal_certs = False
## Redirect user to server (if running), instead of control panel.
# Default: True
# c.JupyterHub.redirect_to_server = True
## Purge and reset the database.
# Default: False
# c.JupyterHub.reset_db = False
## Interval (in seconds) at which to check connectivity of services with web
# endpoints.
# Default: 60
# c.JupyterHub.service_check_interval = 60
## Dict of token:servicename to be loaded into the database.
#
# Allows ahead-of-time generation of API tokens for use by externally
# managed services.
# Default: {}
# c.JupyterHub.service_tokens = {}
## List of service specification dictionaries.
#
# A service
#
# For instance::
#
# services = [
# {
# 'name': 'cull_idle',
# 'command': ['/path/to/cull_idle_servers.py'],
# },
# {
# 'name': 'formgrader',
# 'url': 'http://127.0.0.1:1234',
# 'api_token': 'super-secret',
# 'environment':
# }
# ]
# Default: []
# c.JupyterHub.services = []
## Instead of starting the Application, dump configuration to stdout
# See also: Application.show_config
# c.JupyterHub.show_config = False
## Instead of starting the Application, dump configuration to stdout (as JSON)
# See also: Application.show_config_json
# c.JupyterHub.show_config_json = False
## Shuts down all user servers on logout
# Default: False
# c.JupyterHub.shutdown_on_logout = False
## The class to use for spawning single-user servers.
#
# Should be a subclass of :class:`jupyterhub.spawner.Spawner`.
#
# .. versionchanged:: 1.0
# spawners may be registered via entry points,
# e.g. `c.JupyterHub.spawner_class = 'localprocess'`
#
# Currently installed:
# - default: jupyterhub.spawner.LocalProcessSpawner
# - localprocess: jupyterhub.spawner.LocalProcessSpawner
# - simple: jupyterhub.spawner.SimpleLocalProcessSpawner
# Default: 'jupyterhub.spawner.LocalProcessSpawner'
# c.JupyterHub.spawner_class = 'jupyterhub.spawner.LocalProcessSpawner'
## Path to SSL certificate file for the public facing interface of the proxy
#
# When setting this, you should also set ssl_key
# Default: ''
# c.JupyterHub.ssl_cert = ''
## Path to SSL key file for the public facing interface of the proxy
#
# When setting this, you should also set ssl_cert
# Default: ''
# c.JupyterHub.ssl_key = ''
## Host to send statsd metrics to. An empty string (the default) disables sending
# metrics.
# Default: ''
# c.JupyterHub.statsd_host = ''
## Port on which to send statsd metrics about the hub
# Default: 8125
# c.JupyterHub.statsd_port = 8125
## Prefix to use for all metrics sent by jupyterhub to statsd
# Default: 'jupyterhub'
# c.JupyterHub.statsd_prefix = 'jupyterhub'
## Hook for constructing subdomains for users and services. Only used when
# `JupyterHub.subdomain_host` is set.
#
# There are two predefined hooks, which can be selected by name:
#
# - 'legacy' (deprecated) - 'idna' (default, more robust. No change for _most_
# usernames)
#
# Otherwise, should be a function which must not be async. A custom
# subdomain_hook should have the signature:
#
# def subdomain_hook(name, domain, kind) -> str:
# ...
#
# and should return a unique, valid domain name for all usernames.
#
# - `name` is the original name, which may need escaping to be safe as a domain
# name label - `domain` is the domain of the Hub itself - `kind` will be one of
# 'user' or 'service'
#
# JupyterHub itself puts very little limit on usernames to accommodate a wide
# variety of Authenticators, but your identity provider is likely much more
# strict, allowing you to make assumptions about the name.
#
# The default behavior is to have all services on a single `services.{domain}`
# subdomain, and each user on `{username}.{domain}`. This is the 'legacy'
# scheme, and doesn't work for all usernames.
#
# The 'idna' scheme is a new scheme that should produce a valid domain name for
# any user, using IDNA encoding for unicode usernames, and a truncate-and-hash
# approach for any usernames that can't be easily encoded into a domain
# component.
#
# .. versionadded:: 5.0
# Default: 'idna'
# c.JupyterHub.subdomain_hook = 'idna'
## Run single-user servers on subdomains of this host.
#
# This should be the full `https://hub.domain.tld[:port]`.
#
# Provides additional cross-site protections for javascript served by
# single-user servers.
#
# Requires `<username>.hub.domain.tld` to resolve to the same host as
# `hub.domain.tld`.
#
# In general, this is most easily achieved with wildcard DNS.
#
# When using SSL (i.e. always) this also requires a wildcard SSL
# certificate.
# Default: ''
# c.JupyterHub.subdomain_host = ''
## Paths to search for jinja templates, before using the default templates.
# Default: []
# c.JupyterHub.template_paths = []
## Extra variables to be passed into jinja templates.
#
# Values in dict may contain callable objects.
# If value is callable, the current user is passed as argument.
#
# Example::
#
# def callable_value(user):
# # user is generated by handlers.base.get_current_user
# with open("/tmp/file.txt", "r") as f:
# ret = f.read()
# ret = ret.replace("<username>", user.name)
# return ret
#
# c.JupyterHub.template_vars = {
# "key1": "value1",
# "key2": callable_value,
# }
# Default: {}
# c.JupyterHub.template_vars = {}
## Extra settings overrides to pass to the tornado application.
# Default: {}
# c.JupyterHub.tornado_settings = {}
## Trust user-provided tokens (via JupyterHub.service_tokens)
# to have good entropy.
#
# If you are not inserting additional tokens via configuration file,
# this flag has no effect.
#
# In JupyterHub 0.8, internally generated tokens do not
# pass through additional hashing because the hashing is costly
# and does not increase the entropy of already-good UUIDs.
#
# User-provided tokens, on the other hand, are not trusted to have good entropy by default,
# and are passed through many rounds of hashing to stretch the entropy of the key
# (i.e. user-provided tokens are treated as passwords instead of random keys).
# These keys are more costly to check.
#
# If your inserted tokens are generated by a good-quality mechanism,
# e.g. `openssl rand -hex 32`, then you can set this flag to True
# to reduce the cost of checking authentication tokens.
# Default: False
# c.JupyterHub.trust_user_provided_tokens = False
## Names to include in the subject alternative name.
#
# These names will be used for server name verification. This is useful
# if JupyterHub is being run behind a reverse proxy or services using ssl
# are on different hosts.
#
# Use with internal_ssl
# Default: []
# c.JupyterHub.trusted_alt_names = []
## Downstream proxy IP addresses to trust.
#
# This sets the list of IP addresses that are trusted and skipped when processing
# the `X-Forwarded-For` header. For example, if an external proxy is used for TLS
# termination, its IP address should be added to this list to ensure the correct
# client IP addresses are recorded in the logs instead of the proxy server's IP
# address.
# Default: []
# c.JupyterHub.trusted_downstream_ips = []
## Upgrade the database automatically on start.
#
# Only safe if database is regularly backed up.
# Only SQLite databases will be backed up to a local file automatically.
# Default: False
# c.JupyterHub.upgrade_db = False
## Return 503 rather than 424 when request comes in for a non-running server.
#
# Prior to JupyterHub 2.0, we returned a 503 when any request came in for a user
# server that was currently not running. By default, JupyterHub 2.0 will return
# a 424 - this makes operational metric dashboards more useful.
#
# JupyterLab < 3.2 expected the 503 to know if the user server is no longer
# running, and prompted the user to start their server. Set this config to true
# to retain the old behavior, so JupyterLab < 3.2 can continue to show the
# appropriate UI when the user server is stopped.
#
# This option will be removed in a future release.
# Default: False
# c.JupyterHub.use_legacy_stopped_server_status_code = False
## Callable to affect behavior of /user-redirect/
#
# Receives 4 parameters: 1. path - URL path that was provided after /user-
# redirect/ 2. request - A Tornado HTTPServerRequest representing the current
# request. 3. user - The currently authenticated user. 4. base_url - The
# base_url of the current hub, for relative redirects
#
# It should return the new URL to redirect to, or None to preserve current
# behavior.
# Default: None
# c.JupyterHub.user_redirect_hook = None
#------------------------------------------------------------------------------
# Spawner(LoggingConfigurable) configuration
#------------------------------------------------------------------------------
## Base class for spawning single-user notebook servers.
#
# Subclass this, and override the following methods:
#
# - load_state
# - get_state
# - start
# - stop
# - poll
#
# As JupyterHub supports multiple users, an instance of the Spawner subclass
# is created for each user. If there are 20 JupyterHub users, there will be 20
# instances of the subclass.
## Extra arguments to be passed to the single-user server.
#
# Some spawners allow shell-style expansion here, allowing you to use
# environment variables here. Most, including the default, do not. Consult the
# documentation for your spawner to verify!
# Default: []
# c.Spawner.args = []
## An optional hook function that you can implement to pass `auth_state` to the
# spawner after it has been initialized but before it starts. The `auth_state`
# dictionary may be set by the `.authenticate()` method of the authenticator.
# This hook enables you to pass some or all of that information to your spawner.
#
# Example::
#
# def userdata_hook(spawner, auth_state):
# spawner.userdata = auth_state["userdata"]
#
# c.Spawner.auth_state_hook = userdata_hook
# Default: None
# c.Spawner.auth_state_hook = None
## The command used for starting the single-user server.
#
# Provide either a string or a list containing the path to the startup script
# command. Extra arguments, other than this path, should be provided via `args`.
#
# This is usually set if you want to start the single-user server in a different
# python environment (with virtualenv/conda) than JupyterHub itself.
#
# Some spawners allow shell-style expansion here, allowing you to use
# environment variables. Most, including the default, do not. Consult the
# documentation for your spawner to verify!
# Default: ['jupyterhub-singleuser']
# c.Spawner.cmd = ['jupyterhub-singleuser']
## Maximum number of consecutive failures to allow before shutting down
# JupyterHub.
#
# This helps JupyterHub recover from a certain class of problem preventing
# launch in contexts where the Hub is automatically restarted (e.g. systemd,
# docker, kubernetes).
#
# A limit of 0 means no limit and consecutive failures will not be tracked.
# Default: 0
# c.Spawner.consecutive_failure_limit = 0
## Minimum number of cpu-cores a single-user notebook server is guaranteed to
# have available.
#
# If this value is set to 0.5, allows use of 50% of one CPU. If this value is
# set to 2, allows use of up to 2 CPUs.
#
# **This is a configuration setting. Your spawner must implement support for the
# limit to work.** The default spawner, `LocalProcessSpawner`, does **not**
# implement this support. A custom spawner **must** add support for this setting
# for it to be enforced.
# Default: None
# c.Spawner.cpu_guarantee = None
## Maximum number of cpu-cores a single-user notebook server is allowed to use.
#
# If this value is set to 0.5, allows use of 50% of one CPU. If this value is
# set to 2, allows use of up to 2 CPUs.
#
# The single-user notebook server will never be scheduled by the kernel to use
# more cpu-cores than this. There is no guarantee that it can access this many
# cpu-cores.
#
# **This is a configuration setting. Your spawner must implement support for the
# limit to work.** The default spawner, `LocalProcessSpawner`, does **not**
# implement this support. A custom spawner **must** add support for this setting
# for it to be enforced.
# Default: None
# c.Spawner.cpu_limit = None
## Enable debug-logging of the single-user server
# Default: False
# c.Spawner.debug = False
## The URL the single-user server should start in.
#
# `{username}` will be expanded to the user's username
#
# Example uses:
#
# - You can set `notebook_dir` to `/` and `default_url` to `/tree/home/{username}` to allow people to
# navigate the whole filesystem from their notebook server, but still start in their home directory.
# - Start with `/notebooks` instead of `/tree` if `default_url` points to a notebook instead of a directory.
# - You can set this to `/lab` to have JupyterLab start by default, rather than Jupyter Notebook.
# Default: ''
# c.Spawner.default_url = ''
## Disable per-user configuration of single-user servers.
#
# When starting the user's single-user server, any config file found in the
# user's $HOME directory will be ignored.
#
# Note: a user could circumvent this if the user modifies their Python
# environment, such as when they have their own conda environments / virtualenvs
# / containers.
# Default: False
# c.Spawner.disable_user_config = False
## List of environment variables for the single-user server to inherit from the
# JupyterHub process.
#
# This list is used to ensure that sensitive information in the JupyterHub
# process's environment (such as `CONFIGPROXY_AUTH_TOKEN`) is not passed to the
# single-user server's process.
# Default: ['PATH', 'PYTHONPATH', 'CONDA_ROOT', 'CONDA_DEFAULT_ENV', 'VIRTUAL_ENV', 'LANG', 'LC_ALL', 'JUPYTERHUB_SINGLEUSER_APP']
# c.Spawner.env_keep = ['PATH', 'PYTHONPATH', 'CONDA_ROOT', 'CONDA_DEFAULT_ENV', 'VIRTUAL_ENV', 'LANG', 'LC_ALL', 'JUPYTERHUB_SINGLEUSER_APP']
## Extra environment variables to set for the single-user server's process.
#
# Environment variables that end up in the single-user server's process come from 3 sources:
# - This `environment` configurable
# - The JupyterHub process' environment variables that are listed in `env_keep`
# - Variables to establish contact between the single-user notebook and the hub (such as JUPYTERHUB_API_TOKEN)
#
# The `environment` configurable should be set by JupyterHub administrators to
# add installation specific environment variables. It is a dict where the key is
# the name of the environment variable, and the value can be a string or a
# callable. If it is a callable, it will be called with one parameter (the
# spawner instance), and should return a string fairly quickly (no blocking
# operations please!).
#
# Note that the spawner class' interface is not guaranteed to be exactly same
# across upgrades, so if you are using the callable take care to verify it
# continues to work after upgrades!
#
# .. versionchanged:: 1.2
# environment from this configuration has highest priority,
# allowing override of 'default' env variables,
# such as JUPYTERHUB_API_URL.
# Default: {}
# c.Spawner.environment = {}
## Timeout (in seconds) before giving up on a spawned HTTP server
#
# Once a server has successfully been spawned, this is the amount of time we
# wait before assuming that the server is unable to accept connections.
# Default: 30
# c.Spawner.http_timeout = 30
## The URL the single-user server should connect to the Hub.
#
# If the Hub URL set in your JupyterHub config is not reachable from spawned
# notebooks, you can set differnt URL by this config.
#
# Is None if you don't need to change the URL.
# Default: None
# c.Spawner.hub_connect_url = None
## The IP address (or hostname) the single-user server should listen on.
#
# Usually either '127.0.0.1' (default) or '0.0.0.0'.
#
# The JupyterHub proxy implementation should be able to send packets to this
# interface.
#
# Subclasses which launch remotely or in containers should override the default
# to '0.0.0.0'.
#
# .. versionchanged:: 2.0
# Default changed to '127.0.0.1', from ''.
# In most cases, this does not result in a change in behavior,
# as '' was interpreted as 'unspecified',
# which used the subprocesses' own default, itself usually '127.0.0.1'.
# Default: '127.0.0.1'
# c.Spawner.ip = '127.0.0.1'
## Minimum number of bytes a single-user notebook server is guaranteed to have
# available.
#
# Allows the following suffixes:
# - K -> Kilobytes
# - M -> Megabytes
# - G -> Gigabytes
# - T -> Terabytes
#
# **This is a configuration setting. Your spawner must implement support for the
# limit to work.** The default spawner, `LocalProcessSpawner`, does **not**
# implement this support. A custom spawner **must** add support for this setting
# for it to be enforced.
# Default: None
# c.Spawner.mem_guarantee = None
## Maximum number of bytes a single-user notebook server is allowed to use.
#
# Allows the following suffixes:
# - K -> Kilobytes
# - M -> Megabytes
# - G -> Gigabytes
# - T -> Terabytes
#
# If the single user server tries to allocate more memory than this, it will
# fail. There is no guarantee that the single-user notebook server will be able
# to allocate this much memory - only that it can not allocate more than this.
#
# **This is a configuration setting. Your spawner must implement support for the
# limit to work.** The default spawner, `LocalProcessSpawner`, does **not**
# implement this support. A custom spawner **must** add support for this setting
# for it to be enforced.
# Default: None
# c.Spawner.mem_limit = None
## Path to the notebook directory for the single-user server.
#
# The user sees a file listing of this directory when the notebook interface is
# started. The current interface does not easily allow browsing beyond the
# subdirectories in this directory's tree.
#
# `~` will be expanded to the home directory of the user, and {username} will be
# replaced with the name of the user.
#
# Note that this does *not* prevent users from accessing files outside of this
# path! They can do so with many other means.
# Default: ''
# c.Spawner.notebook_dir = ''
## Allowed scopes for oauth tokens issued by this server's oauth client.
#
# This sets the maximum and default scopes
# assigned to oauth tokens issued by a single-user server's
# oauth client (i.e. tokens stored in browsers after authenticating with the server),
# defining what actions the server can take on behalf of logged-in users.
#
# Default is an empty list, meaning minimal permissions to identify users,
# no actions can be taken on their behalf.
#
# If callable, will be called with the Spawner as a single argument.
# Callables may be async.
# Default: traitlets.Undefined
# c.Spawner.oauth_client_allowed_scopes = traitlets.Undefined
## Allowed roles for oauth tokens.
#
# Deprecated in 3.0: use oauth_client_allowed_scopes
#
# This sets the maximum and default roles
# assigned to oauth tokens issued by a single-user server's
# oauth client (i.e. tokens stored in browsers after authenticating with the server),
# defining what actions the server can take on behalf of logged-in users.
#
# Default is an empty list, meaning minimal permissions to identify users,
# no actions can be taken on their behalf.
# Default: traitlets.Undefined
# c.Spawner.oauth_roles = traitlets.Undefined
## An HTML form for options a user can specify on launching their server.
#
# The surrounding `<form>` element and the submit button are already provided.
#
# For example:
#
# .. code:: html
#
# Set your key:
# <input name="key" val="default_key"></input>
# <br>
# Choose a letter:
# <select name="letter" multiple="true">
# <option value="A">The letter A</option>
# <option value="B">The letter B</option>
# </select>
#
# The data from this form submission will be passed on to your spawner in
# `self.user_options`
#
# Instead of a form snippet string, this could also be a callable that takes as
# one parameter the current spawner instance and returns a string. The callable
# will be called asynchronously if it returns a future, rather than a str. Note
# that the interface of the spawner class is not deemed stable across versions,
# so using this functionality might cause your JupyterHub upgrades to break.
# Default: traitlets.Undefined
# c.Spawner.options_form = traitlets.Undefined
## Interpret HTTP form data
#
# Form data will always arrive as a dict of lists of strings. Override this
# function to understand single-values, numbers, etc.
#
# This should coerce form data into the structure expected by self.user_options,
# which must be a dict, and should be JSON-serializeable, though it can contain
# bytes in addition to standard JSON data types.
#
# This method should not have any side effects. Any handling of `user_options`
# should be done in `.start()` to ensure consistent behavior across servers
# spawned via the API and form submission page.
#
# Instances will receive this data on self.user_options, after passing through
# this function, prior to `Spawner.start`.
#
# .. versionchanged:: 1.0
# user_options are persisted in the JupyterHub database to be reused
# on subsequent spawns if no options are given.
# user_options is serialized to JSON as part of this persistence
# (with additional support for bytes in case of uploaded file data),
# and any non-bytes non-jsonable values will be replaced with None
# if the user_options are re-used.
# Default: traitlets.Undefined
# c.Spawner.options_from_form = traitlets.Undefined
## Interval (in seconds) on which to poll the spawner for single-user server's
# status.
#
# At every poll interval, each spawner's `.poll` method is called, which checks
# if the single-user server is still running. If it isn't running, then
# JupyterHub modifies its own state accordingly and removes appropriate routes
# from the configurable proxy.
# Default: 30
# c.Spawner.poll_interval = 30
## Jitter fraction for poll_interval.
#
# Avoids alignment of poll calls for many Spawners, e.g. when restarting
# JupyterHub, which restarts all polls for running Spawners.
#
# `poll_jitter=0` means no jitter, 0.1 means 10%, etc.
# Default: 0.1
# c.Spawner.poll_jitter = 0.1
## The port for single-user servers to listen on.
#
# Defaults to `0`, which uses a randomly allocated port number each time.
#
# If set to a non-zero value, all Spawners will use the same port, which only
# makes sense if each server is on a different address, e.g. in containers.
#
# New in version 0.7.
# Default: 0
# c.Spawner.port = 0
## An optional hook function that you can implement to do work after the spawner
# stops.
#
# This can be set independent of any concrete spawner implementation.
# Default: None
# c.Spawner.post_stop_hook = None
## An optional hook function that you can implement to do some bootstrapping work
# before the spawner starts. For example, create a directory for your user or
# load initial content.
#
# This can be set independent of any concrete spawner implementation.
#
# This maybe a coroutine.
#
# Example::
#
# def my_hook(spawner):
# username = spawner.user.name
# spawner.environment["GREETING"] = f"Hello {username}"
#
# c.Spawner.pre_spawn_hook = my_hook
# Default: None
# c.Spawner.pre_spawn_hook = None
## An optional hook function that you can implement to modify the ready event,
# which will be shown to the user on the spawn progress page when their server
# is ready.
#
# This can be set independent of any concrete spawner implementation.
#
# This maybe a coroutine.
#
# Example::
#
# async def my_ready_hook(spawner, ready_event):
# ready_event["html_message"] = f"Server {spawner.name} is ready for {spawner.user.name}"
# return ready_event
#
# c.Spawner.progress_ready_hook = my_ready_hook
# Default: None
# c.Spawner.progress_ready_hook = None
## The list of scopes to request for $JUPYTERHUB_API_TOKEN
#
# If not specified, the scopes in the `server` role will be used
# (unchanged from pre-4.0).
#
# If callable, will be called with the Spawner instance as its sole argument
# (JupyterHub user available as spawner.user).
#
# JUPYTERHUB_API_TOKEN will be assigned the _subset_ of these scopes
# that are held by the user (as in oauth_client_allowed_scopes).
#
# .. versionadded:: 4.0
# Default: traitlets.Undefined
# c.Spawner.server_token_scopes = traitlets.Undefined
## List of SSL alt names
#
# May be set in config if all spawners should have the same value(s),
# or set at runtime by Spawner that know their names.
# Default: []
# c.Spawner.ssl_alt_names = []
## Whether to include `DNS:localhost`, `IP:127.0.0.1` in alt names
# Default: True
# c.Spawner.ssl_alt_names_include_local = True
## Timeout (in seconds) before giving up on starting of single-user server.
#
# This is the timeout for start to return, not the timeout for the server to
# respond. Callers of spawner.start will assume that startup has failed if it
# takes longer than this. start should return when the server process is started
# and its location is known.
# Default: 60
# c.Spawner.start_timeout = 60
#------------------------------------------------------------------------------
# Authenticator(LoggingConfigurable) configuration
#------------------------------------------------------------------------------
## Base class for implementing an authentication provider for JupyterHub
## Set of users that will have admin rights on this JupyterHub.
#
# Note: As of JupyterHub 2.0, full admin rights should not be required, and more
# precise permissions can be managed via roles.
#
# Admin users have extra privileges:
# - Use the admin panel to see list of users logged in
# - Add / remove users in some authenticators
# - Restart / halt the hub
# - Start / stop users' single-user servers
# - Can access each individual users' single-user server (if configured)
#
# Admin access should be treated the same way root access is.
#
# Defaults to an empty set, in which case no user has admin access.
# Default: set()
# c.Authenticator.admin_users = set()
## Allow every user who can successfully authenticate to access JupyterHub.
#
# False by default, which means for most Authenticators, _some_ allow-related
# configuration is required to allow users to log in.
#
# Authenticator subclasses may override the default with e.g.::
#
# @default("allow_all")
# def _default_allow_all(self):
# # if _any_ auth config (depends on the Authenticator)
# if self.allowed_users or self.allowed_groups or self.allow_existing_users:
# return False
# else:
# return True
#
# .. versionadded:: 5.0
#
# .. versionchanged:: 5.0
# Prior to 5.0, `allow_all` wasn't defined on its own,
# and was instead implicitly True when no allow config was provided,
# i.e. `allowed_users` unspecified or empty on the base Authenticator class.
#
# To preserve pre-5.0 behavior,
# set `allow_all = True` if you have no other allow configuration.
# Default: False
# c.Authenticator.allow_all = False
## Allow existing users to login.
#
# Defaults to True if `allowed_users` is set for historical reasons, and False
# otherwise.
#
# With this enabled, all users present in the JupyterHub database are allowed to
# login. This has the effect of any user who has _previously_ been allowed to
# login via any means will continue to be allowed until the user is deleted via
# the /hub/admin page or REST API.
#
# .. warning::
#
# Before enabling this you should review the existing users in the
# JupyterHub admin panel at `/hub/admin`. You may find users existing
# there because they have previously been declared in config such as
# `allowed_users` or allowed to sign in.
#
# .. warning::
#
# When this is enabled and you wish to remove access for one or more
# users previously allowed, you must make sure that they
# are removed from the jupyterhub database. This can be tricky to do
# if you stop allowing an externally managed group of users for example.
#
# With this enabled, JupyterHub admin users can visit `/hub/admin` or use
# JupyterHub's REST API to add and remove users to manage who can login.
#
# .. versionadded:: 5.0
# Default: False
# c.Authenticator.allow_existing_users = False
## Set of usernames that are allowed to log in.
#
# Use this to limit which authenticated users may login. Default behavior: only
# users in this set are allowed.
#
# If empty, does not perform any restriction, in which case any authenticated
# user is allowed.
#
# Authenticators may extend :meth:`.Authenticator.check_allowed` to combine
# `allowed_users` with other configuration to either expand or restrict access.
#
# .. versionchanged:: 1.2
# `Authenticator.whitelist` renamed to `allowed_users`
# Default: set()
# c.Authenticator.allowed_users = set()
## Is there any allow config?
#
# Used to show a warning if it looks like nobody can access the Hub,
# which can happen when upgrading to JupyterHub 5,
# now that `allow_all` defaults to False.
#
# Deployments can set this explicitly to True to suppress
# the "No allow config found" warning.
#
# Will be True if any config tagged with `.tag(allow_config=True)`
# or starts with `allow` is truthy.
#
# .. versionadded:: 5.0
# Default: False
# c.Authenticator.any_allow_config = False
## The max age (in seconds) of authentication info
# before forcing a refresh of user auth info.
#
# Refreshing auth info allows, e.g. requesting/re-validating auth
# tokens.
#
# See :meth:`.refresh_user` for what happens when user auth info is refreshed
# (nothing by default).
# Default: 300
# c.Authenticator.auth_refresh_age = 300
## Automatically begin the login process
#
# rather than starting with a "Login with..." link at `/hub/login`
#
# To work, `.login_url()` must give a URL other than the default `/hub/login`,
# such as an oauth handler or another automatic login handler,
# registered with `.get_handlers()`.
#
# .. versionadded:: 0.8
# Default: False
# c.Authenticator.auto_login = False
## Automatically begin login process for OAuth2 authorization requests
#
# When another application is using JupyterHub as OAuth2 provider, it sends
# users to `/hub/api/oauth2/authorize`. If the user isn't logged in already, and
# auto_login is not set, the user will be dumped on the hub's home page, without
# any context on what to do next.
#
# Setting this to true will automatically redirect users to login if they aren't
# logged in *only* on the `/hub/api/oauth2/authorize` endpoint.
#
# .. versionadded:: 1.5
# Default: False
# c.Authenticator.auto_login_oauth2_authorize = False
## Set of usernames that are not allowed to log in.
#
# Use this with supported authenticators to restrict which users can not log in.
# This is an additional block list that further restricts users, beyond whatever
# restrictions the authenticator has in place.
#
# If empty, does not perform any additional restriction.
#
# .. versionadded: 0.9
#
# .. versionchanged:: 1.2
# `Authenticator.blacklist` renamed to `blocked_users`
# Default: set()
# c.Authenticator.blocked_users = set()
## Delete any users from the database that do not pass validation
#
# When JupyterHub starts, `.add_user` will be called
# on each user in the database to verify that all users are still valid.
#
# If `delete_invalid_users` is True,
# any users that do not pass validation will be deleted from the database.
# Use this if users might be deleted from an external system,
# such as local user accounts.
#
# If False (default), invalid users remain in the Hub's database
# and a warning will be issued.
# This is the default to avoid data loss due to config changes.
# Default: False
# c.Authenticator.delete_invalid_users = False
## Enable persisting auth_state (if available).
#
# auth_state will be encrypted and stored in the Hub's database.
# This can include things like authentication tokens, etc.
# to be passed to Spawners as environment variables.
#
# Encrypting auth_state requires the cryptography package.
#
# Additionally, the JUPYTERHUB_CRYPT_KEY environment variable must
# contain one (or more, separated by ;) 32B encryption keys.
# These can be either base64 or hex-encoded.
#
# If encryption is unavailable, auth_state cannot be persisted.
#
# New in JupyterHub 0.8
# Default: False
# c.Authenticator.enable_auth_state = False
## Let authenticator manage user groups
#
# If True, Authenticator.authenticate and/or .refresh_user
# may return a list of group names in the 'groups' field,
# which will be assigned to the user.
#
# All group-assignment APIs are disabled if this is True.
# Default: False
# c.Authenticator.manage_groups = False
## Let authenticator manage roles
#
# If True, Authenticator.authenticate and/or .refresh_user
# may return a list of roles in the 'roles' field,
# which will be added to the database.
#
# When enabled, all role management will be handled by the
# authenticator; in particular, assignment of roles via
# `JupyterHub.load_roles` traitlet will not be possible.
#
# .. versionadded:: 5.0
# Default: False
# c.Authenticator.manage_roles = False
## The prompt string for the extra OTP (One Time Password) field.
#
# .. versionadded:: 5.0
# Default: 'OTP:'
# c.Authenticator.otp_prompt = 'OTP:'
## An optional hook function that you can implement to do some bootstrapping work
# during authentication. For example, loading user account details from an
# external system.
#
# This function is called after the user has passed all authentication checks
# and is ready to successfully authenticate. This function must return the
# auth_model dict reguardless of changes to it. The hook is called with 3
# positional arguments: `(authenticator, handler, auth_model)`.
#
# This may be a coroutine.
#
# .. versionadded: 1.0
#
# Example::
#
# import os
# import pwd
# def my_hook(authenticator, handler, auth_model):
# user_data = pwd.getpwnam(auth_model['name'])
# spawn_data = {
# 'pw_data': user_data
# 'gid_list': os.getgrouplist(auth_model['name'], user_data.pw_gid)
# }
#
# if auth_model['auth_state'] is None:
# auth_model['auth_state'] = {}
# auth_model['auth_state']['spawn_data'] = spawn_data
#
# return auth_model
#
# c.Authenticator.post_auth_hook = my_hook
# Default: None
# c.Authenticator.post_auth_hook = None
## Force refresh of auth prior to spawn.
#
# This forces :meth:`.refresh_user` to be called prior to launching
# a server, to ensure that auth state is up-to-date.
#
# This can be important when e.g. auth tokens that may have expired
# are passed to the spawner via environment variables from auth_state.
#
# If refresh_user cannot refresh the user auth data,
# launch will fail until the user logs in again.
# Default: False
# c.Authenticator.refresh_pre_spawn = False
## Prompt for OTP (One Time Password) in the login form.
#
# .. versionadded:: 5.0
# Default: False
# c.Authenticator.request_otp = False
## Reset managed roles to result of `load_managed_roles()` on startup.
#
# If True:
# - stale managed roles will be removed,
# - stale assignments to managed roles will be removed.
#
# Any role not present in `load_managed_roles()` will be considered
# 'stale'.
#
# The 'stale' status for role assignments is also determined from
# `load_managed_roles()` result:
#
# - user role assignments status will depend on whether the `users` key
# is defined or not:
#
# * if a list is defined under the `users` key and the user is not listed, then the user role assignment will be considered 'stale',
# * if the `users` key is not provided, the user role assignment will be preserved;
# - service and group role assignments will be considered 'stale':
#
# * if not included in the `services` and `groups` list,
# * if the `services` and `groups` keys are not provided.
#
# .. versionadded:: 5.0
# Default: False
# c.Authenticator.reset_managed_roles_on_startup = False
## Dictionary mapping authenticator usernames to JupyterHub users.
#
# Primarily used to normalize OAuth user names to local users.
# Default: {}
# c.Authenticator.username_map = {}
## Regular expression pattern that all valid usernames must match.
#
# If a username does not match the pattern specified here, authentication will
# not be attempted.
#
# If not set, allow any username.
# Default: ''
# c.Authenticator.username_pattern = ''
## Deprecated, use `Authenticator.allowed_users`
# Default: set()
# c.Authenticator.whitelist = set()
#------------------------------------------------------------------------------
# CryptKeeper(SingletonConfigurable) configuration
#------------------------------------------------------------------------------
## Encapsulate encryption configuration
#
# Use via the encryption_config singleton below.
# Default: []
# c.CryptKeeper.keys = []
## The number of threads to allocate for encryption
# Default: 2
# c.CryptKeeper.n_threads = 2
JupyterHub help command output#
This section contains the output of the command jupyterhub --help-all
.
Start a multi-user Jupyter Notebook server
Spawns a configurable-http-proxy and multi-user Hub,
which authenticates users and spawns single-user Notebook servers
on behalf of users.
Subcommands
===========
Subcommands are launched as `jupyterhub cmd [args]`. For information on using
subcommand 'cmd', do: `jupyterhub cmd -h`.
token
Generate an API token for a user
upgrade-db
Upgrade your JupyterHub state database to the current version.
Options
=======
The options below are convenience aliases to configurable class-options,
as listed in the "Equivalent to" description-line of the aliases.
To see all configurable class-options for some <cmd>, use:
<cmd> --help-all
--debug
set log level to logging.DEBUG (maximize logging output)
Equivalent to: [--Application.log_level=10]
--show-config
Show the application's configuration (human-readable format)
Equivalent to: [--Application.show_config=True]
--show-config-json
Show the application's configuration (json format)
Equivalent to: [--Application.show_config_json=True]
--generate-config
generate default config file
Equivalent to: [--JupyterHub.generate_config=True]
--generate-certs
generate certificates used for internal ssl
Equivalent to: [--JupyterHub.generate_certs=True]
--no-db
disable persisting state database to disk
Equivalent to: [--JupyterHub.db_url=sqlite:///:memory:]
--upgrade-db
Automatically upgrade the database if needed on startup.
Only safe if the database has been backed up.
Only SQLite database files will be backed up automatically.
Equivalent to: [--JupyterHub.upgrade_db=True]
--no-ssl
[DEPRECATED in 0.7: does nothing]
Equivalent to: [--JupyterHub.confirm_no_ssl=True]
--base-url=<URLPrefix>
The base URL of the entire application.
Add this to the beginning of all JupyterHub URLs.
Use base_url to run JupyterHub within an existing website.
Default: '/'
Equivalent to: [--JupyterHub.base_url]
-y=<Bool>
Answer yes to any questions (e.g. confirm overwrite)
Default: False
Equivalent to: [--JupyterHub.answer_yes]
--ssl-key=<Unicode>
Path to SSL key file for the public facing interface of the proxy
When setting this, you should also set ssl_cert
Default: ''
Equivalent to: [--JupyterHub.ssl_key]
--ssl-cert=<Unicode>
Path to SSL certificate file for the public facing interface of the proxy
When setting this, you should also set ssl_key
Default: ''
Equivalent to: [--JupyterHub.ssl_cert]
--url=<Unicode>
The public facing URL of the whole JupyterHub application.
This is the address on which the proxy will bind.
Sets protocol, ip, base_url
Default: 'http://:8000'
Equivalent to: [--JupyterHub.bind_url]
--ip=<Unicode>
The public facing ip of the whole JupyterHub application
(specifically referred to as the proxy).
This is the address on which the proxy will listen. The default is to
listen on all interfaces. This is the only address through which JupyterHub
should be accessed by users.
Default: ''
Equivalent to: [--JupyterHub.ip]
--port=<Int>
The public facing port of the proxy.
This is the port on which the proxy will listen.
This is the only port through which JupyterHub
should be accessed by users.
Default: 8000
Equivalent to: [--JupyterHub.port]
--pid-file=<Unicode>
File to write PID
Useful for daemonizing JupyterHub.
Default: ''
Equivalent to: [--JupyterHub.pid_file]
--log-file=<Unicode>
DEPRECATED: use output redirection instead, e.g.
jupyterhub &>> /var/log/jupyterhub.log
Default: ''
Equivalent to: [--JupyterHub.extra_log_file]
--log-level=<Enum>
Set the log level by value or name.
Choices: any of [0, 10, 20, 30, 40, 50, 'DEBUG', 'INFO', 'WARN', 'ERROR', 'CRITICAL']
Default: 30
Equivalent to: [--Application.log_level]
-f=<Unicode>
The config file to load
Default: 'jupyterhub_config.py'
Equivalent to: [--JupyterHub.config_file]
--config=<Unicode>
The config file to load
Default: 'jupyterhub_config.py'
Equivalent to: [--JupyterHub.config_file]
--db=<Unicode>
url for the database. e.g. `sqlite:///jupyterhub.sqlite`
Default: 'sqlite:///jupyterhub.sqlite'
Equivalent to: [--JupyterHub.db_url]
Class options
=============
The command-line option below sets the respective configurable class-parameter:
--Class.parameter=value
This line is evaluated in Python, so simple expressions are allowed.
For instance, to set `C.a=[0,1,2]`, you may type this:
--C.a='range(3)'
Application(SingletonConfigurable) options
------------------------------------------
--Application.log_datefmt=<Unicode>
The date format used by logging formatters for %(asctime)s
Default: '%Y-%m-%d %H:%M:%S'
--Application.log_format=<Unicode>
The Logging format template
Default: '[%(name)s]%(highlevel)s %(message)s'
--Application.log_level=<Enum>
Set the log level by value or name.
Choices: any of [0, 10, 20, 30, 40, 50, 'DEBUG', 'INFO', 'WARN', 'ERROR', 'CRITICAL']
Default: 30
--Application.logging_config=<key-1>=<value-1>...
Configure additional log handlers.
The default stderr logs handler is configured by the log_level, log_datefmt
and log_format settings.
This configuration can be used to configure additional handlers (e.g. to
output the log to a file) or for finer control over the default handlers.
If provided this should be a logging configuration dictionary, for more
information see:
https://docs.python.org/3/library/logging.config.html#logging-config-
dictschema
This dictionary is merged with the base logging configuration which defines
the following:
* A logging formatter intended for interactive use called
``console``.
* A logging handler that writes to stderr called
``console`` which uses the formatter ``console``.
* A logger with the name of this application set to ``DEBUG``
level.
This example adds a new handler that writes to a file:
.. code-block:: python
c.Application.logging_config = {
"handlers": {
"file": {
"class": "logging.FileHandler",
"level": "DEBUG",
"filename": "<path/to/file>",
}
},
"loggers": {
"<application-name>": {
"level": "DEBUG",
# NOTE: if you don't list the default "console"
# handler here then it will be disabled
"handlers": ["console", "file"],
},
},
}
Default: {}
--Application.show_config=<Bool>
Instead of starting the Application, dump configuration to stdout
Default: False
--Application.show_config_json=<Bool>
Instead of starting the Application, dump configuration to stdout (as JSON)
Default: False
JupyterHub(Application) options
-------------------------------
--JupyterHub.active_server_limit=<Int>
Maximum number of concurrent servers that can be active at a time.
Setting this can limit the total resources your users can consume.
An active server is any server that's not fully stopped. It is considered
active from the time it has been requested until the time that it has
completely stopped.
If this many user servers are active, users will not be able to launch new
servers until a server is shutdown. Spawn requests will be rejected with a
429 error asking them to try again.
If set to 0, no limit is enforced.
Default: 0
--JupyterHub.active_user_window=<Int>
Duration (in seconds) to determine the number of active users.
Default: 1800
--JupyterHub.activity_resolution=<Int>
Resolution (in seconds) for updating activity
If activity is registered that is less than activity_resolution seconds more
recent than the current value, the new value will be ignored.
This avoids too many writes to the Hub database.
Default: 30
--JupyterHub.admin_access=<Bool>
DEPRECATED since version 2.0.0.
The default admin role has full permissions, use custom RBAC scopes instead to
create restricted administrator roles.
https://jupyterhub.readthedocs.io/en/stable/rbac/index.html
Default: False
--JupyterHub.admin_users=<set-item-1>...
DEPRECATED since version 0.7.2, use Authenticator.admin_users instead.
Default: set()
--JupyterHub.allow_named_servers=<Bool>
Allow named single-user servers per user
Default: False
--JupyterHub.answer_yes=<Bool>
Answer yes to any questions (e.g. confirm overwrite)
Default: False
--JupyterHub.api_page_default_limit=<Int>
The default amount of records returned by a paginated endpoint
Default: 50
--JupyterHub.api_page_max_limit=<Int>
The maximum amount of records that can be returned at once
Default: 200
--JupyterHub.api_tokens=<key-1>=<value-1>...
PENDING DEPRECATION: consider using services
Dict of token:username to be loaded into the database.
Allows ahead-of-time generation of API tokens for use by externally managed services,
which authenticate as JupyterHub users.
Consider using services for general services that talk to the
JupyterHub API.
Default: {}
--JupyterHub.authenticate_prometheus=<Bool>
Authentication for prometheus metrics
Default: True
--JupyterHub.authenticator_class=<EntryPointType>
Class for authenticating users.
This should be a subclass of :class:`jupyterhub.auth.Authenticator`
with an :meth:`authenticate` method that:
- is a coroutine (asyncio or tornado)
- returns username on success, None on failure
- takes two arguments: (handler, data),
where `handler` is the calling web.RequestHandler,
and `data` is the POST form data from the login page.
.. versionchanged:: 1.0
authenticators may be registered via entry points,
e.g. `c.JupyterHub.authenticator_class = 'pam'`
Currently installed:
- default: jupyterhub.auth.PAMAuthenticator
- dummy: jupyterhub.auth.DummyAuthenticator
- null: jupyterhub.auth.NullAuthenticator
- pam: jupyterhub.auth.PAMAuthenticator
Default: 'jupyterhub.auth.PAMAuthenticator'
--JupyterHub.base_url=<URLPrefix>
The base URL of the entire application.
Add this to the beginning of all JupyterHub URLs.
Use base_url to run JupyterHub within an existing website.
Default: '/'
--JupyterHub.bind_url=<Unicode>
The public facing URL of the whole JupyterHub application.
This is the address on which the proxy will bind.
Sets protocol, ip, base_url
Default: 'http://:8000'
--JupyterHub.cleanup_proxy=<Bool>
Whether to shutdown the proxy when the Hub shuts down.
Disable if you want to be able to teardown the Hub while leaving the
proxy running.
Only valid if the proxy was starting by the Hub process.
If both this and cleanup_servers are False, sending SIGINT to the Hub will
only shutdown the Hub, leaving everything else running.
The Hub should be able to resume from database state.
Default: True
--JupyterHub.cleanup_servers=<Bool>
Whether to shutdown single-user servers when the Hub shuts down.
Disable if you want to be able to teardown the Hub while leaving the
single-user servers running.
If both this and cleanup_proxy are False, sending SIGINT to the Hub will
only shutdown the Hub, leaving everything else running.
The Hub should be able to resume from database state.
Default: True
--JupyterHub.concurrent_spawn_limit=<Int>
Maximum number of concurrent users that can be spawning at a time.
Spawning lots of servers at the same time can cause performance problems for
the Hub or the underlying spawning system. Set this limit to prevent bursts
of logins from attempting to spawn too many servers at the same time.
This does not limit the number of total running servers. See
active_server_limit for that.
If more than this many users attempt to spawn at a time, their requests will
be rejected with a 429 error asking them to try again. Users will have to
wait for some of the spawning services to finish starting before they can
start their own.
If set to 0, no limit is enforced.
Default: 100
--JupyterHub.config_file=<Unicode>
The config file to load
Default: 'jupyterhub_config.py'
--JupyterHub.confirm_no_ssl=<Bool>
DEPRECATED: does nothing
Default: False
--JupyterHub.cookie_host_prefix_enabled=<Bool>
Enable `__Host-` prefix on authentication cookies.
The `__Host-` prefix on JupyterHub cookies provides further
protection against cookie tossing when untrusted servers
may control subdomains of your jupyterhub deployment.
_However_, it also requires that cookies be set on the path `/`,
which means they are shared by all JupyterHub components,
so a compromised server component will have access to _all_ JupyterHub-related
cookies of the visiting browser.
It is recommended to only combine `__Host-` cookies with per-user domains.
.. versionadded:: 4.1
Default: False
--JupyterHub.cookie_max_age_days=<Float>
Number of days for a login cookie to be valid.
Default is two weeks.
Default: 14
--JupyterHub.cookie_secret=<Union>
The cookie secret to use to encrypt cookies.
Loaded from the JPY_COOKIE_SECRET env variable by default.
Should be exactly 256 bits (32 bytes).
Default: traitlets.Undefined
--JupyterHub.cookie_secret_file=<Unicode>
File in which to store the cookie secret.
Default: 'jupyterhub_cookie_secret'
--JupyterHub.custom_scopes=<key-1>=<value-1>...
Custom scopes to define.
For use when defining custom roles,
to grant users granular permissions
All custom scopes must have a description,
and must start with the prefix `custom:`.
For example::
custom_scopes = {
"custom:jupyter_server:read": {
"description": "read-only access to a single-user server",
},
}
Default: {}
--JupyterHub.data_files_path=<Unicode>
The location of jupyterhub data files (e.g. /usr/local/share/jupyterhub)
Default: '$HOME/checkouts/readthedocs.org/user_builds/jupyterhub/...
--JupyterHub.db_kwargs=<key-1>=<value-1>...
Include any kwargs to pass to the database connection.
See sqlalchemy.create_engine for details.
Default: {}
--JupyterHub.db_url=<Unicode>
url for the database. e.g. `sqlite:///jupyterhub.sqlite`
Default: 'sqlite:///jupyterhub.sqlite'
--JupyterHub.debug_db=<Bool>
log all database transactions. This has A LOT of output
Default: False
--JupyterHub.debug_proxy=<Bool>
DEPRECATED since version 0.8: Use ConfigurableHTTPProxy.debug
Default: False
--JupyterHub.default_server_name=<Unicode>
If named servers are enabled, default name of server to spawn or open when
no server is specified, e.g. by user-redirect.
Note: This has no effect if named servers are not enabled, and does _not_
change the existence or behavior of the default server named `''` (the empty
string). This only affects which named server is launched when no server is
specified, e.g. by links to `/hub/user-redirect/lab/tree/mynotebook.ipynb`.
Default: ''
--JupyterHub.default_url=<Union>
The default URL for users when they arrive (e.g. when user directs to "/")
By default, redirects users to their own server.
Can be a Unicode string (e.g. '/hub/home') or a callable based on the
handler object:
::
def default_url_fn(handler):
user = handler.current_user
if user and user.admin:
return '/hub/admin'
return '/hub/home'
c.JupyterHub.default_url = default_url_fn
Default: traitlets.Undefined
--JupyterHub.external_ssl_authorities=<key-1>=<value-1>...
Dict authority:dict(files). Specify the key, cert, and/or
ca file for an authority. This is useful for externally managed
proxies that wish to use internal_ssl.
The files dict has this format (you must specify at least a cert)::
{
'key': '/path/to/key.key',
'cert': '/path/to/cert.crt',
'ca': '/path/to/ca.crt'
}
The authorities you can override: 'hub-ca', 'notebooks-ca',
'proxy-api-ca', 'proxy-client-ca', and 'services-ca'.
Use with internal_ssl
Default: {}
--JupyterHub.extra_handlers=<list-item-1>...
DEPRECATED.
If you need to register additional HTTP endpoints please use services
instead.
Default: []
--JupyterHub.extra_log_file=<Unicode>
DEPRECATED: use output redirection instead, e.g.
jupyterhub &>> /var/log/jupyterhub.log
Default: ''
--JupyterHub.extra_log_handlers=<list-item-1>...
Extra log handlers to set on JupyterHub logger
Default: []
--JupyterHub.forwarded_host_header=<Unicode>
Alternate header to use as the Host (e.g., X-Forwarded-Host)
when determining whether a request is cross-origin
This may be useful when JupyterHub is running behind a proxy that rewrites
the Host header.
Default: ''
--JupyterHub.generate_certs=<Bool>
Generate certs used for internal ssl
Default: False
--JupyterHub.generate_config=<Bool>
Generate default config file
Default: False
--JupyterHub.hub_bind_url=<Unicode>
The URL on which the Hub will listen. This is a private URL for internal
communication. Typically set in combination with hub_connect_url. If a unix
socket, hub_connect_url **must** also be set.
For example:
"http://127.0.0.1:8081"
"unix+http://%2Fsrv%2Fjupyterhub%2Fjupyterhub.sock"
.. versionadded:: 0.9
Default: ''
--JupyterHub.hub_connect_ip=<Unicode>
The ip or hostname for proxies and spawners to use
for connecting to the Hub.
Use when the bind address (`hub_ip`) is 0.0.0.0, :: or otherwise different
from the connect address.
Default: when `hub_ip` is 0.0.0.0 or ::, use `socket.gethostname()`,
otherwise use `hub_ip`.
Note: Some spawners or proxy implementations might not support hostnames. Check your
spawner or proxy documentation to see if they have extra requirements.
.. versionadded:: 0.8
Default: ''
--JupyterHub.hub_connect_port=<Int>
DEPRECATED
Use hub_connect_url
.. versionadded:: 0.8
.. deprecated:: 0.9
Use hub_connect_url
Default: 0
--JupyterHub.hub_connect_url=<Unicode>
The URL for connecting to the Hub. Spawners, services, and the proxy will
use this URL to talk to the Hub.
Only needs to be specified if the default hub URL is not connectable (e.g.
using a unix+http:// bind url).
.. seealso::
JupyterHub.hub_connect_ip
JupyterHub.hub_bind_url
.. versionadded:: 0.9
Default: ''
--JupyterHub.hub_ip=<Unicode>
The ip address for the Hub process to *bind* to.
By default, the hub listens on localhost only. This address must be accessible from
the proxy and user servers. You may need to set this to a public ip or '' for all
interfaces if the proxy or user servers are in containers or on a different host.
See `hub_connect_ip` for cases where the bind and connect address should differ,
or `hub_bind_url` for setting the full bind URL.
Default: '127.0.0.1'
--JupyterHub.hub_port=<Int>
The internal port for the Hub process.
This is the internal port of the hub itself. It should never be accessed directly.
See JupyterHub.port for the public port to use when accessing jupyterhub.
It is rare that this port should be set except in cases of port conflict.
See also `hub_ip` for the ip and `hub_bind_url` for setting the full
bind URL.
Default: 8081
--JupyterHub.hub_routespec=<Unicode>
The routing prefix for the Hub itself.
Override to send only a subset of traffic to the Hub. Default is to use the
Hub as the default route for all requests.
This is necessary for normal jupyterhub operation, as the Hub must receive
requests for e.g. `/user/:name` when the user's server is not running.
However, some deployments using only the JupyterHub API may want to handle
these events themselves, in which case they can register their own default
target with the proxy and set e.g. `hub_routespec = /hub/` to serve only the
hub's own pages, or even `/hub/api/` for api-only operation.
Note: hub_routespec must include the base_url, if any.
.. versionadded:: 1.4
Default: '/'
--JupyterHub.implicit_spawn_seconds=<Float>
Trigger implicit spawns after this many seconds.
When a user visits a URL for a server that's not running,
they are shown a page indicating that the requested server
is not running with a button to spawn the server.
Setting this to a positive value will redirect the user
after this many seconds, effectively clicking this button
automatically for the users,
automatically beginning the spawn process.
Warning: this can result in errors and surprising behavior
when sharing access URLs to actual servers,
since the wrong server is likely to be started.
Default: 0
--JupyterHub.init_spawners_timeout=<Int>
Timeout (in seconds) to wait for spawners to initialize
Checking if spawners are healthy can take a long time if many spawners are
active at hub start time.
If it takes longer than this timeout to check, init_spawner will be left to
complete in the background and the http server is allowed to start.
A timeout of -1 means wait forever, which can mean a slow startup of the Hub
but ensures that the Hub is fully consistent by the time it starts
responding to requests. This matches the behavior of jupyterhub 1.0.
.. versionadded: 1.1.0
Default: 10
--JupyterHub.internal_certs_location=<Unicode>
The location to store certificates automatically created by
JupyterHub.
Use with internal_ssl
Default: 'internal-ssl'
--JupyterHub.internal_ssl=<Bool>
Enable SSL for all internal communication
This enables end-to-end encryption between all JupyterHub components.
JupyterHub will automatically create the necessary certificate
authority and sign notebook certificates as they're created.
Default: False
--JupyterHub.ip=<Unicode>
The public facing ip of the whole JupyterHub application
(specifically referred to as the proxy).
This is the address on which the proxy will listen. The default is to
listen on all interfaces. This is the only address through which JupyterHub
should be accessed by users.
Default: ''
--JupyterHub.jinja_environment_options=<key-1>=<value-1>...
Supply extra arguments that will be passed to Jinja environment.
Default: {}
--JupyterHub.last_activity_interval=<Int>
Interval (in seconds) at which to update last-activity timestamps.
Default: 300
--JupyterHub.load_groups=<key-1>=<value-1>...
Dict of `{'group': {'users':['usernames'], 'properties': {}}` to load at
startup.
Example::
c.JupyterHub.load_groups = {
'groupname': {
'users': ['usernames'],
'properties': {'key': 'value'},
},
}
This strictly *adds* groups and users to groups. Properties, if defined,
replace all existing properties.
Loading one set of groups, then starting JupyterHub again with a different
set will not remove users or groups from previous launches. That must be
done through the API.
.. versionchanged:: 3.2
Changed format of group from list of usernames to dict
Default: {}
--JupyterHub.load_roles=<list-item-1>...
List of predefined role dictionaries to load at startup.
For instance::
load_roles = [
{
'name': 'teacher',
'description': 'Access to users' information and group membership',
'scopes': ['users', 'groups'],
'users': ['cyclops', 'gandalf'],
'services': [],
'groups': []
}
]
All keys apart from 'name' are optional.
See all the available scopes in the JupyterHub REST API documentation.
Default roles are defined in roles.py.
Default: []
--JupyterHub.log_datefmt=<Unicode>
The date format used by logging formatters for %(asctime)s
Default: '%Y-%m-%d %H:%M:%S'
--JupyterHub.log_format=<Unicode>
The Logging format template
Default: '[%(name)s]%(highlevel)s %(message)s'
--JupyterHub.log_level=<Enum>
Set the log level by value or name.
Choices: any of [0, 10, 20, 30, 40, 50, 'DEBUG', 'INFO', 'WARN', 'ERROR', 'CRITICAL']
Default: 30
--JupyterHub.logging_config=<key-1>=<value-1>...
Configure additional log handlers.
The default stderr logs handler is configured by the log_level, log_datefmt
and log_format settings.
This configuration can be used to configure additional handlers (e.g. to
output the log to a file) or for finer control over the default handlers.
If provided this should be a logging configuration dictionary, for more
information see:
https://docs.python.org/3/library/logging.config.html#logging-config-
dictschema
This dictionary is merged with the base logging configuration which defines
the following:
* A logging formatter intended for interactive use called
``console``.
* A logging handler that writes to stderr called
``console`` which uses the formatter ``console``.
* A logger with the name of this application set to ``DEBUG``
level.
This example adds a new handler that writes to a file:
.. code-block:: python
c.Application.logging_config = {
"handlers": {
"file": {
"class": "logging.FileHandler",
"level": "DEBUG",
"filename": "<path/to/file>",
}
},
"loggers": {
"<application-name>": {
"level": "DEBUG",
# NOTE: if you don't list the default "console"
# handler here then it will be disabled
"handlers": ["console", "file"],
},
},
}
Default: {}
--JupyterHub.logo_file=<Unicode>
Specify path to a logo image to override the Jupyter logo in the banner.
Default: ''
--JupyterHub.named_server_limit_per_user=<Union>
Maximum number of concurrent named servers that can be created by a user at
a time.
Setting this can limit the total resources a user can consume.
If set to 0, no limit is enforced.
Can be an integer or a callable/awaitable based on the handler object:
::
def named_server_limit_per_user_fn(handler):
user = handler.current_user
if user and user.admin:
return 0
return 5
c.JupyterHub.named_server_limit_per_user =
named_server_limit_per_user_fn
Default: 0
--JupyterHub.oauth_token_expires_in=<Int>
Expiry (in seconds) of OAuth access tokens.
The default is to expire when the cookie storing them expires,
according to `cookie_max_age_days` config.
These are the tokens stored in cookies when you visit
a single-user server or service.
When they expire, you must re-authenticate with the Hub,
even if your Hub authentication is still valid.
If your Hub authentication is valid,
logging in may be a transparent redirect as you refresh the page.
This does not affect JupyterHub API tokens in general,
which do not expire by default.
Only tokens issued during the oauth flow
accessing services and single-user servers are affected.
.. versionadded:: 1.4
OAuth token expires_in was not previously configurable.
.. versionchanged:: 1.4
Default now uses cookie_max_age_days so that oauth tokens
which are generally stored in cookies,
expire when the cookies storing them expire.
Previously, it was one hour.
Default: 0
--JupyterHub.pid_file=<Unicode>
File to write PID
Useful for daemonizing JupyterHub.
Default: ''
--JupyterHub.port=<Int>
The public facing port of the proxy.
This is the port on which the proxy will listen.
This is the only port through which JupyterHub
should be accessed by users.
Default: 8000
--JupyterHub.proxy_api_ip=<Unicode>
DEPRECATED since version 0.8 : Use ConfigurableHTTPProxy.api_url
Default: ''
--JupyterHub.proxy_api_port=<Int>
DEPRECATED since version 0.8 : Use ConfigurableHTTPProxy.api_url
Default: 0
--JupyterHub.proxy_auth_token=<Unicode>
DEPRECATED since version 0.8: Use ConfigurableHTTPProxy.auth_token
Default: ''
--JupyterHub.proxy_check_interval=<Int>
DEPRECATED since version 0.8: Use
ConfigurableHTTPProxy.check_running_interval
Default: 5
--JupyterHub.proxy_class=<EntryPointType>
The class to use for configuring the JupyterHub proxy.
Should be a subclass of :class:`jupyterhub.proxy.Proxy`.
.. versionchanged:: 1.0
proxies may be registered via entry points,
e.g. `c.JupyterHub.proxy_class = 'traefik'`
Currently installed:
- configurable-http-proxy: jupyterhub.proxy.ConfigurableHTTPProxy
- default: jupyterhub.proxy.ConfigurableHTTPProxy
Default: 'jupyterhub.proxy.ConfigurableHTTPProxy'
--JupyterHub.proxy_cmd=<command-item-1>...
DEPRECATED since version 0.8. Use ConfigurableHTTPProxy.command
Default: []
--JupyterHub.public_url=<Unicode>
Set the public URL of JupyterHub
This will skip any detection of URL and protocol from requests,
which isn't always correct when JupyterHub is behind
multiple layers of proxies, etc.
Usually the failure is detecting http when it's really https.
Should include the full, public URL of JupyterHub,
including the public-facing base_url prefix
(i.e. it should include a trailing slash), e.g.
https://jupyterhub.example.org/prefix/
Default: ''
--JupyterHub.recreate_internal_certs=<Bool>
Recreate all certificates used within JupyterHub on restart.
Note: enabling this feature requires restarting all notebook
servers.
Use with internal_ssl
Default: False
--JupyterHub.redirect_to_server=<Bool>
Redirect user to server (if running), instead of control panel.
Default: True
--JupyterHub.reset_db=<Bool>
Purge and reset the database.
Default: False
--JupyterHub.service_check_interval=<Int>
Interval (in seconds) at which to check connectivity of services with web
endpoints.
Default: 60
--JupyterHub.service_tokens=<key-1>=<value-1>...
Dict of token:servicename to be loaded into the database.
Allows ahead-of-time generation of API tokens for use by externally
managed services.
Default: {}
--JupyterHub.services=<list-item-1>...
List of service specification dictionaries.
A service
For instance::
services = [
{
'name': 'cull_idle',
'command': ['/path/to/cull_idle_servers.py'],
},
{
'name': 'formgrader',
'url': 'http://127.0.0.1:1234',
'api_token': 'super-secret',
'environment':
}
]
Default: []
--JupyterHub.show_config=<Bool>
Instead of starting the Application, dump configuration to stdout
Default: False
--JupyterHub.show_config_json=<Bool>
Instead of starting the Application, dump configuration to stdout (as JSON)
Default: False
--JupyterHub.shutdown_on_logout=<Bool>
Shuts down all user servers on logout
Default: False
--JupyterHub.spawner_class=<EntryPointType>
The class to use for spawning single-user servers.
Should be a subclass of :class:`jupyterhub.spawner.Spawner`.
.. versionchanged:: 1.0
spawners may be registered via entry points,
e.g. `c.JupyterHub.spawner_class = 'localprocess'`
Currently installed:
- default: jupyterhub.spawner.LocalProcessSpawner
- localprocess: jupyterhub.spawner.LocalProcessSpawner
- simple: jupyterhub.spawner.SimpleLocalProcessSpawner
Default: 'jupyterhub.spawner.LocalProcessSpawner'
--JupyterHub.ssl_cert=<Unicode>
Path to SSL certificate file for the public facing interface of the proxy
When setting this, you should also set ssl_key
Default: ''
--JupyterHub.ssl_key=<Unicode>
Path to SSL key file for the public facing interface of the proxy
When setting this, you should also set ssl_cert
Default: ''
--JupyterHub.statsd_host=<Unicode>
Host to send statsd metrics to. An empty string (the default) disables
sending metrics.
Default: ''
--JupyterHub.statsd_port=<Int>
Port on which to send statsd metrics about the hub
Default: 8125
--JupyterHub.statsd_prefix=<Unicode>
Prefix to use for all metrics sent by jupyterhub to statsd
Default: 'jupyterhub'
--JupyterHub.subdomain_hook=<Union>
Hook for constructing subdomains for users and services. Only used when
`JupyterHub.subdomain_host` is set.
There are two predefined hooks, which can be selected by name:
- 'legacy' (deprecated) - 'idna' (default, more robust. No change for _most_
usernames)
Otherwise, should be a function which must not be async. A custom
subdomain_hook should have the signature:
def subdomain_hook(name, domain, kind) -> str:
...
and should return a unique, valid domain name for all usernames.
- `name` is the original name, which may need escaping to be safe as a
domain name label - `domain` is the domain of the Hub itself - `kind` will
be one of 'user' or 'service'
JupyterHub itself puts very little limit on usernames to accommodate a wide
variety of Authenticators, but your identity provider is likely much more
strict, allowing you to make assumptions about the name.
The default behavior is to have all services on a single `services.{domain}`
subdomain, and each user on `{username}.{domain}`. This is the 'legacy'
scheme, and doesn't work for all usernames.
The 'idna' scheme is a new scheme that should produce a valid domain name
for any user, using IDNA encoding for unicode usernames, and a truncate-and-
hash approach for any usernames that can't be easily encoded into a domain
component.
.. versionadded:: 5.0
Default: 'idna'
--JupyterHub.subdomain_host=<Unicode>
Run single-user servers on subdomains of this host.
This should be the full `https://hub.domain.tld[:port]`.
Provides additional cross-site protections for javascript served by
single-user servers.
Requires `<username>.hub.domain.tld` to resolve to the same host as
`hub.domain.tld`.
In general, this is most easily achieved with wildcard DNS.
When using SSL (i.e. always) this also requires a wildcard SSL
certificate.
Default: ''
--JupyterHub.template_paths=<list-item-1>...
Paths to search for jinja templates, before using the default templates.
Default: []
--JupyterHub.template_vars=<key-1>=<value-1>...
Extra variables to be passed into jinja templates.
Values in dict may contain callable objects.
If value is callable, the current user is passed as argument.
Example::
def callable_value(user):
# user is generated by handlers.base.get_current_user
with open("/tmp/file.txt", "r") as f:
ret = f.read()
ret = ret.replace("<username>", user.name)
return ret
c.JupyterHub.template_vars = {
"key1": "value1",
"key2": callable_value,
}
Default: {}
--JupyterHub.tornado_settings=<key-1>=<value-1>...
Extra settings overrides to pass to the tornado application.
Default: {}
--JupyterHub.trust_user_provided_tokens=<Bool>
Trust user-provided tokens (via JupyterHub.service_tokens)
to have good entropy.
If you are not inserting additional tokens via configuration file,
this flag has no effect.
In JupyterHub 0.8, internally generated tokens do not
pass through additional hashing because the hashing is costly
and does not increase the entropy of already-good UUIDs.
User-provided tokens, on the other hand, are not trusted to have good entropy by default,
and are passed through many rounds of hashing to stretch the entropy of the key
(i.e. user-provided tokens are treated as passwords instead of random keys).
These keys are more costly to check.
If your inserted tokens are generated by a good-quality mechanism,
e.g. `openssl rand -hex 32`, then you can set this flag to True
to reduce the cost of checking authentication tokens.
Default: False
--JupyterHub.trusted_alt_names=<list-item-1>...
Names to include in the subject alternative name.
These names will be used for server name verification. This is useful
if JupyterHub is being run behind a reverse proxy or services using ssl
are on different hosts.
Use with internal_ssl
Default: []
--JupyterHub.trusted_downstream_ips=<list-item-1>...
Downstream proxy IP addresses to trust.
This sets the list of IP addresses that are trusted and skipped when processing
the `X-Forwarded-For` header. For example, if an external proxy is used for TLS
termination, its IP address should be added to this list to ensure the correct
client IP addresses are recorded in the logs instead of the proxy server's IP
address.
Default: []
--JupyterHub.upgrade_db=<Bool>
Upgrade the database automatically on start.
Only safe if database is regularly backed up.
Only SQLite databases will be backed up to a local file automatically.
Default: False
--JupyterHub.use_legacy_stopped_server_status_code=<Bool>
Return 503 rather than 424 when request comes in for a non-running server.
Prior to JupyterHub 2.0, we returned a 503 when any request came in for a
user server that was currently not running. By default, JupyterHub 2.0 will
return a 424 - this makes operational metric dashboards more useful.
JupyterLab < 3.2 expected the 503 to know if the user server is no longer
running, and prompted the user to start their server. Set this config to
true to retain the old behavior, so JupyterLab < 3.2 can continue to show
the appropriate UI when the user server is stopped.
This option will be removed in a future release.
Default: False
--JupyterHub.user_redirect_hook=<Callable>
Callable to affect behavior of /user-redirect/
Receives 4 parameters: 1. path - URL path that was provided after /user-
redirect/ 2. request - A Tornado HTTPServerRequest representing the current
request. 3. user - The currently authenticated user. 4. base_url - The
base_url of the current hub, for relative redirects
It should return the new URL to redirect to, or None to preserve current
behavior.
Default: None
Spawner(LoggingConfigurable) options
------------------------------------
--Spawner.args=<list-item-1>...
Extra arguments to be passed to the single-user server.
Some spawners allow shell-style expansion here, allowing you to use
environment variables here. Most, including the default, do not. Consult the
documentation for your spawner to verify!
Default: []
--Spawner.auth_state_hook=<Any>
An optional hook function that you can implement to pass `auth_state` to the
spawner after it has been initialized but before it starts. The `auth_state`
dictionary may be set by the `.authenticate()` method of the authenticator.
This hook enables you to pass some or all of that information to your
spawner.
Example::
def userdata_hook(spawner, auth_state):
spawner.userdata = auth_state["userdata"]
c.Spawner.auth_state_hook = userdata_hook
Default: None
--Spawner.cmd=<command-item-1>...
The command used for starting the single-user server.
Provide either a string or a list containing the path to the startup script
command. Extra arguments, other than this path, should be provided via
`args`.
This is usually set if you want to start the single-user server in a
different python environment (with virtualenv/conda) than JupyterHub itself.
Some spawners allow shell-style expansion here, allowing you to use
environment variables. Most, including the default, do not. Consult the
documentation for your spawner to verify!
Default: ['jupyterhub-singleuser']
--Spawner.consecutive_failure_limit=<Int>
Maximum number of consecutive failures to allow before shutting down
JupyterHub.
This helps JupyterHub recover from a certain class of problem preventing
launch in contexts where the Hub is automatically restarted (e.g. systemd,
docker, kubernetes).
A limit of 0 means no limit and consecutive failures will not be tracked.
Default: 0
--Spawner.cpu_guarantee=<Float>
Minimum number of cpu-cores a single-user notebook server is guaranteed to
have available.
If this value is set to 0.5, allows use of 50% of one CPU. If this value is
set to 2, allows use of up to 2 CPUs.
**This is a configuration setting. Your spawner must implement support for
the limit to work.** The default spawner, `LocalProcessSpawner`, does
**not** implement this support. A custom spawner **must** add support for
this setting for it to be enforced.
Default: None
--Spawner.cpu_limit=<Float>
Maximum number of cpu-cores a single-user notebook server is allowed to use.
If this value is set to 0.5, allows use of 50% of one CPU. If this value is
set to 2, allows use of up to 2 CPUs.
The single-user notebook server will never be scheduled by the kernel to use
more cpu-cores than this. There is no guarantee that it can access this many
cpu-cores.
**This is a configuration setting. Your spawner must implement support for
the limit to work.** The default spawner, `LocalProcessSpawner`, does
**not** implement this support. A custom spawner **must** add support for
this setting for it to be enforced.
Default: None
--Spawner.debug=<Bool>
Enable debug-logging of the single-user server
Default: False
--Spawner.default_url=<Unicode>
The URL the single-user server should start in.
`{username}` will be expanded to the user's username
Example uses:
- You can set `notebook_dir` to `/` and `default_url` to `/tree/home/{username}` to allow people to
navigate the whole filesystem from their notebook server, but still start in their home directory.
- Start with `/notebooks` instead of `/tree` if `default_url` points to a notebook instead of a directory.
- You can set this to `/lab` to have JupyterLab start by default, rather than Jupyter Notebook.
Default: ''
--Spawner.disable_user_config=<Bool>
Disable per-user configuration of single-user servers.
When starting the user's single-user server, any config file found in the
user's $HOME directory will be ignored.
Note: a user could circumvent this if the user modifies their Python
environment, such as when they have their own conda environments /
virtualenvs / containers.
Default: False
--Spawner.env_keep=<list-item-1>...
List of environment variables for the single-user server to inherit from the
JupyterHub process.
This list is used to ensure that sensitive information in the JupyterHub
process's environment (such as `CONFIGPROXY_AUTH_TOKEN`) is not passed to
the single-user server's process.
Default: ['PATH', 'PYTHONPATH', 'CONDA_ROOT', 'CONDA_DEFAULT_ENV', 'VI...
--Spawner.environment=<key-1>=<value-1>...
Extra environment variables to set for the single-user server's process.
Environment variables that end up in the single-user server's process come from 3 sources:
- This `environment` configurable
- The JupyterHub process' environment variables that are listed in `env_keep`
- Variables to establish contact between the single-user notebook and the hub (such as JUPYTERHUB_API_TOKEN)
The `environment` configurable should be set by JupyterHub administrators to
add installation specific environment variables. It is a dict where the key
is the name of the environment variable, and the value can be a string or a
callable. If it is a callable, it will be called with one parameter (the
spawner instance), and should return a string fairly quickly (no blocking
operations please!).
Note that the spawner class' interface is not guaranteed to be exactly same
across upgrades, so if you are using the callable take care to verify it
continues to work after upgrades!
.. versionchanged:: 1.2
environment from this configuration has highest priority,
allowing override of 'default' env variables,
such as JUPYTERHUB_API_URL.
Default: {}
--Spawner.http_timeout=<Int>
Timeout (in seconds) before giving up on a spawned HTTP server
Once a server has successfully been spawned, this is the amount of time we
wait before assuming that the server is unable to accept connections.
Default: 30
--Spawner.hub_connect_url=<Unicode>
The URL the single-user server should connect to the Hub.
If the Hub URL set in your JupyterHub config is not reachable from spawned
notebooks, you can set differnt URL by this config.
Is None if you don't need to change the URL.
Default: None
--Spawner.ip=<Unicode>
The IP address (or hostname) the single-user server should listen on.
Usually either '127.0.0.1' (default) or '0.0.0.0'.
The JupyterHub proxy implementation should be able to send packets to this
interface.
Subclasses which launch remotely or in containers should override the
default to '0.0.0.0'.
.. versionchanged:: 2.0
Default changed to '127.0.0.1', from ''.
In most cases, this does not result in a change in behavior,
as '' was interpreted as 'unspecified',
which used the subprocesses' own default, itself usually '127.0.0.1'.
Default: '127.0.0.1'
--Spawner.mem_guarantee=<ByteSpecification>
Minimum number of bytes a single-user notebook server is guaranteed to have
available.
Allows the following suffixes:
- K -> Kilobytes
- M -> Megabytes
- G -> Gigabytes
- T -> Terabytes
**This is a configuration setting. Your spawner must implement support for
the limit to work.** The default spawner, `LocalProcessSpawner`, does
**not** implement this support. A custom spawner **must** add support for
this setting for it to be enforced.
Default: None
--Spawner.mem_limit=<ByteSpecification>
Maximum number of bytes a single-user notebook server is allowed to use.
Allows the following suffixes:
- K -> Kilobytes
- M -> Megabytes
- G -> Gigabytes
- T -> Terabytes
If the single user server tries to allocate more memory than this, it will
fail. There is no guarantee that the single-user notebook server will be
able to allocate this much memory - only that it can not allocate more than
this.
**This is a configuration setting. Your spawner must implement support for
the limit to work.** The default spawner, `LocalProcessSpawner`, does
**not** implement this support. A custom spawner **must** add support for
this setting for it to be enforced.
Default: None
--Spawner.notebook_dir=<Unicode>
Path to the notebook directory for the single-user server.
The user sees a file listing of this directory when the notebook interface
is started. The current interface does not easily allow browsing beyond the
subdirectories in this directory's tree.
`~` will be expanded to the home directory of the user, and {username} will
be replaced with the name of the user.
Note that this does *not* prevent users from accessing files outside of this
path! They can do so with many other means.
Default: ''
--Spawner.oauth_client_allowed_scopes=<Union>
Allowed scopes for oauth tokens issued by this server's oauth client.
This sets the maximum and default scopes
assigned to oauth tokens issued by a single-user server's
oauth client (i.e. tokens stored in browsers after authenticating with the server),
defining what actions the server can take on behalf of logged-in users.
Default is an empty list, meaning minimal permissions to identify users,
no actions can be taken on their behalf.
If callable, will be called with the Spawner as a single argument.
Callables may be async.
Default: traitlets.Undefined
--Spawner.oauth_roles=<Union>
Allowed roles for oauth tokens.
Deprecated in 3.0: use oauth_client_allowed_scopes
This sets the maximum and default roles
assigned to oauth tokens issued by a single-user server's
oauth client (i.e. tokens stored in browsers after authenticating with the server),
defining what actions the server can take on behalf of logged-in users.
Default is an empty list, meaning minimal permissions to identify users,
no actions can be taken on their behalf.
Default: traitlets.Undefined
--Spawner.options_form=<Union>
An HTML form for options a user can specify on launching their server.
The surrounding `<form>` element and the submit button are already provided.
For example:
.. code:: html
Set your key:
<input name="key" val="default_key"></input>
<br>
Choose a letter:
<select name="letter" multiple="true">
<option value="A">The letter A</option>
<option value="B">The letter B</option>
</select>
The data from this form submission will be passed on to your spawner in
`self.user_options`
Instead of a form snippet string, this could also be a callable that takes
as one parameter the current spawner instance and returns a string. The
callable will be called asynchronously if it returns a future, rather than a
str. Note that the interface of the spawner class is not deemed stable
across versions, so using this functionality might cause your JupyterHub
upgrades to break.
Default: traitlets.Undefined
--Spawner.options_from_form=<Callable>
Interpret HTTP form data
Form data will always arrive as a dict of lists of strings. Override this
function to understand single-values, numbers, etc.
This should coerce form data into the structure expected by
self.user_options, which must be a dict, and should be JSON-serializeable,
though it can contain bytes in addition to standard JSON data types.
This method should not have any side effects. Any handling of `user_options`
should be done in `.start()` to ensure consistent behavior across servers
spawned via the API and form submission page.
Instances will receive this data on self.user_options, after passing through
this function, prior to `Spawner.start`.
.. versionchanged:: 1.0
user_options are persisted in the JupyterHub database to be reused
on subsequent spawns if no options are given.
user_options is serialized to JSON as part of this persistence
(with additional support for bytes in case of uploaded file data),
and any non-bytes non-jsonable values will be replaced with None
if the user_options are re-used.
Default: traitlets.Undefined
--Spawner.poll_interval=<Int>
Interval (in seconds) on which to poll the spawner for single-user server's
status.
At every poll interval, each spawner's `.poll` method is called, which
checks if the single-user server is still running. If it isn't running, then
JupyterHub modifies its own state accordingly and removes appropriate routes
from the configurable proxy.
Default: 30
--Spawner.poll_jitter=<Float>
Jitter fraction for poll_interval.
Avoids alignment of poll calls for many Spawners, e.g. when restarting
JupyterHub, which restarts all polls for running Spawners.
`poll_jitter=0` means no jitter, 0.1 means 10%, etc.
Default: 0.1
--Spawner.port=<Int>
The port for single-user servers to listen on.
Defaults to `0`, which uses a randomly allocated port number each time.
If set to a non-zero value, all Spawners will use the same port, which only
makes sense if each server is on a different address, e.g. in containers.
New in version 0.7.
Default: 0
--Spawner.post_stop_hook=<Any>
An optional hook function that you can implement to do work after the
spawner stops.
This can be set independent of any concrete spawner implementation.
Default: None
--Spawner.pre_spawn_hook=<Any>
An optional hook function that you can implement to do some bootstrapping
work before the spawner starts. For example, create a directory for your
user or load initial content.
This can be set independent of any concrete spawner implementation.
This maybe a coroutine.
Example::
def my_hook(spawner):
username = spawner.user.name
spawner.environment["GREETING"] = f"Hello {username}"
c.Spawner.pre_spawn_hook = my_hook
Default: None
--Spawner.progress_ready_hook=<Any>
An optional hook function that you can implement to modify the ready event,
which will be shown to the user on the spawn progress page when their server
is ready.
This can be set independent of any concrete spawner implementation.
This maybe a coroutine.
Example::
async def my_ready_hook(spawner, ready_event):
ready_event["html_message"] = f"Server {spawner.name} is ready for {spawner.user.name}"
return ready_event
c.Spawner.progress_ready_hook = my_ready_hook
Default: None
--Spawner.server_token_scopes=<Union>
The list of scopes to request for $JUPYTERHUB_API_TOKEN
If not specified, the scopes in the `server` role will be used
(unchanged from pre-4.0).
If callable, will be called with the Spawner instance as its sole argument
(JupyterHub user available as spawner.user).
JUPYTERHUB_API_TOKEN will be assigned the _subset_ of these scopes
that are held by the user (as in oauth_client_allowed_scopes).
.. versionadded:: 4.0
Default: traitlets.Undefined
--Spawner.ssl_alt_names=<list-item-1>...
List of SSL alt names
May be set in config if all spawners should have the same value(s),
or set at runtime by Spawner that know their names.
Default: []
--Spawner.ssl_alt_names_include_local=<Bool>
Whether to include `DNS:localhost`, `IP:127.0.0.1` in alt names
Default: True
--Spawner.start_timeout=<Int>
Timeout (in seconds) before giving up on starting of single-user server.
This is the timeout for start to return, not the timeout for the server to
respond. Callers of spawner.start will assume that startup has failed if it
takes longer than this. start should return when the server process is
started and its location is known.
Default: 60
Authenticator(LoggingConfigurable) options
------------------------------------------
--Authenticator.admin_users=<set-item-1>...
Set of users that will have admin rights on this JupyterHub.
Note: As of JupyterHub 2.0, full admin rights should not be required, and
more precise permissions can be managed via roles.
Admin users have extra privileges:
- Use the admin panel to see list of users logged in
- Add / remove users in some authenticators
- Restart / halt the hub
- Start / stop users' single-user servers
- Can access each individual users' single-user server (if configured)
Admin access should be treated the same way root access is.
Defaults to an empty set, in which case no user has admin access.
Default: set()
--Authenticator.allow_all=<Bool>
Allow every user who can successfully authenticate to access JupyterHub.
False by default, which means for most Authenticators, _some_ allow-related
configuration is required to allow users to log in.
Authenticator subclasses may override the default with e.g.::
@default("allow_all")
def _default_allow_all(self):
# if _any_ auth config (depends on the Authenticator)
if self.allowed_users or self.allowed_groups or self.allow_existing_users:
return False
else:
return True
.. versionadded:: 5.0
.. versionchanged:: 5.0
Prior to 5.0, `allow_all` wasn't defined on its own,
and was instead implicitly True when no allow config was provided,
i.e. `allowed_users` unspecified or empty on the base Authenticator class.
To preserve pre-5.0 behavior,
set `allow_all = True` if you have no other allow configuration.
Default: False
--Authenticator.allow_existing_users=<Bool>
Allow existing users to login.
Defaults to True if `allowed_users` is set for historical reasons, and False
otherwise.
With this enabled, all users present in the JupyterHub database are allowed
to login. This has the effect of any user who has _previously_ been allowed
to login via any means will continue to be allowed until the user is deleted
via the /hub/admin page or REST API.
.. warning::
Before enabling this you should review the existing users in the
JupyterHub admin panel at `/hub/admin`. You may find users existing
there because they have previously been declared in config such as
`allowed_users` or allowed to sign in.
.. warning::
When this is enabled and you wish to remove access for one or more
users previously allowed, you must make sure that they
are removed from the jupyterhub database. This can be tricky to do
if you stop allowing an externally managed group of users for example.
With this enabled, JupyterHub admin users can visit `/hub/admin` or use
JupyterHub's REST API to add and remove users to manage who can login.
.. versionadded:: 5.0
Default: False
--Authenticator.allowed_users=<set-item-1>...
Set of usernames that are allowed to log in.
Use this to limit which authenticated users may login. Default behavior:
only users in this set are allowed.
If empty, does not perform any restriction, in which case any authenticated
user is allowed.
Authenticators may extend :meth:`.Authenticator.check_allowed` to combine
`allowed_users` with other configuration to either expand or restrict
access.
.. versionchanged:: 1.2
`Authenticator.whitelist` renamed to `allowed_users`
Default: set()
--Authenticator.any_allow_config=<Bool>
Is there any allow config?
Used to show a warning if it looks like nobody can access the Hub,
which can happen when upgrading to JupyterHub 5,
now that `allow_all` defaults to False.
Deployments can set this explicitly to True to suppress
the "No allow config found" warning.
Will be True if any config tagged with `.tag(allow_config=True)`
or starts with `allow` is truthy.
.. versionadded:: 5.0
Default: False
--Authenticator.auth_refresh_age=<Int>
The max age (in seconds) of authentication info
before forcing a refresh of user auth info.
Refreshing auth info allows, e.g. requesting/re-validating auth
tokens.
See :meth:`.refresh_user` for what happens when user auth info is refreshed
(nothing by default).
Default: 300
--Authenticator.auto_login=<Bool>
Automatically begin the login process
rather than starting with a "Login with..." link at `/hub/login`
To work, `.login_url()` must give a URL other than the default `/hub/login`,
such as an oauth handler or another automatic login handler,
registered with `.get_handlers()`.
.. versionadded:: 0.8
Default: False
--Authenticator.auto_login_oauth2_authorize=<Bool>
Automatically begin login process for OAuth2 authorization requests
When another application is using JupyterHub as OAuth2 provider, it sends
users to `/hub/api/oauth2/authorize`. If the user isn't logged in already,
and auto_login is not set, the user will be dumped on the hub's home page,
without any context on what to do next.
Setting this to true will automatically redirect users to login if they
aren't logged in *only* on the `/hub/api/oauth2/authorize` endpoint.
.. versionadded:: 1.5
Default: False
--Authenticator.blocked_users=<set-item-1>...
Set of usernames that are not allowed to log in.
Use this with supported authenticators to restrict which users can not log
in. This is an additional block list that further restricts users, beyond
whatever restrictions the authenticator has in place.
If empty, does not perform any additional restriction.
.. versionadded: 0.9
.. versionchanged:: 1.2
`Authenticator.blacklist` renamed to `blocked_users`
Default: set()
--Authenticator.delete_invalid_users=<Bool>
Delete any users from the database that do not pass validation
When JupyterHub starts, `.add_user` will be called
on each user in the database to verify that all users are still valid.
If `delete_invalid_users` is True,
any users that do not pass validation will be deleted from the database.
Use this if users might be deleted from an external system,
such as local user accounts.
If False (default), invalid users remain in the Hub's database
and a warning will be issued.
This is the default to avoid data loss due to config changes.
Default: False
--Authenticator.enable_auth_state=<Bool>
Enable persisting auth_state (if available).
auth_state will be encrypted and stored in the Hub's database.
This can include things like authentication tokens, etc.
to be passed to Spawners as environment variables.
Encrypting auth_state requires the cryptography package.
Additionally, the JUPYTERHUB_CRYPT_KEY environment variable must
contain one (or more, separated by ;) 32B encryption keys.
These can be either base64 or hex-encoded.
If encryption is unavailable, auth_state cannot be persisted.
New in JupyterHub 0.8
Default: False
--Authenticator.manage_groups=<Bool>
Let authenticator manage user groups
If True, Authenticator.authenticate and/or .refresh_user
may return a list of group names in the 'groups' field,
which will be assigned to the user.
All group-assignment APIs are disabled if this is True.
Default: False
--Authenticator.manage_roles=<Bool>
Let authenticator manage roles
If True, Authenticator.authenticate and/or .refresh_user
may return a list of roles in the 'roles' field,
which will be added to the database.
When enabled, all role management will be handled by the
authenticator; in particular, assignment of roles via
`JupyterHub.load_roles` traitlet will not be possible.
.. versionadded:: 5.0
Default: False
--Authenticator.otp_prompt=<Any>
The prompt string for the extra OTP (One Time Password) field.
.. versionadded:: 5.0
Default: 'OTP:'
--Authenticator.post_auth_hook=<Any>
An optional hook function that you can implement to do some bootstrapping
work during authentication. For example, loading user account details from
an external system.
This function is called after the user has passed all authentication checks
and is ready to successfully authenticate. This function must return the
auth_model dict reguardless of changes to it. The hook is called with 3
positional arguments: `(authenticator, handler, auth_model)`.
This may be a coroutine.
.. versionadded: 1.0
Example::
import os
import pwd
def my_hook(authenticator, handler, auth_model):
user_data = pwd.getpwnam(auth_model['name'])
spawn_data = {
'pw_data': user_data
'gid_list': os.getgrouplist(auth_model['name'], user_data.pw_gid)
}
if auth_model['auth_state'] is None:
auth_model['auth_state'] = {}
auth_model['auth_state']['spawn_data'] = spawn_data
return auth_model
c.Authenticator.post_auth_hook = my_hook
Default: None
--Authenticator.refresh_pre_spawn=<Bool>
Force refresh of auth prior to spawn.
This forces :meth:`.refresh_user` to be called prior to launching
a server, to ensure that auth state is up-to-date.
This can be important when e.g. auth tokens that may have expired
are passed to the spawner via environment variables from auth_state.
If refresh_user cannot refresh the user auth data,
launch will fail until the user logs in again.
Default: False
--Authenticator.request_otp=<Bool>
Prompt for OTP (One Time Password) in the login form.
.. versionadded:: 5.0
Default: False
--Authenticator.reset_managed_roles_on_startup=<Bool>
Reset managed roles to result of `load_managed_roles()` on startup.
If True:
- stale managed roles will be removed,
- stale assignments to managed roles will be removed.
Any role not present in `load_managed_roles()` will be considered
'stale'.
The 'stale' status for role assignments is also determined from
`load_managed_roles()` result:
- user role assignments status will depend on whether the `users`
key is defined or not:
* if a list is defined under the `users` key and the user is not listed, then the user role assignment will be considered 'stale',
* if the `users` key is not provided, the user role assignment will be preserved;
- service and group role assignments will be considered 'stale':
* if not included in the `services` and `groups` list,
* if the `services` and `groups` keys are not provided.
.. versionadded:: 5.0
Default: False
--Authenticator.username_map=<key-1>=<value-1>...
Dictionary mapping authenticator usernames to JupyterHub users.
Primarily used to normalize OAuth user names to local users.
Default: {}
--Authenticator.username_pattern=<Unicode>
Regular expression pattern that all valid usernames must match.
If a username does not match the pattern specified here, authentication will
not be attempted.
If not set, allow any username.
Default: ''
--Authenticator.whitelist=<set-item-1>...
Deprecated, use `Authenticator.allowed_users`
Default: set()
CryptKeeper(SingletonConfigurable) options
------------------------------------------
--CryptKeeper.keys=<list-item-1>...
Default: []
--CryptKeeper.n_threads=<Int>
The number of threads to allocate for encryption
Default: 2
Examples
--------
generate default config file:
jupyterhub --generate-config -f /etc/jupyterhub/jupyterhub_config.py
spawn the server on 10.0.1.2:443 with https:
jupyterhub --ip 10.0.1.2 --port 443 --ssl-key my_ssl.key --ssl-cert my_ssl.cert
Services#
Definition of a Service#
When working with JupyterHub, a Service is defined as something (usually a process) that can interact with the Hub’s REST API. A Service may perform a specific action or task. For example, the following tasks can each be a unique Service:
shutting down individuals’ single user notebook servers that have been idle for some time
an additional web application which uses the Hub as an OAuth provider to authenticate and authorize user access
a script run once in a while, which performs any API action
automating requests to running user servers, such as activity data collection
Two key features help differentiate Services:
Is the Service managed by JupyterHub?
Does the Service have a web server that should be added to the proxy’s table?
Currently, these characteristics distinguish two types of Services:
A Hub-Managed Service which is managed by JupyterHub
An Externally-Managed Service which runs its own web server and communicates operation instructions via the Hub’s API.
Properties of a Service#
A Service may have the following properties:
name: str
- the name of the serviceurl: str (default - None)
- The URL where the service should be running (from the proxy’s perspective). Typically a localhost URL for Hub-managed services. If a url is specified, the service will be added to the proxy at/services/:name
.api_token: str (default - None)
- For Externally-Managed Services, you need to specify an API token to perform API requests to the Hub. For Hub-managed services, this token is generated at startup, and available via$JUPYTERHUB_API_TOKEN
. For OAuth services, this is the client secret.display: bool (default - True)
- When set to true, display a link to the service’s URL under the ‘Services’ dropdown in users’ hub home page. Only has an effect ifurl
is also specified.oauth_no_confirm: bool (default - False)
- When set to true, skip the OAuth confirmation page when users access this service. By default, when users authenticate with a service using JupyterHub, they are prompted to confirm that they want to grant that service access to their credentials. Skipping the confirmation page is useful for admin-managed services that are considered part of the Hub and shouldn’t need extra prompts for login.oauth_client_id: str (default - 'service-$name')
- This never needs to be set, but you can specify a service’s OAuth client id. It must start withservice-
.oauth_redirect_uri: str (default: '/services/:name/oauth_redirect')
- Set the OAuth redirect URI. Required if the redirect URI differs from the default or the service is not to be added to the proxy at/services/:name
(i.e.url
is not set, but there is still a public web service using OAuth).
If a service is also to be managed by the Hub, it has a few extra options:
command: (str/Popen list)
- Command for JupyterHub to spawn the service. - Only use this if the service should be a subprocess. - If command is not specified, the Service is assumed to be managed externally. - If a command is specified for launching the Service, the Service will be started and managed by the Hub.environment: dict
- additional environment variables for the Service.user: str
- the name of a system user to manage the Service. If unspecified, run as the same user as the Hub.
Hub-Managed Services#
A Hub-Managed Service is started by the Hub, and the Hub is responsible for the Service’s operation. A Hub-Managed Service can only be a local subprocess of the Hub. The Hub will take care of starting the process and restart the service if the service stops.
While Hub-Managed Services share some similarities with single-user server Spawners, there are no plans for Hub-Managed Services to support the same spawning abstractions as a Spawner.
If you wish to run a Service in a Docker container or other deployment environments, the Service can be registered as an Externally-Managed Service, as described below.
Launching a Hub-Managed Service#
A Hub-Managed Service is characterized by its specified command
for launching
the Service. For example, a ‘cull idle’ notebook server task configured as a
Hub-Managed Service would include:
the Service name,
permissions to see when users are active, and to stop servers
the
command
to launch the Service which will cull idle servers after a timeout interval
This example would be configured as follows in jupyterhub_config.py
:
c.JupyterHub.load_roles = [
{
"name": "idle-culler",
"scopes": [
"read:users:activity", # read user last_activity
"servers", # start and stop servers
# 'admin:users' # needed if culling idle users as well
]
}
]
c.JupyterHub.services = [
{
'name': 'idle-culler',
'command': [sys.executable, '-m', 'jupyterhub_idle_culler', '--timeout=3600']
}
]
A Hub-Managed Service may also be configured with additional optional parameters, which describe the environment needed to start the Service process:
environment: dict
- additional environment variables for the Service.user: str
- name of the user to run the server if different from the Hub. Requires Hub to be root.cwd: path
directory in which to run the Service, if different from the Hub directory.
The Hub will pass the following environment variables to launch the Service:
JUPYTERHUB_SERVICE_NAME: The name of the service
JUPYTERHUB_API_TOKEN: API token assigned to the service
JUPYTERHUB_API_URL: URL for the JupyterHub API (default, http://127.0.0.1:8080/hub/api)
JUPYTERHUB_BASE_URL: Base URL of the Hub (https://mydomain[:port]/)
JUPYTERHUB_SERVICE_PREFIX: URL path prefix of this service (/services/:service-name/)
JUPYTERHUB_SERVICE_URL: Local URL where the service is expected to be listening.
Only for proxied web services.
JUPYTERHUB_OAUTH_SCOPES: JSON-serialized list of scopes to use for allowing access to the service
(deprecated in 3.0, use JUPYTERHUB_OAUTH_ACCESS_SCOPES).
JUPYTERHUB_OAUTH_ACCESS_SCOPES: JSON-serialized list of scopes to use for allowing access to the service (new in 3.0).
JUPYTERHUB_OAUTH_CLIENT_ALLOWED_SCOPES: JSON-serialized list of scopes that can be requested by the oauth client on behalf of users (new in 3.0).
JUPYTERHUB_PUBLIC_URL: the public URL of the service,
e.g. `https://jupyterhub.example.org/services/name/`.
Empty if no public URL is specified (default).
Will be available if subdomains are configured.
JUPYTERHUB_PUBLIC_HUB_URL: the public URL of JupyterHub as a whole,
e.g. `https://jupyterhub.example.org/`.
Empty if no public URL is specified (default).
Will be available if subdomains are configured.
For the previous ‘cull idle’ Service example, these environment variables would be passed to the Service when the Hub starts the ‘cull idle’ Service:
JUPYTERHUB_SERVICE_NAME: 'idle-culler'
JUPYTERHUB_API_TOKEN: API token assigned to the service
JUPYTERHUB_API_URL: http://127.0.0.1:8080/hub/api
JUPYTERHUB_BASE_URL: https://mydomain[:port]
JUPYTERHUB_SERVICE_PREFIX: /services/idle-culler/
See the GitHub repo for additional information about the jupyterhub_idle_culler.
Externally-Managed Services#
You may prefer to use your own service management tools, such as Docker or systemd, to manage a JupyterHub Service. These Externally-Managed Services, unlike Hub-Managed Services, are not subprocesses of the Hub. You must tell JupyterHub which API token the Externally-Managed Service is using to perform its API requests. Each Externally-Managed Service will need a unique API token, because the Hub authenticates each API request and the API token is used to identify the originating Service or user.
A configuration example of an Externally-Managed Service running its own web server is:
c.JupyterHub.services = [
{
'name': 'my-web-service',
'url': 'https://10.0.1.1:1984',
# any secret >8 characters, you'll use api_token to
# authenticate api requests to the hub from your service
'api_token': 'super-secret',
}
]
In this case, the url
field will be passed along to the Service as
JUPYTERHUB_SERVICE_URL
.
Service credentials#
A service has direct access to the Hub API via its api_token
.
Exactly what actions the service can take are governed by the service’s role assignments:
c.JupyterHub.services = [
{
"name": "user-lister",
"command": ["python3", "/path/to/user-lister"],
}
]
c.JupyterHub.load_roles = [
{
"name": "list-users",
"scopes": ["list:users", "read:users"],
"services": ["user-lister"]
}
]
When a service has a configured URL or explicit oauth_client_id
or oauth_redirect_uri
, it can operate as an OAuth client.
When a user visits an oauth-authenticated service,
completion of authentication results in issuing an oauth token.
This token is:
owned by the authenticated user
associated with the oauth client of the service
governed by the service’s
oauth_client_allowed_scopes
configuration
This token enables the service to act on behalf of the user.
When an oauthenticated service makes a request to the Hub (or other Hub-authenticated service), it has two credentials available to authenticate the request:
the service’s own
api_token
, which acts as the service, and is governed by the service’s own role assignments.the user’s oauth token issued to the service during the oauth flow, which acts as the user.
Choosing which one to use depends on “who” should be considered taking the action represented by the request.
A service’s own permissions governs how it can act without any involvement of a user.
The service’s oauth_client_allowed_scopes
configuration allows individual users to delegate permission for the service to act on their behalf.
This allows services to have little to no permissions of their own,
but allow users to take actions via the service,
using their own credentials.
An example of such a service would be a web application for instructors, presenting a dashboard of actions which can be taken for students in their courses. The service would need no permission to do anything with the JupyterHub API on its own, but it could employ the user’s oauth credentials to list users, manage student servers, etc.
This service might look like:
c.JupyterHub.services = [
{
"name": "grader-dashboard",
"command": ["python3", "/path/to/grader-dashboard"],
"url": "http://127.0.0.1:12345",
"oauth_client_allowed_scopes": [
"list:users",
"read:users",
]
}
]
c.JupyterHub.load_roles = [
{
"name": "grader",
"scopes": [
"list:users!group=class-a",
"read:users!group=class-a",
"servers!group=class-a",
"access:servers!group=class-a",
"access:services",
],
"groups": ["graders"]
}
]
In this example, the grader-dashboard
service does not have permission to take any actions with the Hub API on its own because it has not been assigned any role.
But when a grader accesses the service,
the dashboard will have a token with permission to list and read information about any users that the grader can access.
The dashboard will not have permission to do additional things as the grader.
The dashboard will be able to:
list users in class A (
list:users!group=class-a
)read information about users in class A (
read:users!group=class-a
)
The dashboard will not be able to:
start, stop, or access user servers (
servers
,access:servers
), even though the grader has this permission (it’s not inoauth_client_allowed_scopes
)take any action without the grader granting permission via oauth
Adding or removing services at runtime#
Only externally-managed services can be added at runtime by using JupyterHub’s REST API.
Add a new service#
To add a new service, send a POST request to this endpoint
POST /hub/api/services/:servicename
Required scope: admin:services
Payload: The payload should contain the definition of the service to be created. The endpoint supports the same properties as externally-managed services defined in the config file.
Possible responses
201 Created
: The service and related objects are created (and started in case of a Hub-managed one) successfully.400 Bad Request
: The payload is invalid or JupyterHub can not create the service.409 Conflict
: The service with the same name already exists.
Remove an existing service#
To remove an existing service, send a DELETE request to this endpoint
DELETE /hub/api/services/:servicename
Required scope: admin:services
Payload: None
Possible responses
200 OK
: The service and related objects are removed (and stopped in case of a Hub-managed one) successfully.400 Bad Request
: JupyterHub can not remove the service.404 Not Found
: The requested service does not exist.405 Not Allowed
: The requested service is created from the config file, it can not be removed at runtime.
Writing your own Services#
When writing your own services, you have a few decisions to make (in addition to what your service does!):
Does my service need a public URL?
Do I want JupyterHub to start/stop the service?
Does my service need to authenticate users?
When a Service is managed by JupyterHub, the Hub will pass the necessary information to the Service via the environment variables described above. A flexible Service, whether managed by the Hub or not, can make use of these same environment variables.
When you run a service that has a URL, it will be accessible under a
/services/
prefix, such as https://myhub.horse/services/my-service/
. For
your service to route proxied requests properly, it must take
JUPYTERHUB_SERVICE_PREFIX
into account when routing requests. For example, a
web service would normally service its root handler at '/'
, but the proxied
service would need to serve JUPYTERHUB_SERVICE_PREFIX
.
Note that JUPYTERHUB_SERVICE_PREFIX
will contain a trailing slash. This must
be taken into consideration when creating the service routes. If you include an
extra slash you might get unexpected behavior. For example if your service has a
/foo
endpoint, the route would be JUPYTERHUB_SERVICE_PREFIX + foo
, and
/foo/bar
would be JUPYTERHUB_SERVICE_PREFIX + foo/bar
.
Hub Authentication and Services#
JupyterHub provides some utilities for using the Hub’s authentication mechanism to govern access to your service.
Requests to all JupyterHub services are made with OAuth tokens.
These can either be requests with a token in the Authorization
header,
or url parameter ?token=...
,
or browser requests which must complete the OAuth authorization code flow,
which results in a token that should be persisted for future requests
(persistence is up to the service,
but an encrypted cookie confined to the service path is appropriate,
and provided by default).
Changed in version 2.0: The shared jupyterhub-services
cookie is removed.
OAuth must be used to authenticate browser requests with services.
JupyterHub includes a reference implementation of Hub authentication that can be used by services. You may go beyond this reference implementation and create custom hub-authenticating clients and services. We describe the process below.
The reference, or base, implementation is the HubAuth
class,
which implements the API requests to the Hub that resolve a token to a User model.
There are two levels of authentication with the Hub:
HubAuth
- the most basic authentication, for services that should only accept API requests authorized with a token.HubOAuth
- For services that should use oauth to authenticate with the Hub. This should be used for any service that serves pages that should be visited with a browser.
To use HubAuth, you must set the .api_token
instance variable. This can be
done via the HubAuth constructor, direct assignment to a HubAuth object, or via the
JUPYTERHUB_API_TOKEN
environment variable. A number of the examples in the
root of the jupyterhub git repository set the JUPYTERHUB_API_TOKEN
variable
so consider having a look at those for further reading
(cull-idle,
external-oauth,
service-notebook
and service-whoami)
Most of the logic for authentication implementation is found in the
HubAuth.user_for_token()
methods,
which makes a request of the Hub, and returns:
None, if no user could be identified, or
a dict of the following form:
{ "name": "username", "groups": ["list", "of", "groups"], "scopes": [ "access:servers!server=username/", ], }
You are then free to use the returned user information to take appropriate action.
HubAuth also caches the Hub’s response for a number of seconds,
configurable by the cookie_cache_max_age
setting (default: five minutes).
If your service would like to make further requests on behalf of users,
it should use the token issued by this OAuth process.
If you are using tornado,
you can access the token authenticating the current request with HubAuth.get_token()
.
Changed in version 2.2: HubAuth.get_token()
adds support for retrieving
tokens stored in tornado cookies after the completion of OAuth.
Previously, it only retrieved tokens from URL parameters or the Authorization header.
Passing get_token(handler, in_cookie=False)
preserves this behavior.
Flask Example#
For example, you have a Flask service that returns information about a user.
JupyterHub’s HubAuth class can be used to authenticate requests to the Flask
service. See the service-whoami-flask
example in the
JupyterHub GitHub repo
for more details.
Authenticating tornado services with JupyterHub#
Since most Jupyter services are written with tornado,
we include a mixin class, HubOAuthenticated
,
for quickly authenticating your own tornado services with JupyterHub.
Tornado’s authenticated()
decorator calls a Handler’s get_current_user()
method to identify the user. Mixing in HubAuthenticated
defines
get_current_user()
to use HubAuth. If you want to configure the HubAuth
instance beyond the default, you’ll want to define an initialize()
method,
such as:
class MyHandler(HubOAuthenticated, web.RequestHandler):
def initialize(self, hub_auth):
self.hub_auth = hub_auth
@web.authenticated
def get(self):
...
The HubAuth class will automatically load the desired configuration from the Service environment variables.
Changed in version 2.0: Access scopes are used to govern access to services.
Prior to 2.0,
sets of users and groups could be used to grant access
by defining .hub_groups
or .hub_users
on the authenticated handler.
These are ignored if the 2.0 .hub_scopes
is defined.
See also
Implementing your own Authentication with JupyterHub#
If you don’t want to use the reference implementation (e.g. you find the implementation a poor fit for your Flask app), you can implement authentication via the Hub yourself. JupyterHub is a standard OAuth2 provider, so you can use any OAuth 2 client implementation appropriate for your toolkit. See the FastAPI example for an example of using JupyterHub as an OAuth provider with FastAPI, without using any code imported from JupyterHub.
On completion of OAuth, you will have an access token for JupyterHub, which can be used to identify the user and the permissions (scopes) the user has authorized for your service.
You will only get to this stage if the user has the required access:services!service=$service-name
scope.
To retrieve the user model for the token, make a request to GET /hub/api/user
with the token in the Authorization header.
For example, using flask:
We recommend looking at the [HubOAuth
][huboauth] class implementation for reference,
and taking note of the following process:
retrieve the token from the request.
Make an API request
GET /hub/api/user
, with the token in theAuthorization
header.For example, with requests:
r = requests.get( "http://127.0.0.1:8081/hub/api/user", headers = { 'Authorization' : f'token {api_token}', }, ) r.raise_for_status() user = r.json()
On success, the reply will be a JSON model describing the user:
{ "name": "inara", # groups may be omitted, depending on permissions "groups": ["serenity", "guild"], # scopes is new in JupyterHub 2.0 "scopes": [ "access:services", "read:users:name", "read:users!user=inara", "..." ] }
The scopes
field can be used to manage access.
Note: a user will have access to a service to complete oauth access to the service for the first time.
Individual permissions may be revoked at any later point without revoking the token,
in which case the scopes
field in this model should be checked on each access.
The default required scopes for access are available from hub_auth.oauth_scopes
or $JUPYTERHUB_OAUTH_ACCESS_SCOPES
.
An example of using an Externally-Managed Service and authentication is in the nbviewer README section on securing the notebook viewer, and an example of its configuration is found here. nbviewer can also be run as a Hub-Managed Service as described nbviewer README section on securing the notebook viewer.
JupyterHub URL scheme#
This document describes how JupyterHub routes requests.
This does not include the REST API URLs.
In general, all URLs can be prefixed with c.JupyterHub.base_url
to
run the whole JupyterHub application on a prefix.
All authenticated handlers redirect to /hub/login
to log-in users
before being redirected back to the originating page.
The returned request should preserve all query parameters.
/
#
The top-level request is always a simple redirect to /hub/
,
to be handled by the default JupyterHub handler.
In general, all requests to /anything
that do not start with /hub/
but are routed to the Hub, will be redirected to /hub/anything
before being handled by the Hub.
/hub/
#
This is an authenticated URL.
This handler redirects users to the default URL of the application,
which defaults to the user’s default server.
That is, the handler redirects to /hub/spawn
if the user’s server is not running,
or to the server itself (/user/:name
) if the server is running.
This default URL behavior can be customized in two ways:
First, to redirect users to the JupyterHub home page (/hub/home
)
instead of spawning their server,
set redirect_to_server
to False:
c.JupyterHub.redirect_to_server = False
This might be useful if you have a Hub where you expect users to be managing multiple server configurations but automatic spawning is not desirable.
Second, you can customise the landing page to any page you like, such as a custom service you have deployed e.g. with course information:
c.JupyterHub.default_url = '/services/my-landing-service'
/hub/home
#
By default, the Hub home page has just one or two buttons for starting and stopping the user’s server.
If named servers are enabled, there will be some additional tools for management of the named servers.
Version added: 1.0 named server UI is new in 1.0.
/hub/login
#
This is the JupyterHub login page. If you have a form-based username+password login, such as the default PAMAuthenticator, this page will render the login form.
If login is handled by an external service, e.g. with OAuth, this page will have a button, declaring “Log in with …” which users can click to log in with the chosen service.
If you want to skip the user interaction and initiate login via the button, you can set:
c.Authenticator.auto_login = True
This can be useful when the user is “already logged in” via some mechanism.
However, a handshake via redirects
is necessary to complete the authentication with JupyterHub.
/hub/logout
#
Visiting /hub/logout
clears cookies from the current browser.
Note that logging out does not stop a user’s server(s) by default.
If you would like to shut down user servers on logout, you can enable this behavior with:
c.JupyterHub.shutdown_on_logout = True
Be careful with this setting because logging out one browser does not mean the user is no longer actively using their server from another machine.
/user/:username[/:servername]
#
If a user’s server is running, this URL is handled by the user’s given server, not by the Hub. The username is the first part, and if using named servers, the server name is the second part.
If the user’s server is not running, this will be redirected to /hub/user/:username/...
/hub/user/:username[/:servername]
#
This URL indicates a request for a user server that is not running
(because /user/...
would have been handled by the notebook server
if the specified server were running).
Handling this URL depends on two conditions: whether a requested user is found as a match and the state of the requested user’s notebook server, for example:
the server is not active a. user matches b. user doesn’t match
the server is ready
the server is pending, but not ready
If the server is pending spawn,
the browser will be redirected to /hub/spawn-pending/:username/:servername
to see a progress page while waiting for the server to be ready.
If the server is not active at all,
a page will be served with a link to /hub/spawn/:username/:servername
.
Following that link will launch the requested server.
The HTTP status will be 503 in this case because a request has been made for a server that is not running.
If the server is ready, it is assumed that the proxy has not yet registered the route.
Some checks are performed and a delay is added before redirecting back to /user/:username/:servername/...
.
If something is really wrong, this can result in a redirect loop.
Visiting this page will never result in triggering the spawn of servers without additional user action (i.e. clicking the link on the page).
Version changed: 1.0
Prior to 1.0, this URL itself was responsible for spawning servers. If the progress page was pending, the URL redirected it to running servers. This was useful because it made sure that the requested servers were restarted after they stopped. However, it could also be harmful because unused servers would continuously be restarted if e.g. an idle JupyterLab frontend that constantly makes polling requests was openly pointed at it.
Special handling of API requests#
Requests to /user/:username[/:servername]/api/...
are assumed to be
from applications connected to stopped servers.
These requests fail with a 503
status code and an informative JSON error message
that indicates how to spawn the server.
This is meant to help applications such as JupyterLab,
that are connected to a server that has stopped.
Version changed: 1.0
JupyterHub version 0.9 failed these API requests with status 404
,
but version 1.0 uses 503.
/user-redirect/...
#
The /user-redirect/...
URL is for sharing a URL that will redirect a user
to a path on their own default server.
This is useful when different users have the same file at the same URL on their servers,
and you want a single link to give to any user that will open that file on their server.
e.g. a link to /user-redirect/notebooks/Index.ipynb
will send user hortense
to /user/hortense/notebooks/Index.ipynb
DO NOT share links to your own server with other users. This will not work in general, unless you grant those users access to your server.
Contributions welcome: The JupyterLab “shareable link” should share this link when run with JupyterHub, but it does not. See jupyterlab-hub where this should probably be done and this issue in JupyterLab that is intended to make it possible.
Spawning#
/hub/spawn[/:username[/:servername]]
#
Requesting /hub/spawn
will spawn the default server for the current user.
If the username
and optionally servername
are specified,
then the specified server for the specified user will be spawned.
Once spawn has been requested,
the browser is redirected to /hub/spawn-pending/...
.
If Spawner.options_form
is used,
this will render a form,
and a POST request will trigger the actual spawn and redirect.
Version added: 1.0
1.0 adds the ability to specify username
and servername
.
Prior to 1.0, only /hub/spawn
was recognized for the default server.
Version changed: 1.0
Prior to 1.0, this page redirected back to /hub/user/:username
,
which was responsible for triggering spawn and rendering progress, etc.
/hub/spawn-pending[/:username[/:servername]]
#
Version added: 1.0 this URL is new in JupyterHub 1.0.
This page renders the progress view for the given spawn request.
Once the server is ready,
the browser is redirected to the running server at /user/:username/:servername/...
.
If this page is requested at any time after the specified server is ready, the browser will be redirected to the running server.
Requesting this page will never trigger any side effects.
If the server is not running (e.g. because the spawn has failed),
the spawn failure message (if applicable) will be displayed,
and the page will show a link back to /hub/spawn/...
.
/hub/token
#
On this page, users can manage their JupyterHub API tokens. They can revoke access and request new tokens for writing scripts against the JupyterHub REST API.
/hub/admin
#
Administrators can take various administrative actions from this page:
add/remove users
grant admin privileges
start/stop user servers
shutdown JupyterHub itself
Event logging and telemetry#
JupyterHub can be configured to record structured events from a running server using Jupyter’s Telemetry System. The types of events that JupyterHub emits are defined by JSON schemas listed at the bottom of this page.
How to emit events#
Event logging is handled by its Eventlog
object. This leverages Python’s standing logging library to emit, filter, and collect event data.
To begin recording events, you’ll need to set two configurations:
handlers
: tells the EventLog where to route your events. This trait is a list of Python logging handlers that route events to the event log file.
allows_schemas
: tells the EventLog which events should be recorded. No events are emitted by default; all recorded events must be listed here.
Here’s a basic example:
import logging
c.EventLog.handlers = [
logging.FileHandler('event.log'),
]
c.EventLog.allowed_schemas = [
'hub.jupyter.org/server-action'
]
The output is a file, "event.log"
, with events recorded as JSON data.
Event schemas#
JupyterHub server events#
hub.jupyter.org/server-action |
||
Record actions on user servers made via JupyterHub. JupyterHub can perform various actions on user servers via direct interaction from users, or via the API. This event is recorded whenever either of those happen. Limitations:
|
||
type |
object |
|
properties |
||
|
Action performed by JupyterHub. This is a required field. Possibl Values:
|
|
enum |
start, stop |
|
|
Name of the user whose server this action was performed on. This is the normalized name used by JupyterHub itself, which is derived from the authentication provider used but might not be the same as used in the authentication provider. |
|
type |
string |
|
|
Name of the server this action was performed on. JupyterHub supports each user having multiple servers with arbitrary names, and this field specifies the name of the server. The ‘default’ server is denoted by the empty string |
|
type |
string |
Monitoring#
This section covers details on monitoring the state of your JupyterHub installation.
JupyterHub expose the /metrics
endpoint that returns text describing its current
operational state formatted in a way Prometheus understands.
Prometheus is a separate open source tool that can be configured to repeatedly poll
JupyterHub’s /metrics
endpoint to parse and save its current state.
By doing so, Prometheus can describe JupyterHub’s evolving state over time. This evolving state can then be accessed through Prometheus that expose its underlying storage to those allowed to access it, and be presented with dashboards by a tool like Grafana.
List of Prometheus Metrics#
Type |
Name |
Description |
---|---|---|
gauge |
jupyterhub_active_users |
Number of users who were active in the given time period |
histogram |
jupyterhub_check_routes_duration_seconds |
Time taken to validate all routes in proxy |
histogram |
jupyterhub_event_loop_interval_seconds |
Distribution of measured event loop intervals |
histogram |
jupyterhub_hub_startup_duration_seconds |
Time taken for Hub to start |
histogram |
jupyterhub_init_spawners_duration_seconds |
Time taken for spawners to initialize |
histogram |
jupyterhub_proxy_add_duration_seconds |
Duration for adding user routes to proxy |
histogram |
jupyterhub_proxy_delete_duration_seconds |
Duration for deleting user routes from proxy |
histogram |
jupyterhub_proxy_poll_duration_seconds |
Duration for polling all routes from proxy |
histogram |
jupyterhub_request_duration_seconds |
Request duration for all HTTP requests |
gauge |
jupyterhub_running_servers |
The number of user servers currently running |
histogram |
jupyterhub_server_poll_duration_seconds |
Time taken to poll if server is running |
histogram |
jupyterhub_server_spawn_duration_seconds |
Time taken for server spawning operation |
histogram |
jupyterhub_server_stop_seconds |
Time taken for server stopping operation |
gauge |
jupyterhub_total_users |
Total number of users |
Customizing the metrics prefix#
JupyterHub metrics all have a jupyterhub_
prefix.
As of JupyterHub 5.0, this can be overridden with $JUPYTERHUB_METRICS_PREFIX
environment variable
in the Hub’s environment.
For example,
export JUPYTERHUB_METRICS_PREFIX=jupyterhub_prod
would result in the metric jupyterhub_prod_active_users
, etc.
A Gallery of JupyterHub Deployments#
A JupyterHub Community Resource
We’ve compiled this list of JupyterHub deployments to help the community see the breadth and growth of JupyterHub’s use in education, research, and high performance computing.
Please submit pull requests to update information or to add new institutions or uses.
Academic Institutions, Research Labs, and Supercomputer Centers#
University of California Berkeley#
University of California Davis#
Although not technically a JupyterHub deployment, this tutorial setup may be helpful to others in the Jupyter community.
Thank you C. Titus Brown for sharing this with the Software Carpentry mailing list.
* I started a big Amazon machine;
* I installed Docker and built a custom image containing my software of
interest;
* I ran multiple containers, one connected to port 8000, one on 8001,
etc. and gave each student a different port;
* students could connect in and use the Terminal program in Jupyter to
execute commands, and could upload/download files via the Jupyter
console interface;
* in theory I could have used notebooks too, but for this I didn’t have
need.
I am aware that JupyterHub can probably do all of this including manage
the containers, but I’m still a bit shy of diving into that; this was
fairly straightforward, gave me disposable containers that were isolated
for each individual student, and worked almost flawlessly. Should be
easy to do with RStudio too.
Cal Poly San Luis Obispo#
jupyterhub-deploy-teaching based on work by Brian Granger for Cal Poly’s Data Science 301 Course
CERN#
CERN, also known as the European Organization for Nuclear Research, is a world-renowned scientific research centre and the home of the Large Hadron Collider (LHC).
Within CERN, there are two noteworthy JupyterHub deployments in operation:
SWAN, which stands for Service for Web based Analysis, serves as an interactive data analysis platform primarily utilized at CERN.
VRE, which stands for Virtual Research Environment, is an analysis platform developed within the EOSC Project to cater to the needs of scientific communities involved in European projects.
Chameleon#
Chameleon is a NSF-funded configurable experimental environment for large-scale computer science systems research with bare metal reconfigurability. Chameleon users utilize JupyterHub to document and reproduce their complex CISE and networking experiments.
Shared JupyterHub: provides a common “workbench” environment for any Chameleon user.
Trovi: a sharing portal of experiments, tutorials, and examples, which users can launch as a dedicated isolated environments on Chameleon’s JupyterHub.
Clemson University#
Advanced Computing
University of Colorado Boulder#
(CU Research Computing) CURC
-
Slurm job dispatched on Crestone compute cluster
log troubleshooting
Profiles in IPython Clusters tab
-
ETH Zurich#
ETH Zurich, (Federal Institute of Technology Zurich), is a public research university in Zürich, Switzerland, with focus on science, technology, engineering, and mathematics, although its 16 departments span a variety of disciplines and subjects.
The Educational Development and Technology unit provides JupyterHub exclusively for teaching and learning, integrated in the learning management system Moodle. Each course gets its individually configured JupyterHub environment deployed on a on-premise Kubernetes cluster.
ETH JupyterHub for teaching and learning
George Washington University#
JupyterHub with university single-sign-on. Deployed early 2017.
HTCondor#
University of Illinois#
https://datascience.business.illinois.edu (currently down; checked 10/26/22)
IllustrisTNG Simulation Project#
MIT and Lincoln Labs#
https://supercloud.mit.edu/
Michigan State University#
University of Minnesota#
University of Missouri#
https://dsa.missouri.edu/faq/
Paderborn University#
-
nbgraderutils: Use JupyterHub + nbgrader + iJava kernel for online Java exercises. Used in lecture Statistical Natural Language Processing.
Penn State University#
Press release: “New open-source web apps available for students and faculty”
University of California San Diego#
San Diego Supercomputer Center - Andrea Zonca
Educational Technology Services - Paul Jamason
TACC University of Texas#
Texas A&M#
Kristen Thyng - Oceanography
Elucidata#
What’s new in Jupyter Notebooks @Elucidata:
Service Providers#
AWS#
Google Cloud Platform#
Everware#
Everware Reproducible and reusable science powered by jupyterhub and docker. Like nbviewer, but executable. CERN, Geneva website
Microsoft Azure#
Rackspace Carina#
https://getcarina.com/blog/learning-how-to-whale/
https://carolynvanslyck.com/talk/carina/jupyterhub/#/ (but carolynvanslyck is currently down; checked 10/26/22)
Hadoop#
Sirepo#
Sirepo is an online Computer-Aided Engineering gateway that contains a JupyterHub instance. Sirepo is provided at no cost for community use, but users must request login access.
Miscellaneous#
https://medium.com/@ybarraud/setting-up-jupyterhub-with-sudospawner-and-anaconda-844628c0dbee#.rm3yt87e1
https://www.laketide.com/building-your-lab-part-3/
https://estrellita.hatenablog.com/entry/2015/07/31/083202
https://www.walkingrandomly.com/?p=5734
https://wrdrd.com/docs/consulting/education-technology
https://bitbucket.org/jackhale/fenics-jupyter
Changelog#
For detailed changes from the prior release, click on the version number, and
its link will bring up a GitHub listing of changes. Use git log
on the
command line for details.
Versioning#
JupyterHub follows Intended Effort Versioning (EffVer) for versioning, where the version number is meant to indicate the amount of effort required to upgrade to the new version.
Contributors to major version bumps in JupyterHub include:
Database schema changes that require migrations and are hard to roll back
Increasing the minimum required Python version
Large new features
Breaking changes likely to affect users
Unreleased#
4.1#
4.1.5 - 2024-04-04#
Bugs fixed#
singleuser mixin: include check_xsrf_cookie in overrides #4771 (@minrk, @consideRatio)
Contributors to this release#
The following people contributed discussions, new ideas, code and documentation contributions, and review. See our definition of contributors.
(GitHub contributors page for this release)
@consideRatio (activity) | @manics (activity) | @minrk (activity)
4.1.4 - 2024-03-30#
Bugs fixed#
avoid xsrf check on navigate GET requests #4759 (@minrk, @consideRatio)
Contributors to this release#
The following people contributed discussions, new ideas, code and documentation contributions, and review. See our definition of contributors.
4.1.3 - 2024-03-26#
Bugs fixed#
respect jupyter-server disable_check_xsrf setting #4753 (@minrk, @consideRatio)
Contributors to this release#
The following people contributed discussions, new ideas, code and documentation contributions, and review. See our definition of contributors.
4.1.2 - 2024-03-25#
4.1.2 fixes a regression in 4.1.0 affecting named servers.
Bugs fixed#
rework handling of multiple xsrf tokens #4750 (@minrk, @consideRatio)
Contributors to this release#
The following people contributed discussions, new ideas, code and documentation contributions, and review. See our definition of contributors.
4.1.1 - 2024-03-23#
4.1.1 fixes a compatibility regression in 4.1.0 for some extensions, particularly jupyter-server-proxy.
Bugs fixed#
allow subclasses to override xsrf check #4745 (@minrk, @consideRatio)
Contributors to this release#
The following people contributed discussions, new ideas, code and documentation contributions, and review. See our definition of contributors.
4.1.0 - 2024-03-20#
JupyterHub 4.1 is a security release, fixing CVE-2024-28233. All JupyterHub deployments are encouraged to upgrade, especially those with other user content on peer domains to JupyterHub.
As always, JupyterHub deployments are especially encouraged to enable per-user domains if protecting users from each other is a concern.
For more information on securely deploying JupyterHub, see the web security documentation.
Enhancements made#
Bugs fixed#
Backport PR #4733 on branch 4.x (Catch ValueError while waiting for server to be reachable) #4734 (@minrk)
Backport PR #4679 on branch 4.x (Unescape jinja username) #4705 (@minrk)
Backport PR #4630: avoid setting unused oauth state cookies on API requests #4697 (@minrk)
Backport PR #4632: simplify, avoid errors in parsing accept headers #4696 (@minrk)
Backport PR #4677 on branch 4.x (Improve validation, docs for token.expires_in) #4692 (@minrk)
Backport PR #4570 on branch 4.x (fix mutation of frozenset in scope intersection) #4691 (@minrk)
Backport PR #4562 on branch 4.x (Use
user.stop
to cleanup spawners that stopped while Hub was down) #4690 (@minrk)Backport PR #4542 on branch 4.x (Fix include_stopped_servers in paginated next_url) #4689 (@minrk)
Backport PR #4651 on branch 4.x (avoid attempting to patch removed IPythonHandler with notebook v7) #4688 (@minrk)
Backport PR #4560 on branch 4.x (singleuser extension: persist token from ?token=… url in cookie) #4687 (@minrk)
Maintenance and upkeep improvements#
Contributors to this release#
The following people contributed discussions, new ideas, code and documentation contributions, and review. See our definition of contributors.
(GitHub contributors page for this release)
@Achele (activity) | @akashthedeveloper (activity) | @balajialg (activity) | @BhavyaT-135 (activity) | @blink1073 (activity) | @consideRatio (activity) | @fcollonval (activity) | @I-Am-D-B (activity) | @jakirkham (activity) | @ktaletsk (activity) | @kzgrzendek (activity) | @lumberbot-app (activity) | @manics (activity) | @mbiette (activity) | @minrk (activity) | @rcthomas (activity) | @ryanlovett (activity) | @sgaist (activity) | @shubham0473 (activity) | @Temidayo32 (activity) | @willingc (activity) | @yuvipanda (activity)
4.0#
4.0.2 - 2023-08-10#
Enhancements made#
avoid counting failed requests to not-running servers as ‘activity’ #4491 (@minrk, @consideRatio)
improve permission-denied errors for various cases #4489 (@minrk, @consideRatio)
Bugs fixed#
set root_dir when using singleuser extension #4503 (@minrk, @consideRatio, @manics)
Allow setting custom log_function in tornado_settings in SingleUserServer #4475 (@grios-stratio, @minrk)
Documentation improvements#
Contributors to this release#
The following people contributed discussions, new ideas, code and documentation contributions, and review. See our definition of contributors.
(GitHub contributors page for this release)
@agelosnm (activity) | @consideRatio (activity) | @diocas (activity) | @grios-stratio (activity) | @jhgoebbert (activity) | @jtpio (activity) | @kosmonavtus (activity) | @kreuzert (activity) | @manics (activity) | @martinRenou (activity) | @minrk (activity) | @opoplawski (activity) | @Ph0tonic (activity) | @sgaist (activity) | @trungleduc (activity) | @yuvipanda (activity)
4.0.1 - 2023-06-08#
Enhancements made#
Bugs fixed#
Abort informatively on unrecognized CLI options #4467 (@minrk, @consideRatio)
Add xsrf to custom_html template context #4464 (@opoplawski, @minrk)
preserve CLI > env priority config in jupyterhub-singleuser extension #4451 (@minrk, @consideRatio, @timeu, @rcthomas)
Maintenance and upkeep improvements#
Fix link to collaboration accounts doc in example #4448 (@minrk)
Update jsx dependencies as much as possible #4443 (@manics, @minrk, @consideRatio)
Remove unused admin JS code #4438 (@yuvipanda, @minrk)
Finish migrating browser tests from selenium to playwright #4435 (@mouse1203, @minrk, @consideRatio)
Migrate some tests from selenium to playwright #4431 (@mouse1203, @minrk)
Begin setup of playwright tests #4420 (@mouse1203, @minrk, @manics)
Documentation improvements#
‘servers’ should be a dict of dicts, not a list of dicts in rest-api.yml #4458 (@tfmark, @minrk)
Config reference: link to nicer(?) API docs first #4456 (@manics, @minrk, @consideRatio)
Add CERN to Gallery of JupyterHub Deployments #4454 (@goseind, @minrk, @consideRatio)
Fix “Thanks” typo. #4441 (@ryanlovett, @minrk)
add HUNT into research institutions #4432 (@matuskosut, @minrk, @manics)
docs: fix missing redirects for api to reference/api #4429 (@consideRatio, @minrk, @manics)
Fix some public URL links within the docs #4427 (@minrk, @consideRatio)
add upgrade note for 4.0 to changelog #4426 (@minrk, @consideRatio)
Contributors to this release#
The following people contributed discussions, new ideas, code and documentation contributions, and review. See our definition of contributors.
(GitHub contributors page for this release)
@consideRatio (activity) | @diocas (activity) | @echarles (activity) | @goseind (activity) | @hsadia538 (activity) | @mahamtariq58 (activity) | @manics (activity) | @matuskosut (activity) | @minrk (activity) | @mouse1203 (activity) | @opoplawski (activity) | @rcthomas (activity) | @ryanlovett (activity) | @tfmark (activity) | @timeu (activity) | @yuvipanda (activity)
4.0.0 - 2023-04-20#
4.0 is a major release, but a small one.
Upgrade note
Upgrading from 3.1 to 4.0 should require no additional action beyond running jupyterhub --upgrade-db
to upgrade the database schema after upgrading the package version.
It is otherwise a regular jupyterhub upgrade.
There are three major changes that should be invisible to most users:
Groups can now have ‘properties’, editable via the admin page, which can be used by Spawners for their operations. This requires a db schema upgrade, so remember to backup and upgrade your database!
Often-problematic header-based checks for cross-site requests have been replaces with more standard use of XSRF tokens. Most folks shouldn’t notice this change, but if “Blocking Cross Origin API request” has been giving you headaches, this should be much improved.
Improved support for Jupyter Server 2.0 by reimplementing
jupyterhub-singleuser
as a standard server extension. This mode is used by default with Jupyter Server >=2.0. Again, this should be an implementation detail to most, but it’s a big change under the hood. If you have issues, please let us know and you can opt-out by settingJUPYTERHUB_SINGLEUSER_EXTENSION=0
in your single-user environment.
In addition to these, thanks to contributions from this years Outreachy interns, we have reorganized the documentation according to diataxis, improved accessibility of JupyterHub pages, and improved testing.
API and Breaking Changes#
Use XSRF tokens for cross-site checks #4032 (@minrk, @consideRatio, @julietKiloRomeo)
New features added#
add Spawner.server_token_scopes config #4400 (@minrk, @consideRatio)
Make singleuser server-extension default #4354 (@minrk, @consideRatio)
singleuser auth as server extension #3888 (@minrk, @consideRatio)
Dynamic table for changing customizable properties of groups #3651 (@vladfreeze, @minrk, @naatebarber, @manics)
Enhancements made#
admin page: improve display of long lists (groups, etc.) #4417 (@manics, @minrk, @ryanlovett)
add a few more buckets for server_spawn_duration_seconds #4352 (@shaneknapp, @yuvipanda)
Standardize styling on input fields by moving common form CSS to page.less #4294 (@minrk, @consideRatio)
Bugs fixed#
make sure named server URLs include trailing slash #4402 (@minrk, @manics)
fix inclusion of singleuser/templates/page.html in wheel #4387 (@consideRatio, @minrk)
exponential_backoff: preserve jitter when max_wait is reached #4383 (@minrk, @manics)
admin panel: fix condition for start/stop buttons on user servers #4365 (@minrk, @consideRatio)
avoid logging error when browsers send invalid cookies #4356 (@minrk, @manics, @consideRatio)
test and fix deprecated load_groups list #4299 (@minrk, @manics)
Fix formatting of load_groups help string #4295 (@minrk, @consideRatio)
Fix skipped heading level across pages #4290 (@bl-aire, @minrk)
Fix reoccurring accessibility issues in JupyterHub’s pages #4274 (@bl-aire, @minrk)
Remove remnants of unused jupyterhub-services cookie #4258 (@minrk, @consideRatio)
Maintenance and upkeep improvements#
add remaining redirects for docs reorg #4423 (@minrk, @consideRatio)
Disable dev traitlets #4419 (@manics, @consideRatio)
dependabot: rename to .yaml #4409 (@consideRatio)
dependabot: fix syntax error of not using quotes for ##:## #4408 (@consideRatio)
dependabot: monthly updates of github actions #4403 (@consideRatio, @minrk)
Refresh 4.0 changelog #4396 (@minrk, @consideRatio)
Reduce size of jupyterhub image #4394 (@alekseyolg, @minrk)
Selenium: updating test_oauth_page #4393 (@mouse1203, @minrk)
avoid warning on engine_connect listener #4392 (@minrk, @consideRatio, @manics)
remove pin from singleuser #4379 (@minrk, @consideRatio, @mathbunnyru)
simplify some async fixtures #4332 (@minrk, @GeorgianaElena, @Sheila-nk)
Selenium: adding new cases that covered Admin UI page #4328 (@mouse1203, @minrk)
also ignore sqlite backups #4327 (@minrk, @consideRatio)
pre-commit: bump isort #4325 (@minrk, @consideRatio)
Remove no longer relevant notice in readme #4324 (@consideRatio, @minrk)
Fix the oauthenticator docs api links #4304 (@GeorgianaElena, @minrk)
Selenium testing: adding new case covered the authorisation page #4298 (@mouse1203, @minrk)
Refactored selenium tests for improved readability #4278 (@mouse1203, @minrk, @consideRatio)
remove deprecated import of pipes.quote #4273 (@minrk, @manics)
only run testing config on localhost #4271 (@minrk, @manics)
pre-commit: autoupdate monthly #4268 (@minrk, @consideRatio, @GeorgianaElena, @betatim)
remove unnecessary actions for firefox/geckodriver #4264 (@minrk, @manics)
docs: refresh Makefile/make.bat #4256 (@consideRatio, @minrk)
Test docs, links on CI #4251 (@minrk, @consideRatio)
maint: fix detail when removing support for py36 #4248 (@consideRatio, @manics)
more selenium test cases #4207 (@mouse1203, @minrk)
Documentation improvements#
Fix variable spelling. #4398 (@ryanlovett, @manics)
Remove bracket around link text without address #4416 (@crazytan, @minrk)
add some more detail and examples to database doc #4399 (@minrk, @consideRatio)
Add emphasis about role loading and hub restarts. #4390 (@ryanlovett, @minrk, @consideRatio)
reduce nested hierarchy in docs organization #4377 (@alwasega, @minrk)
changelog for 4.0 beta #4375 (@minrk, @consideRatio)
add collaboration accounts tutorial #4373 (@minrk, @fperez, @ryanlovett)
Updated the top-level index file #4368 (@alwasega, @sgibson91)
JupyterHub sphinx theme #4363 (@minrk, @choldgraf)
add singleuser explanation doc #4357 (@minrk, @consideRatio)
Updates to the documentation Contribution section #4355 (@alwasega, @sgibson91, @minrk)
Restructured references section of the docs #4343 (@alwasega, @minrk, @sgibson91)
Document use of pytest-asyncio in JupyterHub test suite #4341 (@Sheila-nk, @minrk, @alwasega)
Moved Explanation/Background files #4340 (@alwasega, @sgibson91, @minrk)
Moved last set of Tutorials #4338 (@alwasega, @sgibson91)
fix a couple ref links in changelog #4334 (@minrk, @consideRatio)
Added rediraffe using auto redirect builder #4331 (@alwasega, @sgibson91, @GeorgianaElena, @consideRatio, @manics)
Backport PR #4316 on branch 3.x (changelog for 3.1.1) #4318 (@minrk)
changelog for 3.1.1 #4316 (@minrk, @consideRatio)
Moved second half of HowTo documentation #4314 (@alwasega, @sgibson91, @minrk)
Moved first half of HowTo documentation #4311 (@alwasega, @sgibson91, @minrk)
number agreement in authenticators-users-basics #4309 (@TaofeeqatDev, @minrk)
transferred docs to the FAQ folder #4307 (@alwasega, @sgibson91, @minrk, @GeorgianaElena)
Added docs to the folder #4305 (@alwasega, @sgibson91, @minrk)
Created folders to house the restructured documentation #4301 (@alwasega, @sgibson91, @minrk, @manics, @GeorgianaElena)
expand database docs #4292 (@minrk, @GeorgianaElena, @ajpower, @consideRatio, @manics, @sgibson91)
added note on
Spawner.name_template
setting #4288 (@stevejpurves, @minrk)Document JUPYTER_PREFER_ENV_PATH=0 for shared user environments #4269 (@minrk, @manics)
set max depth on api/index toctree #4259 (@minrk, @consideRatio)
fix bracket typo in capacity figures #4250 (@minrk, @consideRatio)
convert remaining rst files to myst #4249 (@minrk, @consideRatio)
doc: fix formatting of spawner env-vars #4245 (@manics, @consideRatio, @minrk)
Contributors to this release#
The following people contributed discussions, new ideas, code and documentation contributions, and review. See our definition of contributors.
(GitHub contributors page for this release)
@3coins (activity) | @ajcollett (activity) | @ajpower (activity) | @alekseyolg (activity) | @alwasega (activity) | @betatim (activity) | @bl-aire (activity) | @choldgraf (activity) | @consideRatio (activity) | @crazytan (activity) | @dependabot (activity) | @fperez (activity) | @GeorgianaElena (activity) | @julietKiloRomeo (activity) | @ktaletsk (activity) | @manics (activity) | @mathbunnyru (activity) | @meeseeksdev (activity) | @meeseeksmachine (activity) | @minrk (activity) | @mouse1203 (activity) | @naatebarber (activity) | @pnasrat (activity) | @pre-commit-ci (activity) | @ryanlovett (activity) | @sgibson91 (activity) | @shaneknapp (activity) | @Sheila-nk (activity) | @stevejpurves (activity) | @TaofeeqatDev (activity) | @vladfreeze (activity) | @yuvipanda (activity)
3.1#
3.1.1 - 2023-01-27#
3.1.1 has only tiny bugfixes, enabling compatibility with the latest sqlalchemy 2.0 release, and fixing some metadata files that were not being included in wheel installs.
Bugs fixed#
3.1.0 - 2022-12-05#
3.1.0 is a small release, fixing various bugs and introducing some small new features and metrics. See more details below.
Thanks to many Outreachy applicants for significantly improving our documentation! This release fixes a problem in the jupyterhub/jupyterhub docker image, where the admin page could be empty.
New features added#
Add active users prometheus metrics #4214 (@yuvipanda, @consideRatio, @manics, @minrk)
Allow named_server_limit_per_user type to be callable #4053 (@danilopeixoto, @minrk, @GeorgianaElena, @dietmarw)
Bugs fixed#
make current_user available to handle_logout hook #4230 (@minrk, @yuvipanda)
Fully resolve requested scopes in oauth #4063 (@minrk, @consideRatio)
Fix crash when removing scopes attribute from an existing role #4045 (@miwig, @minrk)
Pass launch_instance args on correctly. #4039 (@hjoliver, @minrk, @timeu)
Fix Dockerfile yarn JSX build #4034 (@manics, @consideRatio, @utkarshgupta137)
Maintenance and upkeep improvements#
ci: update database image versions #4233 (@minrk, @consideRatio)
selenium: update next_url after waiting for it to change #4225 (@minrk, @consideRatio)
maint: add test to extras_require, remove greenlet workaround, test final py311, misc cleanup #4223 (@consideRatio, @minrk, @manics)
pre-commit: add autoflake and make flake8 checks stricter #4219 (@consideRatio, @minrk, @manics)
flake8: cleanup unused/redundant config #4216 (@consideRatio, @minrk)
ci: use non-deprecated codecov uploader #4187 (@consideRatio, @minrk)
jsx: remove unused useState #4173 (@liliyao2022, @minrk)
Deleted unused failRegexEvent #4172 (@liliyao2022, @minrk)
set stacklevel for oauth_scopes deprecation warning #4064 (@minrk, @consideRatio)
setup.py: require npm, check that NPM CSS JSX commands succeed #4035 (@manics, @minrk)
Add browser-based tests with Selenium #4026 (@mouse1203, @minrk, @consideRatio)
Documentation improvements#
clarify docstrings for default_server_name #4240 (@minrk, @manics, @behrmann)
changelog for 1.5.1 #4234 (@minrk, @consideRatio, @mriedem)
docs: refresh conf.py, add opengraph and rediraffe extensions #4227 (@consideRatio, @choldgraf, @minrk)
docs: sphinx config cleanup, removing epub build, fix build warnings #4222 (@consideRatio, @minrk, @choldgraf)
grammar in security-basics.rst #4210 (@ArafatAbdussalam, @minrk)
avoid contraction in setup.rst #4209 (@ArafatAbdussalam, @minrk)
improved the grammatical structure #4208 (@ArafatAbdussalam, @yuvipanda)
highlight “what is our actual goal” in faq #4186 (@emmanuella194, @minrk)
Edits to institutional FAQ #4185 (@lumenCodes, @minrk)
typos in example readme #4171 (@liliyao2022, @minrk)
Updated deployment gallery links #4170 (@KaluBuikem, @minrk)
typo in contributing doc #4169 (@emmanuella194, @minrk)
clarify CHP downsides in proxy doc #4168 (@lumenCodes, @minrk)
Proofread services.md #4167 (@lumenCodes, @minrk)
Welcome first-time contributors to the forum #4166 (@lumenCodes, @minrk)
Typo in institutional-faq #4162 (@liliyao2022, @minrk)
Improve text in proxy docs #4161 (@Christiandike, @minrk)
revise sudo config documentation #4160 (@Christiandike, @minrk)
Typo in roadmap.md #4159 (@liliyao2022, @minrk)
fixes in jsx/README.md #4158 (@liliyao2022, @minrk)
Update dockerfile README #4156 (@liliyao2022, @minrk)
Add link to OAuth 2 #4155 (@Christiandike, @minrk)
Consistent capitalization of Authenticator #4153 (@Teniola-theDev, @minrk)
Link back to rbac from use-cases #4152 (@emmanuella194, @minrk)
grammar improvements in quickstart.md #4150 (@liliyao2022, @minrk)
update/spawners.md #4148 (@Christiandike, @sgibson91)
oauth doc: added punctuations and capitalized words where necessary #4147 (@Teebarh, @minrk)
Update to the spawner basic file #4146 (@lumenCodes, @minrk)
Added text to documentation for more readability #4145 (@softkeldozy, @minrk)
add link to rbac index from implementation #4140 (@Joel-Ando, @minrk)
highlight note about the docker image scope #4139 (@Joel-Ando, @minrk)
Update wording in web security docs #4136 (@yamakat, @minrk)
update websecurity.md #4135 (@Christiandike, @sgibson91, @minrk)
I capitalized cli and added y to jupterhub #4133 (@Teniola-theDev, @sgibson91)
update spawners-basics.md #4132 (@ToobaJamal, @minrk)
refine text in template docs. #4130 (@lumenCodes, @minrk)
reorder REST API doc #4129 (@lumenCodes, @minrk)
update troubleshooting.md docs #4127 (@ArafatAbdussalam, @minrk, @consideRatio)
Improve clarity in troubleshooting doc #4126 (@PoorvajaRayas, @minrk)
Add concrete steps to services-basics #4124 (@AdrianaHelga, @minrk)
Formatting changes to tests.rst #4119 (@PoorvajaRayas, @minrk)
Capitalization typo in troubleshooting.md #4118 (@falyne, @minrk)
Fix duplicate statement in upgrading doc #4116 (@Eshy10, @minrk)
Issue #41 - Documentation reviewed, made concise and beginners friendly, added a JupyterHub image #4114 (@alexanderchosen, @sgibson91)
proofread upgrading docs #4113 (@EstherChristopher, @minrk)
Reviewed the documentation #4109 (@EstherChristopher, @minrk)
Edited and restructured
server-api
file #4106 (@alwasega, @minrk)Proofread and Improve security-basics.rst #4098 (@NPDebs, @minrk)
Modifications to URLs docs #4097 (@Mackenzie-OO7, @minrk)
Link reference to github oauth config with jupyter #4096 (@Christiandike, @GeorgianaElena)
Update config-proxy.md #4095 (@Goodiec, @minrk, @ryanlovett)
Fixed typos and added punctuations #4094 (@Busayo-ojo, @minrk, @GeorgianaElena)
Fix typo #4093 (@Achele, @GeorgianaElena)
modified announcement README and config.py #4092 (@Temidayo32, @sgibson91)
Update config-user-env.md #4091 (@Goodiec, @minrk, @GeorgianaElena)
Restructured Community communication channels file #4090 (@alwasega, @minrk)
Update tech-implementation.md #4089 (@Christiandike, @GeorgianaElena, @minrk, @KaluBuikem, @Ginohmk)
Grammatical/link fixes in upgrading doc #4088 (@ToobaJamal, @minrk, @Mackenzie-OO7)
Updated log message docs #4086 (@ArafatAbdussalam, @minrk)
Update troubleshooting.md #4085 (@Goodiec, @minrk, @GeorgianaElena)
Refine text in documentation index #4084 (@melissakirabo, @minrk)
updated websecurity.md #4083 (@ArafatAbdussalam, @minrk)
Migrate community channels to markdown, update text #4081 (@Christiandike, @GeorgianaElena, @minrk)
Improve the documentation about log messages #4079 (@mahamtariq58, @GeorgianaElena)
Update index.rst #4074 (@ToobaJamal, @minrk, @GeorgianaElena)
improved the quickstart docker guide #4073 (@ikeadeoyin, @minrk)
[doc] templates: updated obsolete links and made wordings clearer #4072 (@ruqayaahh, @minrk)
Modifications to Technical Overview Docs #4070 (@Mackenzie-OO7, @GeorgianaElena, @minrk)
Modifications is testing docs #4069 (@chicken-biryani, @GeorgianaElena, @manics, @minrk)
Update setup.rst #4068 (@Goodiec, @GeorgianaElena, @minrk)
fixed some typos and also added links #41 #4067 (@Uzor13, @GeorgianaElena, @manics, @chicken-biryani)
Modification in community channels docs #4065 (@chicken-biryani, @minrk, @manics)
Add note about building the docs from Windows #4061 (@Temidayo32, @consideRatio)
mention c.Spawner.auth_state_hook in Authenticator auth state docs #4046 (@Neeraj-Natu, @minrk)
Add draft capacity planning doc #4008 (@minrk, @choldgraf)
Some suggestions from reading through the docs #2641 (@ericdill, @minrk, @manics, @willingc)
API and Breaking Changes#
deprecate JupyterHub.extra_handlers #4236 (@minrk, @GeorgianaElena)
Contributors to this release#
(GitHub contributors page for this release)
@Achele | @AdrianaHelga | @alexanderchosen | @alwasega | @ArafatAbdussalam | @behrmann | @bl-aire | @Busayo-ojo | @chicken-biryani | @choldgraf | @Christiandike | @consideRatio | @danilopeixoto | @dependabot | @dietmarw | @Emenyi95 | @emmanuella194 | @ericdill | @Eshy10 | @EstherChristopher | @falyne | @GeorgianaElena | @Ginohmk | @github-actions | @Goodiec | @hjoliver | @hsadia538 | @ikeadeoyin | @iLynette | @Joel-Ando | @KaluBuikem | @kamzzy | @liliyao2022 | @lumenCodes | @Mackenzie-OO7 | @mahamtariq58 | @manics | @melissakirabo | @minrk | @miwig | @mouse1203 | @mriedem | @Neeraj-Natu | @NPDebs | @PoorvajaRayas | @pre-commit-ci | @ruqayaahh | @ryanlovett | @sgibson91 | @softkeldozy | @Teebarh | @Temidayo32 | @Teniola-theDev | @timeu | @ToobaJamal | @utkarshgupta137 | @Uzor13 | @willingc | @yamakat | @yuvipanda | @zeelyha
3.0#
3.0.0 - 2022-09-08#
3.0 is a major upgrade, but a small one.
It qualifies as a major upgrade because of two changes:
It includes a database schema change (
jupyterhub --upgrade-db
). The schema change should not be disruptive, but we’ve decided that any schema change qualifies as a major version upgrade.We’ve dropped support for Python 3.6, which reached End-of-Life in 2021. If you are using at least Python 3.7, this change should have no effect.
The database schema change is small and should not be disruptive, but downgrading is always harder than upgrading after a db migration, which makes rolling back the update more likely to be problematic.
Changes in RBAC#
The biggest changes in 3.0 relate to JupyterHub RBAC, which also means they shouldn’t affect most users. The users most affected will be JupyterHub admins using JupyterHub roles extensively to define user permissions.
After testing 2.0 in the wild,
we learned that we had used roles in a few places that should have been scopes.
Specifically, OAuth tokens now have scopes instead of roles
(and token-issuing oauth clients now have allowed_scopes
instead of allowed_roles
).
The consequences should be fairly transparent to users,
but anyone who ran into the restrictions of roles in the oauth process
should find scopes easier to work with.
We tried not to break anything here, so any prior use of roles will still work with a deprecation,
but the role will be resolved immediately at token-issue time,
rather than every time the token is used.
This especially came up testing the new Custom scopes feature. Authors of JupyterHub-authenticated services can now extend JupyterHub’s RBAC functionality to define their own scopes, and assign them to users and groups via roles. This can be used to e.g. limit student/grader/instructor permissions in a grading service, or grant instructors read-only access to their students’ single-user servers starting with upcoming Jupyter Server 2.0.
Further extending granular control of permissions,
we have added !service
and !server
filters for scopes (Self-referencing filters),
like we had for !user
.
Access to the admin UI is now governed by a dedicated admin-ui
scope,
rather than combined admin:servers
and admin:users
in 2.0.
More info in Available scopes.
More highlights#
The admin UI can now show more detailed info about users and their servers in a drop-down details table:
Several bugfixes and improvements in the new admin UI.
Direct access to the Hub’s database is deprecated. We intend to change the database connection lifecycle in the future to enable scalability and high-availability (HA), and limiting where connections and transactions can occur is an important part of making that possible.
Lots more bugfixes and error-handling improvements.
New features added#
include stopped servers in user model #3909 (@minrk, @consideRatio)
allow HubAuth to be async #3883 (@minrk, @consideRatio, @sgibson91)
add ‘admin-ui’ scope for access to the admin ui #3878 (@minrk, @GeorgianaElena, @manics)
store scopes on oauth clients, too #3877 (@minrk, @consideRatio, @manics)
!service and !server filters #3851 (@minrk, @consideRatio)
allow user-defined custom scopes #3713 (@minrk, @consideRatio, @manics)
Enhancements made#
Integrate Pagination API into Admin JSX #4002 (@naatebarber, @minrk)
add correct autocomplete fields for login form #3958 (@minrk, @consideRatio)
Tokens have scopes instead of roles #3833 (@minrk, @consideRatio)
Bugs fixed#
Use correct expiration labels in drop-down menu on token page. #4022 (@possiblyMikeB, @consideRatio)
avoid database error on repeated group name in sync_groups #4019 (@minrk, @manics)
reset offset to 0 on name filter change #4018 (@minrk, @consideRatio)
admin: avoid redundant client-side username validation in edit-user #4016 (@minrk, @consideRatio)
restore trimming of username input #4011 (@minrk, @consideRatio)
nbclassic extension name has been renamed #3971 (@minrk, @consideRatio)
Fix disabling of individual page template announcements #3969 (@consideRatio, @manics, @minrk)
validate proxy.extra_routes #3967 (@minrk, @consideRatio)
FreeBSD, missing -n for pw useradd #3953 (@silenius, @minrk, @manics)
admin: Hub is responsible for username validation #3936 (@minrk, @consideRatio, @NarekA, @yuvipanda)
admin: Fix spawn page link for default server #3935 (@minrk, @consideRatio, @benz0li)
let errors raised in an auth_state_hook halt spawn #3908 (@minrk, @consideRatio)
Maintenance and upkeep improvements#
Test 3.11 #4013 (@minrk, @consideRatio)
Avoid IOLoop.current in singleuser mixins #3992 (@minrk, @consideRatio)
Increase stacklevel for decorated warnings #3978 (@minrk, @consideRatio)
Bump Dockerfile base image to 22.04 #3975 (@minrk, @consideRatio, @manics)
Avoid deprecated ‘IOLoop.current’ method #3974 (@minrk, @consideRatio, @manics)
switch to importlib_metadata for entrypoints #3937 (@minrk, @consideRatio)
pages.py: Remove unreachable code #3921 (@manics, @minrk, @consideRatio)
Use isort for import formatting #3852 (@minrk, @consideRatio, @choldgraf, @yuvipanda)
Documentation improvements#
document oauth_no_confirm in services #4012 (@minrk, @consideRatio)
Remove outdated cookie-secret note in security docs #3997 (@minrk, @consideRatio)
jupyter troubleshooting
➡️jupyter troubleshoot
#3903 (@manics, @minrk, @consideRatio)admin_access
no longer works as it is overridden by RBAC scopes #3899 (@manics, @minrk)Document the ‘display’ attribute of services #3895 (@yuvipanda, @minrk, @sgibson91)
remove apache NE flag as it prevents opening folders and renaming fil… #3891 (@bbrauns, @minrk)
API and Breaking Changes#
Contributors to this release#
(GitHub contributors page for this release)
@ajcollett | @bbrauns | @benz0li | @betatim | @blink1073 | @brospars | @Carreau | @choldgraf | @cmd-ntrf | @code-review-doctor | @consideRatio | @cqzlxl | @dependabot | @fabianbaier | @GeorgianaElena | @github-actions | @hansen-m | @huage1994 | @jbaksta | @jgwerner | @jhermann | @johnkpark | @jwclark | @maluhoss | @manics | @mathematicalmichael | @meeseeksdev | @minrk | @mriedem | @naatebarber | @NarekA | @naveensrinivasan | @nicorikken | @nsshah1288 | @panruipr | @paulkerry1 | @possiblyMikeB | @pre-commit-ci | @rcthomas | @robnagler | @rpwagner | @ryogesh | @sgibson91 | @silenius | @SonakshiGrover | @superfive666 | @tharwan | @vpavlin | @willingc | @ykazakov | @yuvipanda | @zoltan-fedor
2.3#
2.3.1 - 2022-06-06#
This release includes a selection of bugfixes.
Bugs fixed#
use equality to filter token prefixes #3910 (@minrk, @yuvipanda)
ensure custom template is loaded with jupyter-server notebook extension #3919 (@minrk, @yuvipanda)
set default_url via config #3918 (@minrk, @yuvipanda)
Force add existing certificates #3906 (@fabianbaier, @minrk)
admin: make user-info table selectable #3889 (@johnkpark, @minrk, @naatebarber, @NarekA)
ensure _import_error is set when JUPYTERHUB_SINGLEUSER_APP is unavailable #3837 (@minrk, @consideRatio)
Contributors to this release#
(GitHub contributors page for this release)
@bbrauns | @betatim | @blink1073 | @brospars | @Carreau | @choldgraf | @consideRatio | @fabianbaier | @GeorgianaElena | @github-actions | @hansen-m | @jbaksta | @jgwerner | @jhermann | @johnkpark | @maluhoss | @manics | @mathematicalmichael | @meeseeksdev | @minrk | @mriedem | @naatebarber | @NarekA | @nicorikken | @nsshah1288 | @panruipr | @paulkerry1 | @rcthomas | @robnagler | @ryogesh | @sgibson91 | @SonakshiGrover | @tharwan | @vpavlin | @welcome | @willingc | @yuvipanda | @zoltan-fedor
2.3.0 - 2022-05-06#
Enhancements made#
Bugs fixed#
don’t confuse :// in next_url query params for a redirect hostname #3876 (@minrk, @GeorgianaElena)
Search bar disabled on admin dashboard #3863 (@NarekA, @minrk)
Do not store Spawner.ip/port on spawner.server during get_env #3859 (@minrk, @manics, @consideRatio)
ensure _import_error is set when JUPYTERHUB_SINGLEUSER_APP is unavailable #3837 (@minrk, @consideRatio)
Maintenance and upkeep improvements#
Use log.exception when logging exceptions #3882 (@yuvipanda, @minrk, @sgibson91)
Missing
f
prefix on f-strings fix #3874 (@code-review-doctor, @minrk, @consideRatio)adopt pytest-asyncio asyncio_mode=’auto’ #3841 (@minrk, @consideRatio, @manics)
remove lingering reference to distutils #3835 (@minrk, @consideRatio)
Documentation improvements#
Fix typo in REST API link in README.md #3862 (@cmd-ntrf, @consideRatio)
The word
used
is duplicated in upgrade.md #3849 (@huage1994, @consideRatio)Some typos in docs #3843 (@minrk, @consideRatio)
Document version mismatch log message #3839 (@yuvipanda, @consideRatio, @minrk)
Contributors to this release#
(GitHub contributors page for this release)
@choldgraf | @cmd-ntrf | @code-review-doctor | @consideRatio | @dependabot | @GeorgianaElena | @github-actions | @huage1994 | @johnkpark | @jwclark | @manics | @minrk | @NarekA | @pre-commit-ci | @sgibson91 | @ykazakov | @yuvipanda
2.2#
2.2.2 2022-03-14#
2.2.2 fixes a small regressions in 2.2.1.
Bugs fixed#
Fix failure to update admin-react.js by re-compiling from our source #3825 (@NarekA, @consideRatio, @minrk, @manics)
Continuous integration improvements#
ci: standalone jsx workflow and verify compiled asset matches source code #3826 (@consideRatio, @NarekA)
Contributors to this release#
(GitHub contributors page for this release)
@consideRatio | @manics | @minrk | @NarekA
2.2.1 2022-03-11#
2.2.1 fixes a few small regressions in 2.2.0.
Bugs fixed#
Fix clearing cookie with custom xsrf cookie options #3823 (@minrk, @consideRatio)
Fix admin dashboard table sorting #3822 (@NarekA, @minrk, @consideRatio)
Maintenance and upkeep improvements#
allow Spawner.server to be mocked without underlying orm_spawner #3819 (@minrk, @yuvipanda, @consideRatio)
Documentation#
Add some docs on common log messages #3820 (@yuvipanda, @choldgraf, @consideRatio)
Contributors to this release#
(GitHub contributors page for this release)
@choldgraf | @consideRatio | @minrk | @NarekA | @yuvipanda
2.2.0 2022-03-07#
JupyterHub 2.2.0 is a small release. The main new feature is the ability of Authenticators to manage group membership, e.g. when the identity provider has its own concept of groups that should be preserved in JupyterHub.
The links to access user servers from the admin page have been restored.
New features added#
Enhancements made#
Add user token to JupyterLab PageConfig #3809 (@minrk, @manics, @consideRatio)
show insecure-login-warning for all authenticators #3793 (@satra, @minrk)
short-circuit token permission check if token and owner share role #3792 (@minrk, @consideRatio)
Named server support, access links in admin page #3790 (@NarekA, @minrk, @ykazakov, @manics)
Bugs fixed#
Keep Spawner.server in sync with underlying orm_spawner.server #3810 (@minrk, @manics, @GeorgianaElena, @consideRatio)
Replace failed spawners when starting new launch #3802 (@minrk, @consideRatio)
Log proxy’s public_url only when started by JupyterHub #3781 (@cqzlxl, @consideRatio, @minrk)
Documentation improvements#
Apache2 Documentation: Updates Reverse Proxy Configuration (TLS/SSL, Protocols, Headers) #3813 (@rzo1, @minrk)
Update example to not reference an undefined scope #3812 (@ktaletsk, @minrk)
Apache: set X-Forwarded-Proto header #3808 (@manics, @consideRatio, @rzo1, @tobi45)
idle-culler example config missing closing bracket #3803 (@tmtabor, @consideRatio)
Behavior Changes#
Stop opening PAM sessions by default #3787 (@minrk, @consideRatio)
Contributors to this release#
(GitHub contributors page for this release)
@blink1073 | @clkao | @consideRatio | @cqzlxl | @dependabot | @dtaniwaki | @fcollonval | @GeorgianaElena | @github-actions | @kshitija08 | @ktaletsk | @manics | @minrk | @NarekA | @pre-commit-ci | @rajat404 | @rcthomas | @ryogesh | @rzo1 | @satra | @thomafred | @tmtabor | @tobi45 | @ykazakov
2.1#
2.1.1 2022-01-25#
2.1.1 is a tiny bugfix release,
fixing an issue where admins did not receive the new read:metrics
permission.
Bugs fixed#
add missing read:metrics scope to admin role #3778 (@minrk, @consideRatio)
Contributors to this release#
2.1.0 2022-01-21#
2.1.0 is a small bugfix release, resolving regressions in 2.0 and further refinements.
In particular, the authenticated prometheus metrics endpoint did not work in 2.0 because it lacked a scope.
To access the authenticated metrics endpoint with a token,
upgrade to 2.1 and make sure the token/owner has the read:metrics
scope.
Custom error messages for failed spawns are now handled more consistently on the spawn-progress API and the spawn-failed HTML page.
Previously, spawn-progress did not relay the custom message provided by exception.jupyterhub_message
,
and full HTML messages in exception.jupyterhub_html_message
can now be displayed in both contexts.
The long-deprecated, inconsistent behavior when users visited a URL for another user’s server,
where they could sometimes be redirected back to their own server,
has been removed in favor of consistent behavior based on the user’s permissions.
To share a URL that will take any user to their own server, use https://my.hub/hub/user-redirect/path/...
.
Enhancements made#
relay custom messages in exception.jupyterhub_message in progress API #3764 (@minrk)
Add the capability to inform a connection to Alembic Migration Script #3762 (@DougTrajano)
Bugs fixed#
Maintenance and upkeep improvements#
Documentation improvements#
Contributors to this release#
(GitHub contributors page for this release)
@consideRatio | @dependabot | @DougTrajano | @IgorBerman | @minrk | @twalcari | @welcome
2.0#
2.0.2 2022-01-10#
2.0.2 fixes a regression in 2.0.1 causing false positives rejecting valid requests as cross-origin, mostly when JupyterHub is behind additional proxies.
Bugs fixed#
Maintenance and upkeep improvements#
Documentation improvements#
DOCS: Update theme configuration #3754 (@choldgraf)
DOC: Add note about allowed_users not being set #3748 (@choldgraf)
Contributors to this release#
(GitHub contributors page for this release)
@choldgraf | @consideRatio | @github-actions | @jakob-keller | @manics | @meeseeksmachine | @minrk | @pre-commit-ci | @welcome
2.0.1#
2.0.1 is a bugfix release, with some additional small improvements, especially in the new RBAC handling and admin page.
Several issues are fixed where users might not have the default ‘user’ role as expected.
Enhancements made#
Bugs fixed#
initialize new admin users with default roles #3735 (@minrk)
Fix error message about Authenticator.pre_spawn_start #3716 (@minrk)
admin: Pass Base Url #3715 (@naatebarber)
Grant role after user creation during config load #3714 (@a3626a)
Avoid clearing user role membership when defining custom user scopes #3708 (@minrk)
cors: handle mismatched implicit/explicit ports in host header #3701 (@minrk)
Maintenance and upkeep improvements#
Documentation improvements#
Contributors to this release#
(GitHub contributors page for this release)
@a3626a | @betatim | @consideRatio | @github-actions | @kylewm | @manics | @minrk | @naatebarber | @pre-commit-ci | @sgaist | @welcome
2.0.0#
JupyterHub 2.0 is a big release!
The most significant change is the addition of roles and scopes to the JupyterHub permissions model, allowing more fine-grained access control. Read more about it in the docs.
In particular, the ‘admin’ level of permissions should not be needed anymore,
and you can now grant users and services only the permissions they need, not more.
We encourage you to review permissions, especially any service or user with admin: true
and consider assigning only the necessary roles and scopes.
JupyterHub 2.0 requires an update to the database schema, so make sure to read the upgrade documentation and backup your database before upgrading.
stop all servers before upgrading
Upgrading JupyterHub to 2.0 revokes all tokens issued before the upgrade, which means that single-user servers started before the upgrade will become inaccessible after the upgrade until they have been stopped and started again. To avoid this, it is best to shutdown all servers prior to the upgrade.
Other major changes that may require updates to your deployment, depending on what features you use:
List endpoints now support pagination, and have a max page size, which means API consumers must be updated to make paginated requests if you have a lot of users and/or groups.
Spawners have stopped specifying any command-line options to spawners by default. Previously,
--ip
and--port
could be specified on the command-line. From 2.0 forward, JupyterHub will only communicate options to Spawners via environment variables, and the command to be launched is configured exclusively viaSpawner.cmd
andSpawner.args
.
Other new features:
new Admin page, written in React. With RBAC, it should now be fully possible to implement a custom admin panel as a service via the REST API.
JupyterLab is the default UI for single-user servers, if available in the user environment. See more info in the docs about switching back to the classic notebook, if you are not ready to switch to JupyterLab.
NullAuthenticator is now bundled with JupyterHub, so you no longer need to install the
nullauthenticator
package to disable login, you can setc.JupyterHub.authenticator_class = 'null'
.Support
jupyterhub --show-config
option to see your current jupyterhub configuration.Add expiration date dropdown to Token page
and major bug fixes:
Improve database rollback recovery on broken connections
and other changes:
Requests to a not-running server (e.g. visiting
/user/someuser/
) will return an HTTP 424 error instead of 503, making it easier to monitor for real deployment problems. JupyterLab in the user environment should be at least version 3.1.16 to recognize this error code as a stopped server. You can temporarily opt-in to the older behavior (e.g. if older JupyterLab is required) by settingc.JupyterHub.use_legacy_stopped_server_status_code = True
.
Plus lots of little fixes along the way.
2.0.0 - 2021-12-01#
New features added#
support inherited
--show-config
flags from base Application #3559 (@minrk)Add expiration date dropdown to Token page #3552 (@dolfinus)
Support auto login when used as a OAuth2 provider #3488 (@yuvipanda)
Make JupyterHub Admin page into a React app #3398 (@naatebarber)
Stop specifying
--ip
and--port
on the command-line #3381 (@minrk)
Enhancements made#
Fail suspected API requests with 424, not 503 #3636 (@yuvipanda)
Reduce logging verbosity of ‘checking routes’ #3604 (@yuvipanda)
Remove a couple every-request debug statements #3582 (@minrk)
Validate Content-Type Header for api POST requests #3575 (@VaishnaviHire)
Improved Grammar for the Documentation #3572 (@eruditehassan)
Bugs fixed#
Forward-port fixes from 1.5.0 security release #3679 (@minrk)
raise 404 on admin attempt to spawn nonexistent user #3653 (@minrk)
new user token returns 200 instead of 201 #3646 (@joegasewicz)
Added base_url to path for jupyterhub-session-id cookie #3625 (@albertmichaelj)
Fix wrong name of auth_state_hook in the exception log #3569 (@dolfinus)
Stop injecting statsd parameters into the configurable HTTP proxy #3568 (@paccorsi)
explicit DB rollback for 500 errors #3566 (@nsshah1288)
Avoid zombie processes in case of using LocalProcessSpawner #3543 (@dolfinus)
Fix regression where external services api_token became required #3531 (@consideRatio)
Fix allow_all check when only allow_admin is set #3526 (@dolfinus)
Bug: save_bearer_token (provider.py) passes a float value to the expires_at field (int) #3484 (@weisdd)
Maintenance and upkeep improvements#
build jupyterhub/singleuser along with other images #3690 (@minrk)
Forward-port fixes from 1.5.0 security release #3679 (@minrk)
verify that successful login assigns default role #3674 (@minrk)
use v2 of jupyterhub/action-major-minor-tag-calculator #3672 (@minrk)
clarify some log messages during role assignment #3663 (@minrk)
Rename ‘all’ metascope to more descriptive ‘inherit’ #3661 (@minrk)
minor refinement of excessive scopes error message #3660 (@minrk)
deprecate instead of remove
@admin_only
auth decorator #3659 (@minrk)Add pyupgrade –py36-plus to pre-commit config #3586 (@consideRatio)
pyupgrade: run pyupgrade –py36-plus and black on all but tests #3585 (@consideRatio)
pyupgrade: run pyupgrade –py36-plus and black on jupyterhub/tests #3584 (@consideRatio)
remove very old backward-compat for LocalProcess subclasses #3558 (@minrk)
release docker workflow: ‘branchRegex: ^\w[\w-.]*$’ #3509 (@manics)
exclude dependabot push events from release workflow #3505 (@minrk)
Documentation improvements#
docs: fix typo in proxy config example #3657 (@edgarcosta)
server-api example typo: trim space in token file #3626 (@minrk)
[doc] add example specifying scopes for a default role #3581 (@minrk)
Add detailed doc for starting/waiting for servers via api #3565 (@minrk)
doc: Mention a list of known proxies available #3546 (@AbdealiJK)
Update changelog for 1.4.2 in main branch #3539 (@consideRatio)
Retrospectively update changelog for 1.4.1 in main branch #3537 (@consideRatio)
Add research study participation notice to readme #3506 (@sgibson91)
Fix typo #3494 (@davidbrochart)
Add Chameleon to JupyterHub deployment gallery #3482 (@diurnalist)
Contributors to this release#
(GitHub contributors page for this release)
@0mar | @AbdealiJK | @albertmichaelj | @betatim | @bollwyvl | @choldgraf | @consideRatio | @cslocum | @danlester | @davidbrochart | @dependabot | @diurnalist | @dolfinus | @echarles | @edgarcosta | @ellisonbg | @eruditehassan | @icankeep | @IvanaH8 | @joegasewicz | @manics | @meeseeksmachine | @minrk | @mriedem | @naatebarber | @nsshah1288 | @octavd | @OrnithOrtion | @paccorsi | @panruipr | @pre-commit-ci | @rpwagner | @sgibson91 | @support | @twalcari | @VaishnaviHire | @warwing | @weisdd | @welcome | @willingc | @ykazakov | @yuvipanda
1.5#
JupyterHub 1.5 is a security release, fixing a vulnerability ghsa-cw7p-q79f-m2v7 where JupyterLab users with multiple tabs open could fail to logout completely, leaving their browser with valid credentials until they logout again.
A few fully backward-compatible features have been backported from 2.0.
1.5.1 2022-12-05#
This is a patch release, improving db resiliency when certain errors occur, without requiring a jupyterhub restart.
Merged PRs#
Backport db rollback fixes to 1.x #4076 (@mriedem, @minrk), @nsshah1288
Contributors to this release#
1.5.0 2021-11-04#
New features added#
Backport #3636 to 1.4.x (opt-in support for JupyterHub.use_legacy_stopped_server_status_code) #3639 (@yuvipanda)
Backport PR #3552 on branch 1.4.x (Add expiration date dropdown to Token page) #3580 (@meeseeksmachine)
Backport PR #3488 on branch 1.4.x (Support auto login when used as a OAuth2 provider) #3579 (@meeseeksmachine)
Maintenance and upkeep improvements#
Documentation improvements#
Contributors to this release#
(GitHub contributors page for this release)
@choldgraf | @consideRatio | @manics | @meeseeksmachine | @minrk | @support | @welcome | @yuvipanda
1.4#
JupyterHub 1.4 is a small release, with several enhancements, bug fixes, and new configuration options.
There are no database schema changes requiring migration from 1.3 to 1.4.
1.4 is also the first version to start publishing docker images for arm64.
In particular, OAuth tokens stored in user cookies,
used for accessing single-user servers and hub-authenticated services,
have changed their expiration from one hour to the expiry of the cookie
in which they are stored (default: two weeks).
This is now also configurable via JupyterHub.oauth_token_expires_in
.
The result is that it should be much less likely for auth tokens stored in cookies to expire during the lifetime of a server.
1.4.2 2021-06-15#
1.4.2 is a small bugfix release for 1.4.
Bugs fixed#
Fix regression where external services api_token became required #3531 (@consideRatio)
Bug: save_bearer_token (provider.py) passes a float value to the expires_at field (int) #3484 (@weisdd)
Maintenance and upkeep improvements#
Documentation improvements#
Fix typo #3494 (@davidbrochart)
Contributors to this release#
(GitHub contributors page for this release)
@consideRatio | @davidbrochart | @icankeep | @minrk | @weisdd
1.4.1 2021-05-12#
1.4.1 is a small bugfix release for 1.4.
Enhancements made#
Bugs fixed#
Maintenance and upkeep improvements#
ci: fix typo in environment variable #3457 (@consideRatio)
avoid re-using asyncio.Locks across event loops #3456 (@minrk)
ci: github workflow security, pin action to sha etc #3436 (@consideRatio)
Documentation improvements#
Fix documentation #3452 (@davidbrochart)
Contributors to this release#
(GitHub contributors page for this release)
@0mar | @betatim | @consideRatio | @danlester | @davidbrochart | @IvanaH8 | @manics | @minrk | @naatebarber | @OrnithOrtion | @support | @welcome
1.4.0 2021-04-19#
New features added#
Support Proxy.extra_routes #3430 (@yuvipanda)
login-template: Add a “login_container” block inside the div-container. #3422 (@olifre)
Allow customization of service menu via templates #3345 (@stv0g)
Add Spawner.delete_forever #3337 (@nsshah1288)
Allow to set spawner-specific hub connect URL #3326 (@dtaniwaki)
Make Authenticator Custom HTML Flexible #3315 (@dtaniwaki)
Enhancements made#
Log the exception raised in Spawner.post_stop_hook instead of raising it #3418 (@jiajunjie)
Don’t delete all oauth clients on startup #3407 (@yuvipanda)
Use ‘secrets’ module to generate secrets #3394 (@yuvipanda)
Allow cookie_secret to be set to a hexadecimal string #3343 (@consideRatio)
Clear tornado xsrf cookie on logout #3341 (@dtaniwaki)
always log slow requests at least at info-level #3338 (@minrk)
Bugs fixed#
Maintenance and upkeep improvements#
alpine dockerfile: avoid compilation by getting some deps from apk #3386 (@minrk)
Fix sqlachemy.interfaces.PoolListener deprecation for tests #3383 (@IvanaH8)
Update pre-commit hooks versions #3362 (@consideRatio)
move get_custom_html to base Authenticator class #3359 (@minrk)
[TST] Do not implicitly create users in auth_header #3344 (@minrk)
ci: github actions, allow for manual test runs and fix badge in readme #3324 (@consideRatio)
Documentation improvements#
Fix link to jupyterhub/jupyterhub-the-hard-way #3417 (@manics)
Added Azure AD as a supported authenticator. #3401 (@maxshowarth)
Fix the help related to the proxy check #3332 (@jiajunjie)
Mention Jupyter Server as optional single-user backend in documentation #3329 (@Zsailer)
Fix mixup in comment regarding the sync parameter #3325 (@andrewisplinghoff)
docs: fix simple typo, funciton -> function #3314 (@timgates42)
Contributors to this release#
(GitHub contributors page for this release)
@00Kai0 | @8rV1n | @akhilputhiry | @alexal | @analytically | @andreamazzoni | @andrewisplinghoff | @BertR | @betatim | @bitnik | @bollwyvl | @carluri | @Carreau | @consideRatio | @davidedelvento | @dhirschfeld | @dmpe | @dsblank | @dtaniwaki | @echarles | @elgalu | @eran-pinhas | @gaebor | @GeorgianaElena | @gsemet | @gweis | @hynek2001 | @ianabc | @ibre5041 | @IvanaH8 | @jhegedus42 | @jhermann | @jiajunjie | @jtlz2 | @kafonek | @katsar0v | @kinow | @krinsman | @laurensdv | @lits789 | @m-alekseev | @mabbasi90 | @manics | @manniche | @maxshowarth | @mdivk | @meeseeksmachine | @minrk | @mogthesprog | @mriedem | @nsshah1288 | @olifre | @PandaWhoCodes | @pawsaw | @phozzy | @playermanny2 | @rabsr | @randy3k | @rawrgulmuffins | @rcthomas | @rebeca-maia | @rebenkoy | @rkdarst | @robnagler | @ronaldpetty | @ryanlovett | @ryogesh | @sbailey-auro | @sigurdurb | @SivaAccionLabs | @sougou | @stv0g | @sudi007 | @support | @tathagata | @timgates42 | @trallard | @vlizanae | @welcome | @whitespaceninja | @whlteXbread | @willingc | @yuvipanda | @Zsailer
1.3#
JupyterHub 1.3 is a small feature release. Highlights include:
Require Python >=3.6 (jupyterhub 1.2 is the last release to support 3.5)
Add a
?state=
filter for getting user list, allowing much quicker responses when retrieving a small fraction of users.state
can beactive
,inactive
, orready
.prometheus metrics now include a
jupyterhub_
prefix, so deployments may need to update their grafana charts to match.page templates can now be async!
1.3.0#
Enhancements made#
Bugs fixed#
Maintenance and upkeep improvements#
Documentation improvements#
Fix curl in jupyter announcements #3286 (@Sangarshanan)
Update services.md #3267 (@slemonide)
[Docs] Fix https reverse proxy redirect issues #3244 (@mhwasil)
Remove the extra parenthesis in service.md #3303 (@Sangarshanan)
Contributors to this release#
(GitHub contributors page for this release)
@0mar | @agp8x | @alexweav | @belfhi | @betatim | @cbanek | @cmd-ntrf | @coffeebenzene | @consideRatio | @danlester | @fcollonval | @GeorgianaElena | @ianabc | @IvanaH8 | @manics | @meeseeksmachine | @mhwasil | @minrk | @mriedem | @mxjeff | @olifre | @rcthomas | @rgbkrk | @rkdarst | @Sangarshanan | @slemonide | @support | @tlvu | @welcome | @yuvipanda
1.2#
1.2.2 2020-11-27#
Enhancements made#
Bugs fixed#
Maintenance and upkeep improvements#
Environment marker on pamela #3255 (@fcollonval)
Migrate from travis to GitHub actions #3246 (@consideRatio)
Documentation improvements#
Contributors to this release#
(GitHub contributors page for this release)
@alexweav | @belfhi | @betatim | @cmd-ntrf | @consideRatio | @danlester | @fcollonval | @GeorgianaElena | @ianabc | @IvanaH8 | @manics | @meeseeksmachine | @minrk | @mriedem | @olifre | @rcthomas | @rgbkrk | @rkdarst | @slemonide | @support | @welcome | @yuvipanda
1.2.1 2020-10-30#
Bugs fixed#
Contributors to this release#
1.2.0 2020-10-29#
JupyterHub 1.2 is an incremental release with lots of small improvements. It is unlikely that users will have to change much to upgrade, but lots of new things are possible and/or better!
There are no database schema changes requiring migration from 1.1 to 1.2.
Highlights:
Deprecate black/whitelist configuration fields in favor of more inclusive blocked/allowed language. For example:
c.Authenticator.allowed_users = {'user', ...}
More configuration of page templates and service display
Pagination of the admin page improving performance with large numbers of users
Improved control of user redirect
Support for jupyter-server-based single-user servers, such as Voilà and latest JupyterLab.
Lots more improvements to documentation, HTML pages, and customizations
Enhancements made#
Make api_request to CHP’s REST API more reliable #3223 (@consideRatio)
Add a footer block + wrap the admin footer in this block #3136 (@pabepadu)
Allow JupyterHub.default_url to be a callable #3133 (@danlester)
Allow head requests for the health endpoint #3131 (@rkevin-arch)
Hide hamburger button menu in mobile/responsive mode and fix other minor issues #3103 (@kinow)
build jupyterhub/jupyterhub-demo image on docker hub #3083 (@minrk)
Add JupyterHub Demo docker image #3059 (@GeorgianaElena)
Warn if both bind_url and ip/port/base_url are set #3057 (@GeorgianaElena)
UI Feedback on Submit #3028 (@possiblyMikeB)
Support kubespawner running on a IPv6 only cluster #3020 (@stv0g)
Spawn with options passed in query arguments to /spawn #3013 (@twalcari)
SpawnHandler POST with user form options displays the spawn-pending page #2978 (@danlester)
Start named servers by pressing the Enter key #2960 (@jtpio)
Keep the URL fragments after spawning an application #2952 (@kinow)
make init_spawners check O(running servers) not O(total users) #2936 (@minrk)
Add favicon to the base page template #2930 (@JohnPaton)
Add support for Jupyter Server #2601 (@yuvipanda)
Bugs fixed#
Fix #2284 must be sent from authorization page #3219 (@elgalu)
avoid specifying default_value=None in Command traits #3208 (@minrk)
Prevent OverflowErrors in exponential_backoff() #3204 (@kreuzert)
update prometheus metrics for server spawn when it fails with exception #3150 (@yhal-nesi)
jupyterhub/utils: Load system default CA certificates in make_ssl_context #3140 (@chancez)
admin page sorts on spawner last_activity instead of user last_activity #3137 (@lydian)
Fix the services dropdown on the admin page #3132 (@pabepadu)
Don’t log a warning when slow_spawn_timeout is disabled #3127 (@mriedem)
app.py: Work around incompatibility between Tornado 6 and asyncio proactor event loop in python 3.8 on Windows #3123 (@alexweav)
jupyterhub/user: clear spawner state after post_stop_hook #3121 (@rkdarst)
fix for stopping named server deleting default server and tests #3109 (@kxiao-fn)
Hide hamburger button menu in mobile/responsive mode and fix other minor issues #3103 (@kinow)
Rename Authenticator.white/blacklist to allowed/blocked #3090 (@minrk)
Include the query string parameters when redirecting to a new URL #3089 (@kinow)
Make
delete_invalid_users
configurable #3087 (@fcollonval)Ensure client dependencies build before wheel #3082 (@diurnalist)
make Spawner.environment config highest priority #3081 (@minrk)
Changing start my server button link to spawn url once server is stopped #3042 (@rabsr)
Fix CSS on admin page version listing #3035 (@vilhelmen)
Fix –generate-config bug when specifying a filename #2907 (@consideRatio)
Handle the protocol when ssl is enabled and log the right URL #2773 (@kinow)
Maintenance and upkeep improvements#
Update travis-ci badge in README.md #3232 (@consideRatio)
Upgraded Jquery dep #3174 (@AngelOnFira)
Don’t allow ‘python:3.8 + master dependencies’ to fail #3157 (@manics)
Update Dockerfile to ubuntu:focal (Python 3.8) #3156 (@manics)
Get error description from error key vs error_description key #3147 (@jgwerner)
Log slow_stop_timeout when hit like slow_spawn_timeout #3111 (@mriedem)
Allow
python:3.8 + master dependencies
to fail #3079 (@manics)synchronize implementation of expiring values #3072 (@minrk)
More consistent behavior for UserDict.get and
key in UserDict
#3071 (@minrk)Use the issue templates from the central repo #3056 (@GeorgianaElena)
Log successful /health requests as debug level #3047 (@consideRatio)
Fix broken test due to BeautifulSoup 4.9.0 behavior change #3025 (@twalcari)
Use pip instead of conda for building the docs on RTD #3010 (@GeorgianaElena)
Avoid redundant logging of jupyterhub version mismatches #2971 (@mriedem)
preserve auth type when logging obfuscated auth header #2953 (@minrk)
make spawner:server relationship explicitly one to one #2944 (@minrk)
Add what we need with some margin to Dockerfile’s build stage #2905 (@consideRatio)
Documentation improvements#
[docs] Remove duplicate line in changelog for 1.1.0 #3207 (@kinow)
changelog for 1.2.0b1 #3192 (@consideRatio)
Add SELinux configuration for nginx #3185 (@rainwoodman)
Mention the PAM pitfall on fedora. #3184 (@rainwoodman)
Added extra documentation for endpoint /users/{name}/servers/{server_name}. #3159 (@synchronizing)
docs: please docs linter (move_cert docstring) #3151 (@consideRatio)
Needed NoEsacpe (NE) option for apache #3143 (@basvandervlies)
Document external service api_tokens better #3142 (@snickell)
Remove idle culler example #3114 (@yuvipanda)
docs: unsqueeze logo, remove unused CSS and templates #3107 (@consideRatio)
Replace zonca/remotespawner with NERSC/sshspawner #3086 (@manics)
Remove already done named servers from roadmap #3084 (@elgalu)
proxy settings might cause authentication errors #3078 (@gatoniel)
document upgrading from api_tokens to services config #3055 (@minrk)
[Docs] Disable proxy_buffering when using nginx reverse proxy #3048 (@mhwasil)
Fix docs CI test failure: duplicate object description #3021 (@rkdarst)
Update issue templates #3001 (@GeorgianaElena)
updating docs theme #2995 (@choldgraf)
Docs: Fixed grammar on landing page #2950 (@alexdriedger)
docs: use metachannel for faster environment solve #2943 (@minrk)
[doc] Add more docs about Cookies used for authentication in JupyterHub #2940 (@kinow)
[doc] Use fixed commit plus line number in github link #2939 (@kinow)
[doc] Fix link to SSL encryption from troubleshooting page #2938 (@kinow)
rest api: fix schema for remove parameter in rest api #2917 (@minrk)
Several fixes to the doc #2904 (@reneluria)
Contributors to this release#
(GitHub contributors page for this release)
@0nebody | @1kastner | @ahkui | @alexdriedger | @alexweav | @AlJohri | @Analect | @analytically | @aneagoe | @AngelOnFira | @barrachri | @basvandervlies | @betatim | @bigbosst | @blink1073 | @Cadair | @Carreau | @cbjuan | @ceocoder | @chancez | @choldgraf | @Chrisjw42 | @cmd-ntrf | @consideRatio | @danlester | @diurnalist | @Dmitry1987 | @dsblank | @dylex | @echarles | @elgalu | @fcollonval | @gatoniel | @GeorgianaElena | @hnykda | @itssimon | @jgwerner | @JohnPaton | @joshmeek | @jtpio | @kinow | @kreuzert | @kxiao-fn | @lesiano | @limimiking | @lydian | @mabbasi90 | @maluhoss | @manics | @matteoipri | @mbmilligan | @meeseeksmachine | @mhwasil | @minrk | @mriedem | @nscozzaro | @pabepadu | @possiblyMikeB | @psyvision | @rabsr | @rainwoodman | @rajat404 | @rcthomas | @reneluria | @rgbkrk | @rkdarst | @rkevin-arch | @romainx | @ryanlovett | @ryogesh | @sdague | @snickell | @SonakshiGrover | @ssanderson | @stefanvangastel | @steinad | @stephen-a2z | @stevegore | @stv0g | @subgero | @sudi007 | @summerswallow | @support | @synchronizing | @thuvh | @tritemio | @twalcari | @vchandvankar | @vilhelmen | @vlizanae | @weimin | @welcome | @willingc | @xlotlu | @yhal-nesi | @ynnelson | @yuvipanda | @zonca | @Zsailer
1.1#
1.1.0 2020-01-17#
1.1 is a release with lots of accumulated fixes and improvements, especially in performance, metrics, and customization. There are no database changes in 1.1, so no database upgrade is required when upgrading from 1.0 to 1.1.
Of particular interest to deployments with automatic health checking and/or large numbers of users is that the slow startup time
introduced in 1.0 by additional spawner validation can now be mitigated by JupyterHub.init_spawners_timeout
,
allowing the Hub to become responsive before the spawners may have finished validating.
Several new Prometheus metrics are added (and others fixed!) to measure sources of common performance issues, such as proxy interactions and startup.
1.1 also begins adoption of the Jupyter telemetry project in JupyterHub, See The Jupyter Telemetry docs for more info. The only events so far are starting and stopping servers, but more will be added in future releases.
There are many more fixes and improvements listed below. Thanks to everyone who has contributed to this release!
New#
LocalProcessSpawner should work on windows by using psutil.pid_exists #2882 (@ociule)
trigger auth_state_hook prior to options form, add auth_state to template namespace #2881 (@minrk)
Added guide ‘install jupyterlab the hard way’ #2110 #2842 (@mangecoeur)
Add prometheus metric to measure hub startup time #2799 (@rajat404)
JupyterHub.user_redirect_hook
is added to allow admins to customize /user-redirect/ behavior #2790 (@yuvipanda)Add prometheus metric to measure hub startup time #2799 (@rajat404)
Add prometheus metric to measure proxy route poll times #2798 (@rajat404)
PROXY_DELETE_DURATION_SECONDS
prometheus metric is added, to measure proxy route deletion times #2788 (@rajat404)Service.oauth_no_confirm
is added, it is useful for admin-managed services that are considered part of the Hub and shouldn’t need to prompt the user for access #2767 (@minrk)JupyterHub.default_server_name
is added to make the default server be a named server with provided name #2735 (@krinsman)JupyterHub.init_spawners_timeout
is introduced to combat slow startups on large JupyterHub deployments #2721 (@minrk)The configuration
uids
for local authenticators is added to consistently assign users UNIX id’s between installations #2687 (@rgerkin)JupyterHub.activity_resolution
is introduced with a default value of 30s improving performance by not updating the database with user activity too often #2605 (@minrk)HubAuth’s SSL configuration can now be set through environment variables #2588 (@cmd-ntrf)
Expose spawner.user_options in REST API. #2755 (@danielballan)
Instrument JupyterHub to record events with jupyter_telemetry [Part II] #2698 (@Zsailer)
Make announcements visible without custom HTML #2570 (@consideRatio)
Display server version on admin page #2776 (@vilhelmen)
Fixes#
Bugfix: pam_normalize_username didn’t return username #2876 (@rkdarst)
Fix an issue occurring with the default spawner and
internal_ssl
enabled #2785 (@rpwagner)Fix named servers to not be spawnable unless activated #2772 (@bitnik)
JupyterHub now awaits proxy availability before accepting web requests #2750 (@minrk)
Fix a no longer valid assumption that MySQL and MariaDB need to have
innodb_file_format
andinnodb_large_prefix
configured #2712 (@chicocvenancio)Login/Logout button now updates to Login on logout #2705 (@aar0nTw)
Fix handling of exceptions within
pre_spawn_start
hooks #2684 (@GeorgianaElena)Fix an issue where a user could end up spawning a default server instead of a named server as intended #2682 (@rcthomas)
/hub/admin now redirects to login if unauthenticated #2670 (@GeorgianaElena)
Fix spawning of users with names containing characters that needs to be escaped #2648 (@nicorikken)
Fix
TOTAL_USERS
prometheus metric #2637 (@GeorgianaElena)Fix
RUNNING_SERVERS
prometheus metric #2629 (@GeorgianaElena)Fix faulty redirects to 404 that could occur with the use of named servers #2594 (@vilhelmen)
JupyterHub API spec is now a valid OpenAPI spec #2590 (@sbrunk)
Use of
--help
or--version
previously could output unrelated errors #2584 (@minrk)Escape usernames in the frontend #2640 (@nicorikken)
Maintenance#
Optimize CI jobs and default to bionic #2897 (@consideRatio)
Fixup .travis.yml #2868 (@consideRatio)
Update README’s badges #2867 (@consideRatio)
Dockerfile: add build-essential to builder image #2866 (@rkdarst)
remove redundant pip package list in docs environment.yml #2838 (@minrk)
remove redundant pip package list in docs environment.yml #2838 (@minrk)
updating to pandas docs theme #2820 (@choldgraf)
Adding institutional faq #2800 (@choldgraf)
Add inline comment to test #2826 (@consideRatio)
Raise error on missing specified config #2824 (@consideRatio)
chore: Update python versions in travis matrix #2811 (@jgwerner)
chore: Bump package versions used in pre-commit config #2810 (@jgwerner)
adding docs preview to circleci #2803 (@choldgraf)
adding institutional faq #2800 (@choldgraf)
cull_idle_servers.py: rebind max_age and inactive_limit locally #2794 (@rkdarst)
Fix deprecation warnings #2789 (@tirkarthi)
Log proxy class #2783 (@GeorgianaElena)
Log JupyterHub version on startup #2752 (@consideRatio)
Reduce verbosity for “Failing suspected API request to not-running server” (new) #2751 (@rkdarst)
Add missing package for json schema doc build #2744 (@willingc)
Remove tornado deprecated/unnecessary AsyncIOMainLoop().install() call #2740 (@kinow)
Remove duplicate hub and authenticator traitlets from Spawner #2736 (@eslavich)
Add New Server: change redirecting to relative to home page in js #2714 (@bitnik)
Create a warning when creating a service implicitly from service_tokens #2704 (@katsar0v)
Add Jupyter community link #2696 (@mattjshannon)
Fix failing travis tests #2695 (@GeorgianaElena)
Documentation update: hint for using services instead of service tokens. #2679 (@katsar0v)
Replace header logo: jupyter -> jupyterhub #2672 (@consideRatio)
Update flask hub authentication services example in doc #2658 (@cmd-ntrf)
close
<div class="container">
tag in home.html #2649 (@bitnik)Some theme updates; no double NEXT/PREV buttons. #2647 (@Carreau)
fix typos on technical reference documentation #2646 (@ilee38)
corrected docker network create instructions in dockerfiles README #2632 (@bartolone)
Fixed docs and testing code to use refactored SimpleLocalProcessSpawner #2631 (@danlester)
Update doc: do not suggest depricated config key #2626 (@lumbric)
cull-idle: Include a hint on how to add custom culling logic #2613 (@rkdarst)
Replace existing redirect code by Tornado’s addslash decorator #2609 (@kinow)
Hide Stop My Server red button after server stopped. #2577 (@aar0nTw)
typo #2564 (@julienchastang)
Update to simplify the language related to spawner options #2558 (@NikeNano)
Adding the use case of the Elucidata: How Jupyter Notebook is used in… #2548 (@IamViditAgarwal)
Dict rewritten as literal #2546 (@remyleone)
1.0#
1.0.0 2019-05-03#
JupyterHub 1.0 is a major milestone for JupyterHub. Huge thanks to the many people who have contributed to this release, whether it was through discussion, testing, documentation, or development.
Major new features#
Support TLS encryption and authentication of all internal communication. Spawners must implement
.move_certs
method to make certificates available to the notebook server if it is not local to the Hub.There is now full UI support for managing named servers. With named servers, each jupyterhub user may have access to more than one named server. For example, a professor may access a server named
research
and another namedteaching
.Authenticators can now expire and refresh authentication data by implementing
Authenticator.refresh_user(user)
. This allows things like OAuth data and access tokens to be refreshed. When used together withAuthenticator.refresh_pre_spawn = True
, auth refresh can be forced prior to Spawn, allowing the Authenticator to require that authentication data is fresh immediately before the user’s server is launched.
New features#
allow custom spawners, authenticators, and proxies to register themselves via ‘entry points’, enabling more convenient configuration such as:
c.JupyterHub.authenticator_class = 'github' c.JupyterHub.spawner_class = 'docker' c.JupyterHub.proxy_class = 'traefik_etcd'
Spawners are passed the tornado Handler object that requested their spawn (as
self.handler
), so they can do things like make decisions based on query arguments in the request.SimpleSpawner and DummyAuthenticator, which are useful for testing, have been merged into JupyterHub itself:
# For testing purposes only. Should not be used in production. c.JupyterHub.authenticator_class = 'dummy' c.JupyterHub.spawner_class = 'simple'
These classes are not appropriate for production use. Only testing.
Add health check endpoint at
/hub/health
Several prometheus metrics have been added (thanks to Outreachy applicants!)
A new API for registering user activity. To prepare for the addition of alternate proxy implementations, responsibility for tracking activity is taken away from the proxy and moved to the notebook server (which already has activity tracking features). Activity is now tracked by pushing it to the Hub from user servers instead of polling the proxy API.
Dynamic
options_form
callables may now return an empty string which will result in no options form being rendered.Spawner.user_options
is persisted to the database to be re-used, so that a server spawned once via the form can be re-spawned via the API with the same options.Added
c.PAMAuthenticator.pam_normalize_username
option for round-tripping usernames through PAM to retrieve the normalized form.Added
c.JupyterHub.named_server_limit_per_user
configuration to limit the number of named servers each user can have. The default is 0, for no limit.API requests to HubAuthenticated services (e.g. single-user servers) may pass a token in the
Authorization
header, matching authentication with the Hub API itself.Added
Authenticator.is_admin(handler, authentication)
method andAuthenticator.admin_groups
configuration for automatically determining that a member of a group should be considered an admin.New
c.Authenticator.post_auth_hook
configuration that can be any callable of the formasync def hook(authenticator, handler, authentication=None):
. This hook may transform the return value ofAuthenticator.authenticate()
and return a new authentication dictionary, e.g. specifying admin privileges, group membership, or custom allowed/blocked logic. This hook is called after existing normalization and allowed-username checking.Spawner.options_from_form
may now be asyncAdded
JupyterHub.shutdown_on_logout
option to trigger shutdown of a user’s servers when they log out.When
Spawner.start
raises an Exception, a message can be passed on to the user if the exception has a.jupyterhub_message
attribute.
Changes#
Authentication methods such as
check_whitelist
should now take an additionalauthentication
argument that will be a dictionary (default: None) of authentication data, as returned byAuthenticator.authenticate()
:def check_whitelist(self, username, authentication=None): ...
authentication
should have a default value of None for backward-compatibility with jupyterhub < 1.0.Prometheus metrics page is now authenticated. Any authenticated user may see the prometheus metrics. To disable prometheus authentication, set
JupyterHub.authenticate_prometheus = False
.Visits to
/user/:name
no longer trigger an implicit launch of the user’s server. Instead, a page is shown indicating that the server is not running with a link to request the spawn.API requests to
/user/:name
for a not-running server will have status 503 instead of 404.OAuth includes a confirmation page when attempting to visit another user’s server, so that users can choose to cancel authentication with the single-user server. Confirmation is still skipped when accessing your own server.
Fixed#
Various fixes to improve Windows compatibility (default Authenticator and Spawner still do not support Windows, but other Spawners may)
Fixed compatibility with Oracle db
Fewer redirects following a visit to the default
/
urlError when progress is requested before progress is ready
Error when API requests are made to a not-running server without authentication
Avoid logging database password on connect if password is specified in
JupyterHub.db_url
.
Development changes#
There have been several changes to the development process that shouldn’t
generally affect users of JupyterHub, but may affect contributors.
In general, see CONTRIBUTING.md
for contribution info or ask if you have questions.
JupyterHub has adopted
black
as a code autoformatter andpre-commit
as a tool for automatically running code formatting on commit. This is meant to make it easier to contribute to JupyterHub, so let us know if it’s having the opposite effect.JupyterHub has switched its test suite to using
pytest-asyncio
frompytest-tornado
.OAuth is now implemented internally using
oauthlib
instead ofpython-oauth2
. This should have no effect on behavior.
0.9#
0.9.6 2019-04-01#
JupyterHub 0.9.6 is a security release.
Fixes an Open Redirect vulnerability (CVE-2019-10255).
JupyterHub 0.9.5 included a partial fix for this issue.
0.9.4 2018-09-24#
JupyterHub 0.9.4 is a small bugfix release.
Fixes an issue that required all running user servers to be restarted when performing an upgrade from 0.8 to 0.9.
Fixes content-type for API endpoints back to
application/json
. It wastext/html
in 0.9.0-0.9.3.
0.9.3 2018-09-12#
JupyterHub 0.9.3 contains small bugfixes and improvements
Fix token page and model handling of
expires_at
. This field was missing from the REST API model for tokens and could cause the token page to not renderAdd keep-alive to progress event stream to avoid proxies dropping the connection due to inactivity
Documentation and example improvements
Disable quit button when using notebook 5.6
Prototype new feature (may change prior to 1.0): pass requesting Handler to Spawners during start, accessible as
self.handler
0.9.2 2018-08-10#
JupyterHub 0.9.2 contains small bugfixes and improvements.
Documentation and example improvements
Add
Spawner.consecutive_failure_limit
config for aborting the Hub if too many spawns fail in a row.Fix for handling SIGTERM when run with asyncio (tornado 5)
Windows compatibility fixes
0.9.1 2018-07-04#
JupyterHub 0.9.1 contains a number of small bugfixes on top of 0.9.
Use a PID file for the proxy to decrease the likelihood that a leftover proxy process will prevent JupyterHub from restarting
c.LocalProcessSpawner.shell_cmd
is now configurableAPI requests to stopped servers (requests to the hub for
/user/:name/api/...
) fail with 404 rather than triggering a restart of the serverCompatibility fix for notebook 5.6.0 which will introduce further security checks for local connections
Managed services always use localhost to talk to the Hub if the Hub listening on all interfaces
When using a URL prefix, the Hub route will be
JupyterHub.base_url
instead of unconditionally/
additional fixes and improvements
0.9.0 2018-06-15#
JupyterHub 0.9 is a major upgrade of JupyterHub. There are several changes to the database schema, so make sure to backup your database and run:
jupyterhub upgrade-db
after upgrading jupyterhub.
The biggest change for 0.9 is the switch to asyncio coroutines everywhere instead of tornado coroutines. Custom Spawners and Authenticators are still free to use tornado coroutines for async methods, as they will continue to work. As part of this upgrade, JupyterHub 0.9 drops support for Python < 3.5 and tornado < 5.0.
Changed#
Require Python >= 3.5
Require tornado >= 5.0
Use asyncio coroutines throughout
Set status 409 for conflicting actions instead of 400, e.g. creating users or groups that already exist.
timestamps in REST API continue to be UTC, but now include ‘Z’ suffix to identify them as such.
REST API User model always includes
servers
dict, not just when named servers are enabled.server
info is no longer available to oauth identification endpoints, only user info and group membership.User.last_activity
may be None if a user has not been seen, rather than starting with the user creation time which is now separately stored asUser.created
.static resources are now found in
$PREFIX/share/jupyterhub
instead ofshare/jupyter/hub
for improved consistency.Deprecate
.extra_log_file
config. Use pipe redirection instead:jupyterhub &>> /var/log/jupyterhub.log
Add
JupyterHub.bind_url
config for setting the full bind URL of the proxy. Sets ip, port, base_url all at once.Add
JupyterHub.hub_bind_url
for setting the full host+port of the Hub.hub_bind_url
supports unix domain sockets, e.g.unix+http://%2Fsrv%2Fjupyterhub.sock
Deprecate
JupyterHub.hub_connect_port
config in favor ofJupyterHub.hub_connect_url
.hub_connect_ip
is not deprecated and can still be used in the common case where only the ip address of the hub differs from the bind ip.
Added#
Spawners can define a
.progress
method which should be an async generator. The generator should yield events of the form:{ "message": "some-state-message", "progress": 50, }
These messages will be shown with a progress bar on the spawn-pending page. The
async_generator
package can be used to make async generators compatible with Python 3.5.track activity of individual API tokens
new REST API for managing API tokens at
/hub/api/user/tokens[/token-id]
allow viewing/revoking tokens via token page
User creation time is available in the REST API as
User.created
Server start time is stored as
Server.started
Spawner.start
may return a URL for connecting to a notebook instead of(ip, port)
. This enables Spawners to launch servers that setup their own HTTPS.Optimize database performance by disabling sqlalchemy expire_on_commit by default.
Add
python -m jupyterhub.dbutil shell
entrypoint for quickly launching an IPython session connected to your JupyterHub database.Include
User.auth_state
in user model on single-user REST endpoints for admins only.Include
Server.state
in server model on REST endpoints for admins only.Add
Authenticator.blacklist
for blocking users instead of allowing.Pass
c.JupyterHub.tornado_settings['cookie_options']
down to Spawners so that cookie options (e.g.expires_days
) can be set globally for the whole application.SIGINFO (
ctrl-t
) handler showing the current status of all running threads, coroutines, and CPU/memory/FD consumption.Add async
Spawner.get_options_form
alternative to.options_form
, so it can be a coroutine.Add
JupyterHub.redirect_to_server
config to govern whether users should be sent to their server on login or the JupyterHub home page.html page templates can be more easily customized and extended.
Allow registering external OAuth clients for using the Hub as an OAuth provider.
Add basic prometheus metrics at
/hub/metrics
endpoint.Add session-id cookie, enabling immediate revocation of login tokens.
Authenticators may specify that users are admins by specifying the
admin
key when return the user model as a dict.Added “Start All” button to admin page for launching all user servers at once.
Services have an
info
field which is a dictionary. This is accessible via the REST API.JupyterHub.extra_handlers
allows defining additional tornado RequestHandlers attached to the Hub.API tokens may now expire. Expiry is available in the REST model as
expires_at
, and settable when creating API tokens by specifyingexpires_in
.
Fixed#
Remove green from theme to improve accessibility
Fix error when proxy deletion fails due to route already being deleted
clear
?redirects
from URL on successful launchdisable send2trash by default, which is rarely desirable for jupyterhub
Put PAM calls in a thread so they don’t block the main application in cases where PAM is slow (e.g. LDAP).
Remove implicit spawn from login handler, instead relying on subsequent request for
/user/:name
to trigger spawn.Fixed several inconsistencies for initial redirects, depending on whether server is running or not and whether the user is logged in or not.
Admin requests for
/user/:name
(when admin-access is enabled) launch the right server if it’s not running instead of redirecting to their own.Major performance improvement starting up JupyterHub with many users, especially when most are inactive.
Various fixes in race conditions and performance improvements with the default proxy.
Fixes for CORS headers
Stop setting
.form-control
on spawner form inputs unconditionally.Better recovery from database errors and database connection issues without having to restart the Hub.
Fix handling of
~
character in usernames.Fix jupyterhub startup when
getpass.getuser()
would fail, e.g. due to missing entry in passwd file in containers.
0.8#
0.8.1 2017-11-07#
JupyterHub 0.8.1 is a collection of bugfixes and small improvements on 0.8.
Added#
Run tornado with AsyncIO by default
Add
jupyterhub --upgrade-db
flag for automatically upgrading the database as part of startup. This is useful for cases where manually runningjupyterhub upgrade-db
as a separate step is unwieldy.Avoid creating backups of the database when no changes are to be made by
jupyterhub upgrade-db
.
Fixed#
Add some further validation to usernames -
/
is not allowed in usernames.Fix empty logout page when using auto_login
Fix autofill of username field in default login form.
Fix listing of users on the admin page who have not yet started their server.
Fix ever-growing traceback when re-raising Exceptions from spawn failures.
Remove use of deprecated
bower
for javascript client dependencies.
0.8.0 2017-10-03#
JupyterHub 0.8 is a big release!
Perhaps the biggest change is the use of OAuth to negotiate authentication between the Hub and single-user services. Due to this change, it is important that the single-user server and Hub are both running the same version of JupyterHub. If you are using containers (e.g. via DockerSpawner or KubeSpawner), this means upgrading jupyterhub in your user images at the same time as the Hub. In most cases, a
pip install jupyterhub==version
in your Dockerfile is sufficient.
Added#
JupyterHub now defined a
Proxy
API for custom proxy implementations other than the default. The defaults are unchanged, but configuration of the proxy is now done on theConfigurableHTTPProxy
class instead of the top-level JupyterHub. TODO: docs for writing a custom proxy.Single-user servers and services (anything that uses HubAuth) can now accept token-authenticated requests via the Authentication header.
Authenticators can now store state in the Hub’s database. To do so, the
authenticate
method should return a dict of the form{ 'username': 'name', 'state': {} }
This data will be encrypted and requires
JUPYTERHUB_CRYPT_KEY
environment variable to be set and theAuthenticator.enable_auth_state
flag to be True. If these are not set, auth_state returned by the Authenticator will not be stored.There is preliminary support for multiple (named) servers per user in the REST API. Named servers can be created via API requests, but there is currently no UI for managing them.
Add
LocalProcessSpawner.popen_kwargs
andLocalProcessSpawner.shell_cmd
for customizing how user server processes are launched.Add
Authenticator.auto_login
flag for skipping the “Login with…” page explicitly.Add
JupyterHub.hub_connect_ip
configuration for the ip that should be used when connecting to the Hub. This is promoting (and deprecating)DockerSpawner.hub_ip_connect
for use by all Spawners.Add
Spawner.pre_spawn_hook(spawner)
hook for customizing pre-spawn events.Add
JupyterHub.active_server_limit
andJupyterHub.concurrent_spawn_limit
for limiting the total number of running user servers and the number of pending spawns, respectively.
Changed#
more arguments to spawners are now passed via environment variables (
.get_env()
) rather than CLI arguments (.get_args()
)internally generated tokens no longer get extra hash rounds, significantly speeding up authentication. The hash rounds were deemed unnecessary because the tokens were already generated with high entropy.
JUPYTERHUB_API_TOKEN
env is available at all times, rather than being removed during single-user start. The token is now accessible to kernel processes, enabling user kernels to make authenticated API requests to Hub-authenticated services.Cookie secrets should be 32B hex instead of large base64 secrets.
pycurl is used by default, if available.
Fixed#
So many things fixed!
Collisions are checked when users are renamed
Fix bug where OAuth authenticators could not logout users due to being redirected right back through the login process.
If there are errors loading your config files, JupyterHub will refuse to start with an informative error. Previously, the bad config would be ignored and JupyterHub would launch with default configuration.
Raise 403 error on unauthorized user rather than redirect to login, which could cause redirect loop.
Set
httponly
on cookies because it’s prudent.Improve support for MySQL as the database backend
Many race conditions and performance problems under heavy load have been fixed.
Fix alembic tagging of database schema versions.
Removed#
End support for Python 3.3
0.7#
0.7.2 - 2017-01-09#
Added#
Support service environment variables and defaults in
jupyterhub-singleuser
for easier deployment of notebook servers as a Service.Add
--group
parameter for deployingjupyterhub-singleuser
as a Service with group authentication.Include URL parameters when redirecting through
/user-redirect/
Fixed#
Fix group authentication for HubAuthenticated services
0.7.1 - 2017-01-02#
Added#
Spawner.will_resume
for signaling that a single-user server is paused instead of stopped. This is needed for cases likeDockerSpawner.remove_containers = False
, where the first API token is re-used for subsequent spawns.Warning on startup about single-character usernames, caused by common
set('string')
typo in config.
Fixed#
Removed spurious warning about empty
next_url
, which is AOK.
0.7.0 - 2016-12-2#
Added#
Implement Services API #705
Add
/api/
and/api/info
endpoints #675Add documentation for JupyterLab, pySpark configuration, troubleshooting, and more.
Add logging of error if adding users already in database. #689
Add HubAuth class for authenticating with JupyterHub. This class can be used by any application, even outside tornado.
Add user groups.
Add
/hub/user-redirect/...
URL for redirecting users to a file on their own server.
Changed#
Fixed#
Fix docker repository location #719
Fix swagger spec conformance and timestamp type in API spec
Various redirect-loop-causing bugs have been fixed.
Removed#
0.6#
0.6.1 - 2016-05-04#
Bugfixes on 0.6:
statsd is an optional dependency, only needed if in use
Notice more quickly when servers have crashed
Better error pages for proxy errors
Add Stop All button to admin panel for stopping all servers at once
0.6.0 - 2016-04-25#
JupyterHub has moved to a new
jupyterhub
namespace on GitHub and Docker. What wasjupyter/jupyterhub
is nowjupyterhub/jupyterhub
, etc.jupyterhub/jupyterhub
image on DockerHub no longer loads the jupyterhub_config.py in an ONBUILD step. A newjupyterhub/jupyterhub-onbuild
image does thisAdd statsd support, via
c.JupyterHub.statsd_{host,port,prefix}
Update to traitlets 4.1
@default
,@observe
APIs for traitsAllow disabling PAM sessions via
c.PAMAuthenticator.open_sessions = False
. This may be needed on SELinux-enabled systems, where our PAM session logic often does not work properlyAdd
Spawner.environment
configurable, for defining extra environment variables to load for single-user serversJupyterHub API tokens can be pregenerated and loaded via
JupyterHub.api_tokens
, a dict oftoken: username
.JupyterHub API tokens can be requested via the REST API, with a POST request to
/api/authorizations/token
. This can only be used if the Authenticator has a username and password.Various fixes for user URLs and redirects
0.5 - 2016-03-07#
Single-user server must be run with Jupyter Notebook ≥ 4.0
Require
--no-ssl
confirmation to allow the Hub to be run without SSL (e.g. behind SSL termination in nginx)Add lengths to text fields for MySQL support
Add
Spawner.disable_user_config
for preventing user-owned configuration from modifying single-user servers.Fixes for MySQL support.
Add ability to run each user’s server on its own subdomain. Requires wildcard DNS and wildcard SSL to be feasible. Enable subdomains by setting
JupyterHub.subdomain_host = 'https://jupyterhub.domain.tld[:port]'
.Use
127.0.0.1
for local communication instead oflocalhost
, avoiding issues with DNS on some systems.Fix race that could add users to proxy prematurely if spawning is slow.
0.4#
0.4.1 - 2016-02-03#
Fix removal of /login
page in 0.4.0, breaking some OAuth providers.
0.4.0 - 2016-02-01#
Add
Spawner.user_options_form
for specifying an HTML form to present to users, allowing users to influence the spawning of their own servers.Add
Authenticator.pre_spawn_start
andAuthenticator.post_spawn_stop
hooks, so that Authenticators can do setup or teardown (e.g. passing credentials to Spawner, mounting data sources, etc.). These methods are typically used with custom Authenticator+Spawner pairs.0.4 will be the last JupyterHub release where single-user servers running IPython 3 is supported instead of Notebook ≥ 4.0.
0.3 - 2015-11-04#
No longer make the user starting the Hub an admin
start PAM sessions on login
hooks for Authenticators to fire before spawners start and after they stop, allowing deeper interaction between Spawner/Authenticator pairs.
login redirect fixes
0.2 - 2015-07-12#
Based on standalone traitlets instead of IPython.utils.traitlets
multiple users in admin panel
Fixes for usernames that require escaping
0.1 - 2015-03-07#
First preview release
JupyterHub REST API#
NOTE: The contents of this markdown file are not used,
this page is entirely generated from _templates/redoc.html
and _static/rest-api.yml
REST API methods can be linked by their operationId in rest-api.yml,
prefixed with rest-api-
, e.g.
you cat [GET /api/users](rest-api-get-users)
JupyterHub API Reference#
- Date:
Apr 18, 2024
- Release:
5.0.0.dev
JupyterHub also provides a REST API for administration of the Hub and users. The documentation on Using JupyterHub’s REST API provides information on:
what you can do with the API
creating an API token
adding API tokens to the config files
making an API request programmatically using the requests library
learning more about JupyterHub’s API
Application configuration#
Module: jupyterhub.app
#
The multi-user notebook application
JupyterHub
#
- class jupyterhub.app.JupyterHub(**kwargs: Any)#
An Application for starting a Multi-User Jupyter Notebook server.
- active_server_limit c.JupyterHub.active_server_limit = Int(0)#
Maximum number of concurrent servers that can be active at a time.
Setting this can limit the total resources your users can consume.
An active server is any server that’s not fully stopped. It is considered active from the time it has been requested until the time that it has completely stopped.
If this many user servers are active, users will not be able to launch new servers until a server is shutdown. Spawn requests will be rejected with a 429 error asking them to try again.
If set to 0, no limit is enforced.
- active_user_window c.JupyterHub.active_user_window = Int(1800)#
Duration (in seconds) to determine the number of active users.
- activity_resolution c.JupyterHub.activity_resolution = Int(30)#
Resolution (in seconds) for updating activity
If activity is registered that is less than activity_resolution seconds more recent than the current value, the new value will be ignored.
This avoids too many writes to the Hub database.
- admin_access c.JupyterHub.admin_access = Bool(False)#
DEPRECATED since version 2.0.0.
The default admin role has full permissions, use custom RBAC scopes instead to create restricted administrator roles. https://jupyterhub.readthedocs.io/en/stable/rbac/index.html
- admin_users c.JupyterHub.admin_users = Set()#
DEPRECATED since version 0.7.2, use Authenticator.admin_users instead.
- allow_named_servers c.JupyterHub.allow_named_servers = Bool(False)#
Allow named single-user servers per user
- answer_yes c.JupyterHub.answer_yes = Bool(False)#
Answer yes to any questions (e.g. confirm overwrite)
- api_page_default_limit c.JupyterHub.api_page_default_limit = Int(50)#
The default amount of records returned by a paginated endpoint
- api_page_max_limit c.JupyterHub.api_page_max_limit = Int(200)#
The maximum amount of records that can be returned at once
- api_tokens c.JupyterHub.api_tokens = Dict()#
PENDING DEPRECATION: consider using services
Dict of token:username to be loaded into the database.
Allows ahead-of-time generation of API tokens for use by externally managed services, which authenticate as JupyterHub users.
Consider using services for general services that talk to the JupyterHub API.
- authenticate_prometheus c.JupyterHub.authenticate_prometheus = Bool(True)#
Authentication for prometheus metrics
- authenticator_class c.JupyterHub.authenticator_class = EntryPointType(<class 'jupyterhub.auth.PAMAuthenticator'>)#
Class for authenticating users.
This should be a subclass of
jupyterhub.auth.Authenticator
with an
authenticate()
method that:is a coroutine (asyncio or tornado)
returns username on success, None on failure
takes two arguments: (handler, data), where
handler
is the calling web.RequestHandler, anddata
is the POST form data from the login page.
Changed in version 1.0: authenticators may be registered via entry points, e.g.
c.JupyterHub.authenticator_class = 'pam'
- Currently installed:
default: jupyterhub.auth.PAMAuthenticator
dummy: jupyterhub.auth.DummyAuthenticator
null: jupyterhub.auth.NullAuthenticator
pam: jupyterhub.auth.PAMAuthenticator
- base_url c.JupyterHub.base_url = URLPrefix('/')#
The base URL of the entire application.
Add this to the beginning of all JupyterHub URLs. Use base_url to run JupyterHub within an existing website.
- bind_url c.JupyterHub.bind_url = Unicode('http://:8000')#
The public facing URL of the whole JupyterHub application.
This is the address on which the proxy will bind. Sets protocol, ip, base_url
- cleanup_proxy c.JupyterHub.cleanup_proxy = Bool(True)#
Whether to shutdown the proxy when the Hub shuts down.
Disable if you want to be able to teardown the Hub while leaving the proxy running.
Only valid if the proxy was starting by the Hub process.
If both this and cleanup_servers are False, sending SIGINT to the Hub will only shutdown the Hub, leaving everything else running.
The Hub should be able to resume from database state.
- cleanup_servers c.JupyterHub.cleanup_servers = Bool(True)#
Whether to shutdown single-user servers when the Hub shuts down.
Disable if you want to be able to teardown the Hub while leaving the single-user servers running.
If both this and cleanup_proxy are False, sending SIGINT to the Hub will only shutdown the Hub, leaving everything else running.
The Hub should be able to resume from database state.
- concurrent_spawn_limit c.JupyterHub.concurrent_spawn_limit = Int(100)#
Maximum number of concurrent users that can be spawning at a time.
Spawning lots of servers at the same time can cause performance problems for the Hub or the underlying spawning system. Set this limit to prevent bursts of logins from attempting to spawn too many servers at the same time.
This does not limit the number of total running servers. See active_server_limit for that.
If more than this many users attempt to spawn at a time, their requests will be rejected with a 429 error asking them to try again. Users will have to wait for some of the spawning services to finish starting before they can start their own.
If set to 0, no limit is enforced.
- config_file c.JupyterHub.config_file = Unicode('jupyterhub_config.py')#
The config file to load
- confirm_no_ssl c.JupyterHub.confirm_no_ssl = Bool(False)#
DEPRECATED: does nothing
- cookie_host_prefix_enabled c.JupyterHub.cookie_host_prefix_enabled = Bool(False)#
Enable
__Host-
prefix on authentication cookies.The
__Host-
prefix on JupyterHub cookies provides further protection against cookie tossing when untrusted servers may control subdomains of your jupyterhub deployment._However_, it also requires that cookies be set on the path
/
, which means they are shared by all JupyterHub components, so a compromised server component will have access to _all_ JupyterHub-related cookies of the visiting browser. It is recommended to only combine__Host-
cookies with per-user domains.Added in version 4.1.
- cookie_max_age_days c.JupyterHub.cookie_max_age_days = Float(14)#
Number of days for a login cookie to be valid. Default is two weeks.
- cookie_secret c.JupyterHub.cookie_secret = Union()#
The cookie secret to use to encrypt cookies.
Loaded from the JPY_COOKIE_SECRET env variable by default.
Should be exactly 256 bits (32 bytes).
- cookie_secret_file c.JupyterHub.cookie_secret_file = Unicode('jupyterhub_cookie_secret')#
File in which to store the cookie secret.
- custom_scopes c.JupyterHub.custom_scopes = Dict()#
Custom scopes to define.
For use when defining custom roles, to grant users granular permissions
All custom scopes must have a description, and must start with the prefix
custom:
.For example:
custom_scopes = { "custom:jupyter_server:read": { "description": "read-only access to a single-user server", }, }
- data_files_path c.JupyterHub.data_files_path = Unicode('/home/docs/checkouts/readthedocs.org/user_builds/jupyterhub/envs/latest/share/jupyterhub')#
The location of jupyterhub data files (e.g. /usr/local/share/jupyterhub)
- db_kwargs c.JupyterHub.db_kwargs = Dict()#
Include any kwargs to pass to the database connection. See sqlalchemy.create_engine for details.
- db_url c.JupyterHub.db_url = Unicode('sqlite:///jupyterhub.sqlite')#
url for the database. e.g.
sqlite:///jupyterhub.sqlite
- debug_db c.JupyterHub.debug_db = Bool(False)#
log all database transactions. This has A LOT of output
- debug_proxy c.JupyterHub.debug_proxy = Bool(False)#
DEPRECATED since version 0.8: Use ConfigurableHTTPProxy.debug
- default_server_name c.JupyterHub.default_server_name = Unicode('')#
If named servers are enabled, default name of server to spawn or open when no server is specified, e.g. by user-redirect.
Note: This has no effect if named servers are not enabled, and does _not_ change the existence or behavior of the default server named
''
(the empty string). This only affects which named server is launched when no server is specified, e.g. by links to/hub/user-redirect/lab/tree/mynotebook.ipynb
.
- default_url c.JupyterHub.default_url = Union()#
The default URL for users when they arrive (e.g. when user directs to “/”)
By default, redirects users to their own server.
Can be a Unicode string (e.g. ‘/hub/home’) or a callable based on the handler object:
def default_url_fn(handler): user = handler.current_user if user and user.admin: return '/hub/admin' return '/hub/home' c.JupyterHub.default_url = default_url_fn
- external_ssl_authorities c.JupyterHub.external_ssl_authorities = Dict()#
Dict authority:dict(files). Specify the key, cert, and/or ca file for an authority. This is useful for externally managed proxies that wish to use internal_ssl.
The files dict has this format (you must specify at least a cert):
{ 'key': '/path/to/key.key', 'cert': '/path/to/cert.crt', 'ca': '/path/to/ca.crt' }
The authorities you can override: ‘hub-ca’, ‘notebooks-ca’, ‘proxy-api-ca’, ‘proxy-client-ca’, and ‘services-ca’.
Use with internal_ssl
- extra_handlers c.JupyterHub.extra_handlers = List()#
DEPRECATED.
If you need to register additional HTTP endpoints please use services instead.
- extra_log_file c.JupyterHub.extra_log_file = Unicode('')#
DEPRECATED: use output redirection instead, e.g.
jupyterhub &>> /var/log/jupyterhub.log
- extra_log_handlers c.JupyterHub.extra_log_handlers = List()#
Extra log handlers to set on JupyterHub logger
- forwarded_host_header c.JupyterHub.forwarded_host_header = Unicode('')#
Alternate header to use as the Host (e.g., X-Forwarded-Host) when determining whether a request is cross-origin
This may be useful when JupyterHub is running behind a proxy that rewrites the Host header.
- generate_certs c.JupyterHub.generate_certs = Bool(False)#
Generate certs used for internal ssl
- generate_config c.JupyterHub.generate_config = Bool(False)#
Generate default config file
- hub_bind_url c.JupyterHub.hub_bind_url = Unicode('')#
The URL on which the Hub will listen. This is a private URL for internal communication. Typically set in combination with hub_connect_url. If a unix socket, hub_connect_url must also be set.
For example:
“http://127.0.0.1:8081” “unix+http://%2Fsrv%2Fjupyterhub%2Fjupyterhub.sock”
Added in version 0.9.
- hub_connect_ip c.JupyterHub.hub_connect_ip = Unicode('')#
The ip or hostname for proxies and spawners to use for connecting to the Hub.
Use when the bind address (
hub_ip
) is 0.0.0.0, :: or otherwise different from the connect address.Default: when
hub_ip
is 0.0.0.0 or ::, usesocket.gethostname()
, otherwise usehub_ip
.Note: Some spawners or proxy implementations might not support hostnames. Check your spawner or proxy documentation to see if they have extra requirements.
Added in version 0.8.
- hub_connect_port c.JupyterHub.hub_connect_port = Int(0)#
DEPRECATED
Use hub_connect_url
Added in version 0.8.
Deprecated since version 0.9: Use hub_connect_url
- hub_connect_url c.JupyterHub.hub_connect_url = Unicode('')#
The URL for connecting to the Hub. Spawners, services, and the proxy will use this URL to talk to the Hub.
Only needs to be specified if the default hub URL is not connectable (e.g. using a unix+http:// bind url).
See also
JupyterHub.hub_connect_ip JupyterHub.hub_bind_url
Added in version 0.9.
- hub_ip c.JupyterHub.hub_ip = Unicode('127.0.0.1')#
The ip address for the Hub process to bind to.
By default, the hub listens on localhost only. This address must be accessible from the proxy and user servers. You may need to set this to a public ip or ‘’ for all interfaces if the proxy or user servers are in containers or on a different host.
See
hub_connect_ip
for cases where the bind and connect address should differ, orhub_bind_url
for setting the full bind URL.
- hub_port c.JupyterHub.hub_port = Int(8081)#
The internal port for the Hub process.
This is the internal port of the hub itself. It should never be accessed directly. See JupyterHub.port for the public port to use when accessing jupyterhub. It is rare that this port should be set except in cases of port conflict.
See also
hub_ip
for the ip andhub_bind_url
for setting the full bind URL.
- hub_routespec c.JupyterHub.hub_routespec = Unicode('/')#
The routing prefix for the Hub itself.
Override to send only a subset of traffic to the Hub. Default is to use the Hub as the default route for all requests.
This is necessary for normal jupyterhub operation, as the Hub must receive requests for e.g.
/user/:name
when the user’s server is not running.However, some deployments using only the JupyterHub API may want to handle these events themselves, in which case they can register their own default target with the proxy and set e.g.
hub_routespec = /hub/
to serve only the hub’s own pages, or even/hub/api/
for api-only operation.Note: hub_routespec must include the base_url, if any.
Added in version 1.4.
- implicit_spawn_seconds c.JupyterHub.implicit_spawn_seconds = Float(0)#
Trigger implicit spawns after this many seconds.
When a user visits a URL for a server that’s not running, they are shown a page indicating that the requested server is not running with a button to spawn the server.
Setting this to a positive value will redirect the user after this many seconds, effectively clicking this button automatically for the users, automatically beginning the spawn process.
Warning: this can result in errors and surprising behavior when sharing access URLs to actual servers, since the wrong server is likely to be started.
- init_spawners_timeout c.JupyterHub.init_spawners_timeout = Int(10)#
Timeout (in seconds) to wait for spawners to initialize
Checking if spawners are healthy can take a long time if many spawners are active at hub start time.
If it takes longer than this timeout to check, init_spawner will be left to complete in the background and the http server is allowed to start.
A timeout of -1 means wait forever, which can mean a slow startup of the Hub but ensures that the Hub is fully consistent by the time it starts responding to requests. This matches the behavior of jupyterhub 1.0.
- internal_certs_location c.JupyterHub.internal_certs_location = Unicode('internal-ssl')#
The location to store certificates automatically created by JupyterHub.
Use with internal_ssl
- internal_ssl c.JupyterHub.internal_ssl = Bool(False)#
Enable SSL for all internal communication
This enables end-to-end encryption between all JupyterHub components. JupyterHub will automatically create the necessary certificate authority and sign notebook certificates as they’re created.
- ip c.JupyterHub.ip = Unicode('')#
The public facing ip of the whole JupyterHub application (specifically referred to as the proxy).
This is the address on which the proxy will listen. The default is to listen on all interfaces. This is the only address through which JupyterHub should be accessed by users.
- jinja_environment_options c.JupyterHub.jinja_environment_options = Dict()#
Supply extra arguments that will be passed to Jinja environment.
- last_activity_interval c.JupyterHub.last_activity_interval = Int(300)#
Interval (in seconds) at which to update last-activity timestamps.
- load_groups c.JupyterHub.load_groups = Dict()#
Dict of
{'group': {'users':['usernames'], 'properties': {}}
to load at startup.Example:
c.JupyterHub.load_groups = { 'groupname': { 'users': ['usernames'], 'properties': {'key': 'value'}, }, }
This strictly adds groups and users to groups. Properties, if defined, replace all existing properties.
Loading one set of groups, then starting JupyterHub again with a different set will not remove users or groups from previous launches. That must be done through the API.
Changed in version 3.2: Changed format of group from list of usernames to dict
- load_roles c.JupyterHub.load_roles = List()#
List of predefined role dictionaries to load at startup.
For instance:
load_roles = [ { 'name': 'teacher', 'description': 'Access to users' information and group membership', 'scopes': ['users', 'groups'], 'users': ['cyclops', 'gandalf'], 'services': [], 'groups': [] } ]
All keys apart from ‘name’ are optional. See all the available scopes in the JupyterHub REST API documentation.
Default roles are defined in roles.py.
- log_datefmt c.JupyterHub.log_datefmt = Unicode('%Y-%m-%d %H:%M:%S')#
The date format used by logging formatters for %(asctime)s
- log_format c.JupyterHub.log_format = Unicode('[%(name)s]%(highlevel)s %(message)s')#
The Logging format template
- log_level c.JupyterHub.log_level = Enum(30)#
Set the log level by value or name.
- logging_config c.JupyterHub.logging_config = Dict()#
Configure additional log handlers.
The default stderr logs handler is configured by the log_level, log_datefmt and log_format settings.
This configuration can be used to configure additional handlers (e.g. to output the log to a file) or for finer control over the default handlers.
If provided this should be a logging configuration dictionary, for more information see: https://docs.python.org/3/library/logging.config.html#logging-config-dictschema
This dictionary is merged with the base logging configuration which defines the following:
A logging formatter intended for interactive use called
console
.A logging handler that writes to stderr called
console
which uses the formatterconsole
.A logger with the name of this application set to
DEBUG
level.
This example adds a new handler that writes to a file:
c.Application.logging_config = { "handlers": { "file": { "class": "logging.FileHandler", "level": "DEBUG", "filename": "<path/to/file>", } }, "loggers": { "<application-name>": { "level": "DEBUG", # NOTE: if you don't list the default "console" # handler here then it will be disabled "handlers": ["console", "file"], }, }, }
- logo_file c.JupyterHub.logo_file = Unicode('')#
Specify path to a logo image to override the Jupyter logo in the banner.
- named_server_limit_per_user c.JupyterHub.named_server_limit_per_user = Union(0)#
Maximum number of concurrent named servers that can be created by a user at a time.
Setting this can limit the total resources a user can consume.
If set to 0, no limit is enforced.
Can be an integer or a callable/awaitable based on the handler object:
def named_server_limit_per_user_fn(handler): user = handler.current_user if user and user.admin: return 0 return 5 c.JupyterHub.named_server_limit_per_user = named_server_limit_per_user_fn
- oauth_token_expires_in c.JupyterHub.oauth_token_expires_in = Int(0)#
Expiry (in seconds) of OAuth access tokens.
The default is to expire when the cookie storing them expires, according to
cookie_max_age_days
config.These are the tokens stored in cookies when you visit a single-user server or service. When they expire, you must re-authenticate with the Hub, even if your Hub authentication is still valid. If your Hub authentication is valid, logging in may be a transparent redirect as you refresh the page.
This does not affect JupyterHub API tokens in general, which do not expire by default. Only tokens issued during the oauth flow accessing services and single-user servers are affected.
Added in version 1.4: OAuth token expires_in was not previously configurable.
Changed in version 1.4: Default now uses cookie_max_age_days so that oauth tokens which are generally stored in cookies, expire when the cookies storing them expire. Previously, it was one hour.
- pid_file c.JupyterHub.pid_file = Unicode('')#
File to write PID Useful for daemonizing JupyterHub.
- port c.JupyterHub.port = Int(8000)#
The public facing port of the proxy.
This is the port on which the proxy will listen. This is the only port through which JupyterHub should be accessed by users.
- proxy_api_ip c.JupyterHub.proxy_api_ip = Unicode('')#
DEPRECATED since version 0.8 : Use ConfigurableHTTPProxy.api_url
- proxy_api_port c.JupyterHub.proxy_api_port = Int(0)#
DEPRECATED since version 0.8 : Use ConfigurableHTTPProxy.api_url
- proxy_auth_token c.JupyterHub.proxy_auth_token = Unicode('')#
DEPRECATED since version 0.8: Use ConfigurableHTTPProxy.auth_token
- proxy_check_interval c.JupyterHub.proxy_check_interval = Int(5)#
DEPRECATED since version 0.8: Use ConfigurableHTTPProxy.check_running_interval
- proxy_class c.JupyterHub.proxy_class = EntryPointType(<class 'jupyterhub.proxy.ConfigurableHTTPProxy'>)#
The class to use for configuring the JupyterHub proxy.
Should be a subclass of
jupyterhub.proxy.Proxy
.Changed in version 1.0: proxies may be registered via entry points, e.g.
c.JupyterHub.proxy_class = 'traefik'
- Currently installed:
configurable-http-proxy: jupyterhub.proxy.ConfigurableHTTPProxy
default: jupyterhub.proxy.ConfigurableHTTPProxy
- proxy_cmd c.JupyterHub.proxy_cmd = Command()#
DEPRECATED since version 0.8. Use ConfigurableHTTPProxy.command
- public_url c.JupyterHub.public_url = Unicode('')#
Set the public URL of JupyterHub
This will skip any detection of URL and protocol from requests, which isn’t always correct when JupyterHub is behind multiple layers of proxies, etc. Usually the failure is detecting http when it’s really https.
Should include the full, public URL of JupyterHub, including the public-facing base_url prefix (i.e. it should include a trailing slash), e.g. https://jupyterhub.example.org/prefix/
- recreate_internal_certs c.JupyterHub.recreate_internal_certs = Bool(False)#
Recreate all certificates used within JupyterHub on restart.
Note: enabling this feature requires restarting all notebook servers.
Use with internal_ssl
- redirect_to_server c.JupyterHub.redirect_to_server = Bool(True)#
Redirect user to server (if running), instead of control panel.
- reset_db c.JupyterHub.reset_db = Bool(False)#
Purge and reset the database.
- service_check_interval c.JupyterHub.service_check_interval = Int(60)#
Interval (in seconds) at which to check connectivity of services with web endpoints.
- service_tokens c.JupyterHub.service_tokens = Dict()#
Dict of token:servicename to be loaded into the database.
Allows ahead-of-time generation of API tokens for use by externally managed services.
- services c.JupyterHub.services = List()#
List of service specification dictionaries.
A service
For instance:
services = [ { 'name': 'cull_idle', 'command': ['/path/to/cull_idle_servers.py'], }, { 'name': 'formgrader', 'url': 'http://127.0.0.1:1234', 'api_token': 'super-secret', 'environment': } ]
- show_config c.JupyterHub.show_config = Bool(False)#
Instead of starting the Application, dump configuration to stdout
- show_config_json c.JupyterHub.show_config_json = Bool(False)#
Instead of starting the Application, dump configuration to stdout (as JSON)
- shutdown_on_logout c.JupyterHub.shutdown_on_logout = Bool(False)#
Shuts down all user servers on logout
- spawner_class c.JupyterHub.spawner_class = EntryPointType(<class 'jupyterhub.spawner.LocalProcessSpawner'>)#
The class to use for spawning single-user servers.
Should be a subclass of
jupyterhub.spawner.Spawner
.Changed in version 1.0: spawners may be registered via entry points, e.g.
c.JupyterHub.spawner_class = 'localprocess'
- Currently installed:
default: jupyterhub.spawner.LocalProcessSpawner
localprocess: jupyterhub.spawner.LocalProcessSpawner
simple: jupyterhub.spawner.SimpleLocalProcessSpawner
- ssl_cert c.JupyterHub.ssl_cert = Unicode('')#
Path to SSL certificate file for the public facing interface of the proxy
When setting this, you should also set ssl_key
- ssl_key c.JupyterHub.ssl_key = Unicode('')#
Path to SSL key file for the public facing interface of the proxy
When setting this, you should also set ssl_cert
- statsd_host c.JupyterHub.statsd_host = Unicode('')#
Host to send statsd metrics to. An empty string (the default) disables sending metrics.
- statsd_port c.JupyterHub.statsd_port = Int(8125)#
Port on which to send statsd metrics about the hub
- statsd_prefix c.JupyterHub.statsd_prefix = Unicode('jupyterhub')#
Prefix to use for all metrics sent by jupyterhub to statsd
- subdomain_hook c.JupyterHub.subdomain_hook = Union('idna')#
Hook for constructing subdomains for users and services. Only used when
JupyterHub.subdomain_host
is set.There are two predefined hooks, which can be selected by name:
‘legacy’ (deprecated)
‘idna’ (default, more robust. No change for _most_ usernames)
Otherwise, should be a function which must not be async. A custom subdomain_hook should have the signature:
- def subdomain_hook(name, domain, kind) -> str:
…
and should return a unique, valid domain name for all usernames.
name
is the original name, which may need escaping to be safe as a domain name labeldomain
is the domain of the Hub itselfkind
will be one of ‘user’ or ‘service’
JupyterHub itself puts very little limit on usernames to accommodate a wide variety of Authenticators, but your identity provider is likely much more strict, allowing you to make assumptions about the name.
The default behavior is to have all services on a single
services.{domain}
subdomain, and each user on{username}.{domain}
. This is the ‘legacy’ scheme, and doesn’t work for all usernames.The ‘idna’ scheme is a new scheme that should produce a valid domain name for any user, using IDNA encoding for unicode usernames, and a truncate-and-hash approach for any usernames that can’t be easily encoded into a domain component.
Added in version 5.0.
- subdomain_host c.JupyterHub.subdomain_host = Unicode('')#
Run single-user servers on subdomains of this host.
This should be the full
https://hub.domain.tld[:port]
.Provides additional cross-site protections for javascript served by single-user servers.
Requires
<username>.hub.domain.tld
to resolve to the same host ashub.domain.tld
.In general, this is most easily achieved with wildcard DNS.
When using SSL (i.e. always) this also requires a wildcard SSL certificate.
- template_paths c.JupyterHub.template_paths = List()#
Paths to search for jinja templates, before using the default templates.
- template_vars c.JupyterHub.template_vars = Dict()#
Extra variables to be passed into jinja templates.
Values in dict may contain callable objects. If value is callable, the current user is passed as argument.
Example:
def callable_value(user): # user is generated by handlers.base.get_current_user with open("/tmp/file.txt", "r") as f: ret = f.read() ret = ret.replace("<username>", user.name) return ret c.JupyterHub.template_vars = { "key1": "value1", "key2": callable_value, }
- tornado_settings c.JupyterHub.tornado_settings = Dict()#
Extra settings overrides to pass to the tornado application.
- trust_user_provided_tokens c.JupyterHub.trust_user_provided_tokens = Bool(False)#
Trust user-provided tokens (via JupyterHub.service_tokens) to have good entropy.
If you are not inserting additional tokens via configuration file, this flag has no effect.
In JupyterHub 0.8, internally generated tokens do not pass through additional hashing because the hashing is costly and does not increase the entropy of already-good UUIDs.
User-provided tokens, on the other hand, are not trusted to have good entropy by default, and are passed through many rounds of hashing to stretch the entropy of the key (i.e. user-provided tokens are treated as passwords instead of random keys). These keys are more costly to check.
If your inserted tokens are generated by a good-quality mechanism, e.g.
openssl rand -hex 32
, then you can set this flag to True to reduce the cost of checking authentication tokens.
- trusted_alt_names c.JupyterHub.trusted_alt_names = List()#
Names to include in the subject alternative name.
These names will be used for server name verification. This is useful if JupyterHub is being run behind a reverse proxy or services using ssl are on different hosts.
Use with internal_ssl
- trusted_downstream_ips c.JupyterHub.trusted_downstream_ips = List()#
Downstream proxy IP addresses to trust.
This sets the list of IP addresses that are trusted and skipped when processing the
X-Forwarded-For
header. For example, if an external proxy is used for TLS termination, its IP address should be added to this list to ensure the correct client IP addresses are recorded in the logs instead of the proxy server’s IP address.
- upgrade_db c.JupyterHub.upgrade_db = Bool(False)#
Upgrade the database automatically on start.
Only safe if database is regularly backed up. Only SQLite databases will be backed up to a local file automatically.
- use_legacy_stopped_server_status_code c.JupyterHub.use_legacy_stopped_server_status_code = Bool(False)#
Return 503 rather than 424 when request comes in for a non-running server.
Prior to JupyterHub 2.0, we returned a 503 when any request came in for a user server that was currently not running. By default, JupyterHub 2.0 will return a 424 - this makes operational metric dashboards more useful.
JupyterLab < 3.2 expected the 503 to know if the user server is no longer running, and prompted the user to start their server. Set this config to true to retain the old behavior, so JupyterLab < 3.2 can continue to show the appropriate UI when the user server is stopped.
This option will be removed in a future release.
- user_redirect_hook c.JupyterHub.user_redirect_hook = Callable(None)#
Callable to affect behavior of /user-redirect/
Receives 4 parameters: 1. path - URL path that was provided after /user-redirect/ 2. request - A Tornado HTTPServerRequest representing the current request. 3. user - The currently authenticated user. 4. base_url - The base_url of the current hub, for relative redirects
It should return the new URL to redirect to, or None to preserve current behavior.
Authenticators#
Module: jupyterhub.auth
#
Base Authenticator class and the default PAM Authenticator
Authenticator
#
- class jupyterhub.auth.Authenticator(**kwargs: Any)#
Base class for implementing an authentication provider for JupyterHub
- add_user(user)#
Hook called when a user is added to JupyterHub
- This is called:
When a user first authenticates, _after_ all allow and block checks have passed
When the hub restarts, for all users in the database (i.e. users previously allowed)
When a user is added to the database, either via configuration or REST API
This method may be a coroutine.
By default, this adds the user to the allowed_users set if allow_existing_users is true.
Subclasses may do more extensive things, such as creating actual system users, but they should call super to ensure the allowed_users set is updated.
Note that this should be idempotent, since it is called whenever the hub restarts for all users.
Changed in version 5.0: Now adds users to the allowed_users set if allow_all is False and allow_existing_users is True, instead of if allowed_users is not empty.
- Parameters:
user (User) – The User wrapper object
- admin_users c.Authenticator.admin_users = Set()#
Set of users that will have admin rights on this JupyterHub.
Note: As of JupyterHub 2.0, full admin rights should not be required, and more precise permissions can be managed via roles.
- Admin users have extra privileges:
Use the admin panel to see list of users logged in
Add / remove users in some authenticators
Restart / halt the hub
Start / stop users’ single-user servers
Can access each individual users’ single-user server (if configured)
Admin access should be treated the same way root access is.
Defaults to an empty set, in which case no user has admin access.
- allow_all c.Authenticator.allow_all = Bool(False)#
Allow every user who can successfully authenticate to access JupyterHub.
False by default, which means for most Authenticators, _some_ allow-related configuration is required to allow users to log in.
Authenticator subclasses may override the default with e.g.:
@default("allow_all") def _default_allow_all(self): # if _any_ auth config (depends on the Authenticator) if self.allowed_users or self.allowed_groups or self.allow_existing_users: return False else: return True
Added in version 5.0.
Changed in version 5.0: Prior to 5.0,
allow_all
wasn’t defined on its own, and was instead implicitly True when no allow config was provided, i.e.allowed_users
unspecified or empty on the base Authenticator class.To preserve pre-5.0 behavior, set
allow_all = True
if you have no other allow configuration.
- allow_existing_users c.Authenticator.allow_existing_users = Bool(False)#
Allow existing users to login.
Defaults to True if
allowed_users
is set for historical reasons, and False otherwise.With this enabled, all users present in the JupyterHub database are allowed to login. This has the effect of any user who has _previously_ been allowed to login via any means will continue to be allowed until the user is deleted via the /hub/admin page or REST API.
Warning
Before enabling this you should review the existing users in the JupyterHub admin panel at
/hub/admin
. You may find users existing there because they have previously been declared in config such asallowed_users
or allowed to sign in.Warning
When this is enabled and you wish to remove access for one or more users previously allowed, you must make sure that they are removed from the jupyterhub database. This can be tricky to do if you stop allowing an externally managed group of users for example.
With this enabled, JupyterHub admin users can visit
/hub/admin
or use JupyterHub’s REST API to add and remove users to manage who can login.Added in version 5.0.
- allowed_users c.Authenticator.allowed_users = Set()#
Set of usernames that are allowed to log in.
Use this to limit which authenticated users may login. Default behavior: only users in this set are allowed.
If empty, does not perform any restriction, in which case any authenticated user is allowed.
Authenticators may extend
Authenticator.check_allowed()
to combineallowed_users
with other configuration to either expand or restrict access.Changed in version 1.2:
Authenticator.whitelist
renamed toallowed_users
- any_allow_config c.Authenticator.any_allow_config = Bool(False)#
Is there any allow config?
Used to show a warning if it looks like nobody can access the Hub, which can happen when upgrading to JupyterHub 5, now that
allow_all
defaults to False.Deployments can set this explicitly to True to suppress the “No allow config found” warning.
Will be True if any config tagged with
.tag(allow_config=True)
or starts withallow
is truthy.Added in version 5.0.
- auth_refresh_age c.Authenticator.auth_refresh_age = Int(300)#
The max age (in seconds) of authentication info before forcing a refresh of user auth info.
Refreshing auth info allows, e.g. requesting/re-validating auth tokens.
See
refresh_user()
for what happens when user auth info is refreshed (nothing by default).
- async authenticate(handler, data)#
Authenticate a user with login form data
This must be a coroutine.
It must return the username on successful authentication, and return None on failed authentication.
Subclasses can also raise a
web.HTTPError(403, message)
in order to halt the authentication process and customize the error message that will be shown to the user. This error may be raised anywhere in the authentication process (authenticate
,check_allowed
,check_blocked_users
).Checking allowed_users/blocked_users is handled separately by the caller.
Changed in version 0.8: Allow
authenticate
to return a dict containing auth_state.- Parameters:
handler (tornado.web.RequestHandler) – the current request handler
data (dict) – The formdata of the login form. The default form has ‘username’ and ‘password’ fields.
- Returns:
The username of the authenticated user, or None if Authentication failed.
The Authenticator may return a dict instead, which MUST have a key
name
holding the username, and MAY have additional keys:auth_state
, a dictionary of auth state that will be persisted;admin
, the admin setting value for the usergroups
, the list of group names the user should be a member of, if Authenticator.manage_groups is True.groups
MUST always be present if manage_groups is enabled.
- Return type:
- Raises:
web.HTTPError(403) – Raising errors directly allows customizing the message shown to the user.
- auto_login c.Authenticator.auto_login = Bool(False)#
Automatically begin the login process
rather than starting with a “Login with…” link at
/hub/login
To work,
.login_url()
must give a URL other than the default/hub/login
, such as an oauth handler or another automatic login handler, registered with.get_handlers()
.Added in version 0.8.
- auto_login_oauth2_authorize c.Authenticator.auto_login_oauth2_authorize = Bool(False)#
Automatically begin login process for OAuth2 authorization requests
When another application is using JupyterHub as OAuth2 provider, it sends users to
/hub/api/oauth2/authorize
. If the user isn’t logged in already, and auto_login is not set, the user will be dumped on the hub’s home page, without any context on what to do next.Setting this to true will automatically redirect users to login if they aren’t logged in only on the
/hub/api/oauth2/authorize
endpoint.Added in version 1.5.
- blocked_users c.Authenticator.blocked_users = Set()#
Set of usernames that are not allowed to log in.
Use this with supported authenticators to restrict which users can not log in. This is an additional block list that further restricts users, beyond whatever restrictions the authenticator has in place.
If empty, does not perform any additional restriction.
Changed in version 1.2:
Authenticator.blacklist
renamed toblocked_users
- check_allow_config()#
Log a warning if no allow config can be found.
Could get a false positive if _only_ unrecognized allow config is used. Authenticators can apply
.tag(allow_config=True)
to label this config to make sure it is found.Subclasses can override to perform additonal checks and warn about likely authenticator configuration problems.
Added in version 5.0.
- check_allowed(username, authentication=None)#
Check if a username is allowed to authenticate based on configuration
Return True if username is allowed, False otherwise.
No allowed_users set means any username is allowed.
Names are normalized before being checked against the allowed set.
Changed in version 1.0: Signature updated to accept authentication data and any future changes
Changed in version 1.2: Renamed check_whitelist to check_allowed
- Parameters:
- Returns:
Whether the user is allowed
- Return type:
allowed (bool)
- Raises:
web.HTTPError(403) – Raising HTTPErrors directly allows customizing the message shown to the user.
- check_blocked_users(username, authentication=None)#
Check if a username is blocked to authenticate based on Authenticator.blocked_users configuration
Return True if username is allowed, False otherwise. No block list means any username is allowed.
Names are normalized before being checked against the block list.
Changed in version 1.0: Signature updated to accept authentication data as second argument
Changed in version 1.2: Renamed check_blacklist to check_blocked_users
- Parameters:
- Returns:
Whether the user is allowed
- Return type:
allowed (bool)
- Raises:
web.HTTPError(403, message) – Raising HTTPErrors directly allows customizing the message shown to the user.
- custom_html Unicode('')#
HTML form to be overridden by authenticators if they want a custom authentication form.
Defaults to an empty string, which shows the default username/password form.
- delete_invalid_users c.Authenticator.delete_invalid_users = Bool(False)#
Delete any users from the database that do not pass validation
When JupyterHub starts,
.add_user
will be called on each user in the database to verify that all users are still valid.If
delete_invalid_users
is True, any users that do not pass validation will be deleted from the database. Use this if users might be deleted from an external system, such as local user accounts.If False (default), invalid users remain in the Hub’s database and a warning will be issued. This is the default to avoid data loss due to config changes.
- delete_user(user)#
Hook called when a user is deleted
Removes the user from the allowed_users set. Subclasses should call super to ensure the allowed_users set is updated.
- Parameters:
user (User) – The User wrapper object
- enable_auth_state c.Authenticator.enable_auth_state = Bool(False)#
Enable persisting auth_state (if available).
auth_state will be encrypted and stored in the Hub’s database. This can include things like authentication tokens, etc. to be passed to Spawners as environment variables.
Encrypting auth_state requires the cryptography package.
Additionally, the JUPYTERHUB_CRYPT_KEY environment variable must contain one (or more, separated by ;) 32B encryption keys. These can be either base64 or hex-encoded.
If encryption is unavailable, auth_state cannot be persisted.
New in JupyterHub 0.8
- async get_authenticated_user(handler, data)#
Authenticate the user who is attempting to log in
Returns user dict if successful, None otherwise.
This calls
authenticate
, which should be overridden in subclasses, normalizes the username if any normalization should be done, and then validates the name in the allowed set.This is the outer API for authenticating a user. Subclasses should not override this method.
- The various stages can be overridden separately:
authenticate
turns formdata into a usernamenormalize_username
normalizes the usernamecheck_blocked_users
check against the blocked usernamesallow_all
is checkedcheck_allowed
checks against the allowed usernamesis_admin
check if a user is an admin
Changed in version 0.8: return dict instead of username
- get_custom_html(base_url)#
Get custom HTML for the authenticator.
- get_handlers(app)#
Return any custom handlers the authenticator needs to register
Used in conjugation with
login_url
andlogout_url
.- Parameters:
app (JupyterHub Application) – the application object, in case it needs to be accessed for info.
- Returns:
list of
('/url', Handler)
tuples passed to tornado. The Hub prefix is added to any URLs.- Return type:
handlers (list)
- is_admin(handler, authentication)#
Authentication helper to determine a user’s admin status.
- Parameters:
handler (tornado.web.RequestHandler) – the current request handler
authentication – The authentication dict generated by
authenticate
.
- Returns:
The admin status of the user, or None if it could not be determined or should not change.
- Return type:
admin_status (Bool or None)
- async load_managed_roles()#
Load roles managed by authenticator.
Returns a list of predefined role dictionaries to load at startup, following the same format as
JupyterHub.load_roles
.Added in version 5.0.
- login_service Unicode('')#
Name of the login service that this authenticator is providing using to authenticate users.
Example: GitHub, MediaWiki, Google, etc.
Setting this value replaces the login form with a “Login with <login_service>” button.
Any authenticator that redirects to an external service (e.g. using OAuth) should set this.
- login_url(base_url)#
Override this when registering a custom login handler
Generally used by authenticators that do not use simple form-based authentication.
The subclass overriding this is responsible for making sure there is a handler available to handle the URL returned from this method, using the
get_handlers
method.
- logout_url(base_url)#
Override when registering a custom logout handler
The subclass overriding this is responsible for making sure there is a handler available to handle the URL returned from this method, using the
get_handlers
method.
- manage_groups c.Authenticator.manage_groups = Bool(False)#
Let authenticator manage user groups
If True, Authenticator.authenticate and/or .refresh_user may return a list of group names in the ‘groups’ field, which will be assigned to the user.
All group-assignment APIs are disabled if this is True.
- manage_roles c.Authenticator.manage_roles = Bool(False)#
Let authenticator manage roles
If True, Authenticator.authenticate and/or .refresh_user may return a list of roles in the ‘roles’ field, which will be added to the database.
When enabled, all role management will be handled by the authenticator; in particular, assignment of roles via
JupyterHub.load_roles
traitlet will not be possible.Added in version 5.0.
- normalize_username(username)#
Normalize the given username and return it
Override in subclasses if usernames need different normalization rules.
The default attempts to lowercase the username and apply
username_map
if it is set.
- otp_prompt c.Authenticator.otp_prompt = Any('OTP:')#
The prompt string for the extra OTP (One Time Password) field.
Added in version 5.0.
- post_auth_hook c.Authenticator.post_auth_hook = Any(None)#
An optional hook function that you can implement to do some bootstrapping work during authentication. For example, loading user account details from an external system.
This function is called after the user has passed all authentication checks and is ready to successfully authenticate. This function must return the auth_model dict reguardless of changes to it. The hook is called with 3 positional arguments:
(authenticator, handler, auth_model)
.This may be a coroutine.
Example:
import os import pwd def my_hook(authenticator, handler, auth_model): user_data = pwd.getpwnam(auth_model['name']) spawn_data = { 'pw_data': user_data 'gid_list': os.getgrouplist(auth_model['name'], user_data.pw_gid) } if auth_model['auth_state'] is None: auth_model['auth_state'] = {} auth_model['auth_state']['spawn_data'] = spawn_data return auth_model c.Authenticator.post_auth_hook = my_hook
- post_spawn_stop(user, spawner)#
Hook called after stopping a user container
Can be used to do auth-related cleanup, e.g. closing PAM sessions.
- pre_spawn_start(user, spawner)#
Hook called before spawning a user’s server
Can be used to do auth-related startup, e.g. opening PAM sessions.
- refresh_pre_spawn c.Authenticator.refresh_pre_spawn = Bool(False)#
Force refresh of auth prior to spawn.
This forces
refresh_user()
to be called prior to launching a server, to ensure that auth state is up-to-date.This can be important when e.g. auth tokens that may have expired are passed to the spawner via environment variables from auth_state.
If refresh_user cannot refresh the user auth data, launch will fail until the user logs in again.
- async refresh_user(user, handler=None)#
Refresh auth data for a given user
Allows refreshing or invalidating auth data.
Only override if your authenticator needs to refresh its data about users once in a while.
- Parameters:
user (User) – the user to refresh
handler (tornado.web.RequestHandler or None) – the current request handler
- Returns:
Return True if auth data for the user is up-to-date and no updates are required.
Return False if the user’s auth data has expired, and they should be required to login again.
Return a dict of auth data if some values should be updated. This dict should have the same structure as that returned by
authenticate()
when it returns a dict. Any fields present will refresh the value for the user. Any fields not present will be left unchanged. This can include updating.admin
or.auth_state
fields.- Return type:
- request_otp c.Authenticator.request_otp = Bool(False)#
Prompt for OTP (One Time Password) in the login form.
Added in version 5.0.
- reset_managed_roles_on_startup c.Authenticator.reset_managed_roles_on_startup = Bool(False)#
Reset managed roles to result of
load_managed_roles()
on startup.- If True:
stale managed roles will be removed,
stale assignments to managed roles will be removed.
Any role not present in
load_managed_roles()
will be considered ‘stale’.The ‘stale’ status for role assignments is also determined from
load_managed_roles()
result:user role assignments status will depend on whether the
users
key is defined or not:if a list is defined under the
users
key and the user is not listed, then the user role assignment will be considered ‘stale’,if the
users
key is not provided, the user role assignment will be preserved;
service and group role assignments will be considered ‘stale’:
if not included in the
services
andgroups
list,if the
services
andgroups
keys are not provided.
Added in version 5.0.
- async run_post_auth_hook(handler, auth_model)#
Run the post_auth_hook if defined
- Parameters:
handler (tornado.web.RequestHandler) – the current request handler
auth_model (dict) – User authentication data dictionary. Contains the username (‘name’), admin status (‘admin’), and auth state dictionary (‘auth_state’).
- Returns:
The hook must always return the auth_model dict
- Return type:
auth_model (dict)
- username_map c.Authenticator.username_map = Dict()#
Dictionary mapping authenticator usernames to JupyterHub users.
Primarily used to normalize OAuth user names to local users.
- username_pattern c.Authenticator.username_pattern = Unicode('')#
Regular expression pattern that all valid usernames must match.
If a username does not match the pattern specified here, authentication will not be attempted.
If not set, allow any username.
- username_regex Any(None)#
Compiled regex kept in sync with
username_pattern
- validate_username(username)#
Validate a normalized username
Return True if username is valid, False otherwise.
- whitelist c.Authenticator.whitelist = Set()#
Deprecated, use
Authenticator.allowed_users
LocalAuthenticator
#
- class jupyterhub.auth.LocalAuthenticator(**kwargs: Any)#
Base class for Authenticators that work with local Linux/UNIX users
Checks for local users, and can attempt to create them if they exist.
- add_system_user(user)#
Create a new local UNIX user on the system.
Tested to work on FreeBSD and Linux, at least.
- async add_user(user)#
Hook called whenever a new user is added
If self.create_system_users, the user will attempt to be created if it doesn’t exist.
- add_user_cmd c.LocalAuthenticator.add_user_cmd = Command()#
The command to use for creating users as a list of strings
For each element in the list, the string USERNAME will be replaced with the user’s username. The username will also be appended as the final argument.
For Linux, the default value is:
[‘adduser’, ‘-q’, ‘–gecos’, ‘””’, ‘–disabled-password’]
To specify a custom home directory, set this to:
[‘adduser’, ‘-q’, ‘–gecos’, ‘””’, ‘–home’, ‘/customhome/USERNAME’, ‘–disabled-password’]
This will run the command:
adduser -q –gecos “” –home /customhome/river –disabled-password river
when the user ‘river’ is created.
- admin_users c.LocalAuthenticator.admin_users = Set()#
Set of users that will have admin rights on this JupyterHub.
Note: As of JupyterHub 2.0, full admin rights should not be required, and more precise permissions can be managed via roles.
- Admin users have extra privileges:
Use the admin panel to see list of users logged in
Add / remove users in some authenticators
Restart / halt the hub
Start / stop users’ single-user servers
Can access each individual users’ single-user server (if configured)
Admin access should be treated the same way root access is.
Defaults to an empty set, in which case no user has admin access.
- allow_all c.LocalAuthenticator.allow_all = Bool(False)#
Allow every user who can successfully authenticate to access JupyterHub.
False by default, which means for most Authenticators, _some_ allow-related configuration is required to allow users to log in.
Authenticator subclasses may override the default with e.g.:
@default("allow_all") def _default_allow_all(self): # if _any_ auth config (depends on the Authenticator) if self.allowed_users or self.allowed_groups or self.allow_existing_users: return False else: return True
Added in version 5.0.
Changed in version 5.0: Prior to 5.0,
allow_all
wasn’t defined on its own, and was instead implicitly True when no allow config was provided, i.e.allowed_users
unspecified or empty on the base Authenticator class.To preserve pre-5.0 behavior, set
allow_all = True
if you have no other allow configuration.
- allow_existing_users c.LocalAuthenticator.allow_existing_users = Bool(False)#
Allow existing users to login.
Defaults to True if
allowed_users
is set for historical reasons, and False otherwise.With this enabled, all users present in the JupyterHub database are allowed to login. This has the effect of any user who has _previously_ been allowed to login via any means will continue to be allowed until the user is deleted via the /hub/admin page or REST API.
Warning
Before enabling this you should review the existing users in the JupyterHub admin panel at
/hub/admin
. You may find users existing there because they have previously been declared in config such asallowed_users
or allowed to sign in.Warning
When this is enabled and you wish to remove access for one or more users previously allowed, you must make sure that they are removed from the jupyterhub database. This can be tricky to do if you stop allowing an externally managed group of users for example.
With this enabled, JupyterHub admin users can visit
/hub/admin
or use JupyterHub’s REST API to add and remove users to manage who can login.Added in version 5.0.
- allowed_groups c.LocalAuthenticator.allowed_groups = Set()#
Allow login from all users in these UNIX groups.
Changed in version 5.0:
allowed_groups
may be specified together with allowed_users, to grant access by group OR name.
- allowed_users c.LocalAuthenticator.allowed_users = Set()#
Set of usernames that are allowed to log in.
Use this to limit which authenticated users may login. Default behavior: only users in this set are allowed.
If empty, does not perform any restriction, in which case any authenticated user is allowed.
Authenticators may extend
Authenticator.check_allowed()
to combineallowed_users
with other configuration to either expand or restrict access.Changed in version 1.2:
Authenticator.whitelist
renamed toallowed_users
- any_allow_config c.LocalAuthenticator.any_allow_config = Bool(False)#
Is there any allow config?
Used to show a warning if it looks like nobody can access the Hub, which can happen when upgrading to JupyterHub 5, now that
allow_all
defaults to False.Deployments can set this explicitly to True to suppress the “No allow config found” warning.
Will be True if any config tagged with
.tag(allow_config=True)
or starts withallow
is truthy.Added in version 5.0.
- auth_refresh_age c.LocalAuthenticator.auth_refresh_age = Int(300)#
The max age (in seconds) of authentication info before forcing a refresh of user auth info.
Refreshing auth info allows, e.g. requesting/re-validating auth tokens.
See
refresh_user()
for what happens when user auth info is refreshed (nothing by default).
- auto_login c.LocalAuthenticator.auto_login = Bool(False)#
Automatically begin the login process
rather than starting with a “Login with…” link at
/hub/login
To work,
.login_url()
must give a URL other than the default/hub/login
, such as an oauth handler or another automatic login handler, registered with.get_handlers()
.Added in version 0.8.
- auto_login_oauth2_authorize c.LocalAuthenticator.auto_login_oauth2_authorize = Bool(False)#
Automatically begin login process for OAuth2 authorization requests
When another application is using JupyterHub as OAuth2 provider, it sends users to
/hub/api/oauth2/authorize
. If the user isn’t logged in already, and auto_login is not set, the user will be dumped on the hub’s home page, without any context on what to do next.Setting this to true will automatically redirect users to login if they aren’t logged in only on the
/hub/api/oauth2/authorize
endpoint.Added in version 1.5.
- blocked_users c.LocalAuthenticator.blocked_users = Set()#
Set of usernames that are not allowed to log in.
Use this with supported authenticators to restrict which users can not log in. This is an additional block list that further restricts users, beyond whatever restrictions the authenticator has in place.
If empty, does not perform any additional restriction.
Changed in version 1.2:
Authenticator.blacklist
renamed toblocked_users
- check_allowed(username, authentication=None)#
Check if a username is allowed to authenticate based on configuration
Return True if username is allowed, False otherwise.
No allowed_users set means any username is allowed.
Names are normalized before being checked against the allowed set.
Changed in version 1.0: Signature updated to accept authentication data and any future changes
Changed in version 1.2: Renamed check_whitelist to check_allowed
- Parameters:
- Returns:
Whether the user is allowed
- Return type:
allowed (bool)
- Raises:
web.HTTPError(403) – Raising HTTPErrors directly allows customizing the message shown to the user.
- check_allowed_groups(username, authentication=None)#
If allowed_groups is configured, check if authenticating user is part of group.
- create_system_users c.LocalAuthenticator.create_system_users = Bool(False)#
If set to True, will attempt to create local system users if they do not exist already.
Supports Linux and BSD variants only.
- delete_invalid_users c.LocalAuthenticator.delete_invalid_users = Bool(False)#
Delete any users from the database that do not pass validation
When JupyterHub starts,
.add_user
will be called on each user in the database to verify that all users are still valid.If
delete_invalid_users
is True, any users that do not pass validation will be deleted from the database. Use this if users might be deleted from an external system, such as local user accounts.If False (default), invalid users remain in the Hub’s database and a warning will be issued. This is the default to avoid data loss due to config changes.
- enable_auth_state c.LocalAuthenticator.enable_auth_state = Bool(False)#
Enable persisting auth_state (if available).
auth_state will be encrypted and stored in the Hub’s database. This can include things like authentication tokens, etc. to be passed to Spawners as environment variables.
Encrypting auth_state requires the cryptography package.
Additionally, the JUPYTERHUB_CRYPT_KEY environment variable must contain one (or more, separated by ;) 32B encryption keys. These can be either base64 or hex-encoded.
If encryption is unavailable, auth_state cannot be persisted.
New in JupyterHub 0.8
- group_whitelist c.LocalAuthenticator.group_whitelist = Set()#
DEPRECATED: use allowed_groups
- manage_groups c.LocalAuthenticator.manage_groups = Bool(False)#
Let authenticator manage user groups
If True, Authenticator.authenticate and/or .refresh_user may return a list of group names in the ‘groups’ field, which will be assigned to the user.
All group-assignment APIs are disabled if this is True.
- manage_roles c.LocalAuthenticator.manage_roles = Bool(False)#
Let authenticator manage roles
If True, Authenticator.authenticate and/or .refresh_user may return a list of roles in the ‘roles’ field, which will be added to the database.
When enabled, all role management will be handled by the authenticator; in particular, assignment of roles via
JupyterHub.load_roles
traitlet will not be possible.Added in version 5.0.
- otp_prompt c.LocalAuthenticator.otp_prompt = Any('OTP:')#
The prompt string for the extra OTP (One Time Password) field.
Added in version 5.0.
- post_auth_hook c.LocalAuthenticator.post_auth_hook = Any(None)#
An optional hook function that you can implement to do some bootstrapping work during authentication. For example, loading user account details from an external system.
This function is called after the user has passed all authentication checks and is ready to successfully authenticate. This function must return the auth_model dict reguardless of changes to it. The hook is called with 3 positional arguments:
(authenticator, handler, auth_model)
.This may be a coroutine.
Example:
import os import pwd def my_hook(authenticator, handler, auth_model): user_data = pwd.getpwnam(auth_model['name']) spawn_data = { 'pw_data': user_data 'gid_list': os.getgrouplist(auth_model['name'], user_data.pw_gid) } if auth_model['auth_state'] is None: auth_model['auth_state'] = {} auth_model['auth_state']['spawn_data'] = spawn_data return auth_model c.Authenticator.post_auth_hook = my_hook
- refresh_pre_spawn c.LocalAuthenticator.refresh_pre_spawn = Bool(False)#
Force refresh of auth prior to spawn.
This forces
refresh_user()
to be called prior to launching a server, to ensure that auth state is up-to-date.This can be important when e.g. auth tokens that may have expired are passed to the spawner via environment variables from auth_state.
If refresh_user cannot refresh the user auth data, launch will fail until the user logs in again.
- request_otp c.LocalAuthenticator.request_otp = Bool(False)#
Prompt for OTP (One Time Password) in the login form.
Added in version 5.0.
- reset_managed_roles_on_startup c.LocalAuthenticator.reset_managed_roles_on_startup = Bool(False)#
Reset managed roles to result of
load_managed_roles()
on startup.- If True:
stale managed roles will be removed,
stale assignments to managed roles will be removed.
Any role not present in
load_managed_roles()
will be considered ‘stale’.The ‘stale’ status for role assignments is also determined from
load_managed_roles()
result:user role assignments status will depend on whether the
users
key is defined or not:if a list is defined under the
users
key and the user is not listed, then the user role assignment will be considered ‘stale’,if the
users
key is not provided, the user role assignment will be preserved;
service and group role assignments will be considered ‘stale’:
if not included in the
services
andgroups
list,if the
services
andgroups
keys are not provided.
Added in version 5.0.
- system_user_exists(user)#
Check if the user exists on the system
- uids c.LocalAuthenticator.uids = Dict()#
Dictionary of uids to use at user creation time. This helps ensure that users created from the database get the same uid each time they are created in temporary deployments or containers.
- username_map c.LocalAuthenticator.username_map = Dict()#
Dictionary mapping authenticator usernames to JupyterHub users.
Primarily used to normalize OAuth user names to local users.
- username_pattern c.LocalAuthenticator.username_pattern = Unicode('')#
Regular expression pattern that all valid usernames must match.
If a username does not match the pattern specified here, authentication will not be attempted.
If not set, allow any username.
- whitelist c.LocalAuthenticator.whitelist = Set()#
Deprecated, use
Authenticator.allowed_users
PAMAuthenticator
#
- class jupyterhub.auth.PAMAuthenticator(**kwargs: Any)#
Authenticate local UNIX users with PAM
- add_user_cmd c.PAMAuthenticator.add_user_cmd = Command()#
The command to use for creating users as a list of strings
For each element in the list, the string USERNAME will be replaced with the user’s username. The username will also be appended as the final argument.
For Linux, the default value is:
[‘adduser’, ‘-q’, ‘–gecos’, ‘””’, ‘–disabled-password’]
To specify a custom home directory, set this to:
[‘adduser’, ‘-q’, ‘–gecos’, ‘””’, ‘–home’, ‘/customhome/USERNAME’, ‘–disabled-password’]
This will run the command:
adduser -q –gecos “” –home /customhome/river –disabled-password river
when the user ‘river’ is created.
- admin_groups c.PAMAuthenticator.admin_groups = Set()#
Authoritative list of user groups that determine admin access. Users not in these groups can still be granted admin status through admin_users.
allowed/blocked rules still apply.
Note: As of JupyterHub 2.0, full admin rights should not be required, and more precise permissions can be managed via roles.
- admin_users c.PAMAuthenticator.admin_users = Set()#
Set of users that will have admin rights on this JupyterHub.
Note: As of JupyterHub 2.0, full admin rights should not be required, and more precise permissions can be managed via roles.
- Admin users have extra privileges:
Use the admin panel to see list of users logged in
Add / remove users in some authenticators
Restart / halt the hub
Start / stop users’ single-user servers
Can access each individual users’ single-user server (if configured)
Admin access should be treated the same way root access is.
Defaults to an empty set, in which case no user has admin access.
- allow_all c.PAMAuthenticator.allow_all = Bool(False)#
Allow every user who can successfully authenticate to access JupyterHub.
False by default, which means for most Authenticators, _some_ allow-related configuration is required to allow users to log in.
Authenticator subclasses may override the default with e.g.:
@default("allow_all") def _default_allow_all(self): # if _any_ auth config (depends on the Authenticator) if self.allowed_users or self.allowed_groups or self.allow_existing_users: return False else: return True
Added in version 5.0.
Changed in version 5.0: Prior to 5.0,
allow_all
wasn’t defined on its own, and was instead implicitly True when no allow config was provided, i.e.allowed_users
unspecified or empty on the base Authenticator class.To preserve pre-5.0 behavior, set
allow_all = True
if you have no other allow configuration.
- allow_existing_users c.PAMAuthenticator.allow_existing_users = Bool(False)#
Allow existing users to login.
Defaults to True if
allowed_users
is set for historical reasons, and False otherwise.With this enabled, all users present in the JupyterHub database are allowed to login. This has the effect of any user who has _previously_ been allowed to login via any means will continue to be allowed until the user is deleted via the /hub/admin page or REST API.
Warning
Before enabling this you should review the existing users in the JupyterHub admin panel at
/hub/admin
. You may find users existing there because they have previously been declared in config such asallowed_users
or allowed to sign in.Warning
When this is enabled and you wish to remove access for one or more users previously allowed, you must make sure that they are removed from the jupyterhub database. This can be tricky to do if you stop allowing an externally managed group of users for example.
With this enabled, JupyterHub admin users can visit
/hub/admin
or use JupyterHub’s REST API to add and remove users to manage who can login.Added in version 5.0.
- allowed_groups c.PAMAuthenticator.allowed_groups = Set()#
Allow login from all users in these UNIX groups.
Changed in version 5.0:
allowed_groups
may be specified together with allowed_users, to grant access by group OR name.
- allowed_users c.PAMAuthenticator.allowed_users = Set()#
Set of usernames that are allowed to log in.
Use this to limit which authenticated users may login. Default behavior: only users in this set are allowed.
If empty, does not perform any restriction, in which case any authenticated user is allowed.
Authenticators may extend
Authenticator.check_allowed()
to combineallowed_users
with other configuration to either expand or restrict access.Changed in version 1.2:
Authenticator.whitelist
renamed toallowed_users
- any_allow_config c.PAMAuthenticator.any_allow_config = Bool(False)#
Is there any allow config?
Used to show a warning if it looks like nobody can access the Hub, which can happen when upgrading to JupyterHub 5, now that
allow_all
defaults to False.Deployments can set this explicitly to True to suppress the “No allow config found” warning.
Will be True if any config tagged with
.tag(allow_config=True)
or starts withallow
is truthy.Added in version 5.0.
- auth_refresh_age c.PAMAuthenticator.auth_refresh_age = Int(300)#
The max age (in seconds) of authentication info before forcing a refresh of user auth info.
Refreshing auth info allows, e.g. requesting/re-validating auth tokens.
See
refresh_user()
for what happens when user auth info is refreshed (nothing by default).
- auto_login c.PAMAuthenticator.auto_login = Bool(False)#
Automatically begin the login process
rather than starting with a “Login with…” link at
/hub/login
To work,
.login_url()
must give a URL other than the default/hub/login
, such as an oauth handler or another automatic login handler, registered with.get_handlers()
.Added in version 0.8.
- auto_login_oauth2_authorize c.PAMAuthenticator.auto_login_oauth2_authorize = Bool(False)#
Automatically begin login process for OAuth2 authorization requests
When another application is using JupyterHub as OAuth2 provider, it sends users to
/hub/api/oauth2/authorize
. If the user isn’t logged in already, and auto_login is not set, the user will be dumped on the hub’s home page, without any context on what to do next.Setting this to true will automatically redirect users to login if they aren’t logged in only on the
/hub/api/oauth2/authorize
endpoint.Added in version 1.5.
- blocked_users c.PAMAuthenticator.blocked_users = Set()#
Set of usernames that are not allowed to log in.
Use this with supported authenticators to restrict which users can not log in. This is an additional block list that further restricts users, beyond whatever restrictions the authenticator has in place.
If empty, does not perform any additional restriction.
Changed in version 1.2:
Authenticator.blacklist
renamed toblocked_users
- check_account c.PAMAuthenticator.check_account = Bool(True)#
Whether to check the user’s account status via PAM during authentication.
The PAM account stack performs non-authentication based account management. It is typically used to restrict/permit access to a service and this step is needed to access the host’s user access control.
Disabling this can be dangerous as authenticated but unauthorized users may be granted access and, therefore, arbitrary execution on the system.
- create_system_users c.PAMAuthenticator.create_system_users = Bool(False)#
If set to True, will attempt to create local system users if they do not exist already.
Supports Linux and BSD variants only.
- delete_invalid_users c.PAMAuthenticator.delete_invalid_users = Bool(False)#
Delete any users from the database that do not pass validation
When JupyterHub starts,
.add_user
will be called on each user in the database to verify that all users are still valid.If
delete_invalid_users
is True, any users that do not pass validation will be deleted from the database. Use this if users might be deleted from an external system, such as local user accounts.If False (default), invalid users remain in the Hub’s database and a warning will be issued. This is the default to avoid data loss due to config changes.
- enable_auth_state c.PAMAuthenticator.enable_auth_state = Bool(False)#
Enable persisting auth_state (if available).
auth_state will be encrypted and stored in the Hub’s database. This can include things like authentication tokens, etc. to be passed to Spawners as environment variables.
Encrypting auth_state requires the cryptography package.
Additionally, the JUPYTERHUB_CRYPT_KEY environment variable must contain one (or more, separated by ;) 32B encryption keys. These can be either base64 or hex-encoded.
If encryption is unavailable, auth_state cannot be persisted.
New in JupyterHub 0.8
- encoding c.PAMAuthenticator.encoding = Unicode('utf8')#
The text encoding to use when communicating with PAM
- group_whitelist c.PAMAuthenticator.group_whitelist = Set()#
DEPRECATED: use allowed_groups
- manage_groups c.PAMAuthenticator.manage_groups = Bool(False)#
Let authenticator manage user groups
If True, Authenticator.authenticate and/or .refresh_user may return a list of group names in the ‘groups’ field, which will be assigned to the user.
All group-assignment APIs are disabled if this is True.
- manage_roles c.PAMAuthenticator.manage_roles = Bool(False)#
Let authenticator manage roles
If True, Authenticator.authenticate and/or .refresh_user may return a list of roles in the ‘roles’ field, which will be added to the database.
When enabled, all role management will be handled by the authenticator; in particular, assignment of roles via
JupyterHub.load_roles
traitlet will not be possible.Added in version 5.0.
- open_sessions c.PAMAuthenticator.open_sessions = Bool(False)#
Whether to open a new PAM session when spawners are started.
This may trigger things like mounting shared filesystems, loading credentials, etc. depending on system configuration.
The lifecycle of PAM sessions is not correct, so many PAM session configurations will not work.
If any errors are encountered when opening/closing PAM sessions, this is automatically set to False.
Changed in version 2.2: Due to longstanding problems in the session lifecycle, this is now disabled by default. You may opt-in to opening sessions by setting this to True.
- otp_prompt c.PAMAuthenticator.otp_prompt = Any('OTP:')#
The prompt string for the extra OTP (One Time Password) field.
Added in version 5.0.
- pam_normalize_username c.PAMAuthenticator.pam_normalize_username = Bool(False)#
Round-trip the username via PAM lookups to make sure it is unique
PAM can accept multiple usernames that map to the same user, for example DOMAINusername in some cases. To prevent this, convert username into uid, then back to uid to normalize.
- post_auth_hook c.PAMAuthenticator.post_auth_hook = Any(None)#
An optional hook function that you can implement to do some bootstrapping work during authentication. For example, loading user account details from an external system.
This function is called after the user has passed all authentication checks and is ready to successfully authenticate. This function must return the auth_model dict reguardless of changes to it. The hook is called with 3 positional arguments:
(authenticator, handler, auth_model)
.This may be a coroutine.
Example:
import os import pwd def my_hook(authenticator, handler, auth_model): user_data = pwd.getpwnam(auth_model['name']) spawn_data = { 'pw_data': user_data 'gid_list': os.getgrouplist(auth_model['name'], user_data.pw_gid) } if auth_model['auth_state'] is None: auth_model['auth_state'] = {} auth_model['auth_state']['spawn_data'] = spawn_data return auth_model c.Authenticator.post_auth_hook = my_hook
- refresh_pre_spawn c.PAMAuthenticator.refresh_pre_spawn = Bool(False)#
Force refresh of auth prior to spawn.
This forces
refresh_user()
to be called prior to launching a server, to ensure that auth state is up-to-date.This can be important when e.g. auth tokens that may have expired are passed to the spawner via environment variables from auth_state.
If refresh_user cannot refresh the user auth data, launch will fail until the user logs in again.
- request_otp c.PAMAuthenticator.request_otp = Bool(False)#
Prompt for OTP (One Time Password) in the login form.
Added in version 5.0.
- reset_managed_roles_on_startup c.PAMAuthenticator.reset_managed_roles_on_startup = Bool(False)#
Reset managed roles to result of
load_managed_roles()
on startup.- If True:
stale managed roles will be removed,
stale assignments to managed roles will be removed.
Any role not present in
load_managed_roles()
will be considered ‘stale’.The ‘stale’ status for role assignments is also determined from
load_managed_roles()
result:user role assignments status will depend on whether the
users
key is defined or not:if a list is defined under the
users
key and the user is not listed, then the user role assignment will be considered ‘stale’,if the
users
key is not provided, the user role assignment will be preserved;
service and group role assignments will be considered ‘stale’:
if not included in the
services
andgroups
list,if the
services
andgroups
keys are not provided.
Added in version 5.0.
- service c.PAMAuthenticator.service = Unicode('login')#
The name of the PAM service to use for authentication
- uids c.PAMAuthenticator.uids = Dict()#
Dictionary of uids to use at user creation time. This helps ensure that users created from the database get the same uid each time they are created in temporary deployments or containers.
- username_map c.PAMAuthenticator.username_map = Dict()#
Dictionary mapping authenticator usernames to JupyterHub users.
Primarily used to normalize OAuth user names to local users.
- username_pattern c.PAMAuthenticator.username_pattern = Unicode('')#
Regular expression pattern that all valid usernames must match.
If a username does not match the pattern specified here, authentication will not be attempted.
If not set, allow any username.
- whitelist c.PAMAuthenticator.whitelist = Set()#
Deprecated, use
Authenticator.allowed_users
DummyAuthenticator
#
- class jupyterhub.auth.DummyAuthenticator(**kwargs: Any)#
Dummy Authenticator for testing
By default, any username + password is allowed If a non-empty password is set, any username will be allowed if it logs in with that password.
Added in version 1.0.
Added in version 5.0:
allow_all
defaults to True, preserving default behavior.- admin_users c.DummyAuthenticator.admin_users = Set()#
Set of users that will have admin rights on this JupyterHub.
Note: As of JupyterHub 2.0, full admin rights should not be required, and more precise permissions can be managed via roles.
- Admin users have extra privileges:
Use the admin panel to see list of users logged in
Add / remove users in some authenticators
Restart / halt the hub
Start / stop users’ single-user servers
Can access each individual users’ single-user server (if configured)
Admin access should be treated the same way root access is.
Defaults to an empty set, in which case no user has admin access.
- allow_all c.DummyAuthenticator.allow_all = Bool(False)#
Allow every user who can successfully authenticate to access JupyterHub.
False by default, which means for most Authenticators, _some_ allow-related configuration is required to allow users to log in.
Authenticator subclasses may override the default with e.g.:
@default("allow_all") def _default_allow_all(self): # if _any_ auth config (depends on the Authenticator) if self.allowed_users or self.allowed_groups or self.allow_existing_users: return False else: return True
Added in version 5.0.
Changed in version 5.0: Prior to 5.0,
allow_all
wasn’t defined on its own, and was instead implicitly True when no allow config was provided, i.e.allowed_users
unspecified or empty on the base Authenticator class.To preserve pre-5.0 behavior, set
allow_all = True
if you have no other allow configuration.
- allow_existing_users c.DummyAuthenticator.allow_existing_users = Bool(False)#
Allow existing users to login.
Defaults to True if
allowed_users
is set for historical reasons, and False otherwise.With this enabled, all users present in the JupyterHub database are allowed to login. This has the effect of any user who has _previously_ been allowed to login via any means will continue to be allowed until the user is deleted via the /hub/admin page or REST API.
Warning
Before enabling this you should review the existing users in the JupyterHub admin panel at
/hub/admin
. You may find users existing there because they have previously been declared in config such asallowed_users
or allowed to sign in.Warning
When this is enabled and you wish to remove access for one or more users previously allowed, you must make sure that they are removed from the jupyterhub database. This can be tricky to do if you stop allowing an externally managed group of users for example.
With this enabled, JupyterHub admin users can visit
/hub/admin
or use JupyterHub’s REST API to add and remove users to manage who can login.Added in version 5.0.
- allowed_users c.DummyAuthenticator.allowed_users = Set()#
Set of usernames that are allowed to log in.
Use this to limit which authenticated users may login. Default behavior: only users in this set are allowed.
If empty, does not perform any restriction, in which case any authenticated user is allowed.
Authenticators may extend
Authenticator.check_allowed()
to combineallowed_users
with other configuration to either expand or restrict access.Changed in version 1.2:
Authenticator.whitelist
renamed toallowed_users
- any_allow_config c.DummyAuthenticator.any_allow_config = Bool(False)#
Is there any allow config?
Used to show a warning if it looks like nobody can access the Hub, which can happen when upgrading to JupyterHub 5, now that
allow_all
defaults to False.Deployments can set this explicitly to True to suppress the “No allow config found” warning.
Will be True if any config tagged with
.tag(allow_config=True)
or starts withallow
is truthy.Added in version 5.0.
- auth_refresh_age c.DummyAuthenticator.auth_refresh_age = Int(300)#
The max age (in seconds) of authentication info before forcing a refresh of user auth info.
Refreshing auth info allows, e.g. requesting/re-validating auth tokens.
See
refresh_user()
for what happens when user auth info is refreshed (nothing by default).
- auto_login c.DummyAuthenticator.auto_login = Bool(False)#
Automatically begin the login process
rather than starting with a “Login with…” link at
/hub/login
To work,
.login_url()
must give a URL other than the default/hub/login
, such as an oauth handler or another automatic login handler, registered with.get_handlers()
.Added in version 0.8.
- auto_login_oauth2_authorize c.DummyAuthenticator.auto_login_oauth2_authorize = Bool(False)#
Automatically begin login process for OAuth2 authorization requests
When another application is using JupyterHub as OAuth2 provider, it sends users to
/hub/api/oauth2/authorize
. If the user isn’t logged in already, and auto_login is not set, the user will be dumped on the hub’s home page, without any context on what to do next.Setting this to true will automatically redirect users to login if they aren’t logged in only on the
/hub/api/oauth2/authorize
endpoint.Added in version 1.5.
- blocked_users c.DummyAuthenticator.blocked_users = Set()#
Set of usernames that are not allowed to log in.
Use this with supported authenticators to restrict which users can not log in. This is an additional block list that further restricts users, beyond whatever restrictions the authenticator has in place.
If empty, does not perform any additional restriction.
Changed in version 1.2:
Authenticator.blacklist
renamed toblocked_users
- delete_invalid_users c.DummyAuthenticator.delete_invalid_users = Bool(False)#
Delete any users from the database that do not pass validation
When JupyterHub starts,
.add_user
will be called on each user in the database to verify that all users are still valid.If
delete_invalid_users
is True, any users that do not pass validation will be deleted from the database. Use this if users might be deleted from an external system, such as local user accounts.If False (default), invalid users remain in the Hub’s database and a warning will be issued. This is the default to avoid data loss due to config changes.
- enable_auth_state c.DummyAuthenticator.enable_auth_state = Bool(False)#
Enable persisting auth_state (if available).
auth_state will be encrypted and stored in the Hub’s database. This can include things like authentication tokens, etc. to be passed to Spawners as environment variables.
Encrypting auth_state requires the cryptography package.
Additionally, the JUPYTERHUB_CRYPT_KEY environment variable must contain one (or more, separated by ;) 32B encryption keys. These can be either base64 or hex-encoded.
If encryption is unavailable, auth_state cannot be persisted.
New in JupyterHub 0.8
- manage_groups c.DummyAuthenticator.manage_groups = Bool(False)#
Let authenticator manage user groups
If True, Authenticator.authenticate and/or .refresh_user may return a list of group names in the ‘groups’ field, which will be assigned to the user.
All group-assignment APIs are disabled if this is True.
- manage_roles c.DummyAuthenticator.manage_roles = Bool(False)#
Let authenticator manage roles
If True, Authenticator.authenticate and/or .refresh_user may return a list of roles in the ‘roles’ field, which will be added to the database.
When enabled, all role management will be handled by the authenticator; in particular, assignment of roles via
JupyterHub.load_roles
traitlet will not be possible.Added in version 5.0.
- otp_prompt c.DummyAuthenticator.otp_prompt = Any('OTP:')#
The prompt string for the extra OTP (One Time Password) field.
Added in version 5.0.
- password c.DummyAuthenticator.password = Unicode('')#
Set a global password for all users wanting to log in.
This allows users with any username to log in with the same static password.
- post_auth_hook c.DummyAuthenticator.post_auth_hook = Any(None)#
An optional hook function that you can implement to do some bootstrapping work during authentication. For example, loading user account details from an external system.
This function is called after the user has passed all authentication checks and is ready to successfully authenticate. This function must return the auth_model dict reguardless of changes to it. The hook is called with 3 positional arguments:
(authenticator, handler, auth_model)
.This may be a coroutine.
Example:
import os import pwd def my_hook(authenticator, handler, auth_model): user_data = pwd.getpwnam(auth_model['name']) spawn_data = { 'pw_data': user_data 'gid_list': os.getgrouplist(auth_model['name'], user_data.pw_gid) } if auth_model['auth_state'] is None: auth_model['auth_state'] = {} auth_model['auth_state']['spawn_data'] = spawn_data return auth_model c.Authenticator.post_auth_hook = my_hook
- refresh_pre_spawn c.DummyAuthenticator.refresh_pre_spawn = Bool(False)#
Force refresh of auth prior to spawn.
This forces
refresh_user()
to be called prior to launching a server, to ensure that auth state is up-to-date.This can be important when e.g. auth tokens that may have expired are passed to the spawner via environment variables from auth_state.
If refresh_user cannot refresh the user auth data, launch will fail until the user logs in again.
- request_otp c.DummyAuthenticator.request_otp = Bool(False)#
Prompt for OTP (One Time Password) in the login form.
Added in version 5.0.
- reset_managed_roles_on_startup c.DummyAuthenticator.reset_managed_roles_on_startup = Bool(False)#
Reset managed roles to result of
load_managed_roles()
on startup.- If True:
stale managed roles will be removed,
stale assignments to managed roles will be removed.
Any role not present in
load_managed_roles()
will be considered ‘stale’.The ‘stale’ status for role assignments is also determined from
load_managed_roles()
result:user role assignments status will depend on whether the
users
key is defined or not:if a list is defined under the
users
key and the user is not listed, then the user role assignment will be considered ‘stale’,if the
users
key is not provided, the user role assignment will be preserved;
service and group role assignments will be considered ‘stale’:
if not included in the
services
andgroups
list,if the
services
andgroups
keys are not provided.
Added in version 5.0.
- username_map c.DummyAuthenticator.username_map = Dict()#
Dictionary mapping authenticator usernames to JupyterHub users.
Primarily used to normalize OAuth user names to local users.
- username_pattern c.DummyAuthenticator.username_pattern = Unicode('')#
Regular expression pattern that all valid usernames must match.
If a username does not match the pattern specified here, authentication will not be attempted.
If not set, allow any username.
- whitelist c.DummyAuthenticator.whitelist = Set()#
Deprecated, use
Authenticator.allowed_users
Spawners#
Module: jupyterhub.spawner
#
Contains base Spawner class & default implementation
Spawner
#
- class jupyterhub.spawner.Spawner(**kwargs: Any)#
Base class for spawning single-user notebook servers.
Subclass this, and override the following methods:
load_state
get_state
start
stop
poll
As JupyterHub supports multiple users, an instance of the Spawner subclass is created for each user. If there are 20 JupyterHub users, there will be 20 instances of the subclass.
- args c.Spawner.args = List()#
Extra arguments to be passed to the single-user server.
Some spawners allow shell-style expansion here, allowing you to use environment variables here. Most, including the default, do not. Consult the documentation for your spawner to verify!
- auth_state_hook c.Spawner.auth_state_hook = Any(None)#
An optional hook function that you can implement to pass
auth_state
to the spawner after it has been initialized but before it starts. Theauth_state
dictionary may be set by the.authenticate()
method of the authenticator. This hook enables you to pass some or all of that information to your spawner.Example:
def userdata_hook(spawner, auth_state): spawner.userdata = auth_state["userdata"] c.Spawner.auth_state_hook = userdata_hook
- cmd c.Spawner.cmd = Command()#
The command used for starting the single-user server.
Provide either a string or a list containing the path to the startup script command. Extra arguments, other than this path, should be provided via
args
.This is usually set if you want to start the single-user server in a different python environment (with virtualenv/conda) than JupyterHub itself.
Some spawners allow shell-style expansion here, allowing you to use environment variables. Most, including the default, do not. Consult the documentation for your spawner to verify!
- consecutive_failure_limit c.Spawner.consecutive_failure_limit = Int(0)#
Maximum number of consecutive failures to allow before shutting down JupyterHub.
This helps JupyterHub recover from a certain class of problem preventing launch in contexts where the Hub is automatically restarted (e.g. systemd, docker, kubernetes).
A limit of 0 means no limit and consecutive failures will not be tracked.
- cpu_guarantee c.Spawner.cpu_guarantee = Float(None)#
Minimum number of cpu-cores a single-user notebook server is guaranteed to have available.
If this value is set to 0.5, allows use of 50% of one CPU. If this value is set to 2, allows use of up to 2 CPUs.
This is a configuration setting. Your spawner must implement support for the limit to work. The default spawner,
LocalProcessSpawner
, does not implement this support. A custom spawner must add support for this setting for it to be enforced.
- cpu_limit c.Spawner.cpu_limit = Float(None)#
Maximum number of cpu-cores a single-user notebook server is allowed to use.
If this value is set to 0.5, allows use of 50% of one CPU. If this value is set to 2, allows use of up to 2 CPUs.
The single-user notebook server will never be scheduled by the kernel to use more cpu-cores than this. There is no guarantee that it can access this many cpu-cores.
This is a configuration setting. Your spawner must implement support for the limit to work. The default spawner,
LocalProcessSpawner
, does not implement this support. A custom spawner must add support for this setting for it to be enforced.
- async create_certs()#
Create and set ownership for the certs to be used for internal ssl
- Keyword Arguments:
alt_names (list) – a list of alternative names to identify the
see (server by,)
https – //en.wikipedia.org/wiki/Subject_Alternative_Name
override – override the default_names with the provided alt_names
- Returns:
Path to cert files and CA
- Return type:
This method creates certs for use with the singleuser notebook. It enables SSL and ensures that the notebook can perform bi-directional SSL auth with the hub (verification based on CA).
If the singleuser host has a name or ip other than localhost, an appropriate alternative name(s) must be passed for ssl verification by the hub to work. For example, for Jupyter hosts with an IP of 10.10.10.10 or DNS name of jupyter.example.com, this would be:
alt_names=[“IP:10.10.10.10”] alt_names=[“DNS:jupyter.example.com”]
respectively. The list can contain both the IP and DNS names to refer to the host by either IP or DNS name (note the
default_names
below).
- debug c.Spawner.debug = Bool(False)#
Enable debug-logging of the single-user server
- default_url c.Spawner.default_url = Unicode('')#
The URL the single-user server should start in.
{username}
will be expanded to the user’s usernameExample uses:
You can set
notebook_dir
to/
anddefault_url
to/tree/home/{username}
to allow people to navigate the whole filesystem from their notebook server, but still start in their home directory.Start with
/notebooks
instead of/tree
ifdefault_url
points to a notebook instead of a directory.You can set this to
/lab
to have JupyterLab start by default, rather than Jupyter Notebook.
- disable_user_config c.Spawner.disable_user_config = Bool(False)#
Disable per-user configuration of single-user servers.
When starting the user’s single-user server, any config file found in the user’s $HOME directory will be ignored.
Note: a user could circumvent this if the user modifies their Python environment, such as when they have their own conda environments / virtualenvs / containers.
- env_keep c.Spawner.env_keep = List()#
List of environment variables for the single-user server to inherit from the JupyterHub process.
This list is used to ensure that sensitive information in the JupyterHub process’s environment (such as
CONFIGPROXY_AUTH_TOKEN
) is not passed to the single-user server’s process.
- environment c.Spawner.environment = Dict()#
Extra environment variables to set for the single-user server’s process.
- Environment variables that end up in the single-user server’s process come from 3 sources:
This
environment
configurableThe JupyterHub process’ environment variables that are listed in
env_keep
Variables to establish contact between the single-user notebook and the hub (such as JUPYTERHUB_API_TOKEN)
The
environment
configurable should be set by JupyterHub administrators to add installation specific environment variables. It is a dict where the key is the name of the environment variable, and the value can be a string or a callable. If it is a callable, it will be called with one parameter (the spawner instance), and should return a string fairly quickly (no blocking operations please!).Note that the spawner class’ interface is not guaranteed to be exactly same across upgrades, so if you are using the callable take care to verify it continues to work after upgrades!
Changed in version 1.2: environment from this configuration has highest priority, allowing override of ‘default’ env variables, such as JUPYTERHUB_API_URL.
- format_string(s)#
Render a Python format string
Uses
Spawner.template_namespace()
to populate format namespace.
- get_args()#
Return the arguments to be passed after self.cmd
Doesn’t expect shell expansion to happen.
Changed in version 2.0: Prior to 2.0, JupyterHub passed some options such as ip, port, and default_url to the command-line. JupyterHub 2.0 no longer builds any CLI args other than
Spawner.cmd
andSpawner.args
. All values that come from jupyterhub itself will be passed via environment variables.
- get_env()#
Return the environment dict to use for the Spawner.
This applies things like
env_keep
, anything defined inSpawner.environment
, and adds the API token to the env.When overriding in subclasses, subclasses must call
super().get_env()
, extend the returned dict and return it.Use this to access the env in Spawner.start to allow extension in subclasses.
- get_state()#
Save state of spawner into database.
A black box of extra state for custom spawners. The returned value of this is passed to
load_state
.Subclasses should call
super().get_state()
, augment the state returned from there, and return that state.- Returns:
state – a JSONable dict of state
- Return type:
- http_timeout c.Spawner.http_timeout = Int(30)#
Timeout (in seconds) before giving up on a spawned HTTP server
Once a server has successfully been spawned, this is the amount of time we wait before assuming that the server is unable to accept connections.
- hub_connect_url c.Spawner.hub_connect_url = Unicode(None)#
The URL the single-user server should connect to the Hub.
If the Hub URL set in your JupyterHub config is not reachable from spawned notebooks, you can set differnt URL by this config.
Is None if you don’t need to change the URL.
- ip c.Spawner.ip = Unicode('127.0.0.1')#
The IP address (or hostname) the single-user server should listen on.
Usually either ‘127.0.0.1’ (default) or ‘0.0.0.0’.
The JupyterHub proxy implementation should be able to send packets to this interface.
Subclasses which launch remotely or in containers should override the default to ‘0.0.0.0’.
Changed in version 2.0: Default changed to ‘127.0.0.1’, from ‘’. In most cases, this does not result in a change in behavior, as ‘’ was interpreted as ‘unspecified’, which used the subprocesses’ own default, itself usually ‘127.0.0.1’.
- mem_guarantee c.Spawner.mem_guarantee = ByteSpecification(None)#
Minimum number of bytes a single-user notebook server is guaranteed to have available.
- Allows the following suffixes:
K -> Kilobytes
M -> Megabytes
G -> Gigabytes
T -> Terabytes
This is a configuration setting. Your spawner must implement support for the limit to work. The default spawner,
LocalProcessSpawner
, does not implement this support. A custom spawner must add support for this setting for it to be enforced.
- mem_limit c.Spawner.mem_limit = ByteSpecification(None)#
Maximum number of bytes a single-user notebook server is allowed to use.
- Allows the following suffixes:
K -> Kilobytes
M -> Megabytes
G -> Gigabytes
T -> Terabytes
If the single user server tries to allocate more memory than this, it will fail. There is no guarantee that the single-user notebook server will be able to allocate this much memory - only that it can not allocate more than this.
This is a configuration setting. Your spawner must implement support for the limit to work. The default spawner,
LocalProcessSpawner
, does not implement this support. A custom spawner must add support for this setting for it to be enforced.
- async move_certs(paths)#
Takes certificate paths and makes them available to the notebook server
- Parameters:
paths (dict) – a list of paths for key, cert, and CA. These paths will be resolvable and readable by the Hub process, but not necessarily by the notebook server.
- Returns:
- a list (potentially altered) of paths for key, cert, and CA.
These paths should be resolvable and readable by the notebook server to be launched.
- Return type:
.move_certs
is called after certs for the singleuser notebook have been created by create_certs.By default, certs are created in a standard, central location defined by
internal_certs_location
. For a local, single-host deployment of JupyterHub, this should suffice. If, however, singleuser notebooks are spawned on other hosts,.move_certs
should be overridden to move these files appropriately. This could mean usingscp
to copy them to another host, moving them to a volume mounted in a docker container, or exporting them as a secret in kubernetes.
- notebook_dir c.Spawner.notebook_dir = Unicode('')#
Path to the notebook directory for the single-user server.
The user sees a file listing of this directory when the notebook interface is started. The current interface does not easily allow browsing beyond the subdirectories in this directory’s tree.
~
will be expanded to the home directory of the user, and {username} will be replaced with the name of the user.Note that this does not prevent users from accessing files outside of this path! They can do so with many other means.
- oauth_client_allowed_scopes c.Spawner.oauth_client_allowed_scopes = Union()#
Allowed scopes for oauth tokens issued by this server’s oauth client.
This sets the maximum and default scopes assigned to oauth tokens issued by a single-user server’s oauth client (i.e. tokens stored in browsers after authenticating with the server), defining what actions the server can take on behalf of logged-in users.
Default is an empty list, meaning minimal permissions to identify users, no actions can be taken on their behalf.
If callable, will be called with the Spawner as a single argument. Callables may be async.
- oauth_roles c.Spawner.oauth_roles = Union()#
Allowed roles for oauth tokens.
Deprecated in 3.0: use oauth_client_allowed_scopes
This sets the maximum and default roles assigned to oauth tokens issued by a single-user server’s oauth client (i.e. tokens stored in browsers after authenticating with the server), defining what actions the server can take on behalf of logged-in users.
Default is an empty list, meaning minimal permissions to identify users, no actions can be taken on their behalf.
- options_form c.Spawner.options_form = Union()#
An HTML form for options a user can specify on launching their server.
The surrounding
<form>
element and the submit button are already provided.For example:
Set your key: <input name="key" val="default_key"></input> <br> Choose a letter: <select name="letter" multiple="true"> <option value="A">The letter A</option> <option value="B">The letter B</option> </select>
The data from this form submission will be passed on to your spawner in
self.user_options
Instead of a form snippet string, this could also be a callable that takes as one parameter the current spawner instance and returns a string. The callable will be called asynchronously if it returns a future, rather than a str. Note that the interface of the spawner class is not deemed stable across versions, so using this functionality might cause your JupyterHub upgrades to break.
- options_from_form c.Spawner.options_from_form = Callable()#
Interpret HTTP form data
Form data will always arrive as a dict of lists of strings. Override this function to understand single-values, numbers, etc.
This should coerce form data into the structure expected by self.user_options, which must be a dict, and should be JSON-serializeable, though it can contain bytes in addition to standard JSON data types.
This method should not have any side effects. Any handling of
user_options
should be done in.start()
to ensure consistent behavior across servers spawned via the API and form submission page.Instances will receive this data on self.user_options, after passing through this function, prior to
Spawner.start
.Changed in version 1.0: user_options are persisted in the JupyterHub database to be reused on subsequent spawns if no options are given. user_options is serialized to JSON as part of this persistence (with additional support for bytes in case of uploaded file data), and any non-bytes non-jsonable values will be replaced with None if the user_options are re-used.
- async poll()#
Check if the single-user process is running
- Returns:
None if single-user process is running. Integer exit status (0 if unknown), if it is not running.
State transitions, behavior, and return response:
If the Spawner has not been initialized (neither loaded state, nor called start), it should behave as if it is not running (status=0).
If the Spawner has not finished starting, it should behave as if it is running (status=None).
Design assumptions about when
poll
may be called:On Hub launch:
poll
may be called beforestart
when state is loaded on Hub launch.poll
should return exit status 0 (unknown) if the Spawner has not been initialized viaload_state
orstart
.If
.start()
is async:poll
may be called during any yielded portions of thestart
process.poll
should return None whenstart
is yielded, indicating that thestart
process has not yet completed.
- poll_interval c.Spawner.poll_interval = Int(30)#
Interval (in seconds) on which to poll the spawner for single-user server’s status.
At every poll interval, each spawner’s
.poll
method is called, which checks if the single-user server is still running. If it isn’t running, then JupyterHub modifies its own state accordingly and removes appropriate routes from the configurable proxy.
- poll_jitter c.Spawner.poll_jitter = Float(0.1)#
Jitter fraction for poll_interval.
Avoids alignment of poll calls for many Spawners, e.g. when restarting JupyterHub, which restarts all polls for running Spawners.
poll_jitter=0
means no jitter, 0.1 means 10%, etc.
- port c.Spawner.port = Int(0)#
The port for single-user servers to listen on.
Defaults to
0
, which uses a randomly allocated port number each time.If set to a non-zero value, all Spawners will use the same port, which only makes sense if each server is on a different address, e.g. in containers.
New in version 0.7.
- post_stop_hook c.Spawner.post_stop_hook = Any(None)#
An optional hook function that you can implement to do work after the spawner stops.
This can be set independent of any concrete spawner implementation.
- pre_spawn_hook c.Spawner.pre_spawn_hook = Any(None)#
An optional hook function that you can implement to do some bootstrapping work before the spawner starts. For example, create a directory for your user or load initial content.
This can be set independent of any concrete spawner implementation.
This maybe a coroutine.
Example:
def my_hook(spawner): username = spawner.user.name spawner.environment["GREETING"] = f"Hello {username}" c.Spawner.pre_spawn_hook = my_hook
- progress_ready_hook c.Spawner.progress_ready_hook = Any(None)#
An optional hook function that you can implement to modify the ready event, which will be shown to the user on the spawn progress page when their server is ready.
This can be set independent of any concrete spawner implementation.
This maybe a coroutine.
Example:
async def my_ready_hook(spawner, ready_event): ready_event["html_message"] = f"Server {spawner.name} is ready for {spawner.user.name}" return ready_event c.Spawner.progress_ready_hook = my_ready_hook
- server_token_scopes c.Spawner.server_token_scopes = Union()#
The list of scopes to request for $JUPYTERHUB_API_TOKEN
If not specified, the scopes in the
server
role will be used (unchanged from pre-4.0).If callable, will be called with the Spawner instance as its sole argument (JupyterHub user available as spawner.user).
JUPYTERHUB_API_TOKEN will be assigned the _subset_ of these scopes that are held by the user (as in oauth_client_allowed_scopes).
Added in version 4.0.
- ssl_alt_names c.Spawner.ssl_alt_names = List()#
List of SSL alt names
May be set in config if all spawners should have the same value(s), or set at runtime by Spawner that know their names.
- ssl_alt_names_include_local c.Spawner.ssl_alt_names_include_local = Bool(True)#
Whether to include
DNS:localhost
,IP:127.0.0.1
in alt names
- async start()#
Start the single-user server
Changed in version 0.7: Return ip, port instead of setting on self.user.server directly.
- start_timeout c.Spawner.start_timeout = Int(60)#
Timeout (in seconds) before giving up on starting of single-user server.
This is the timeout for start to return, not the timeout for the server to respond. Callers of spawner.start will assume that startup has failed if it takes longer than this. start should return when the server process is started and its location is known.
- async stop(now=False)#
Stop the single-user server
If
now
is False (default), shutdown the server as gracefully as possible, e.g. starting with SIGINT, then SIGTERM, then SIGKILL. Ifnow
is True, terminate the server immediately.The coroutine should return when the single-user server process is no longer running.
Must be a coroutine.
- template_namespace()#
Return the template namespace for format-string formatting.
Currently used on default_url and notebook_dir.
Subclasses may add items to the available namespace.
The default implementation includes:
{ 'username': user.name, 'base_url': users_base_url, }
- Returns:
namespace for string formatting.
- Return type:
ns (dict)
LocalProcessSpawner
#
- class jupyterhub.spawner.LocalProcessSpawner(**kwargs: Any)#
A Spawner that uses
subprocess.Popen
to start single-user servers as local processes.Requires local UNIX users matching the authenticated users to exist. Does not work on Windows.
This is the default spawner for JupyterHub.
Note: This spawner does not implement CPU / memory guarantees and limits.
- args c.LocalProcessSpawner.args = List()#
Extra arguments to be passed to the single-user server.
Some spawners allow shell-style expansion here, allowing you to use environment variables here. Most, including the default, do not. Consult the documentation for your spawner to verify!
- auth_state_hook c.LocalProcessSpawner.auth_state_hook = Any(None)#
An optional hook function that you can implement to pass
auth_state
to the spawner after it has been initialized but before it starts. Theauth_state
dictionary may be set by the.authenticate()
method of the authenticator. This hook enables you to pass some or all of that information to your spawner.Example:
def userdata_hook(spawner, auth_state): spawner.userdata = auth_state["userdata"] c.Spawner.auth_state_hook = userdata_hook
- cmd c.LocalProcessSpawner.cmd = Command()#
The command used for starting the single-user server.
Provide either a string or a list containing the path to the startup script command. Extra arguments, other than this path, should be provided via
args
.This is usually set if you want to start the single-user server in a different python environment (with virtualenv/conda) than JupyterHub itself.
Some spawners allow shell-style expansion here, allowing you to use environment variables. Most, including the default, do not. Consult the documentation for your spawner to verify!
- consecutive_failure_limit c.LocalProcessSpawner.consecutive_failure_limit = Int(0)#
Maximum number of consecutive failures to allow before shutting down JupyterHub.
This helps JupyterHub recover from a certain class of problem preventing launch in contexts where the Hub is automatically restarted (e.g. systemd, docker, kubernetes).
A limit of 0 means no limit and consecutive failures will not be tracked.
- cpu_guarantee c.LocalProcessSpawner.cpu_guarantee = Float(None)#
Minimum number of cpu-cores a single-user notebook server is guaranteed to have available.
If this value is set to 0.5, allows use of 50% of one CPU. If this value is set to 2, allows use of up to 2 CPUs.
This is a configuration setting. Your spawner must implement support for the limit to work. The default spawner,
LocalProcessSpawner
, does not implement this support. A custom spawner must add support for this setting for it to be enforced.
- cpu_limit c.LocalProcessSpawner.cpu_limit = Float(None)#
Maximum number of cpu-cores a single-user notebook server is allowed to use.
If this value is set to 0.5, allows use of 50% of one CPU. If this value is set to 2, allows use of up to 2 CPUs.
The single-user notebook server will never be scheduled by the kernel to use more cpu-cores than this. There is no guarantee that it can access this many cpu-cores.
This is a configuration setting. Your spawner must implement support for the limit to work. The default spawner,
LocalProcessSpawner
, does not implement this support. A custom spawner must add support for this setting for it to be enforced.
- debug c.LocalProcessSpawner.debug = Bool(False)#
Enable debug-logging of the single-user server
- default_url c.LocalProcessSpawner.default_url = Unicode('')#
The URL the single-user server should start in.
{username}
will be expanded to the user’s usernameExample uses:
You can set
notebook_dir
to/
anddefault_url
to/tree/home/{username}
to allow people to navigate the whole filesystem from their notebook server, but still start in their home directory.Start with
/notebooks
instead of/tree
ifdefault_url
points to a notebook instead of a directory.You can set this to
/lab
to have JupyterLab start by default, rather than Jupyter Notebook.
- disable_user_config c.LocalProcessSpawner.disable_user_config = Bool(False)#
Disable per-user configuration of single-user servers.
When starting the user’s single-user server, any config file found in the user’s $HOME directory will be ignored.
Note: a user could circumvent this if the user modifies their Python environment, such as when they have their own conda environments / virtualenvs / containers.
- env_keep c.LocalProcessSpawner.env_keep = List()#
List of environment variables for the single-user server to inherit from the JupyterHub process.
This list is used to ensure that sensitive information in the JupyterHub process’s environment (such as
CONFIGPROXY_AUTH_TOKEN
) is not passed to the single-user server’s process.
- environment c.LocalProcessSpawner.environment = Dict()#
Extra environment variables to set for the single-user server’s process.
- Environment variables that end up in the single-user server’s process come from 3 sources:
This
environment
configurableThe JupyterHub process’ environment variables that are listed in
env_keep
Variables to establish contact between the single-user notebook and the hub (such as JUPYTERHUB_API_TOKEN)
The
environment
configurable should be set by JupyterHub administrators to add installation specific environment variables. It is a dict where the key is the name of the environment variable, and the value can be a string or a callable. If it is a callable, it will be called with one parameter (the spawner instance), and should return a string fairly quickly (no blocking operations please!).Note that the spawner class’ interface is not guaranteed to be exactly same across upgrades, so if you are using the callable take care to verify it continues to work after upgrades!
Changed in version 1.2: environment from this configuration has highest priority, allowing override of ‘default’ env variables, such as JUPYTERHUB_API_URL.
- http_timeout c.LocalProcessSpawner.http_timeout = Int(30)#
Timeout (in seconds) before giving up on a spawned HTTP server
Once a server has successfully been spawned, this is the amount of time we wait before assuming that the server is unable to accept connections.
- hub_connect_url c.LocalProcessSpawner.hub_connect_url = Unicode(None)#
The URL the single-user server should connect to the Hub.
If the Hub URL set in your JupyterHub config is not reachable from spawned notebooks, you can set differnt URL by this config.
Is None if you don’t need to change the URL.
- interrupt_timeout c.LocalProcessSpawner.interrupt_timeout = Int(10)#
Seconds to wait for single-user server process to halt after SIGINT.
If the process has not exited cleanly after this many seconds, a SIGTERM is sent.
- ip c.LocalProcessSpawner.ip = Unicode('127.0.0.1')#
The IP address (or hostname) the single-user server should listen on.
Usually either ‘127.0.0.1’ (default) or ‘0.0.0.0’.
The JupyterHub proxy implementation should be able to send packets to this interface.
Subclasses which launch remotely or in containers should override the default to ‘0.0.0.0’.
Changed in version 2.0: Default changed to ‘127.0.0.1’, from ‘’. In most cases, this does not result in a change in behavior, as ‘’ was interpreted as ‘unspecified’, which used the subprocesses’ own default, itself usually ‘127.0.0.1’.
- kill_timeout c.LocalProcessSpawner.kill_timeout = Int(5)#
Seconds to wait for process to halt after SIGKILL before giving up.
If the process does not exit cleanly after this many seconds of SIGKILL, it becomes a zombie process. The hub process will log a warning and then give up.
- mem_guarantee c.LocalProcessSpawner.mem_guarantee = ByteSpecification(None)#
Minimum number of bytes a single-user notebook server is guaranteed to have available.
- Allows the following suffixes:
K -> Kilobytes
M -> Megabytes
G -> Gigabytes
T -> Terabytes
This is a configuration setting. Your spawner must implement support for the limit to work. The default spawner,
LocalProcessSpawner
, does not implement this support. A custom spawner must add support for this setting for it to be enforced.
- mem_limit c.LocalProcessSpawner.mem_limit = ByteSpecification(None)#
Maximum number of bytes a single-user notebook server is allowed to use.
- Allows the following suffixes:
K -> Kilobytes
M -> Megabytes
G -> Gigabytes
T -> Terabytes
If the single user server tries to allocate more memory than this, it will fail. There is no guarantee that the single-user notebook server will be able to allocate this much memory - only that it can not allocate more than this.
This is a configuration setting. Your spawner must implement support for the limit to work. The default spawner,
LocalProcessSpawner
, does not implement this support. A custom spawner must add support for this setting for it to be enforced.
- notebook_dir c.LocalProcessSpawner.notebook_dir = Unicode('')#
Path to the notebook directory for the single-user server.
The user sees a file listing of this directory when the notebook interface is started. The current interface does not easily allow browsing beyond the subdirectories in this directory’s tree.
~
will be expanded to the home directory of the user, and {username} will be replaced with the name of the user.Note that this does not prevent users from accessing files outside of this path! They can do so with many other means.
- oauth_client_allowed_scopes c.LocalProcessSpawner.oauth_client_allowed_scopes = Union()#
Allowed scopes for oauth tokens issued by this server’s oauth client.
This sets the maximum and default scopes assigned to oauth tokens issued by a single-user server’s oauth client (i.e. tokens stored in browsers after authenticating with the server), defining what actions the server can take on behalf of logged-in users.
Default is an empty list, meaning minimal permissions to identify users, no actions can be taken on their behalf.
If callable, will be called with the Spawner as a single argument. Callables may be async.
- oauth_roles c.LocalProcessSpawner.oauth_roles = Union()#
Allowed roles for oauth tokens.
Deprecated in 3.0: use oauth_client_allowed_scopes
This sets the maximum and default roles assigned to oauth tokens issued by a single-user server’s oauth client (i.e. tokens stored in browsers after authenticating with the server), defining what actions the server can take on behalf of logged-in users.
Default is an empty list, meaning minimal permissions to identify users, no actions can be taken on their behalf.
- options_form c.LocalProcessSpawner.options_form = Union()#
An HTML form for options a user can specify on launching their server.
The surrounding
<form>
element and the submit button are already provided.For example:
Set your key: <input name="key" val="default_key"></input> <br> Choose a letter: <select name="letter" multiple="true"> <option value="A">The letter A</option> <option value="B">The letter B</option> </select>
The data from this form submission will be passed on to your spawner in
self.user_options
Instead of a form snippet string, this could also be a callable that takes as one parameter the current spawner instance and returns a string. The callable will be called asynchronously if it returns a future, rather than a str. Note that the interface of the spawner class is not deemed stable across versions, so using this functionality might cause your JupyterHub upgrades to break.
- options_from_form c.LocalProcessSpawner.options_from_form = Callable()#
Interpret HTTP form data
Form data will always arrive as a dict of lists of strings. Override this function to understand single-values, numbers, etc.
This should coerce form data into the structure expected by self.user_options, which must be a dict, and should be JSON-serializeable, though it can contain bytes in addition to standard JSON data types.
This method should not have any side effects. Any handling of
user_options
should be done in.start()
to ensure consistent behavior across servers spawned via the API and form submission page.Instances will receive this data on self.user_options, after passing through this function, prior to
Spawner.start
.Changed in version 1.0: user_options are persisted in the JupyterHub database to be reused on subsequent spawns if no options are given. user_options is serialized to JSON as part of this persistence (with additional support for bytes in case of uploaded file data), and any non-bytes non-jsonable values will be replaced with None if the user_options are re-used.
- poll_interval c.LocalProcessSpawner.poll_interval = Int(30)#
Interval (in seconds) on which to poll the spawner for single-user server’s status.
At every poll interval, each spawner’s
.poll
method is called, which checks if the single-user server is still running. If it isn’t running, then JupyterHub modifies its own state accordingly and removes appropriate routes from the configurable proxy.
- poll_jitter c.LocalProcessSpawner.poll_jitter = Float(0.1)#
Jitter fraction for poll_interval.
Avoids alignment of poll calls for many Spawners, e.g. when restarting JupyterHub, which restarts all polls for running Spawners.
poll_jitter=0
means no jitter, 0.1 means 10%, etc.
- popen_kwargs c.LocalProcessSpawner.popen_kwargs = Dict()#
Extra keyword arguments to pass to Popen
when spawning single-user servers.
For example:
popen_kwargs = dict(shell=True)
- port c.LocalProcessSpawner.port = Int(0)#
The port for single-user servers to listen on.
Defaults to
0
, which uses a randomly allocated port number each time.If set to a non-zero value, all Spawners will use the same port, which only makes sense if each server is on a different address, e.g. in containers.
New in version 0.7.
- post_stop_hook c.LocalProcessSpawner.post_stop_hook = Any(None)#
An optional hook function that you can implement to do work after the spawner stops.
This can be set independent of any concrete spawner implementation.
- pre_spawn_hook c.LocalProcessSpawner.pre_spawn_hook = Any(None)#
An optional hook function that you can implement to do some bootstrapping work before the spawner starts. For example, create a directory for your user or load initial content.
This can be set independent of any concrete spawner implementation.
This maybe a coroutine.
Example:
def my_hook(spawner): username = spawner.user.name spawner.environment["GREETING"] = f"Hello {username}" c.Spawner.pre_spawn_hook = my_hook
- progress_ready_hook c.LocalProcessSpawner.progress_ready_hook = Any(None)#
An optional hook function that you can implement to modify the ready event, which will be shown to the user on the spawn progress page when their server is ready.
This can be set independent of any concrete spawner implementation.
This maybe a coroutine.
Example:
async def my_ready_hook(spawner, ready_event): ready_event["html_message"] = f"Server {spawner.name} is ready for {spawner.user.name}" return ready_event c.Spawner.progress_ready_hook = my_ready_hook
- server_token_scopes c.LocalProcessSpawner.server_token_scopes = Union()#
The list of scopes to request for $JUPYTERHUB_API_TOKEN
If not specified, the scopes in the
server
role will be used (unchanged from pre-4.0).If callable, will be called with the Spawner instance as its sole argument (JupyterHub user available as spawner.user).
JUPYTERHUB_API_TOKEN will be assigned the _subset_ of these scopes that are held by the user (as in oauth_client_allowed_scopes).
Added in version 4.0.
- shell_cmd c.LocalProcessSpawner.shell_cmd = Command()#
Specify a shell command to launch.
The single-user command will be appended to this list, so it sould end with
-c
(for bash) or equivalent.For example:
c.LocalProcessSpawner.shell_cmd = ['bash', '-l', '-c']
to launch with a bash login shell, which would set up the user’s own complete environment.
Warning
Using shell_cmd gives users control over PATH, etc., which could change what the jupyterhub-singleuser launch command does. Only use this for trusted users.
- ssl_alt_names c.LocalProcessSpawner.ssl_alt_names = List()#
List of SSL alt names
May be set in config if all spawners should have the same value(s), or set at runtime by Spawner that know their names.
- ssl_alt_names_include_local c.LocalProcessSpawner.ssl_alt_names_include_local = Bool(True)#
Whether to include
DNS:localhost
,IP:127.0.0.1
in alt names
- start_timeout c.LocalProcessSpawner.start_timeout = Int(60)#
Timeout (in seconds) before giving up on starting of single-user server.
This is the timeout for start to return, not the timeout for the server to respond. Callers of spawner.start will assume that startup has failed if it takes longer than this. start should return when the server process is started and its location is known.
- term_timeout c.LocalProcessSpawner.term_timeout = Int(5)#
Seconds to wait for single-user server process to halt after SIGTERM.
If the process does not exit cleanly after this many seconds of SIGTERM, a SIGKILL is sent.
Proxies#
Module: jupyterhub.proxy
#
API for JupyterHub’s proxy.
Custom proxy implementations can subclass Proxy
and register in JupyterHub config:
from mymodule import MyProxy
c.JupyterHub.proxy_class = MyProxy
Route Specification:
A routespec is a URL prefix ([host]/path/), e.g. ‘host.tld/path/’ for host-based routing or ‘/path/’ for default routing.
Route paths should be normalized to always start and end with ‘/’
Proxy
#
- class jupyterhub.proxy.Proxy(**kwargs: Any)#
Base class for configurable proxies that JupyterHub can use.
A proxy implementation should subclass this and must define the following methods:
get_all_routes()
return a dictionary of all JupyterHub-related routesadd_route()
adds a routedelete_route()
deletes a route
In addition to these, the following method(s) may need to be implemented:
start()
start the proxy, if it should be launched by the Hub instead of externally managed. If the proxy is externally managed, it should setshould_start
to False.
And the following method(s) are optional, but can be provided:
get_route()
gets a single route. There is a default implementation that extracts data fromget_all_routes()
, but implementations may choose to provide a more efficient implementation of fetching a single route.
- async add_all_services(service_dict)#
Update the proxy table from the database.
Used when loading up a new proxy.
- async add_all_users(user_dict)#
Update the proxy table from the database.
Used when loading up a new proxy.
- add_hub_route(hub)#
Add the default route for the Hub
- async add_route(routespec, target, data)#
Add a route to the proxy.
Subclasses must define this method
- Parameters:
routespec (str) – A URL prefix ([host]/path/) for which this route will be matched, e.g. host.name/path/
target (str) – A full URL that will be the target of this route.
data (dict) – A JSONable dict that will be associated with this route, and will be returned when retrieving information about this route.
Will raise an appropriate Exception (FIXME: find what?) if the route could not be added.
The proxy implementation should also have a way to associate the fact that a route came from JupyterHub.
- async add_service(service, client=None)#
Add a service’s server to the proxy table.
- async add_user(user, server_name='', client=None)#
Add a user’s server to the proxy table.
- async check_routes(user_dict, service_dict, routes=None)#
Check that all users are properly routed on the proxy.
- async delete_route(routespec)#
Delete a route with a given routespec if it exists.
Subclasses must define this method
- async delete_service(service, client=None)#
Remove a service’s server from the proxy table.
- async delete_user(user, server_name='')#
Remove a user’s server from the proxy table.
- extra_routes c.Proxy.extra_routes = Dict()#
Additional routes to be maintained in the proxy.
A dictionary with a route specification as key, and a URL as target. The hub will ensure this route is present in the proxy.
If the hub is running in host based mode (with JupyterHub.subdomain_host set), the routespec must have a domain component (example.com/my-url/). If the hub is not running in host based mode, the routespec must not have a domain component (/my-url/).
Helpful when the hub is running in API-only mode.
- async get_all_routes()#
Fetch and return all the routes associated by JupyterHub from the proxy.
Subclasses must define this method
Should return a dictionary of routes, where the keys are routespecs and each value is a dict of the form:
{ 'routespec': the route specification ([host]/path/) 'target': the target host URL (proto://host) for this route 'data': the attached data dict for this route (as specified in add_route) }
- async get_route(routespec)#
Return the route info for a given routespec.
- Parameters:
routespec (str) – A URI that was used to add this route, e.g.
host.tld/path/
- Returns:
dict with the following keys:
'routespec': The normalized route specification passed in to add_route ([host]/path/) 'target': The target host for this route (proto://host) 'data': The arbitrary data dict that was passed in by JupyterHub when adding this route.
None: if there are no routes matching the given routespec
- Return type:
result (dict)
- should_start c.Proxy.should_start = Bool(True)#
Should the Hub start the proxy
If True, the Hub will start the proxy and stop it. Set to False if the proxy is managed externally, such as by systemd, docker, or another service manager.
- start()#
Start the proxy.
Will be called during startup if should_start is True.
Subclasses must define this method if the proxy is to be started by the Hub
- stop()#
Stop the proxy.
Will be called during teardown if should_start is True.
Subclasses must define this method if the proxy is to be started by the Hub
- validate_routespec(routespec)#
Validate a routespec
Checks host value vs host-based routing.
Ensures trailing slash on path.
ConfigurableHTTPProxy
#
- class jupyterhub.proxy.ConfigurableHTTPProxy(**kwargs: Any)#
Proxy implementation for the default configurable-http-proxy.
This is the default proxy implementation for running the nodejs proxy
configurable-http-proxy
.If the proxy should not be run as a subprocess of the Hub, (e.g. in a separate container), set:
c.ConfigurableHTTPProxy.should_start = False
- api_url c.ConfigurableHTTPProxy.api_url = Unicode('')#
The ip (or hostname) of the proxy’s API endpoint
- auth_token c.ConfigurableHTTPProxy.auth_token = Unicode('')#
The Proxy auth token
Loaded from the CONFIGPROXY_AUTH_TOKEN env variable by default.
- check_running_interval c.ConfigurableHTTPProxy.check_running_interval = Int(5)#
Interval (in seconds) at which to check if the proxy is running.
- command c.ConfigurableHTTPProxy.command = Command()#
The command to start the proxy
- concurrency c.ConfigurableHTTPProxy.concurrency = Int(10)#
The number of requests allowed to be concurrently outstanding to the proxy
Limiting this number avoids potential timeout errors by sending too many requests to update the proxy at once
- debug c.ConfigurableHTTPProxy.debug = Bool(False)#
Add debug-level logging to the Proxy.
- extra_routes c.ConfigurableHTTPProxy.extra_routes = Dict()#
Additional routes to be maintained in the proxy.
A dictionary with a route specification as key, and a URL as target. The hub will ensure this route is present in the proxy.
If the hub is running in host based mode (with JupyterHub.subdomain_host set), the routespec must have a domain component (example.com/my-url/). If the hub is not running in host based mode, the routespec must not have a domain component (/my-url/).
Helpful when the hub is running in API-only mode.
- log_level c.ConfigurableHTTPProxy.log_level = CaselessStrEnum('info')#
Proxy log level
- pid_file c.ConfigurableHTTPProxy.pid_file = Unicode('jupyterhub-proxy.pid')#
File in which to write the PID of the proxy process.
- should_start c.ConfigurableHTTPProxy.should_start = Bool(True)#
Should the Hub start the proxy
If True, the Hub will start the proxy and stop it. Set to False if the proxy is managed externally, such as by systemd, docker, or another service manager.
Users#
Module: jupyterhub.user
#
UserDict
#
- class jupyterhub.user.UserDict(db_factory, settings)#
Like defaultdict, but for users
Users can be retrieved by:
integer database id
orm.User object
username str
A User wrapper object is always returned.
This dict contains at least all active users, but not necessarily all users in the database.
Checking
key in userdict
returns whether an item is already in the cache, not whether it is in the database.Changed in version 1.2:
'username' in userdict
pattern is now supported- add(orm_user)#
Add a user to the UserDict
- count_active_users()#
Count the number of user servers that are active/pending/ready
Returns dict with counts of active/pending/ready servers
- delete(key)#
Delete a user from the cache and the database
- get(key, default=None)#
Retrieve a User object if it can be found, else default
Lookup can be by User object, id, or name
Changed in version 1.2:
get()
accesses the database instead of just the cache by integer id, so is equivalent to catching KeyErrors on attempted lookup.
User
#
Services#
Module: jupyterhub.services.service
#
A service is a process that talks to JupyterHub.
- Types of services:
- Managed:
managed by JupyterHub (always subprocess, no custom Spawners)
always a long-running process
managed services are restarted automatically if they exit unexpectedly
- Unmanaged:
managed by external service (docker, systemd, etc.)
do not need to be long-running processes, or processes at all
- URL: needs a route added to the proxy.
Public route will always be /services/service-name
url specified in config
if port is 0, Hub will select a port
- API access:
admin: tokens will have admin-access to the API
not admin: tokens will only have non-admin access (not much they can do other than defer to Hub for auth)
An externally managed service running on a URL:
{
'name': 'my-service',
'url': 'https://host:8888',
'admin': True,
'api_token': 'super-secret',
}
A hub-managed service with no URL:
{
'name': 'cull-idle',
'command': ['python', '/path/to/cull-idle']
'admin': True,
}
Service
#
- class jupyterhub.services.service.Service(**kwargs: Any)#
An object wrapping a service specification for Hub API consumers.
A service has inputs:
- name: str
the name of the service
- admin: bool(False)
whether the service should have administrative privileges
- url: str (None)
The URL where the service is/should be. If specified, the service will be added to the proxy at /services/:name
- oauth_no_confirm: bool(False)
Whether this service should be allowed to complete oauth with logged-in users without prompting for confirmation.
If a service is to be managed by the Hub, it has a few extra options:
- command: (str/Popen list)
Command for JupyterHub to spawn the service. Only use this if the service should be a subprocess. If command is not specified, it is assumed to be managed by an external process.
- environment: dict
Additional environment variables for the service.
- user: str
The name of a system user to become. If unspecified, run as the same user as the Hub.
- admin Bool(False)#
Does the service need admin-access to the Hub API?
- api_token Unicode('')#
The API token to use for the service.
If unspecified, an API token will be generated for managed services.
- command Command()#
Command to spawn this service, if managed.
- cwd Unicode('')#
The working directory in which to run the service.
- environment Dict()#
Environment variables to pass to the service. Only used if the Hub is spawning the service.
- property kind#
The name of the kind of service as a string
‘managed’ for managed services
‘external’ for external services
- property managed#
Am I managed by the Hub?
- name Unicode('')#
The name of the service.
If the service has an http endpoint, it
- oauth_client_id Unicode('')#
OAuth client ID for this service.
You shouldn’t generally need to change this. Default:
service-<name>
- url Unicode('')#
URL of the service.
Only specify if the service runs an HTTP(s) endpoint that. If managed, will be passed as JUPYTERHUB_SERVICE_URL env.
- user Unicode('')#
The user to become when launching the service.
If unspecified, run the service as the same user as the Hub.
Services Authentication#
Module: jupyterhub.services.auth
#
Authenticating services with JupyterHub.
Tokens are sent to the Hub for verification. The Hub replies with a JSON model describing the authenticated user.
This contains two levels of authentication:
HubOAuth
- Use OAuth 2 to authenticate browsers with the Hub. This should be used for any service that should respond to browser requests (i.e. most services).HubAuth
- token-only authentication, for a service that only need to handle token-authenticated API requests
The Auth
classes (HubAuth
, HubOAuth
)
can be used in any application, even outside tornado.
They contain reference implementations of talking to the Hub API
to resolve a token to a user.
The Authenticated
classes (HubAuthenticated
, HubOAuthenticated
)
are mixins for tornado handlers that should authenticate with the Hub.
If you are using OAuth, you will also need to register an oauth callback handler to complete the oauth process.
A tornado implementation is provided in HubOAuthCallbackHandler
.
HubAuth
#
- class jupyterhub.services.auth.HubAuth(**kwargs: Any)#
A class for authenticating with JupyterHub
This can be used by any application.
Use this base class only for direct, token-authenticated applications (web APIs). For applications that support direct visits from browsers, use HubOAuth to enable OAuth redirect-based authentication.
If using tornado, use via
HubAuthenticated
mixin. If using manually, use the.user_for_token(token_value)
method to identify the user owning a given token.The following config must be set:
api_token (token for authenticating with JupyterHub API), fetched from the JUPYTERHUB_API_TOKEN env by default.
The following config MAY be set:
api_url: the base URL of the Hub’s internal API, fetched from JUPYTERHUB_API_URL by default.
cookie_cache_max_age: the number of seconds responses from the Hub should be cached.
login_url (the public
/hub/login
URL of the Hub).
- access_scopes c.HubAuth.access_scopes = Set()#
OAuth scopes to use for allowing access.
Get from $JUPYTERHUB_OAUTH_ACCESS_SCOPES by default.
- allow_token_in_url c.HubAuth.allow_token_in_url = Bool(False)#
Allow requests to pages with ?token=… in the URL
This allows starting a user session by sharing a URL with credentials, bypassing authentication with the Hub.
If False, tokens in URLs will be ignored by the server, except on websocket requests.
Has no effect on websocket requests, which can only reliably authenticate via token in the URL, as recommended by browser Websocket implementations.
This will default to False in JupyterHub 5.
Added in version 4.1.
Changed in version 5.0: default changed to False
- allow_websocket_cookie_auth c.HubAuth.allow_websocket_cookie_auth = Bool(True)#
Allow websocket requests with only cookie for authentication
Cookie-authenticated websockets cannot be protected from other user servers unless per-user domains are used. Disabling cookie auth on websockets protects user servers from each other, but may break some user applications. Per-user domains eliminate the need to lock this down.
JupyterLab 4.1.2 and Notebook 6.5.6, 7.1.0 will not work because they rely on cookie authentication without API or XSRF tokens.
Added in version 4.1.
- api_token c.HubAuth.api_token = Unicode('')#
API key for accessing Hub API.
Default: $JUPYTERHUB_API_TOKEN
Loaded from services configuration in jupyterhub_config. Will be auto-generated for hub-managed services.
- api_url c.HubAuth.api_url = Unicode('http://127.0.0.1:8081/hub/api')#
The base API URL of the Hub.
Typically
http://hub-ip:hub-port/hub/api
Default: $JUPYTERHUB_API_URL
- base_url c.HubAuth.base_url = Unicode('/')#
The base URL prefix of this application
e.g. /services/service-name/ or /user/name/
Default: get from JUPYTERHUB_SERVICE_PREFIX
- cache_max_age c.HubAuth.cache_max_age = Int(300)#
The maximum time (in seconds) to cache the Hub’s responses for authentication.
A larger value reduces load on the Hub and occasional response lag. A smaller value reduces propagation time of changes on the Hub (rare).
Default: 300 (five minutes)
- certfile c.HubAuth.certfile = Unicode('')#
The ssl cert to use for requests
Use with keyfile
- check_scopes(required_scopes, user)#
Check whether the user has required scope(s)
- client_ca c.HubAuth.client_ca = Unicode('')#
The ssl certificate authority to use to verify requests
Use with keyfile and certfile
- cookie_cache_max_age Int(0)#
DEPRECATED. Use cache_max_age
- cookie_host_prefix_enabled c.HubAuth.cookie_host_prefix_enabled = Bool(False)#
Enable
__Host-
prefix on authentication cookies.The
__Host-
prefix on JupyterHub cookies provides further protection against cookie tossing when untrusted servers may control subdomains of your jupyterhub deployment._However_, it also requires that cookies be set on the path
/
, which means they are shared by all JupyterHub components, so a compromised server component will have access to _all_ JupyterHub-related cookies of the visiting browser. It is recommended to only combine__Host-
cookies with per-user domains.Set via $JUPYTERHUB_COOKIE_HOST_PREFIX_ENABLED
- cookie_options c.HubAuth.cookie_options = Dict()#
Additional options to pass when setting cookies.
Can include things like
expires_days=None
for session-expiry orsecure=True
if served on HTTPS and default HTTPS discovery fails (e.g. behind some proxies).
- property cookie_path#
Path prefix on which to set cookies
self.base_url, but ‘/’ when cookie_host_prefix_enabled is True
- get_session_id(handler)#
Get the jupyterhub session id
from the jupyterhub-session-id cookie.
- Parameters:
handler (tornado.web.RequestHandler) – the current request handler
- get_token(handler, in_cookie=True)#
Get the token authenticating a request
Changed in version 2.2: in_cookie added. Previously, only URL params and header were considered. Pass
in_cookie=False
to preserve that behavior.in URL parameters: ?token=<token>
in header: Authorization: token <token>
in cookie (stored after oauth), if in_cookie is True
- Parameters:
handler (tornado.web.RequestHandler) – the current request handler
- get_user(handler, *, sync=True)#
Get the Hub user for a given tornado handler.
Checks cookie with the Hub to identify the current user.
Added in version 2.4: async support via
sync
argument.- Parameters:
handler (tornado.web.RequestHandler) – the current request handler
sync (bool) – whether to block for the result or return an awaitable
- Returns:
The user model, if a user is identified, None if authentication fails.
The ‘name’ field contains the user’s name.
- Return type:
user_model (dict)
- hub_host c.HubAuth.hub_host = Unicode('')#
The public host of JupyterHub
Only used if JupyterHub is spreading servers across subdomains.
- hub_prefix c.HubAuth.hub_prefix = Unicode('/hub/')#
The URL prefix for the Hub itself.
Typically /hub/ Default: $JUPYTERHUB_BASE_URL
- keyfile c.HubAuth.keyfile = Unicode('')#
The ssl key to use for requests
Use with certfile
- login_url c.HubAuth.login_url = Unicode('/hub/login')#
The login URL to use
Typically /hub/login
- user_for_cookie(encrypted_cookie, use_cache=True, session_id='')#
Deprecated and removed. Use HubOAuth to authenticate browsers.
- user_for_token(token, use_cache=True, session_id='', *, sync=True)#
Ask the Hub to identify the user for a given token.
Added in version 2.4: async support via
sync
argument.- Parameters:
- Returns:
The user model, if a user is identified, None if authentication fails.
The ‘name’ field contains the user’s name.
- Return type:
user_model (dict)
HubOAuth
#
- class jupyterhub.services.auth.HubOAuth(**kwargs: Any)#
HubAuth using OAuth for login instead of cookies set by the Hub.
Use this class if you want users to be able to visit your service with a browser. They will be authenticated via OAuth with the Hub.
- access_scopes c.HubOAuth.access_scopes = Set()#
OAuth scopes to use for allowing access.
Get from $JUPYTERHUB_OAUTH_ACCESS_SCOPES by default.
- allow_token_in_url c.HubOAuth.allow_token_in_url = Bool(False)#
Allow requests to pages with ?token=… in the URL
This allows starting a user session by sharing a URL with credentials, bypassing authentication with the Hub.
If False, tokens in URLs will be ignored by the server, except on websocket requests.
Has no effect on websocket requests, which can only reliably authenticate via token in the URL, as recommended by browser Websocket implementations.
This will default to False in JupyterHub 5.
Added in version 4.1.
Changed in version 5.0: default changed to False
- allow_websocket_cookie_auth c.HubOAuth.allow_websocket_cookie_auth = Bool(True)#
Allow websocket requests with only cookie for authentication
Cookie-authenticated websockets cannot be protected from other user servers unless per-user domains are used. Disabling cookie auth on websockets protects user servers from each other, but may break some user applications. Per-user domains eliminate the need to lock this down.
JupyterLab 4.1.2 and Notebook 6.5.6, 7.1.0 will not work because they rely on cookie authentication without API or XSRF tokens.
Added in version 4.1.
- api_token c.HubOAuth.api_token = Unicode('')#
API key for accessing Hub API.
Default: $JUPYTERHUB_API_TOKEN
Loaded from services configuration in jupyterhub_config. Will be auto-generated for hub-managed services.
- api_url c.HubOAuth.api_url = Unicode('http://127.0.0.1:8081/hub/api')#
The base API URL of the Hub.
Typically
http://hub-ip:hub-port/hub/api
Default: $JUPYTERHUB_API_URL
- base_url c.HubOAuth.base_url = Unicode('/')#
The base URL prefix of this application
e.g. /services/service-name/ or /user/name/
Default: get from JUPYTERHUB_SERVICE_PREFIX
- cache_max_age c.HubOAuth.cache_max_age = Int(300)#
The maximum time (in seconds) to cache the Hub’s responses for authentication.
A larger value reduces load on the Hub and occasional response lag. A smaller value reduces propagation time of changes on the Hub (rare).
Default: 300 (five minutes)
- certfile c.HubOAuth.certfile = Unicode('')#
The ssl cert to use for requests
Use with keyfile
- check_xsrf_cookie(handler)#
check_xsrf_cookie patch
Applies JupyterHub check_xsrf_cookie if not token authenticated
- clear_cookie(handler)#
Clear the OAuth cookie
- Parameters:
handler (tornado.web.RequestHandler) – the current request handler
- clear_oauth_state(state_id)#
Clear persisted oauth state
- clear_oauth_state_cookies(handler)#
Clear persisted oauth state
- client_ca c.HubOAuth.client_ca = Unicode('')#
The ssl certificate authority to use to verify requests
Use with keyfile and certfile
- cookie_host_prefix_enabled c.HubOAuth.cookie_host_prefix_enabled = Bool(False)#
Enable
__Host-
prefix on authentication cookies.The
__Host-
prefix on JupyterHub cookies provides further protection against cookie tossing when untrusted servers may control subdomains of your jupyterhub deployment._However_, it also requires that cookies be set on the path
/
, which means they are shared by all JupyterHub components, so a compromised server component will have access to _all_ JupyterHub-related cookies of the visiting browser. It is recommended to only combine__Host-
cookies with per-user domains.Set via $JUPYTERHUB_COOKIE_HOST_PREFIX_ENABLED
- property cookie_name#
Use OAuth client_id for cookie name
because we don’t want to use the same cookie name across OAuth clients.
- cookie_options c.HubOAuth.cookie_options = Dict()#
Additional options to pass when setting cookies.
Can include things like
expires_days=None
for session-expiry orsecure=True
if served on HTTPS and default HTTPS discovery fails (e.g. behind some proxies).
- generate_state(next_url=None, **extra_state)#
Generate a state string, given a next_url redirect target
The state info is stored locally in self._oauth_states, and only the state id is returned for use in the oauth state field (cookie, redirect param)
- Parameters:
next_url (str) – The URL of the page to redirect to on successful login.
- Returns:
state_id (str)
- Return type:
The state string to be used as a cookie value.
- get_next_url(state_id='', /)#
Get the next_url for redirection, given an encoded OAuth state
- get_state_cookie_name(state_id='', /)#
Get the cookie name for oauth state, given an encoded OAuth state
Cookie name is stored in the state itself because the cookie name is randomized to deal with races between concurrent oauth sequences.
- hub_host c.HubOAuth.hub_host = Unicode('')#
The public host of JupyterHub
Only used if JupyterHub is spreading servers across subdomains.
- hub_prefix c.HubOAuth.hub_prefix = Unicode('/hub/')#
The URL prefix for the Hub itself.
Typically /hub/ Default: $JUPYTERHUB_BASE_URL
- keyfile c.HubOAuth.keyfile = Unicode('')#
The ssl key to use for requests
Use with certfile
- login_url c.HubOAuth.login_url = Unicode('/hub/login')#
The login URL to use
Typically /hub/login
- oauth_authorization_url c.HubOAuth.oauth_authorization_url = Unicode('/hub/api/oauth2/authorize')#
The URL to redirect to when starting the OAuth process
- oauth_client_id c.HubOAuth.oauth_client_id = Unicode('')#
The OAuth client ID for this application.
Use JUPYTERHUB_CLIENT_ID by default.
- oauth_redirect_uri c.HubOAuth.oauth_redirect_uri = Unicode('')#
OAuth redirect URI
Should generally be /base_url/oauth_callback
- oauth_state_max_age c.HubOAuth.oauth_state_max_age = Int(600)#
Max age (seconds) of oauth state.
Governs both oauth state cookie Max-Age, as well as the in-memory _oauth_states cache.
- oauth_token_url c.HubOAuth.oauth_token_url = Unicode('')#
The URL for requesting an OAuth token from JupyterHub
- set_cookie(handler, access_token)#
Set a cookie recording OAuth result
- set_state_cookie(handler, next_url=None)#
Generate an OAuth state and store it in a cookie
- property state_cookie_name#
The cookie name for storing OAuth state
This cookie is only live for the duration of the OAuth handshake.
HubAuthenticated
#
- class jupyterhub.services.auth.HubAuthenticated#
Mixin for tornado handlers that are authenticated with JupyterHub
A handler that mixes this in must have the following attributes/properties:
.hub_auth: A HubAuth instance
.hub_scopes: A set of JupyterHub 2.0 OAuth scopes to allow. Default comes from .hub_auth.oauth_access_scopes, which in turn is set by $JUPYTERHUB_OAUTH_ACCESS_SCOPES Default values include: - ‘access:services’, ‘access:services!service={service_name}’ for services - ‘access:servers’, ‘access:servers!user={user}’, ‘access:servers!server={user}/{server_name}’ for single-user servers
If hub_scopes is not used (e.g. JupyterHub 1.x), these additional properties can be used:
.allow_admin: If True, allow any admin user. Default: False.
.hub_users: A set of usernames to allow. If left unspecified or None, username will not be checked.
.hub_groups: A set of group names to allow. If left unspecified or None, groups will not be checked.
.allow_admin: Is admin user access allowed or not If left unspecified or False, admin user won’t have an access.
Examples:
class MyHandler(HubAuthenticated, web.RequestHandler): def initialize(self, hub_auth): self.hub_auth = hub_auth @web.authenticated def get(self): ...
- property allow_all#
Property indicating that all successfully identified user or service should be allowed.
- check_hub_user(model)#
Check whether Hub-authenticated user or service should be allowed.
Returns the input if the user should be allowed, None otherwise.
Override for custom logic in authenticating users.
- get_current_user()#
Tornado’s authentication method
- Returns:
The user model, if a user is identified, None if authentication fails.
- Return type:
user_model (dict)
- get_login_url()#
Return the Hub’s login URL
- property hub_scopes#
Set of allowed scopes (use hub_auth.access_scopes by default)
HubOAuthenticated
#
- class jupyterhub.services.auth.HubOAuthenticated#
Simple subclass of HubAuthenticated using OAuth instead of old shared cookies
HubOAuthCallbackHandler
#
- class jupyterhub.services.auth.HubOAuthCallbackHandler(application: Application, request: HTTPServerRequest, **kwargs: Any)#
OAuth Callback handler
Finishes the OAuth flow, setting a cookie to record the user’s info.
Should be registered at
SERVICE_PREFIX/oauth_callback
Frequently asked questions#
Find answers to the most frequently asked questions about JupyterHub such as how to troubleshoot an issue.
FAQs#
Find answers to some of the most frequently-asked questions around JupyterHub and how it works.
Frequently asked questions#
Institutional FAQ#
This page contains common questions from users of JupyterHub, broken down by their roles within organizations.
For all#
Is it appropriate for adoption within a larger institutional context?#
Yes! JupyterHub has been used at-scale for large pools of users, as well as complex and high-performance computing. For example,
UC Berkeley uses JupyterHub for its Data Science Education Program courses (serving over 3,000 students).
The Pangeo project uses JupyterHub to provide access to scalable cloud computing with Dask.
JupyterHub is stable and customizable to the use-cases of large organizations.
I keep hearing about Jupyter Notebook, JupyterLab, and now JupyterHub. What’s the difference?#
Here is a quick breakdown of these three tools:
The Jupyter Notebook is a document specification (the
.ipynb
) file that interweaves narrative text with code cells and their outputs. It is also a graphical interface that allows users to edit these documents. There are also several other graphical interfaces that allow users to edit the.ipynb
format (nteract, Jupyter Lab, Google Colab, Kaggle, etc).JupyterLab is a flexible and extendible user interface for interactive computing. It has several extensions that are tailored for using Jupyter Notebooks, as well as extensions for other parts of the data science stack.
JupyterHub is an application that manages interactive computing sessions for multiple users. It also connects users with infrastructure they wish to access. It can provide remote access to Jupyter Notebooks and JupyterLab for many people.
For management#
Briefly, what problem does JupyterHub solve for us?#
JupyterHub provides a shared platform for data science and collaboration. It allows users to utilize familiar data science workflows (such as the scientific Python stack, the R tidyverse, and Jupyter Notebooks) on institutional infrastructure. It also gives administrators some control over access to resources, security, environments, and authentication.
Is JupyterHub mature? Why should we trust it?#
Yes - the core JupyterHub application recently reached 1.0 status, and is considered stable and performant for most institutions. JupyterHub has also been deployed (along with other tools) to work on scalable infrastructure, large datasets, and high-performance computing.
Who else uses JupyterHub?#
JupyterHub is used at a variety of institutions in academia, industry, and government research labs. It is most-commonly used by two kinds of groups:
Small teams (e.g., data science teams, research labs, or collaborative projects) to provide a shared resource for interactive computing, collaboration, and analytics.
Large teams (e.g., a department, a large class, or a large group of remote users) to provide access to organizational hardware, data, and analytics environments at scale.
Here is a sample of organizations that use JupyterHub:
Universities and colleges: UC Berkeley, UC San Diego, Cal Poly SLO, Harvard University, University of Chicago, University of Oslo, University of Sheffield, Université Paris Sud, University of Versailles
Research laboratories: NASA, NCAR, NOAA, the Large Synoptic Survey Telescope, Brookhaven National Lab, Minnesota Supercomputing Institute, ALCF, CERN, Lawrence Livermore National Laboratory, HUNT
Online communities: Pangeo, Quantopian, mybinder.org, MathHub, Open Humans
Computing infrastructure providers: NERSC, San Diego Supercomputing Center, Compute Canada
Companies: Capital One, SANDVIK code, Globus
See the Gallery of JupyterHub deployments for a more complete list of JupyterHub deployments at institutions.
How does JupyterHub compare with hosted products, like Google Colaboratory, RStudio.cloud, or Anaconda Enterprise?#
JupyterHub puts you in control of your data, infrastructure, and coding environment. In addition, it is vendor neutral, which reduces lock-in to a particular vendor or service. JupyterHub provides access to interactive computing environments in the cloud (similar to each of these services). Compared with the tools above, it is more flexible, more customizable, free, and gives administrators more control over their setup and hardware.
Because JupyterHub is an open-source, community-driven tool, it can be extended and modified to fit an institution’s needs. It plays nicely with the open source data science stack, and can serve a variety of computing environments, user interfaces, and computational hardware. It can also be deployed anywhere - on enterprise cloud infrastructure, on High-Performance-Computing machines, on local hardware, or even on a single laptop, which is not possible with most other tools for shared interactive computing.
For IT#
How would I set up JupyterHub on institutional hardware?#
That depends on what kind of hardware you’ve got. JupyterHub is flexible enough to be deployed on a variety of hardware, including in-room hardware, on-prem clusters, cloud infrastructure, etc.
The most common way to set up a JupyterHub is to use a JupyterHub distribution, these are pre-configured and opinionated ways to set up a JupyterHub on particular kinds of infrastructure. The two distributions that we currently suggest are:
Zero to JupyterHub for Kubernetes is a scalable JupyterHub deployment and guide that runs on Kubernetes. Better for larger or dynamic user groups (50-10,000) or more complex compute/data needs.
The Littlest JupyterHub is a lightweight JupyterHub that runs on a single machine (in the cloud or under your desk). Better for smaller user groups (4-80) or more lightweight computational resources.
Does JupyterHub run well in the cloud?#
Yes - most deployments of JupyterHub are run via cloud infrastructure and on a variety of cloud providers. Depending on the distribution of JupyterHub that you’d like to use, you can also connect your JupyterHub deployment with a number of other cloud-native services so that users have access to other resources from their interactive computing sessions.
For example, if you use the Zero to JupyterHub for Kubernetes distribution, you’ll be able to utilize container-based workflows of other technologies such as the dask-kubernetes project for distributed computing.
The Z2JH Helm Chart also has some functionality built in for auto-scaling your cluster up and down as more resources are needed - allowing you to utilize the benefits of a flexible cloud-based deployment.
Is JupyterHub secure?#
The short answer: yes. JupyterHub as a standalone application has been battle-tested at an institutional level for several years, and makes a number of “default” security decisions that are reasonable for most users.
For security considerations in the base JupyterHub application, see the JupyterHub security page.
For security considerations when deploying JupyterHub on Kubernetes, see the JupyterHub on Kubernetes security page.
The longer answer: it depends on your deployment. Because JupyterHub is very flexible, it can be used in a variety of deployment setups. This often entails connecting your JupyterHub to other infrastructure (such as a Dask Gateway service). There are many security decisions to be made in these cases, and the security of your JupyterHub deployment will often depend on these decisions.
If you are worried about security, don’t hesitate to reach out to the JupyterHub community in the Jupyter Community Forum. This community of practice has many individuals with experience running secure JupyterHub deployments and will be very glad to help you out.
Does JupyterHub provide computing or data infrastructure?#
No - JupyterHub manages user sessions and can control computing infrastructure, but it does not provide these things itself. You are expected to run JupyterHub on your own infrastructure (local or in the cloud). Moreover, JupyterHub has no internal concept of “data”, but is designed to be able to communicate with data repositories (again, either locally or remotely) for use within interactive computing sessions.
How do I manage users?#
JupyterHub offers a few options for managing your users. Upon setting up a JupyterHub, you can choose what kind of authentication you’d like to use. For example, you can have users sign up with an institutional email address, or choose a username / password when they first log-in, or offload authentication onto another service such as an organization’s OAuth.
The users of a JupyterHub are stored locally, and can be modified manually by an administrator of the JupyterHub. Moreover, the active users on a JupyterHub can be found on the administrator’s page. This page gives you the abiltiy to stop or restart kernels, inspect user filesystems, and even take over user sessions to assist them with debugging.
How do I manage software environments?#
A key benefit of JupyterHub is the ability for an administrator to define the environment(s) that users have access to. There are many ways to do this, depending on what kind of infrastructure you’re using for your JupyterHub.
For example, The Littlest JupyterHub runs on a single VM. In this case, the administrator defines an environment by installing packages to a shared folder that exists on the path of all users. The JupyterHub for Kubernetes deployment uses Docker images to define environments. You can create your own list of Docker images that users can select from, and can also control things like the amount of RAM available to users, or the types of machines that their sessions will use in the cloud.
How does JupyterHub manage computational resources?#
For interactive computing sessions, JupyterHub controls computational resources via a spawner. Spawners define how a new user session is created, and are customized for particular kinds of infrastructure. For example, the KubeSpawner knows how to control a Kubernetes deployment to create new pods when users log in.
For more sophisticated computational resources (like distributed computing), JupyterHub can connect with other infrastructure tools (like Dask or Spark). This allows users to control scalable or high-performance resources from within their JupyterHub sessions. The logic of how those resources are controlled is taken care of by the non-JupyterHub application.
Can JupyterHub be used with my high-performance computing resources?#
Yes - JupyterHub can provide access to many kinds of computing infrastructure. Especially when combined with other open-source schedulers such as Dask, you can manage fairly complex computing infrastructures from the interactive sessions of a JupyterHub. For example see the Dask HPC page.
How much resources do user sessions take?#
This is highly configurable by the administrator. If you wish for your users to have simple data analytics environments for prototyping and light data exploring, you can restrict their memory and CPU based on the resources that you have available. If you’d like your JupyterHub to serve as a gateway to high-performance computing or data resources, you may increase the resources available on user machines, or connect them with computing infrastructures elsewhere.
Can I customize the look and feel of a JupyterHub?#
JupyterHub provides some customization of the graphics displayed to users. The most common modification is to add custom branding to the JupyterHub login page, loading pages, and various elements that persist across all pages (such as headers).
For Technical Leads#
Will JupyterHub “just work” with our team’s interactive computing setup?#
Depending on the complexity of your setup, you’ll have different experiences with “out of the box” distributions of JupyterHub. If all of the resources you need will fit on a single VM, then The Littlest JupyterHub should get you up-and-running within a half day or so. For more complex setups, such as scalable Kubernetes clusters or access to high-performance computing and data, it will require more time and expertise with the technologies your JupyterHub will use (e.g., dev-ops knowledge with cloud computing).
In general, the base JupyterHub deployment is not the bottleneck for setup, it is connecting your JupyterHub with the various services and tools that you wish to provide to your users.
How well does JupyterHub scale? What are JupyterHub’s limitations?#
JupyterHub works well at both a small scale (e.g., a single VM or machine) as well as a high scale (e.g., a scalable Kubernetes cluster). It can be used for teams as small as 2, and for user bases as large as 10,000. The scalability of JupyterHub largely depends on the infrastructure on which it is deployed. JupyterHub has been designed to be lightweight and flexible, so you can tailor your JupyterHub deployment to your needs.
Is JupyterHub resilient? What happens when a machine goes down?#
For JupyterHubs that are deployed in a containerized environment (e.g., Kubernetes), it is possible to configure the JupyterHub to be fairly resistant to failures in the system. For example, if JupyterHub fails, then user sessions will not be affected (though new users will not be able to log in). When a JupyterHub process is restarted, it should seamlessly connect with the user database and the system will return to normal. Again, the details of your JupyterHub deployment (e.g., whether it’s deployed on a scalable cluster) will affect the resiliency of the deployment.
What interfaces does JupyterHub support?#
Out of the box, JupyterHub supports a variety of popular data science interfaces for user sessions, such as JupyterLab, Jupyter Notebooks, and RStudio. Any interface that can be served via a web address can be served with a JupyterHub (with the right setup).
Does JupyterHub make it easier for our team to collaborate?#
JupyterHub provides a standardized environment and access to shared resources for your teams. This greatly reduces the cost associated with sharing analyses and content with other team members, and makes it easier to collaborate and build off of one another’s ideas. Combined with access to high-performance computing and data, JupyterHub provides a common resource to amplify your team’s ability to prototype their analyses, scale them to larger data, and then share their results with one another.
JupyterHub also provides a computational framework to share computational narratives between different levels of an organization. For example, data scientists can share Jupyter Notebooks rendered as Voilà dashboards with those who are not familiar with programming, or create publicly-available interactive analyses to allow others to interact with your work.
Can I use JupyterHub with R/RStudio or other languages and environments?#
Yes, Jupyter is a polyglot project, and there are over 40 community-provided kernels for a variety of languages (the most common being Python, Julia, and R). You can also use a JupyterHub to provide access to other interfaces, such as RStudio, that provide their own access to a language kernel.
Troubleshooting#
When troubleshooting, you may see unexpected behaviors or receive an error message. This section provides links for identifying the cause of the problem and how to resolve it.
Behavior#
JupyterHub proxy fails to start#
If you have tried to start the JupyterHub proxy and it fails to start:
check if the JupyterHub IP configuration setting is
c.JupyterHub.ip = '*'
; if it is, tryc.JupyterHub.ip = ''
Try starting with
jupyterhub --ip=0.0.0.0
Note: If this occurs on Ubuntu/Debian, check that you are using a recent version of Node. Some versions of Ubuntu/Debian come with a very old version of Node and it is necessary to update Node.
sudospawner fails to run#
If the sudospawner script is not found in the path, sudospawner will not run. To avoid this, specify sudospawner’s absolute path. For example, start jupyterhub with:
jupyterhub --SudoSpawner.sudospawner_path='/absolute/path/to/sudospawner'
or add:
c.SudoSpawner.sudospawner_path = '/absolute/path/to/sudospawner'
to the config file, jupyterhub_config.py
.
What is the default behavior when none of the lists (admin, allowed, allowed groups) are set?#
When nothing is given for these lists, there will be no admins, and all users who can authenticate on the system (i.e. all the Unix users on the server with a password) will be allowed to start a server. The allowed username set lets you limit this to a particular set of users, and admin_users lets you specify who among them may use the admin interface (not necessary, unless you need to do things like inspect other users’ servers or modify the user list at runtime).
JupyterHub Docker container is not accessible at localhost#
Even though the command to start your Docker container exposes port 8000
(docker run -p 8000:8000 -d --name jupyterhub quay.io/jupyterhub/jupyterhub jupyterhub
),
it is possible that the IP address itself is not accessible/visible. As a result,
when you try http://localhost:8000 in your browser, you are unable to connect
even though the container is running properly. One workaround is to explicitly
tell Jupyterhub to start at 0.0.0.0
which is visible to everyone. Try this
command:
docker run -p 8000:8000 -d --name jupyterhub quay.io/jupyterhub/jupyterhub jupyterhub --ip 0.0.0.0 --port 8000
How can I kill ports from JupyterHub-managed services that have been orphaned?#
I started JupyterHub + nbgrader on the same host without containers. When I try to restart JupyterHub + nbgrader with this configuration, errors appear that the service accounts cannot start because the ports are being used.
How can I kill the processes that are using these ports?
Run the following command:
sudo kill -9 $(sudo lsof -t -i:<service_port>)
Where <service_port>
is the port used by the nbgrader course service. This configuration is specified in jupyterhub_config.py
.
Why am I getting a Spawn failed error message?#
After successfully logging in to JupyterHub with a compatible authenticator, I get a ‘Spawn failed’ error message in the browser. The JupyterHub logs have jupyterhub KeyError: "getpwnam(): name not found: <my_user_name>
.
This issue occurs when the authenticator requires a local system user to exist. In these cases, you need to use a spawner
that does not require an existing system user account, such as DockerSpawner
or KubeSpawner
.
How can I run JupyterHub with sudo but use my current environment variables and virtualenv location?#
When launching JupyterHub with sudo jupyterhub
I get import errors and my environment variables don’t work.
When launching services with sudo ...
the shell won’t have the same environment variables or PATH
s in place. The most direct way to solve this issue is to use the full path to your python environment and add environment variables. For example:
sudo MY_ENV=abc123 \
/home/foo/venv/bin/python3 \
/srv/jupyterhub/jupyterhub
Errors#
Error 500 after spawning my single-user server#
You receive a 500 error while accessing the URL /user/<your_name>/...
.
This is often seen when your single-user server cannot verify your user cookie
with the Hub.
There are two likely reasons for this:
The single-user server cannot connect to the Hub’s API (networking configuration problems)
The single-user server cannot authenticate its requests (invalid token)
Symptoms#
The main symptom is a failure to load any page served by the single-user
server, met with a 500 error. This is typically the first page at /user/<your_name>
after logging in or clicking “Start my server”. When a single-user notebook server
receives a request, the notebook server makes an API request to the Hub to
check if the cookie corresponds to the right user. This request is logged.
If everything is working, the response logged will be similar to this:
200 GET /hub/api/authorizations/cookie/jupyterhub-token-name/[secret] (@10.0.1.4) 6.10ms
You should see a similar 200 message, as above, in the Hub log when you first visit your single-user notebook server. If you don’t see this message in the log, it may mean that your single-user notebook server is not connecting to your Hub.
If you see 403 (forbidden) like this, it is likely a token problem:
403 GET /hub/api/authorizations/cookie/jupyterhub-token-name/[secret] (@10.0.1.4) 4.14ms
Check the logs of the single-user notebook server, which may have more detailed information on the cause.
Causes and resolutions#
Proxy settings (403 GET)#
When your whole JupyterHub sits behind an organization proxy (not a reverse proxy like NGINX as part of your setup and not the configurable-http-proxy) the environment variables HTTP_PROXY
, HTTPS_PROXY
, http_proxy
, and https_proxy
might be set. This confuses the JupyterHub single-user servers: When connecting to the Hub for authorization they connect via the proxy instead of directly connecting to the Hub on localhost. The proxy might deny the request (403 GET). This results in the single-user server thinking it has the wrong auth token. To circumvent this you should add <hub_url>,<hub_ip>,localhost,127.0.0.1
to the environment variables NO_PROXY
and no_proxy
.
Launching Jupyter Notebooks to run as an externally managed JupyterHub service with the jupyterhub-singleuser
command returns a JUPYTERHUB_API_TOKEN
error#
Services allow processes to interact with JupyterHub’s REST API. Example use-cases include:
Secure Testing: provide a canonical Jupyter Notebook for testing production data to reduce the number of entry points into production systems.
Grading Assignments: provide access to shared Jupyter Notebooks that may be used for management tasks such as grading assignments.
Private Dashboards: share dashboards with certain group members.
If possible, try to run the Jupyter Notebook as an externally managed service with one of the provided jupyter/docker-stacks.
Standard JupyterHub installations include a jupyterhub-singleuser command which is built from the jupyterhub.singleuser:main
method. The jupyterhub-singleuser
command is the default command when JupyterHub launches single-user Jupyter Notebooks. One of the goals of this command is to make sure the version of JupyterHub installed within the Jupyter Notebook coincides with the version of the JupyterHub server itself.
If you launch a Jupyter Notebook with the jupyterhub-singleuser
command directly from the command line, the Jupyter Notebook won’t have access to the JUPYTERHUB_API_TOKEN
and will return:
JUPYTERHUB_API_TOKEN env is required to run jupyterhub-singleuser.
Did you launch it manually?
If you plan on testing jupyterhub-singleuser
independently from JupyterHub, then you can set the API token environment variable. For example, if you were to run the single-user Jupyter Notebook on the host, then:
export JUPYTERHUB_API_TOKEN=my_secret_token
jupyterhub-singleuser
With a docker container, pass in the environment variable with the run command:
docker run -d \
-p 8888:8888 \
-e JUPYTERHUB_API_TOKEN=my_secret_token \
jupyter/datascience-notebook:latest
This example demonstrates how to combine the use of the jupyterhub-singleuser
environment variables when launching a Notebook as an externally managed service.
How do I…?#
Use a chained SSL certificate#
Some certificate providers, i.e. Entrust, may provide you with a chained certificate that contains multiple files. If you are using a chained certificate you will need to concatenate the individual files by appending the chained cert and root cert to your host cert:
cat your_host.crt chain.crt root.crt > your_host-chained.crt
You would then set in your jupyterhub_config.py
file the ssl_key
and
ssl_cert
as follows:
c.JupyterHub.ssl_cert = your_host-chained.crt
c.JupyterHub.ssl_key = your_host.key
Example#
Your certificate provider gives you the following files: example_host.crt
,
Entrust_L1Kroot.txt
, and Entrust_Root.txt
.
Concatenate the files appending the chain cert and root cert to your host cert:
cat example_host.crt Entrust_L1Kroot.txt Entrust_Root.txt > example_host-chained.crt
You would then use the example_host-chained.crt
as the value for
JupyterHub’s ssl_cert
. You may pass this value as a command line option
when starting JupyterHub or more conveniently set the ssl_cert
variable in
JupyterHub’s configuration file, jupyterhub_config.py
. In jupyterhub_config.py
,
set:
c.JupyterHub.ssl_cert = /path/to/example_host-chained.crt
c.JupyterHub.ssl_key = /path/to/example_host.key
where ssl_cert
is example-chained.crt and ssl_key to your private key.
Then restart JupyterHub.
See also Enabling SSL encryption.
Install JupyterHub without a network connection#
Both conda and pip can be used without a network connection. You can make your own repository (directory) of conda packages and/or wheels, and then install from there instead of the internet.
For instance, you can install JupyterHub with pip and configurable-http-proxy with npmbox:
python3 -m pip wheel jupyterhub
npmbox configurable-http-proxy
I want access to the whole filesystem and still default users to their home directory#
Setting the following in jupyterhub_config.py
will configure access to
the entire filesystem and set the default to the user’s home directory.
c.Spawner.notebook_dir = '/'
c.Spawner.default_url = '/home/%U' # %U will be replaced with the username
How do I increase the number of pySpark executors on YARN?#
From the command line, pySpark executors can be configured using a command similar to this one:
pyspark --total-executor-cores 2 --executor-memory 1G
Cloudera documentation for configuring spark on YARN applications provides additional information. The pySpark configuration documentation is also helpful for programmatic configuration examples.
How do I use JupyterLab’s pre-release version with JupyterHub?#
While JupyterLab is still under active development, we have had users ask about how to try out JupyterLab with JupyterHub.
You need to install and enable the JupyterLab extension system-wide,
then you can change the default URL to /lab
.
For instance:
python3 -m pip install jupyterlab
jupyter serverextension enable --py jupyterlab --sys-prefix
The important thing is that JupyterLab is installed and enabled in the single-user notebook server environment. For system users, this means system-wide, as indicated above. For Docker containers, it means inside the single-user docker image, etc.
In jupyterhub_config.py
, configure the Spawner to tell the single-user
notebook servers to default to JupyterLab:
c.Spawner.default_url = '/lab'
How do I set up JupyterHub for a workshop (when users are not known ahead of time)?#
Set up JupyterHub using OAuthenticator for GitHub authentication
Configure the admin list to have workshop leaders listed with administrator privileges.
Users will need a GitHub account to log in and be authenticated by the Hub.
How do I set up rotating daily logs?#
You can do this with logrotate,
or pipe to logger
to use Syslog instead of directly to a file.
For example, with this logrotate config file:
/var/log/jupyterhub.log {
copytruncate
daily
}
and run this daily by putting a script in /etc/cron.daily/
:
logrotate /path/to/above-config
Or use syslog:
jupyterhub | logger -t jupyterhub
Toree integration with HDFS rack awareness script#
The Apache Toree kernel will have an issue when running with JupyterHub if the standard HDFS rack awareness script is used. This will materialize in the logs as a repeated WARN:
16/11/29 16:24:20 WARN ScriptBasedMapping: Exception running /etc/hadoop/conf/topology_script.py some.ip.address
ExitCodeException exitCode=1: File "/etc/hadoop/conf/topology_script.py", line 63
print rack
^
SyntaxError: Missing parentheses in call to 'print'
at `org.apache.hadoop.util.Shell.runCommand(Shell.java:576)`
In order to resolve this issue, there are two potential options.
Update HDFS core-site.xml, so the parameter “net.topology.script.file.name” points to a custom script (e.g. /etc/hadoop/conf/custom_topology_script.py). Copy the original script and change the first line point to a python two installation (e.g. /usr/bin/python).
In spark-env.sh add a Python 2 installation to your path (e.g. export PATH=/opt/anaconda2/bin:$PATH).
How can I view the logs for JupyterHub or the user’s Notebook servers when using the DockerSpawner?#
Use docker logs <container>
where <container>
is the container name defined within docker-compose.yml
. For example, to view the logs of the JupyterHub container use:
docker logs hub
By default, the user’s notebook server is named jupyter-<username>
where username
is the user’s username within JupyterHub’s database.
So if you wanted to see the logs for user foo
you would use:
docker logs jupyter-foo
You can also tail logs to view them in real-time using the -f
option:
docker logs -f hub
Troubleshooting commands#
The following commands provide additional detail about installed packages, versions, and system information that may be helpful when troubleshooting a JupyterHub deployment. The commands are:
System and deployment information
jupyter troubleshoot
Kernel information
jupyter kernelspec list
Debug logs when running JupyterHub
jupyterhub --debug
Contributing#
JupyterHub welcomes all contributors, whether you are new to the project or know your way around. The Contributing section provides information on how you can make your contributions.
Contributing#
We want you to contribute to JupyterHub in ways that are most exciting and useful to you. We value documentation, testing, bug reporting & code equally, and are glad to have your contributions in whatever form you wish.
Be sure to first check our Code of Conduct (reporting guidelines), which help keep our community welcoming to as many people as possible.
This section covers information about our community, as well as ways that you can connect and get involved.
Contributors#
Project Jupyter thanks the following people for their help and contribution on JupyterHub:
adelcast
Analect
anderbubble
anikitml
ankitksharma
apetresc
athornton
barrachri
BerserkerTroll
betatim
Carreau
cfournie
charnpreetsingh
chicovenancio
cikao
ckald
cmoscardi
consideRatio
cqzlxl
CRegenschein
cwaldbieser
danielballen
danoventa
daradib
darky2004
datapolitan
dblockow-d2dcrc
DeepHorizons
DerekHeldtWerle
dhirschfeld
dietmarw
dingc3
dmartzol
DominicFollettSmith
dsblank
dtaniwaki
echarles
ellisonbg
emmanuel
evanlinde
Fokko
fperez
franga2000
GladysNalvarte
glenak1911
gweis
iamed18
jamescurtin
JamiesHQ
JasonJWilliamsNY
jbweston
jdavidheiser
jencabral
jhamrick
jkinkead
johnkpark
josephtate
jzf2101
karfai
kinuax
KrishnaPG
kroq-gar78
ksolan
mbmilligan
mgeplf
minrk
mistercrunch
Mistobaan
mpacer
mwmarkland
ndly
nthiery
nxg
ObiWahn
ozancaglayan
paccorsi
parente
PeterDaveHello
peterruppel
phill84
pjamason
prasadkatti
rafael-ladislau
rcthomas
rgbkrk
rkdarst
robnagler
rschroll
ryanlovett
sangramga
Scrypy
schon
shreddd
Siecje
smiller5678
spoorthyv
ssanderson
summerswallow
syutbai
takluyver
temogen
ThomasMChen
Thoralf Gutierrez
timfreund
TimShawver
tklever
Todd-Z-Li
toobaz
tsaeger
tschaume
vilhelmen
whitead
willingc
YannBrrd
yuvipanda
zoltan-fedor
zonca
Neeraj Natu
Community communication channels#
We use different channels of communication for different purposes. Whichever one you use will depend on what kind of communication you want to engage in.
Discourse (recommended)#
We use Discourse for online discussions and support questions. You can ask questions here if you are a first-time contributor to the JupyterHub project. Everyone in the Jupyter community is welcome to bring ideas and questions there.
We recommend that you first use our Discourse as all past and current discussions on it are archived and searchable. Thus, all discussions remain useful and accessible to the whole community.
Gitter#
We use our Gitter channel for online, real-time text chat; a place for more ephemeral discussions. When you’re not on Discourse, you can stop here to have other discussions on the fly.
Github Issues#
Github issues are used for most long-form project discussions, bug reports and feature requests.
Issues related to a specific authenticator or spawner should be opened in the appropriate repository for the authenticator or spawner.
If you are using a specific JupyterHub distribution (such as Zero to JupyterHub on Kubernetes or The Littlest JupyterHub), you should open issues directly in their repository.
If you cannot find a repository to open your issue in, do not worry! Open the issue in the main JupyterHub repository and our community will help you figure it out.
Note
Our community is distributed across the world in various timezones, so please be patient if you do not get a response immediately!
Setting up a development install#
System requirements#
JupyterHub can only run on macOS or Linux operating systems. If you are using Windows, we recommend using VirtualBox or a similar system to run Ubuntu Linux for development.
Install Python#
JupyterHub is written in the Python programming language and requires you have at least version 3.8 installed locally. If you haven’t installed Python before, the recommended way to install it is to use Miniforge.
Install nodejs#
NodeJS 12+ is required for building some JavaScript components.
configurable-http-proxy
, the default proxy implementation for JupyterHub, is written in Javascript.
If you have not installed NodeJS before, we recommend installing it in the miniconda
environment you set up for Python.
You can do so with conda install nodejs
.
Many in the Jupyter community use nvm
to
managing node dependencies.
Install git#
JupyterHub uses Git & GitHub for development & collaboration. You need to install git to work on JupyterHub. We also recommend getting a free account on GitHub.com.
Setting up a development install#
When developing JupyterHub, you would need to make changes and be able to instantly view the results of the changes. To achieve that, a developer install is required.
Note
This guide does not attempt to dictate how development
environments should be isolated since that is a personal preference and can
be achieved in many ways, for example, tox
, conda
, docker
, etc. See this
forum thread for
a more detailed discussion.
Clone the JupyterHub git repository to your computer.
git clone https://github.com/jupyterhub/jupyterhub cd jupyterhub
Make sure the
python
you installed and thenpm
you installed are available to you on the command line.python -V
This should return a version number greater than or equal to 3.8.
npm -v
This should return a version number greater than or equal to 5.0.
Install
configurable-http-proxy
(required to run and test the default JupyterHub configuration):npm install -g configurable-http-proxy
If you get an error that says
Error: EACCES: permission denied
, you might need to prefix the command withsudo
.sudo
may be required to perform a system-wide install. If you do not have access to sudo, you may instead run the following commands:npm install configurable-http-proxy export PATH=$PATH:$(pwd)/node_modules/.bin
The second line needs to be run every time you open a new terminal.
If you are using conda you can instead run:
conda install configurable-http-proxy
Install an editable version of JupyterHub and its requirements for development and testing. This lets you edit JupyterHub code in a text editor & restart the JupyterHub process to see your code changes immediately.
python3 -m pip install --editable ".[test]"
You are now ready to start JupyterHub!
jupyterhub
You can access JupyterHub from your browser at
http://localhost:8000
now.
Happy developing!
Using DummyAuthenticator & SimpleLocalProcessSpawner#
To simplify testing of JupyterHub, it is helpful to use
DummyAuthenticator
instead of the default JupyterHub
authenticator and SimpleLocalProcessSpawner instead of the default spawner.
There is a sample configuration file that does this in
testing/jupyterhub_config.py
. To launch JupyterHub with this
configuration:
jupyterhub -f testing/jupyterhub_config.py
The test configuration enables a few things to make testing easier:
use ‘dummy’ authentication and ‘simple’ spawner
named servers are enabled
listen only on localhost
‘admin’ is an admin user, if you want to test the admin page
disable caching of static files
The default JupyterHub authenticator & spawner require your system to have user accounts for each user you want to log in to JupyterHub as.
DummyAuthenticator allows you to log in with any username & password, while SimpleLocalProcessSpawner allows you to start servers without having to create a Unix user for each JupyterHub user. Together, these make it much easier to test JupyterHub.
Tip: If you are working on parts of JupyterHub that are common to all authenticators & spawners, we recommend using both DummyAuthenticator & SimpleLocalProcessSpawner. If you are working on just authenticator-related parts, use only SimpleLocalProcessSpawner. Similarly, if you are working on just spawner-related parts, use only DummyAuthenticator.
Building frontend components#
The testing configuration file also disables caching of static files, which allows you to edit and rebuild these files without restarting JupyterHub.
If you are working on the admin react page, which is in the jsx
directory, you can run:
cd jsx
npm install
npm run build:watch
to continuously rebuild the admin page, requiring only a refresh of the page.
If you are working on the frontend SCSS files, you can run the same build:watch
command
in the top level directory of the repo:
npm install
npm run build:watch
Troubleshooting#
This section lists common ways setting up your development environment may fail, and how to fix them. Please add to the list if you encounter yet another way it can fail!
lessc
not found#
If the python3 -m pip install --editable .
command fails and complains about
lessc
being unavailable, you may need to explicitly install some
additional JavaScript dependencies:
npm install
This will fetch client-side JavaScript dependencies necessary to compile CSS.
You may also need to manually update JavaScript and CSS after some development updates, with:
python3 setup.py js # fetch updated client-side js
python3 setup.py css # recompile CSS from LESS sources
python3 setup.py jsx # build React admin app
Failed to bind XXX to http://127.0.0.1:<port>/<path>
#
This error can happen when there’s already an application or a service using this port.
Use the following command to find out which service is using this port.
lsof -P -i TCP:<port> -sTCP:LISTEN
If nothing shows up, it likely means there’s a system service that uses it but your current user cannot list it. Reuse the same command with sudo.
sudo lsof -P -i TCP:<port> -sTCP:LISTEN
Depending on the result of the above commands, the most simple solution is to configure JupyterHub to use a different port for the service that is failing.
As an example, the following is a frequently seen issue:
Failed to bind hub to http://127.0.0.1:8081/hub/
Using the procedure described above, start with:
lsof -P -i TCP:8081 -sTCP:LISTEN
and if nothing shows up:
sudo lsof -P -i TCP:8081 -sTCP:LISTEN
Finally, depending on your findings, you can apply the following change and start JupyterHub again:
c.JupyterHub.hub_port = 9081 # Or any other free port
Contributing Documentation#
Documentation is often more important than code. This page helps you get set up on how to contribute to JupyterHub’s documentation.
Building documentation locally#
We use sphinx to build our documentation. It takes
our documentation source files (written in markdown or reStructuredText &
stored under the docs/source
directory) and converts it into various
formats for people to read. To make sure the documentation you write or
change renders correctly, it is good practice to test it locally.
Make sure you have successfully completed Setting up a development install.
Install the packages required to build the docs.
python3 -m pip install -r docs/requirements.txt
Build the html version of the docs. This is the most commonly used output format, so verifying it renders correctly is usually good enough.
cd docs make html
This step will display any syntax or formatting errors in the documentation, along with the filename / line number in which they occurred. Fix them, and re-run the
make html
command to re-render the documentation.View the rendered documentation by opening
_build/html/index.html
in a web browser.Tip
On Windows, you can open a file from the terminal with
start <path-to-file>
.On macOS, you can do the same with
open <path-to-file>
.On Linux, you can do the same with
xdg-open <path-to-file>
.After opening index.html in your browser you can just refresh the page whenever you rebuild the docs via
make html
Documentation conventions#
This section lists various conventions we use in our documentation. This is a living document that grows over time, so feel free to add to it / change it!
Our entire documentation does not yet fully conform to these conventions yet, so help in making it so would be appreciated!
pip
invocation#
There are many ways to invoke a pip
command, we recommend the following
approach:
python3 -m pip
This invokes pip explicitly using the python3 binary that you are currently using. This is the recommended way to invoke pip in our documentation, since it is least likely to cause problems with python3 and pip being from different environments.
For more information on how to invoke pip
commands, see
the pip documentation.
Testing JupyterHub and linting code#
Unit testing helps to validate that JupyterHub works the way we think it does, and continues to do so when changes occur. They also help communicate precisely what we expect our code to do.
JupyterHub uses pytest for all the tests. You can find them under the jupyterhub/tests directory in the git repository.
Running the tests#
Make sure you have completed Setting up a development install. Once you are done, you would be able to run
jupyterhub
from the command line and access it from your web browser. This ensures that the dev environment is properly set up for tests to run.You can run all tests in JupyterHub
pytest -v jupyterhub/tests
This should display progress as it runs all the tests, printing information about any test failures as they occur.
If you wish to confirm test coverage the run tests with the
--cov
flag:pytest -v --cov=jupyterhub jupyterhub/tests
You can also run tests in just a specific file:
pytest -v jupyterhub/tests/<test-file-name>
To run a specific test only, you can do:
pytest -v jupyterhub/tests/<test-file-name>::<test-name>
This runs the test with function name
<test-name>
defined in<test-file-name>
. This is very useful when you are iteratively developing a single test.For example, to run the test
test_shutdown
in the filetest_api.py
, you would run:pytest -v jupyterhub/tests/test_api.py::test_shutdown
For more details, refer to the pytest usage documentation.
Test organisation#
The tests live in jupyterhub/tests
and are organized roughly into:
test_api.py
tests the REST APItest_pages.py
tests loading the HTML pages
and other collections of tests for different components. When writing a new test, there should usually be a test of similar functionality already written and related tests should be added nearby.
The fixtures live in jupyterhub/tests/conftest.py
. There are
fixtures that can be used for JupyterHub components, such as:
app
: an instance of JupyterHub with mocked partsauth_state_enabled
: enables persisting auth_state (like authentication tokens)db
: a sqlite in-memory DB sessionio_loop`
: a Tornado event loopevent_loop
: a new asyncio event loopuser
: creates a new temporary useradmin_user
: creates a new temporary admin usersingle user servers -
cleanup_after
: allows cleanup of single user servers between testsmocked service -
MockServiceSpawner
: a spawner that mocks services for testing with a short poll interval -mockservice`
: mocked service with no external service url -mockservice_url
: mocked service with a url to test external services
And fixtures to add functionality or spawning behavior:
admin_access
: grants admin accessno_patience`
: sets slow-spawning timeouts to zeroslow_spawn
: enables the SlowSpawner (a spawner that takes a few seconds to start)never_spawn
: enables the NeverSpawner (a spawner that will never start)bad_spawn
: enables the BadSpawner (a spawner that fails immediately)slow_bad_spawn
: enables the SlowBadSpawner (a spawner that fails after a short delay)
Refer to the pytest fixtures documentation to learn how to use fixtures that exists already and to create new ones.
The Pytest-Asyncio Plugin#
When testing the various JupyterHub components and their various implementations, it sometimes becomes necessary to have a running instance of JupyterHub to test against.
The app
fixture mocks a JupyterHub application for use in testing by:
enabling ssl if internal certificates are available
creating an instance of MockHub using any provided configurations as arguments
initializing the mocked instance
starting the mocked instance
finally, a registered finalizer function performs a cleanup and stops the mocked instance
The JupyterHub test suite uses the pytest-asyncio plugin that handles event-loop integration in Tornado applications. This allows for the use of top-level awaits when calling async functions or fixtures during testing. All test functions and fixtures labelled as async
will run on the same event loop.
Note
With the introduction of top-level awaits, the use of the io_loop
fixture of the pytest-tornado plugin is no longer necessary. It was initially used to call coroutines. With the upgrades made to pytest-asyncio
, this usage is now deprecated. It is now, only utilized within the JupyterHub test suite to ensure complete cleanup of resources used during testing such as open file descriptors. This is demonstrated in this pull request.
More information is provided below.
One of the general goals of the JupyterHub Pytest Plugin project is to ensure the MockHub cleanup fully closes and stops all utilized resources during testing so the use of the io_loop
fixture for teardown is not necessary. This was highlighted in this issue
For more information on asyncio and event-loops, here are some resources:
Troubleshooting Test Failures#
All the tests are failing#
Make sure you have completed all the steps in Setting up a development install successfully, and are able to access JupyterHub from your browser at http://localhost:8000 after starting jupyterhub
in your command line.
Code formatting and linting#
JupyterHub automatically enforces code formatting. This means that pull requests with changes breaking this formatting will receive a commit from pre-commit.ci automatically.
To automatically format code locally, you can install pre-commit and register a git hook to automatically check with pre-commit before you make a commit if the formatting is okay.
pip install pre-commit
pre-commit install --install-hooks
To run pre-commit manually you would do:
# check for changes to code not yet committed
pre-commit run
# check for changes also in already committed code
pre-commit run --all-files
You may also install black integration into your text editor to format code automatically.
The JupyterHub roadmap#
This roadmap collects “next steps” for JupyterHub. It is about creating a shared understanding of the project’s vision and direction amongst the community of users, contributors, and maintainers. The goal is to communicate priorities and upcoming release plans. It is not aimed at limiting contributions to what is listed here.
Using the roadmap#
What do we mean by “next step”?#
When submitting an issue, think about what “next step” category best describes your issue:
now, concrete/actionable step that is ready for someone to start work on. These might be items that have a link to an issue or more abstract like “decrease typos and dead links in the documentation”
soon, less concrete/actionable step that is going to happen soon, discussions around the topic are coming close to an end at which point it can move into the “now” category
later, abstract ideas or tasks, need a lot of discussion or experimentation to shape the idea so that it can be executed. Can also contain concrete/actionable steps that have been postponed on purpose (these are steps that could be in “now” but the decision was taken to work on them later)
Reviewing and Updating the Roadmap#
The roadmap will get updated as time passes (next review by 1st December) based on discussions and ideas captured as issues. This means this list should not be exhaustive, it should only represent the “top of the stack” of ideas. It should not function as a wish list, collection of feature requests or todo list. For those please create a new issue.
The roadmap should give the reader an idea of what is happening next, what needs input and discussion before it can happen and what has been postponed.
The roadmap proper#
Project vision#
JupyterHub is a dependable tool used by humans that reduces the complexity of creating the environment in which a piece of software can be executed.
Now#
These “Now” items are considered active areas of focus for the project:
HubShare - a sharing service for use with JupyterHub.
Users should be able to:
Push a project to other users.
Get a checkout of a project from other users.
Push updates to a published project.
Pull updates from a published project.
Manage conflicts/merges by simply picking a version (our/theirs)
Get a checkout of a project from the internet. These steps are completely different from saving notebooks/files.
Have directories that are managed by git completely separately from our stuff.
Look at pushed content that they have access to without an explicit pull.
Define and manage teams of users.
Adding/removing a user to/from a team gives/removes them access to all projects that team has access to.
Build other services, such as static HTML publishing and dashboarding on top of these things.
Soon#
These “Soon” items are under discussion. Once an item reaches the point of an actionable plan, the item will be moved to the “Now” section. Typically, these will be moved at a future review of the roadmap.
resource monitoring and management:
(prometheus?) API for resource monitoring
tracking activity on single-user servers instead of the proxy
notes and activity tracking per API token
Later#
The “Later” items are things that are at the back of the project’s mind. At this time there is no active plan for an item. The project would like to find the resources and time to discuss these ideas.
real-time collaboration
Enter into real-time collaboration mode for a project that starts a shared execution context.
Once the single-user notebook package supports realtime collaboration, implement sharing mechanism integrated into the Hub.
Reporting security issues in Jupyter or JupyterHub#
If you find a security vulnerability in Jupyter or JupyterHub, whether it is a failure of the security model described in Security Overview or a failure in implementation, please report it to mailto:security@ipython.org.
If you prefer to encrypt your security reports,
you can use this PGP public key
.
Indices and tables#
Questions? Suggestions?#
All questions and suggestions are welcome. Please feel free to use our Jupyter Discourse Forum to contact our team.
Looking forward to hearing from you!