Setup Azure AD authentication for JupyterHub in Kubernetes
In the world of data science and machine learning, JupyterHub has become an indispensable tool for creating and managing Jupyter notebooks in a collaborative and scalable manner. Kubernetes, on the other hand, is the go-to platform for orchestrating containerized applications. Combining these two technologies can provide a powerful environment for data scientists and researchers. In this tutorial, we will walk you through the process of setting up Azure Active Directory (Azure AD) authentication for JupyterHub running on Kubernetes.
Prerequisites
Before we dive into the setup process, make sure you have the following prerequisites in place:
-
Azure Account: You will need an Azure account to create and configure Azure AD.
-
Kubernetes Cluster: Set up a Kubernetes cluster where you plan to deploy JupyterHub. You can use a managed Kubernetes service like Azure Kubernetes Service (AKS) or AWS Elastic Kubernetes Service (EKS).
-
Helm: Helm is a package manager for Kubernetes. Install Helm if you haven’t already.
Now, let’s proceed with the setup.
Step 1: Create an Azure AD Application
-
Log in to your Azure portal.
-
Navigate to the Microsoft Entra ID section.
-
Select App registrations and click on New registration.
-
Provide a name for your application, choose the appropriate Supported account types, and specify the Redirect URI (e.g., https://your-jupyterhub-domain.com/hub/oauth_callback) as highlighted below.
-
After registration, Select Certificates & secrets and click on Client secrets and then New client secret then fill the details.
- Description: Name of your client’s secret
- Expires: Timeline or Duration of the secret key
After getting client secret copy the secret key value; you’ll need these later.
-
Select Overview, note down the Application (client) ID and the Directory (tenant) ID; you’ll need these later.
Step 2: Configure Authentication in JupyterHub
-
Create a values.yaml file for JupyterHub’s configuration. In this file, you will configure Azure AD as an authenticator as below:
hub: config: ################## Auth for AAD ############## JupyterHub: authenticator_class: azuread Authenticator: enable_auth_state: true AzureAdOAuthenticator: client_id: "your-application-client-id" client_secret: "your-application-client-secret" oauth_callback_url: "https://your-jupyterhub-domain.com/hub/oauth_callback" tenant_id: "your-application-tenant-id" username_claim: unique_name allow_all: true ################## Auth for AAD ##############
Replace “your-application-client-id”, “your-application-client-secret”, “your-application-tenant-id”, and “your-jupyterhub-domain.com” with the values from Step 1.
Note:
- In the above config
allow_all: true
is important if you allow non-admin to login to JupyterHub, otherwise you will get error when you authenticate with Azure Active Directory (AAD). - Above config was tested in JupyterHub Helm Chart version 3.1.0
- In the above config
-
Deploy JupyterHub with the updated configuration:
helm install jupyterhub jupyterhub/jupyterhub -n jupyterhub --create-namespace --version 3.1.0 -f values.yaml
-
Verify that JupyterHub is running with Azure AD authentication.
Kubectl get pods -n jupyterhub
Step 3: Test Azure AD Authentication
-
Access your JupyterHub deployment through a web browser by going to
https://your-jupyterhub-domain.com
-
You should see Sign in with Azure AD at login page. Log in with your Azure AD credentials.
-
Once authenticated, you should be able to create and manage Jupyter notebooks through JupyterHub.
Conclusion
Setting up Azure AD authentication for JupyterHub in Kubernetes provides an added layer of security and makes it easier to manage user access to your Jupyter notebook environment. With Azure AD, you can integrate JupyterHub into your existing identity and access management systems seamlessly. This tutorial has shown you the steps to configure Azure AD authentication, but remember to follow best practices for securing your JupyterHub deployment and Azure resources.
Happy data science and happy coding!
I try to keep my articles up to date. If you see something that is not true (anymore), or something that should be mentioned, feel free to edit the article on GitLab.