You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: modules/tutorials/pages/jupyterhub.adoc
+18-14Lines changed: 18 additions & 14 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -8,11 +8,12 @@ The example notebook is used to demonstrate simple read/write interactions with
8
8
9
9
== Keycloak
10
10
11
-
Keycloak is installed using a https://github.com/stackabletech/demos/blob/feat/keycloak-jupyterhub/stacks/jupyterhub-keycloak/keycloak.yaml[Deployment] that loads realm configuration mounted as a ConfigMap.
11
+
Keycloak is installed using a https://github.com/stackabletech/demos/blob/main/stacks/jupyterhub-keycloak/keycloak.yaml[Deployment] that loads realm configuration mounted as a ConfigMap.
12
12
13
+
[#services]
13
14
=== Services
14
15
15
-
In the demo, the keycloak and jupyter hub service (proxy-public) ports are fixed e.g.
16
+
In the demo, the keycloak and jupyter hub service (`proxy-public`) ports are fixed e.g.
16
17
17
18
[source,yaml]
18
19
---
@@ -84,7 +85,7 @@ options:
84
85
85
86
=== Discovery
86
87
87
-
As mentioned above, keycloak writes out its endpoint information to a ConfigMap, shown in the code section below.
88
+
As mentioned above in <<services, Services>>, keycloak writes out its endpoint information to a ConfigMap, shown in the code section below.
88
89
89
90
.Writing the ConfigMap
90
91
[%collapsible]
@@ -202,7 +203,7 @@ For the self-signed certificate to be accepted during the handshake between Jupy
202
203
203
204
=== Realm
204
205
205
-
The Keycloak https://github.com/stackabletech/demos/blob/feat/keycloak-jupyterhub/stacks/jupyterhub-keycloak/keycloak-realm-config.yaml for the demo basically contains a set of users and groups, along with a simple client definition:
206
+
The Keycloak https://github.com/stackabletech/demos/blob/main/stacks/jupyterhub-keycloak/keycloak-realm-config.yaml for the demo basically contains a set of users and groups, along with a simple client definition:
206
207
207
208
[source,yaml]
208
209
----
@@ -218,8 +219,8 @@ The Keycloak https://github.com/stackabletech/demos/blob/feat/keycloak-jupyterhu
218
219
} ]
219
220
----
220
221
221
-
Not that the standard flow is enabled and no other OAuth-specific settings are required.
222
-
Wildcards are used for `redirectUris` and `webOrigins`, mainly for the sake of simplicity: in production environments this would typically be limited or filtered in an appropriate way.
222
+
Note that the standard flow is enabled and no other OAuth-specific settings are required.
223
+
Wildcards are used for `redirectUris` and `webOrigins`, mainly for the sake of simplicity: in production environments these would typically be limited or filtered in an appropriate way.
223
224
224
225
== JupyterHub
225
226
@@ -254,7 +255,7 @@ image::jupyterhub/sign-up.png[Create a user]
254
255
255
256
Users must either be included in an `allowed_users` list, or the property `allow_all` must be set to `true`.
256
257
The creation of new users will be checked against these settings and refused if appropriate.
257
-
If an admin_users property is defined, then associated users will see an additional tab on the JupyterHub home screen, allowing them to carry out user management actions (e.g. create user groups and assign users to them, assign users to the admin role, delete users).
258
+
If an `admin_users` property is defined, then associated users will see an additional tab on the JupyterHub home screen, allowing them to carry out certain user management actions (e.g. create user groups and assign users to them, assign users to the admin role, delete users).
258
259
259
260
image::jupyterhub/admin-user.png[Admin tab]
260
261
@@ -297,11 +298,11 @@ This section of the JupyterHub values specifies that we are using GenericOAuthen
297
298
298
299
<1> We need to either provide a list of users using `allowed_users`, or to explicitly allow _all_ users, as done here.
299
300
We will delegate this to Keycloak so that we do not have to maintain users in two places.
300
-
<2> Each admin user will have access to an "Admin" tab on the JupyterHub UI where certain user-management actions can be carried out.
301
+
<2> Each admin user will have access to an Admin tab on the JupyterHub UI where certain user-management actions can be carried out.
301
302
<3> Define the Keycloak scope
302
303
<4> Specifies which authenticator class to use
303
304
304
-
The endpoints can be defined directly under `GenericOAuthenticator` as well, though for our purposes we will set them in a configuration script (see below).
305
+
The endpoints can be defined directly under `GenericOAuthenticator` as well, though for our purposes we will set them in a configuration script (see <<endpoints, Endpoints>> below).
305
306
306
307
=== Certificates
307
308
@@ -351,10 +352,11 @@ This can be seen below:
351
352
If the default file is not overwritten, but is mounted to a new file in the same directory, then the certificates should be updated by calling e.g. `update-ca-certificates`.
352
353
<4> ensure python is using the same certificate.
353
354
355
+
[#endpoints]
354
356
=== Endpoints
355
357
356
358
The Helm chart for JupyterHub allows us to augment the standard configuration with one or more scripts.
357
-
As mentioned in an earlier section, we want to define the endpoints dynamically - by making use of the ConfigMap written out by the Keycloak Deployment - and we can do this by adding a script under `extraConfig`:
359
+
As mentioned in the <<services, Services>> section above, we want to define the endpoints dynamically - by making use of the ConfigMap written out by the Keycloak Deployment - and we can do this by adding a script under `extraConfig`:
358
360
359
361
[source,yaml]
360
362
----
@@ -373,9 +375,10 @@ As mentioned in an earlier section, we want to define the endpoints dynamically
NOTE: When using Spark from within a notebook, please the `Provisos` section below.
381
+
NOTE: When using Spark from within a notebook, please the <<provisos, Provisos>> section below.
379
382
380
383
In the same way, we can use another script to define a driver service for each user.
381
384
This is essential when using Spark from within a JupyterHUb notebook so that executor pods can be spawned from the user's kernel in a user-specific way.
@@ -425,7 +428,7 @@ This script instructs JupyterHub to use `KubeSpawner` to create a service refere
425
428
426
429
=== Profiles
427
430
428
-
The `singleuser.profileList` section of the Helm chart values allows us to define notebook profiles by setting the CPU, Memory and Image combinations that can be selected. For instance, the profiles below allows to select 2/4/... CPUs, 4/8/... GB RAM and between two images.
431
+
The `singleuser.profileList` section of the Helm chart values allows us to define notebook profiles by setting the CPU, Memory and Image combinations that can be selected. For instance, the profiles below allows us to select `2/4/...` CPUs, `4/8/...` GB RAM and to select one of two images.
429
432
430
433
[source,yaml]
431
434
----
@@ -524,13 +527,14 @@ USER spark
524
527
====
525
528
526
529
NOTE: The example notebook in the demo will start a distributed Spark cluster, whereby the notebook acts as the driver which spawns a number of executors.
527
-
The driver uses the user-specific driver service (see above) to pass job dependencies to each executor.
530
+
The driver uses the user-specific <<driver, driver service>> to pass job dependencies to each executor.
528
531
The Spark versions of these dependencies must be the same, or else serialization errors can occur.
529
532
This is increasingly likely in cases where Java or Scala classes do not have a specified `serialVersionUID`, in which case one will be calculated at runtime based on the contents of each class (method signatures etc.): if the contents of these class files have been changed, then the UID may differ between driver and executor.
530
533
To avoid this, care needs to be taken to use images for the notebook and the Spark job that are using a common Spark build.
531
534
532
535
== Example Notebook
533
536
537
+
[#provisos]
534
538
=== Provisos
535
539
536
540
WARNING: When running a distributed Spark cluster from within a JupyterHub notebook, the notebook acts as the driver and requests executors Pods from k8s.
Copying /var/data/spark-bfed3050-5f63-441d-9799-a196d7b54ce9/spark-a03b09a7-869e-4778-ac04-fa935bbca5ab/1075326831741174390840_cache to /opt/spark/work-dir/./org.checkerframework_checker-qual-2.5.2.jar
589
593
----
590
594
591
-
Once the Spark session has been created, the notebook reads data from S3, performs a simple aggregation and re-writes it in different formats.
595
+
Once the Spark session has been created, the notebook reads data from S3, performs a simple aggregation and re-writes it in different formats. Further comments can be found in the notebook itself.
0 commit comments