Skip to content

OKDP/jupyterlab-docker

Repository files navigation

OKDP Jupyter Images

Build, test, tag, and push jupyter images

OKDP jupyter docker images based on jupyter docker-stacks source dockerfiles. It includes (read only copy) jupyter docker-stacks repository as a git-subtree sub project.

The project leverages the features provided by jupyter docker-stacks:

  • Build from the original source docker files
  • Customize the images by using docker build-arg build arguments
  • Run the original tests at every pipeline trigger

The project provides an up to date jupyter lab images especially for pyspark.

Images build workflow

Build/Test

The ci build pipeline contains 6 main reusable workflows:

  1. build-base-images-template: docker-stacks-foundation, base-notebook, minimal-notebook, scipy-notebook
  2. build-datascience-images-template: r-notebook, julia-notebook, tensorflow-notebook, pytorch-notebook
  3. build-spark-images-template: pyspark-notebook, all-spark-notebook
  4. publish: push the built images to the container registry (main branch only)
  5. auto-rerun: partially re-run jobs in case of failures (github runner issues/main branch only)
  6. ci: run ci pipeline at every contribution

build pipeline

The build is based on the version compatibility matrix.

The build-matrix section defines the components versions to build. It behaves like a filter of the parent compatibility-matrix section to limit the versions combintations to build. The build process ensures only the compatible versions are built:

For example, the following build-matrix:

build-matrix:
  python_version: ['3.9', '3.10', '3.11']
  spark_version: [3.2.4, 3.3.4, 3.4.2, 3.5.0]
  java_version: [11, 17]
  scala_version: [2.12]

Will build the following versions combinations in regards to compatibility-matrix section:

  • spark3.3.4-python3.10-java17-scala2.12
  • spark3.5.0-python3.11-java17-scala2.12
  • spark3.4.2-python3.11-java17-scala2.12
  • spark3.2.4-python3.9-java11-scala2.12

By default, if no filter is specified:

build-matrix:

All compatible versions combinations are built.

Finally, all the images are tested against the original tests at every pipeline trigger

Publishing

Development images with tags -<GIT-BRANCH>-latest suffix (ex.: spark3.2.4-python3.9-java11-scala2.12--latest) are produced at every pipeline run regardless of the git branch (main or not).

The official images are publiqhed to the okdp quay.io registry:

  1. At every release, and,
  2. Periodically, every monday at 05H00 GMT

Tagging

The project builds the images with a long format tags. Each tag combines multiple compatible versions combinations.

There are multiple tags levels and the format to use is depending on your convenience in term of stability and reproducibility.

Here are some examples:

scipy-notebook:

  • python-3.11-2024-02-06
  • python-3.11.7-2024-02-06
  • python-3.11.7-hub-4.0.2-lab-4.1.0
  • python-3.11.7-hub-4.0.2-lab-4.1.0-2024-02-06

datascience-notebook:

  • python-3.9-2024-02-06
  • python-3.9.18-2024-02-06
  • python-3.9.18-hub-4.0.2-lab-4.1.0
  • python-3.9.18-hub-4.0.2-lab-4.1.0-2024-02-06
  • python-3.9.18-r-4.3.2-julia-1.10.0-2024-02-06
  • python-3.9.18-r-4.3.2-julia-1.10.0-hub-4.0.2-lab-4.1.0
  • python-3.9.18-r-4.3.2-julia-1.10.0-hub-4.0.2-lab-4.1.0-2024-02-06

pyspark-notebook:

  • spark-3.5.0-python-3.11-java-17-scala-2.12
  • spark-3.5.0-python-3.11-java-17-scala-2.12-2024-02-06
  • spark-3.5.0-python-3.11.7-java-17.0.9-scala-2.12.18-hub-4.0.2-lab-4.1.0
  • spark-3.5.0-python-3.11.7-java-17.0.9-scala-2.12.18-hub-4.0.2-lab-4.1.0-2024-02-06
  • spark-3.5.0-python-3.11.7-r-4.3.2-java-17.0.9-scala-2.12.18-hub-4.0.2-lab-4.1.0
  • spark-3.5.0-python-3.11.7-r-4.3.2-java-17.0.9-scala-2.12.18-hub-4.0.2-lab-4.1.0-2024-02-06

Please, check the okdp quay.io container registry for more images and tags.

Running github actions

Official registry (quai.io) credentials

Create the following secrets and configuration variables when running with your own github account or organization:

Variable Type Default Description
REGISTRY Configuration variable quay.io Container registry
REGISTRY_USERNAME Secret variable Container registry username
REGISTRY_ROBOT_TOKEN Secret variable Container registry password or access token (Scopes: write:packages/delete:packages)

Running locally with act

Act can be used to build and test locally.

Here is an example command:

$ act  --container-architecture linux/amd64  \
       -W .github/workflows/ci.yml \
       --env ACT_SKIP_TESTS=<true|false> \
       --secret GITHUB_TOKEN=<GITHUB_TOKEN> \
       --rm

set the option --container-architecture linux/amd64 if you are running locally with Apple's M1/M2 chips.

For more information:

$ act  --help

OKDP custom extensions

  1. Tagging extension is based on the original jupyter docker-stacks source files
  2. Patchs patchs the original jupyter docker-stacks in order to run the tests
  3. Version compatibility matrix to generate all the compatible versions combintations for pyspark
  4. Unit tests in order to test okdp extension at every pipeline run