PUNCH4NFDI User Story

Software to the data: Running container interactively in compute centres

userstory_kachel.png

Data can be analyzed conveniently, i.e. especially interactively, with Jupyter notebooks, which are consequently popular in many scientific communities. However, this method of working reaches its limits with large volumes of data: for reasons of sustainability, copying large volumes of data to each user is not feasible. Instead, a user copies his analysis pipeline to the data centre that holds the data (“software for the data“). Analysis tools usually require additional software components (such as libraries) that, in general, are not installed on the computers of a data centre. This can be remedied by containers in which the user packs the required components alongside the analysis tools and transfers containers to the data centre.

This article demonstrates how container can be used "interactively" at the Jülich Supercomputing Centre, which offers an interface for Jupyter Lab with the added option to start the Jupyter server instance from a custom container.