Skip to content

2023

Deep(er) dive into container labels and annotations

pic from the workshop2

During the euroHPC Summit in Göteborg we discussed the latest developments and the goals we want to push within the euroHPC community. It boils down to raise awareness about container in general and synchronize the efforts of the different entities.
I gave an introduction to the HPC Container Conformance project (HPC3) in which I disected annotations and labels and how we might deal with those.

HPC3: Expected Container Behaviour

The last blog post introduced the HPC Container Conformance (HPC3) project - a project to provide guidance on how to build and annotate container images for HPC use cases.

For HPW we'll need to cover two main parts (IMHO) first:

  1. Entrypoint/Cmd relationship: How do interactive users and batch systems expect a container to behave. We need to make sure that a container works with docker run, singularity run and podman run out of the box (engine configuration already done)
  2. Annotation Baseline: Which annotations do we need and want, some are mandatory and some are optional.

This blob post is going to set a baseline in Terms of Expected Container Behavior to make sure that we can switch HPC3 conformant images of the same application and - ideally - have the job ran in the same way.

The HPC Container Conformance Project

A lot of HPC sites have an unwritten understanding of how to use them:

  1. You need a user to log in. Either on the submithost or when a job launches on the compute nodes,
  2. Once logged in, your environment is setup with default apps, specific application are available using module load from a central software share,
  3. you might be even able to install new software in your home directory.

A central software-share with curated sets of scientific applications and libraries is/was a powerful concept. But with containers this falls a part to some degree... 😦

Buildkit Dockerfile Frontend Caching

Ok, where was I? In the last blog post (BuildKit Dockerfile Frontend) scratching the surface by introcuding here-docs in Dockerfiles. The original inspiration for the post - series, as it turned out - was my to cut the cristiangreco/docker-pdflatex image down by a GB or so.
I went from this:

COPY install.sh /install.sh
RUN sh /install.sh && rm /install.sh

to this:

RUN <<eot bash
  apt-get update
  apt-cache depends texlive-full \
  | grep "Depends:" \
  | grep -v "doc$" \
  | egrep -v "texlive-(games|music)" \
  | egrep -v "texlive-lang-(arabic|cjk|chinese|cyrillic|czechslovak|european|french|german|greek|italian|japanese|korean|other|polish|portuguese|spanish)$" \
  | cut -d ' ' -f 4 \
  | xargs apt-get install --no-install-recommends -y
  apt-get autoremove
eot

BuildKit Dockerfile Frontend

I am a big fan of pandoc to generate documents instead of using a binary format like MS Word. Feels more natural to have a source and compile it into a delivarable.

Dockerfile Improvement

Installing LaTex and all the dependencies for pandoc is a hassle and the natural way of dealing with that is (of course) to create a container. I found the container cristiangreco/docker-pdflatex on the interweb.

But the image was rather big (4.4GB) and I thought:

Let's see if we can shave of a GB or so and while we are at it make the Dockerfile more readable

Blog Relaunched!

In one of my older post from 2019 I wrote:

I won't say "Long time, no post" - but...

After a long(-ish) neglect of the blog I am happy to relaunch the blog now! I am intending to write blog post regulary and blog about container-in-HPC, monitoring, GOLANG and what else comes into mind. I migrated everything from Jekyll to mkdocs-material and had to touch every blog post. What a journey I want through the last ten years. The first blog post is from 2012...

Let's see what the next 10 years are going to be! 🥳