Buildkit Dockerfile Frontend Caching
Ok, where was I? In the last blog post (BuildKit Dockerfile Frontend) scratching the surface by introcuding here-docs in Dockerfiles. The original inspiration for the post - series, as it turned out - was my to cut the cristiangreco/docker-pdflatex image down by a GB or so.
I went from this:
to this:
RUN <<eot bash
apt-get update
apt-cache depends texlive-full \
| grep "Depends:" \
| grep -v "doc$" \
| egrep -v "texlive-(games|music)" \
| egrep -v "texlive-lang-(arabic|cjk|chinese|cyrillic|czechslovak|european|french|german|greek|italian|japanese|korean|other|polish|portuguese|spanish)$" \
| cut -d ' ' -f 4 \
| xargs apt-get install --no-install-recommends -y
apt-get autoremove
eot
Iteration Speed
The problem with the above is that it takes forever to build the image. Each time I was fiddling with the list of packages I wanted to exclude the build downloaded the package list and all packages itself (2GB).
Thus, the iteration speed was terible.
RUN mounts FTW
That's where RUN
mounts within the frontend are going to speed up things. First, I need to instruct apt
to not throw away the packages already downloaded.
# syntax = docker/dockerfile:1.4
FROM debian:bullseye-20230109-slim
RUN <<eot bash
rm -f /etc/apt/apt.conf.d/docker-clean
echo 'Binary::apt::APT::Keep-Downloaded-Packages "true";' \
> /etc/apt/apt.conf.d/keep-cache
eot
RUN --mount=type=cache,target=/var/cache/apt --mount=type=cache,target=/var/lib/apt <<eot bash
apt-get update
apt-cache depends texlive-full \
| grep "Depends:" \
| grep -v "doc$" \
| egrep -v "texlive-(games|music)" \
| egrep -v "texlive-lang-(arabic|cjk|chinese|cyrillic|czechslovak|european|french|german|greek|italian|japanese|korean|other|polish|portuguese|spanish)$" \
| cut -d ' ' -f 4 \
| xargs apt-get install --no-install-recommends -y
apt-get autoremove
eot
VOLUME ["/sources"]
WORKDIR /sources
Next, I'll add two mountpoints to keep the caches around.
RUN --mount=type=cache,target=/var/cache/apt --mount=type=cache,target=/var/lib/apt <<eot bash
apt-get update
apt-cache depends texlive-full \
| grep "Depends:" \
| grep -v "doc$" \
| egrep -v "texlive-(games|music)" \
| egrep -v "texlive-lang-(arabic|cjk|chinese|cyrillic|czechslovak|european|french|german|greek|italian|japanese|korean|other|polish|portuguese|spanish)$" \
| cut -d ' ' -f 4 \
| xargs apt-get install --no-install-recommends -y
apt-get autoremove
eot
# syntax = docker/dockerfile:1.4
FROM debian:bullseye-20230109-slim
RUN <<eot bash
rm -f /etc/apt/apt.conf.d/docker-clean
echo 'Binary::apt::APT::Keep-Downloaded-Packages "true";' \
> /etc/apt/apt.conf.d/keep-cache
eot
RUN --mount=type=cache,target=/var/cache/apt --mount=type=cache,target=/var/lib/apt <<eot bash
apt-get update
apt-cache depends texlive-full \
| grep "Depends:" \
| grep -v "doc$" \
| egrep -v "texlive-(games|music)" \
| egrep -v "texlive-lang-(arabic|cjk|chinese|cyrillic|czechslovak|european|french|german|greek|italian|japanese|korean|other|polish|portuguese|spanish)$" \
| cut -d ' ' -f 4 \
| xargs apt-get install --no-install-recommends -y
apt-get autoremove
eot
VOLUME ["/sources"]
WORKDIR /sources
Awesome! Now I can run the build and maintain the cached files. No need to download 2GB of packages each time I iterate. 🤯
Use in GOLANG
This method also works wonders when you build GOLANG images a lot, just put the go-cache into a cache and off you go.
# syntax = docker/dockerfile:1.4
FROM golang
WORKDIR /go/src/test
COPY <<-"eot" /go/src/test/main.go
package main
import (
"github.com/sirupsen/logrus"
)
func main() {
logrus.Println("Hello, world!")
}
eot
RUN --mount=type=cache,target=/go/pkg/mod <<eot bash
go mod init
go mod tidy
eot
ENV CGO_ENABLED=1 GOOS=linux
RUN --mount=type=cache,target=/go/pkg/mod <<eot bash
go build -o /usr/bin/test -a -ldflags '-extldflags "-static"' .
eot
RUN /usr/bin/test
Why should HPC care?
Pretty sure my readers are going to see where we as the HPC community will benefit by using this: containerized builds...
Using spack containerize
is pretty cool already, but using caching between builds is hard because the Dockerfile looks totally different between builds if you change the spack.env
ever so slightly.
Using --mount=type=cache
is going to be a fun optimization. Need to fiddle around with that before FOSDEM next week.