Skip to main content

Cache busting when building Docker images

One of the handiest features of the docker build system is the caching system.

'docker build' tries to reuse the layers already built until something changes inside the Dockerfile. In this way, we can save several minutes when rebuilding an image if the changes happen further down the list in the Dockerfile.

Sometimes, though, we do want to invalidate the cache and ensure the next build won't use it.

To do this an option is to pass the '--no-cache' argument to 'docker build'.

When dealing with 'apt-get install' instructions though there are other tricks. I found this document on Dockerfile best practices very useful.

First of all an observation. If you have 'RUN apt-get update' as a single line of a Dockerfile, followed by the installation of a package, e.g.:

RUN apt-get update
RUN apt-get install -y nginx

then changing the list of packages and running again the build command won't trigger an 'apt-get update': that line hasn't changed so docker build reuses the cache. It might not be what you want.

To force cache invalidation for this specific case the recommendation is to use those commands in the form:

RUN apt-get update && apt-get install -y nginx

This will always install the latest version of the packages. It even has a name: "cache busting".

Another recommendation I like is to put each package on a single line, and have them in alphabetical order: this will ease visual inspection and prevent duplicates or other undesired conditions.

Of course, you can also specify exact versions for the packages as you would normally do with 'apt-get install'. That's "version pinning" and it invalidates the cache too.

You can find all this on the linked page on Dockerfile best practices; this is just my digested interpretation.

Just one more thing: a way to limit the size of a built image is to clean up the content of '/var/lib/apt/lists' in the same RUN command, e.g.:

RUN apt-get update && apt-get install -y \
    aufs-tools \
    automake \
    build-essential \
&& rm -rf /var/lib/apt/lists/* 

The command above will build an image layer that doesn't contain the apt cache.

If you had instead used this:

RUN apt-get update && apt-get install -y \
    aufs-tools \
    automake \
    build-essential
RUN rm -rf /var/lib/apt/lists/*

you would have had not only a larger layer, containing the apt cache, but also an additional layer generated by the second RUN command.


Popular posts from this blog

Troubleshooting TURN

  WebRTC applications use the ICE negotiation to discovery the best way to communicate with a remote party. I t dynamically finds a pair of candidates (IP address, port and transport, also known as “transport address”) suitable for exchanging media and data. The most important aspect of this is “dynamically”: a local and a remote transport address are found based on the network conditions at the time of establishing a session. For example, a WebRTC client that normally uses a server reflexive transport address to communicate with an SFU. when running inside the home office, may use a relay transport address over TCP when running inside an office network which limits remote UDP targets. The same configuration (defined as “iceServers” when creating an RTCPeerConnection will work in both cases, producing different outcomes.

VoIP calls encoded with SILK: from RTP to WAV

SILK is a codec defined by Skype, but can be found in many VoIP clients, like CSipSimple . It comes in different flavours (sample rates and frame sizes), from narrowband (8 KHz) to wideband (24 KHz). Since Wireshark doesn't allow you to decode an RTP stream carrying SILK frames, I was curious to find a programmatic way to do it. In fact, this has also allowed to me to earn a "tumbleweed" badge in stackoverflow . You may argue that a Wireshark plugin would be the right solution, but that's probably for another day. Initially I thought it was sufficient to read the specification for RTP payload when using SILK ; the truth is that I had to reverse engineer a solution by looking at SILK SDK's test vectors. There, I discovered that a file containing SILK audio doesn't have the file header indicated in the IETF draft ("!#SILK"), but a slightly different one ("!#SILK_V3"). More importantly, each encoded frame is not preced...

Extracting Opus from a pcap file into an audible wav

From time to time I need to verify that the audio inside a trace is as expected. Not much in terms of quality, but more often content and duration. A few years ago I wrote a small program to transform a pcap into a wav file - the codec in use was SILK. These days I'm dealing with Opus , and I have to say things are greatly simplified, in particular if you consider opus-tools , a set of utilities to handle opus files and traces. One of those tools, opusrtp , can do live captures and write the interpreted payload into a .opus file. Still, what I needed was to achieve the same result but from a pcap already existing, i.e. "offline". So I come up with a small - quite shamlessly copy&pasted - patch to opusrtc, which is now in this fork . Once you have a pcap with an RTP stream with opus (say in input.pcap ) you can retrieve the .opus equivalent (in rtpdump.opus ) with: ./opusrtp --extract input.pcap Then you can generate an audible wav file with: ./opusd...