<?xml version="1.0" encoding="utf-8"?><feed xmlns="http://www.w3.org/2005/Atom" ><generator uri="https://jekyllrb.com/" version="3.10.0">Jekyll</generator><link href="https://blog.pmhahn.de/feed.xml" rel="self" type="application/atom+xml" /><link href="https://blog.pmhahn.de/" rel="alternate" type="text/html" /><updated>2026-02-25T09:57:02+01:00</updated><id>https://blog.pmhahn.de/feed.xml</id><title type="html">Philipp Hahn</title><subtitle>Open Source Software Developer</subtitle><author><name>Philipp Hahn</name></author><entry><title type="html">Linux and the Windows NT file system</title><link href="https://blog.pmhahn.de/linux-ntfs/" rel="alternate" type="text/html" title="Linux and the Windows NT file system" /><published>2026-02-17T15:01:00+01:00</published><updated>2026-02-17T15:01:00+01:00</updated><id>https://blog.pmhahn.de/linux-ntfs</id><content type="html" xml:base="https://blog.pmhahn.de/linux-ntfs/"><![CDATA[<p>The <em>Windows New Technology File System</em> (<abbr title="New Technology File System">NTFS</abbr>) has a long history with Linux:</p>

<table>
  <thead>
    <tr>
      <th>Driver</th>
      <th>Type</th>
      <th>Based on</th>
      <th>Kernel</th>
      <th>Period</th>
      <th>Read-write</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td>Original</td>
      <td>Kernel</td>
      <td>Scratch</td>
      <td>2.1.74</td>
      <td>1995-2001</td>
      <td>Read-only</td>
    </tr>
    <tr>
      <td><a href="https://flatcap.github.io/linux-ntfs/misc.html">Linux-<abbr title="New Technology File System">NTFS</abbr></a></td>
      <td>Kernel</td>
      <td>Scratch</td>
      <td>2.5.11</td>
      <td>2002-2024</td>
      <td>Read-only</td>
    </tr>
    <tr>
      <td><a href="https://en.wikipedia.org/wiki/Captive_NTFS">Captive</a></td>
      <td>FUSE</td>
      <td><code class="language-plaintext highlighter-rouge">ntfs.sys</code></td>
      <td> </td>
      <td>2003-2006</td>
      <td>Read-write</td>
    </tr>
    <tr>
      <td><a href="https://en.wikipedia.org/wiki/NTFS-3G"><abbr title="New Technology File System">NTFS</abbr>-3G</a></td>
      <td>FUSE</td>
      <td>Linux-<abbr title="New Technology File System">NTFS</abbr></td>
      <td>3.18</td>
      <td>2006-</td>
      <td>Read-write</td>
    </tr>
    <tr>
      <td><a href="https://www.kernel.org/doc/html/latest/filesystems/ntfs3.html">NTFS3</a></td>
      <td>Kernel</td>
      <td>Paragon</td>
      <td>5.15</td>
      <td>2021-2026</td>
      <td>Read-write</td>
    </tr>
    <tr>
      <td><a href="https://git.kernel.org/pub/scm/linux/kernel/git/linkinjeon/ntfs.git/log/?h=ntfs-next&amp;ref=itsfoss.com"><abbr title="New Technology File System">NTFS</abbr> Plus</a></td>
      <td>Kernel</td>
      <td>Linux-<abbr title="New Technology File System">NTFS</abbr></td>
      <td>7.0?</td>
      <td>2026-</td>
      <td>Read-write</td>
    </tr>
    <tr>
      <td><abbr title="Audio-Visuelles Marketing">AVM</abbr> <abbr title="New Technology File System">NTFS</abbr></td>
      <td>Kernel</td>
      <td><abbr title="New Technology File System">NTFS</abbr>-3G</td>
      <td> </td>
      <td>2012-</td>
      <td>Read-write</td>
    </tr>
  </tbody>
</table>

<!--more-->

<ol>
  <li>The <em>Original</em> implementation was from Martin von Löwis.</li>
  <li>Anton Altaparmakov created the 2nd implementation <em>Linux-<abbr title="New Technology File System">NTFS</abbr></em> from scratch, which replaced the original implementation.</li>
  <li><em>Captive</em> was the first user-space based implementation, which used the original Windows driver <code class="language-plaintext highlighter-rouge">ntfs.sys</code> from Microsoft and run it under Wine.</li>
  <li><em><abbr title="New Technology File System">NTFS</abbr>-3G</em> also runs in user-space and uses FUSE to talk to the kernel.</li>
  <li>Paragaon donated an open-source version of if proprietary <em>NTFS3</em> to the Linux kernel. It was the first read-write implementation in the kernel, but is less documented.</li>
  <li><em><abbr title="New Technology File System">NTFS</abbr> Plus</em> is based on the older <em>Linux-<abbr title="New Technology File System">NTFS</abbr></em> implementation, adds read-write support and updates the implementation to use modern Linux APIs. It is scheduled to replace <em>NTFS3</em> again.</li>
  <li><abbr title="Audio-Visuelles Marketing">AVM</abbr> – now FRITZ! Technology – ported the <em><abbr title="New Technology File System">NTFS</abbr>-3G</em> to kernel space and is used in FritzOS only.</li>
</ol>

<pre><code class="language-mermaid">gantt
    title NTFS
    dateFormat  YYYY-MM-DD
    axisFormat %Y
    Original    : 1997-01-01, 2002-04-01
    Linux-NTFS  : 2002-04-01, 2024-01-01
    Captive     : 2003-01-01, 2006-01-01
    NTFS-3G     : 2006-01-01, 2030-01-01
    NTFS3       : 2021-11-01, 2026-01-01
    NTFS+       : 2026-03-01, 2030-01-01
    ANTFS       : 2012-01-01, 2030-01-01
    2.0.0       : vert, 1996-06-09, 1m
    2.2.0       : vert, 1999-01-26, 1m
    2.4.0       : vert, 2001-01-04, 1m
    2.6.0       : vert, 2003-12-18, 1m
    2.6.16      : vert, 2006-03-20, 1m
    2.6.27      : vert, 2008-10-09, 1m
    3.0         : vert, 2011-07-21, 1m
    3.8         : vert, 2013-02-18, 1m
    4.4         : vert, 2016-01-10, 1m
    4.19        : vert, 2018-10-22, 1m
    5.10        : vert, 2020-12-13, 1m
    5.15        : vert, 2021-10-31, 1m
    6.1         : vert, 2022-12-11, 1m
    6.6         : vert, 2023-10-29, 1m
    6.12        : vert, 2024-11-17, 1m
</code></pre>

<h2 id="links">Links</h2>
<ul>
  <li>Wikipedia: <a href="https://en.wikipedia.org/wiki/Linux_kernel_version_history">Linux Kernel version history</a></li>
</ul>

<!-- <https://www.cyberark.com/resources/threat-research-blog/the-linux-kernel-and-the-cursed-driver -->

<!-- *[FD]: File Daemon -->
<!-- *[FD]: File Descriptor -->
<!-- *[GPT]: Generative Pre-trained Transformer -->
<!-- *[GPT]: Global Partitioning Table -->
<!-- *[GPT]: GUID Partition Table -->]]></content><author><name>Philipp Hahn</name></author><category term="linux" /><category term="filesystem" /><summary type="html"><![CDATA[The Windows New Technology File System (NTFS) has a long history with Linux: Driver Type Based on Kernel Period Read-write Original Kernel Scratch 2.1.74 1995-2001 Read-only Linux-NTFS Kernel Scratch 2.5.11 2002-2024 Read-only Captive FUSE ntfs.sys   2003-2006 Read-write NTFS-3G FUSE Linux-NTFS 3.18 2006- Read-write NTFS3 Kernel Paragon 5.15 2021-2026 Read-write NTFS Plus Kernel Linux-NTFS 7.0? 2026- Read-write AVM NTFS Kernel NTFS-3G   2012- Read-write]]></summary></entry><entry><title type="html">Proper dependency tracking in GNU make</title><link href="https://blog.pmhahn.de/make/" rel="alternate" type="text/html" title="Proper dependency tracking in GNU make" /><published>2025-11-22T12:50:00+01:00</published><updated>2025-11-22T12:50:00+01:00</updated><id>https://blog.pmhahn.de/make</id><content type="html" xml:base="https://blog.pmhahn.de/make/"><![CDATA[<p><code class="language-plaintext highlighter-rouge">make</code> is used to build projects, e.g. compile source code into binaries.
If the project consists of multiple files, explicit dependencies must be specified to run the command in the correct order.</p>

<p>In addition to that <code class="language-plaintext highlighter-rouge">Makefiles</code> can also be used to track implicit dependencies:
If one file is modified, only those commands are re-run which are needed.
For large projects that can be a big time-saver if incremental changes are done.</p>

<p>But how to do that properly (for a <abbr title="Catalog">C</abbr> project)?</p>

<!--more-->

<h2 id="the-historical-way">The historical way</h2>

<ul>
  <li><code class="language-plaintext highlighter-rouge">Makefile</code>
    <div class="language-make highlighter-rouge"><div class="highlight"><pre class="highlight"><code>  <span class="nl">main</span><span class="o">:</span> <span class="nf">main.o</span>
  <span class="nl">main.o</span><span class="o">:</span> <span class="nf">main.c</span>
</code></pre></div>    </div>
  </li>
  <li><code class="language-plaintext highlighter-rouge">main.c</code>
    <div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code>  <span class="cp">#include</span> <span class="cpf">&lt;stdio.h&gt;</span><span class="cp">
</span>  <span class="cp">#include</span> <span class="cpf">"main.h"</span><span class="cp">
</span></code></pre></div>    </div>
  </li>
  <li><code class="language-plaintext highlighter-rouge">main.h</code>
    <div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code>  <span class="cp">#include</span> <span class="cpf">&lt;stdint.h&gt;</span><span class="cp">
</span></code></pre></div>    </div>
  </li>
</ul>

<p>In the past many projects implemented that themselves.
They used the pre-processor <code class="language-plaintext highlighter-rouge">cpp</code> to process all <code class="language-plaintext highlighter-rouge">#include</code> statements and then used <em>regular expressions</em> to extract the path of all files, which have been read.
These dependencies are then converted into a <code class="language-plaintext highlighter-rouge">make</code> fragment, which declares that dependency:</p>
<div class="language-make highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nl">main.o</span><span class="o">:</span> <span class="nf">main.h /usr/include/stdio.h /usr/include/stdint.h</span>
</code></pre></div></div>
<p>The main <code class="language-plaintext highlighter-rouge">Makefiles</code> has to include this fragment using something like <code class="language-plaintext highlighter-rouge">-include main.d</code>.</p>

<p>This solution has multiple issues.</p>

<h3 id="vanishing-dependencies">Vanishing dependencies</h3>

<p>Consider, you refactor your code and remove <code class="language-plaintext highlighter-rouge">main.h</code>.
In that case your automatically generated dependencies show an issue:
As <code class="language-plaintext highlighter-rouge">main.o</code> depends on <code class="language-plaintext highlighter-rouge">main.h</code>, which no longer is there, <code class="language-plaintext highlighter-rouge">make</code> will fail as there is no receipt to remake it.</p>

<p>This fix this your dependency generation tool needs to output empty rules for all dependencies:</p>
<div class="language-make highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nl">main.h</span><span class="o">:</span>
<span class="nl">/usr/include/stdio.h</span><span class="o">:</span>
<span class="nl">/usr/include/stdint.h</span><span class="o">:</span>
</code></pre></div></div>
<p>There are three cases:</p>
<ol>
  <li>if the file still exists and was not updated — it is older than the target — no remake is triggered by this dependency — but others may still trigger one.</li>
  <li>if the file still exists and was updates — it is newer than the target - a rebuild is triggered for the target.</li>
  <li>if the file does no longer exist, <code class="language-plaintext highlighter-rouge">make</code> invokes the <em>empty receipt</em> to remake it. The will not really create the file, but <code class="language-plaintext highlighter-rouge">make</code> will consider it as <em>newer than the target</em> and continue with the previous case 2 above and remake the target.</li>
</ol>

<p>Without that any developer would have to invoke <code class="language-plaintext highlighter-rouge">make clean</code> to remove all targets and dependency files, resulting in a full rebuild:</p>
<div class="language-make highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nl">.PHONY</span><span class="o">:</span> <span class="nf">clean</span>
<span class="nl">clean</span><span class="o">:</span>
    <span class="err">$(RM)</span> <span class="err">main</span> <span class="err">*.o</span> <span class="err">*.d</span>
</code></pre></div></div>

<h3 id="maintaining-the-dependency-tool">Maintaining the dependency tool</h3>

<p>First of all you must run the pre-processor a 2nd time to generate the input for you dependency extraction tool.
For small projects that cost might be negligible, but for larger projects that might add up.</p>

<p>Second you must maintain yet another tool.
While the pre-processed output is relatively easy to parse, newer compiler versions may add new features or change the output slightly, which your tool then must handle also.</p>

<p>Third you must make sure to invoke your pre-process run with exactly the same arguments as your real compilation:
Any <code class="language-plaintext highlighter-rouge">-Ddefine</code>, <code class="language-plaintext highlighter-rouge">-Idirectory</code>, <code class="language-plaintext highlighter-rouge">-include</code>, <code class="language-plaintext highlighter-rouge">-imacros</code> is important as otherwise you might miss or record wrong dependencies.</p>

<p>You must also decide, <strong>when</strong> to call your tool:
Many projects call it <strong>before</strong> the actual compilation, but that is unneeded:
If the target is missing, <code class="language-plaintext highlighter-rouge">make</code> must remake it anyway.
If the target exists, but you don’t no longer have the dependency information, you must also remake the target as you cannot guarantee, that any (changed) header might not introduce a significant change.</p>

<p>Generating the dependency information afterwards looks okay.
But you might get into situations, where you have stale information, for example if you interrupt <code class="language-plaintext highlighter-rouge">make</code> between the compilation and dependency-gathering steps.</p>

<p>Best would be to do it at the same time.
Luckily that is possible with <code class="language-plaintext highlighter-rouge">gcc</code> and other modern compilers like <code class="language-plaintext highlighter-rouge">clang</code>.</p>

<h2 id="the-gcc-way">The <abbr title="GNU Compiler Collection">gcc</abbr> way</h2>

<p>Luckily <em>modern</em> GCC has built-in support to <a href="https://gcc.gnu.org/onlinedocs/cpp/Invocation.html">generate dependency information</a> in <code class="language-plaintext highlighter-rouge">make</code>-syntax itself:</p>
<ul>
  <li><code class="language-plaintext highlighter-rouge">-M</code> enables generating dependency information <strong>instead</strong> of compiling the file. The output is written to <abbr title="Standard Output">STDOUT</abbr> unless <code class="language-plaintext highlighter-rouge">-o</code> is used to redirect it to a file.</li>
  <li><code class="language-plaintext highlighter-rouge">-MM</code> similar to the above, but <em>system header files</em> are not mentioned.</li>
  <li><code class="language-plaintext highlighter-rouge">-MD</code> and <code class="language-plaintext highlighter-rouge">-MMD</code> are variants of <code class="language-plaintext highlighter-rouge">-M</code> and <code class="language-plaintext highlighter-rouge">-MM</code> respectively, which generate dependency information in <strong>addition</strong> to the requested action, e.g. <code class="language-plaintext highlighter-rouge">-c</code> to compile the unite.</li>
  <li><code class="language-plaintext highlighter-rouge">-MF file</code> writes the information to the given file instead of <abbr title="Standard Output">STDOUT</abbr>.</li>
  <li><code class="language-plaintext highlighter-rouge">-MP</code> adds additional <code class="language-plaintext highlighter-rouge">.PHONY</code> targets for all dependencies to solve the <a href="#vanishing-dependencies">Vanishing dependencies</a> problem from above.</li>
  <li><code class="language-plaintext highlighter-rouge">-MT target</code> allows to overwrite the target name. By default the base-name of the <em>main input file</em> is used, where the suffix is replaced by <code class="language-plaintext highlighter-rouge">.o</code>.</li>
  <li><code class="language-plaintext highlighter-rouge">-MQ target</code> is the variant of the above, which also quotes any <code class="language-plaintext highlighter-rouge">make</code> meta-characters to make sure, the name is not mangled by <code class="language-plaintext highlighter-rouge">make</code> but reaches the shell command as-given.</li>
</ul>

<p>So let’s rewrite our <code class="language-plaintext highlighter-rouge">Makefile</code> and try this:</p>
<div class="language-make highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nl">main</span><span class="o">:</span> <span class="nf">main.o</span>

<span class="nl">%.o %.d &amp;</span><span class="o">:</span> <span class="nf">%.c</span>
	<span class="nv">$(CC)</span> <span class="nv">$(CPPFLAGS)</span> <span class="nv">$(CFLAGS)</span> <span class="nt">-MMD</span> <span class="nt">-MF</span> <span class="nv">$*</span>.d <span class="nt">-MP</span> <span class="nt">-c</span> <span class="nt">-o</span> <span class="nv">$*</span>.o <span class="nv">$&lt;</span>

<span class="k">-include</span><span class="sx"> *.d</span>
</code></pre></div></div>

<ol>
  <li>‘&amp;:<code class="language-plaintext highlighter-rouge"> tells </code>make`, that the recipt generated both files at the same time. (<a href="https://www.gnu.org/software/make/manual/html_node/Multiple-Targets.html">grouped targets</a>)</li>
  <li><code class="language-plaintext highlighter-rouge">-MMD</code> tells <code class="language-plaintext highlighter-rouge">gcc</code> to both compile and generate dependency information at the same time. <em>System header files</em> are excluded.</li>
  <li><code class="language-plaintext highlighter-rouge">-MF $*.d</code> tells <code class="language-plaintext highlighter-rouge">gcc</code> to write the dependency information into a file with the file name extension <code class="language-plaintext highlighter-rouge">.d</code>.</li>
  <li><code class="language-plaintext highlighter-rouge">-MP</code> tells <code class="language-plaintext highlighter-rouge">gcc</code> to generate <code class="language-plaintext highlighter-rouge">.PHONY</code> targets for all included file to make the dependency information future-proof in case one of them gets deleted.</li>
  <li><code class="language-plaintext highlighter-rouge">-c -o $*.o $&lt;</code> to compile the unit.</li>
  <li><code class="language-plaintext highlighter-rouge">-include *.d</code> includes the dependency information as far as it already exists</li>
</ol>

<h3 id="first-compilation-issue">First compilation issue</h3>

<p>This does not work as expected:
<code class="language-plaintext highlighter-rouge">make</code> has a built-in mechanism to <a href="https://www.gnu.org/software/make/manual/html_node/Remaking-Makefiles.html">Remake Makefiles</a>.
All files included via <code class="language-plaintext highlighter-rouge">include</code> are considered <em>Makefiles</em> and <code class="language-plaintext highlighter-rouge">make</code> tries to update them.
If there is no file <code class="language-plaintext highlighter-rouge">*.d</code>, <code class="language-plaintext highlighter-rouge">make</code> applies our rule and will try to compile <code class="language-plaintext highlighter-rouge">*.c</code> to <code class="language-plaintext highlighter-rouge">*.d</code> :-(
(That is why the above rule already uses <code class="language-plaintext highlighter-rouge">$*.o</code> instead of <code class="language-plaintext highlighter-rouge">$@</code> as the later would be <code class="language-plaintext highlighter-rouge">*.d</code>, which then is passed to both <code class="language-plaintext highlighter-rouge">-MF</code> and <code class="language-plaintext highlighter-rouge">-o</code> with catastrophic results.)</p>

<p>We can avoid this by explicitly using <code class="language-plaintext highlighter-rouge">$(wildcard )</code> to include only the existing files:</p>
<div class="language-make highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">-include</span><span class="sx"> $(wildcard *.d)</span>
</code></pre></div></div>

<h3 id="second-compilation-issue">Second compilation issue</h3>

<p>While the solution looks okay, actually it is not:
This way dependency information is optional.
If you delete all dependency files <code class="language-plaintext highlighter-rouge">*.d</code>, modify <code class="language-plaintext highlighter-rouge">main.h</code> and re-run <code class="language-plaintext highlighter-rouge">make</code>: Nothing will happen.
We lost the information, that <code class="language-plaintext highlighter-rouge">main.o</code> depends on <code class="language-plaintext highlighter-rouge">main.h</code>.
Therefore we must change the rule to always require the associated file <code class="language-plaintext highlighter-rouge">$*.d</code> to always exist:</p>
<div class="language-make highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nl">%.o</span><span class="o">:</span> <span class="nf">%.c %.d</span>
	<span class="nv">$(CC)</span> <span class="nv">$(CPPFLAGS)</span> <span class="nv">$(CFLAGS)</span> <span class="nt">-MMD</span> <span class="nt">-MF</span> <span class="nv">$*</span>.d <span class="nt">-MP</span> <span class="nt">-c</span> <span class="nt">-o</span> <span class="nv">$@</span> <span class="nv">$&lt;</span>
<span class="nl">%.d</span><span class="o">:</span> <span class="nf">;</span>
<span class="nl">.NOTINTERMEDIATE</span><span class="o">:</span> <span class="nf">%.d</span>
</code></pre></div></div>
<ul>
  <li>the <em>empty rule</em> for <code class="language-plaintext highlighter-rouge">%.d</code> is needed for <code class="language-plaintext highlighter-rouge">make</code> to handle the case, when the file is missing.
For that case we tell <code class="language-plaintext highlighter-rouge">make</code> that it should consider that file as <code class="language-plaintext highlighter-rouge">remade</code>, so it newer than the target.
That will remake the target to actually generate the real dependency information.</li>
  <li>
    <p>the <code class="language-plaintext highlighter-rouge">.NOTINTERMEDIATE</code> is needed as <code class="language-plaintext highlighter-rouge">%.d</code> is never mentioned as a real target.
<code class="language-plaintext highlighter-rouge">make</code> will search its <a href="https://www.gnu.org/software/make/manual/html_node/Chained-Rules.html">chain of implicit rules</a> <code class="language-plaintext highlighter-rouge">main</code> → <code class="language-plaintext highlighter-rouge">main.o</code> → <code class="language-plaintext highlighter-rouge">main.d</code> and mark it as <em>intermediate</em>.
Because of that the file is not remade and/or will be deleted if it is remade.
By marking it as non-intermediate we tell <code class="language-plaintext highlighter-rouge">make</code> to handle it as a regular file and to keep it afterwards.</p>

    <p>This is only available since <em>GNU make 4.4</em>!</p>
  </li>
</ul>

<h3 id="final-version--make-44">Final version — make 4.4</h3>

<div class="language-make highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c">#!/usr/bin/make -f
# Disable built-in rules and variables
</span><span class="nv">MAKEFLAGS</span> <span class="o">+=</span> <span class="nt">--no-builtin-rules</span>

<span class="nl">main</span><span class="o">:</span> <span class="nf">main.o</span>

<span class="nv">CFLAGS</span> <span class="o">:=</span> <span class="nt">-g</span>
<span class="nv">MYCFLAGS</span> <span class="o">:=</span> <span class="nt">-Wall</span> <span class="nt">-Werror</span>
<span class="nv">DEPFLAGS</span> <span class="o">=</span> <span class="nt">-MMD</span> <span class="nt">-MP</span> <span class="nt">-MF</span> <span class="nv">$*</span>.d <span class="nt">-MT</span> <span class="nv">$@</span>

<span class="nv">COMPILE.c</span> <span class="o">=</span> <span class="nv">$(CC)</span> <span class="nv">$(DEPFLAGS)</span> <span class="nv">$(CPPFLAGS)</span> <span class="nv">$(CFLAGS)</span> <span class="nv">$(MYCFLAGS)</span> <span class="nt">-c</span>

<span class="nl">%.o</span><span class="o">:</span> <span class="nf">%.c %.d</span>
	<span class="nv">$(COMPILE.c)</span> <span class="nv">$(OUTPUT_OPTION)</span> <span class="nv">$&lt;</span>
<span class="nl">%.d</span><span class="o">:</span> <span class="nf">;</span>
<span class="nl">.NOTINTERMEDIATE</span><span class="o">:</span> <span class="nf">%.d</span>
<span class="nl">%</span><span class="o">:</span> <span class="nf">%.o</span>
	<span class="nv">$(LINK.o)</span> <span class="nv">$^</span> <span class="nv">$(LOADLIBES)</span> <span class="nv">$(LDLIBS)</span> <span class="nt">-o</span> <span class="nv">$@</span>

<span class="k">-include</span><span class="sx"> $(wildcard *.d)</span>

<span class="nl">.PHONY</span><span class="o">:</span> <span class="nf">clean</span>
<span class="nl">clean</span><span class="o">:</span>
	<span class="nv">$(RM)</span> main <span class="k">*</span>.o <span class="k">*</span>.d
</code></pre></div></div>

<h3 id="final-version--make-43">Final version — make 4.3</h3>

<div class="language-make highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c">#!/usr/bin/make -f
# Disable built-in rules and variables
</span><span class="nv">MAKEFLAGS</span> <span class="o">+=</span> <span class="nt">--no-builtin-rules</span>

<span class="nv">SRCS</span> <span class="o">:=</span> main.c
<span class="nv">OBJS</span> <span class="o">:=</span> <span class="nv">$(SRCS:%.c=%.o)</span>
<span class="nv">DEPS</span> <span class="o">:=</span> <span class="nv">$(SRCS:%.c=%.d)</span>

<span class="nl">main</span><span class="o">:</span> <span class="nf">$(OBJS)</span>

<span class="nv">CFLAGS</span> <span class="o">:=</span> <span class="nt">-g</span>
<span class="nv">MYCFLAGS</span> <span class="o">:=</span> <span class="nt">-Wall</span> <span class="nt">-Werror</span>
<span class="nv">DEPFLAGS</span> <span class="o">=</span> <span class="nt">-MMD</span> <span class="nt">-MP</span> <span class="nt">-MF</span> <span class="nv">$*</span>.d <span class="nt">-MT</span> <span class="nv">$@</span>

<span class="nv">COMPILE.c</span> <span class="o">=</span> <span class="nv">$(CC)</span> <span class="nv">$(DEPFLAGS)</span> <span class="nv">$(CPPFLAGS)</span> <span class="nv">$(CFLAGS)</span> <span class="nv">$(MYCFLAGS)</span> <span class="nt">-c</span>

<span class="nl">%.o</span><span class="o">:</span> <span class="nf">%.c %.d</span>
	<span class="nv">$(COMPILE.c)</span> <span class="nv">$(OUTPUT_OPTION)</span> <span class="nv">$&lt;</span>
<span class="nl">$(DEPS)</span><span class="o">:</span>

<span class="k">-include</span><span class="sx"> $(wildcard $(DEPS))</span>

<span class="nl">.PHONY</span><span class="o">:</span> <span class="nf">clean</span>
<span class="nl">clean</span><span class="o">:</span>
	<span class="nv">$(RM)</span> main <span class="nv">$(OBJS)</span> <span class="nv">$(DEPS)</span>
</code></pre></div></div>

<h2 id="the-kbuild-way">The kbuild way</h2>

<p>The Linux kernel uses its own <a href="https://docs.kernel.org/kbuild/index.html">build system</a> called <code class="language-plaintext highlighter-rouge">kbuild</code>, which is based on a bunch of <code class="language-plaintext highlighter-rouge">make</code> receipts.
It has some additional requirements:</p>
<ol>
  <li>The Linux is heavily configurable.
There is a huge <code class="language-plaintext highlighter-rouge">.config</code> file, which lists all options.
If that file would be used as a pre-dependency, all such files would get rebuilt each time a single option was changed.
Therefore kbuild uses some mechanisms to split that big file into smaller chunks, so that each compilation unit can just depend on those options, it really depends on.</li>
  <li>The above solution does not track the <code class="language-plaintext highlighter-rouge">$(…FLAGS)</code> variables or <code class="language-plaintext highlighter-rouge">$(CC)</code>.
Changing them might a complete rebuild to have a consistent kernel again.
As such kbuild logs the final command used to compile the target also in the dependency information file.
On the next run the commands are compared and the invocation may only be skipped, if they match.</li>
</ol>

<p>For that kbuild overwrites most of <code class="language-plaintext highlighter-rouge">make</code>s dependency mechanism with its own implementation:</p>
<ol>
  <li>Most targets have <code class="language-plaintext highlighter-rouge">FORCE</code> as their pre-dependency, so that the receipt will always run.</li>
  <li>The receipt itself will then use some heavy macro magic to read back its dependency information from a file and compare that to the actual run.
The command is only executed if any pre-requisite is changed or any relevant configuration option is changed.</li>
  <li>If a command cannot determine, if it needs to run, it will run by default but will write its output to a temporary file.
That file is then compared to the previous version.
    <ul>
      <li>if the content differs, the temporary file is renamed over the real output file.</li>
      <li>if the content did not change, the temporary file is deleted.
That way the old time stamp is preserved if no change did happen.
This is done to prevent needless downstream rebuilds.</li>
    </ul>
  </li>
</ol>

<h2 id="closing-word">Closing word</h2>

<p>Much of this was inspired by the article <a href="https://make.mad-scientist.net/papers/advanced-auto-dependency-generation/">Auto-Dependency Generation</a> from <em>Paul <abbr title="Director">D</abbr>. Smith</em>.
Thank you very much for writing this in the first place.
The main difference is, that he uses a variable <code class="language-plaintext highlighter-rouge">$(SRCS)</code>, which explicitly lists all source <abbr title="Catalog">C</abbr> files.
That way he can <strong>explicitly</strong> name the expected <code class="language-plaintext highlighter-rouge">*.o</code> and <code class="language-plaintext highlighter-rouge">*.d</code> files, which bypasses the problem with intermediate files from my solution above.
That version also works for <code class="language-plaintext highlighter-rouge">make 4.3</code> an earlier as <code class="language-plaintext highlighter-rouge">.NOTINTERMEDIATE</code> is only available since <code class="language-plaintext highlighter-rouge">make 4.4</code>.</p>

<!-- *[FD]: File Daemon -->
<!-- *[FD]: File Descriptor -->
<!-- *[GPT]: Generative Pre-trained Transformer -->
<!-- *[GPT]: Global Partitioning Table -->
<!-- *[GPT]: GUID Partition Table -->]]></content><author><name>Philipp Hahn</name></author><category term="linux" /><summary type="html"><![CDATA[make is used to build projects, e.g. compile source code into binaries. If the project consists of multiple files, explicit dependencies must be specified to run the command in the correct order. In addition to that Makefiles can also be used to track implicit dependencies: If one file is modified, only those commands are re-run which are needed. For large projects that can be a big time-saver if incremental changes are done. But how to do that properly (for a C project)?]]></summary></entry><entry><title type="html">Padding and alignment of C structs</title><link href="https://blog.pmhahn.de/C-padding/" rel="alternate" type="text/html" title="Padding and alignment of C structs" /><published>2025-10-01T10:46:00+02:00</published><updated>2025-10-01T10:46:00+02:00</updated><id>https://blog.pmhahn.de/C-padding</id><content type="html" xml:base="https://blog.pmhahn.de/C-padding/"><![CDATA[<p>Q: How to debug padding and alignment issues of <abbr title="Catalog">C</abbr> <code class="language-plaintext highlighter-rouge">struct</code>?</p>

<p>A: <code class="language-plaintext highlighter-rouge">gdb --silent --batch -ex 'ptype /o struct my_t' some.o</code></p>

<!--more-->

<p>The past days I was investigating some performance issues with a proprietary SoC:
The code consists of closed-source pre-compiled binaries combined with public header files.
Some public glue-code added accessors to allocate, copy, and free the data structures.</p>

<h2 id="padding">Padding</h2>

<p>We have the requirement to extend the data-structure and add some additional members.
As some code was close-sourced, it is important to not change the layout of the existing structure.
Luckily <abbr title="Catalog">C</abbr> adds padding between members to align the next member according to common hardware constraints:</p>
<blockquote>
  <p>The start address of a 1,2,4,8,16,32,64,… sized member must align to that size.</p>
</blockquote>

<p>Consider this:</p>
<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="cp">#include</span> <span class="cpf">&lt;stdint.h&gt;</span><span class="cp">
</span><span class="k">struct</span> <span class="n">my0_t</span> <span class="p">{</span>
    <span class="kt">uint8_t</span> <span class="n">foo</span><span class="p">;</span>
    <span class="kt">uint32_t</span> <span class="n">bar</span><span class="p">;</span>
<span class="p">}</span> <span class="n">var0</span><span class="p">[</span><span class="mi">3</span><span class="p">];</span>
<span class="n">static_assert</span><span class="p">(</span><span class="k">sizeof</span><span class="p">(</span><span class="n">var0</span><span class="p">)</span> <span class="o">==</span> <span class="mi">8</span> <span class="o">*</span> <span class="mi">3</span><span class="p">,</span> <span class="s">"Unexpected sizeof"</span><span class="p">);</span>
</code></pre></div></div>

<p><code class="language-plaintext highlighter-rouge">bar</code> has size 32 bits or 4 bytes, so <code class="language-plaintext highlighter-rouge">var0[0]</code> must be placed into memory so that <code class="language-plaintext highlighter-rouge">&amp;(var[0].bar) % 4 == 0</code> is true.
The <abbr title="Catalog">C</abbr>-compiler will thus add <em>padding bytes</em> before <code class="language-plaintext highlighter-rouge">bar</code> to satisfy that requirement.
Compiling the code with <code class="language-plaintext highlighter-rouge">-Wpadded</code> shows this:</p>
<div class="language-console highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="gp">$</span><span class="w"> </span>gcc <span class="nt">-c</span> <span class="nt">-g</span> <span class="nt">-Wpadded</span> c-padding.c
<span class="go">c-padding.c:4:14: warning: padding struct to align ‘bar’ [-Wpadded]
</span><span class="gp">    4 |     uint32_t bar;</span><span class="w">
</span><span class="go">      |              ^~~
</span></code></pre></div></div>

<p>But you don’t know, what the <abbr title="Catalog">C</abbr> compiler does here:
<code class="language-plaintext highlighter-rouge">gcc</code> may either insert padding <strong>before</strong> <code class="language-plaintext highlighter-rouge">foo</code> or <strong>after</strong> it:</p>
<pre><code class="language-struct">struct my1_t {
    uint8_t foo;
    uint8_t _padding[3];
    uint32_t bar;
} var1[3];
static_assert(sizeof(var1) == 8 * 3, "Unexpected sizeof");
</code></pre>
<p>or</p>
<pre><code class="language-struct">struct my2_t {
    uint8_t _padding[3];
    uint8_t foo;
    uint32_t bar;
} var2[3];
static_assert(sizeof(var2) == 8 * 3, "Unexpected sizeof");
</code></pre>

<p>Both are valid, but all I have ever seen is padding being inserted after the previous member and before the next member.</p>

<p>But you can use <code class="language-plaintext highlighter-rouge">gdb</code>s <code class="language-plaintext highlighter-rouge">ptype</code> command to dump the exact layout including offset, size and inserted padding:</p>
<div class="language-console highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="gp">$</span><span class="w"> </span>gdb <span class="nt">--silent</span> <span class="nt">--batch</span> <span class="nt">-ex</span> <span class="s1">'ptype /o struct my0_t'</span> c-padding.o
<span class="go">/* offset      |    size */  type = struct my0_t {
</span><span class="gp">/*      0      |       1 */    uint8_t foo;</span><span class="w">
</span><span class="go">/* XXX  3-byte hole      */
</span><span class="gp">/*      4      |       4 */    uint32_t bar;</span><span class="w">
</span><span class="go">                               /* total size (bytes):    8 */
                             }
</span></code></pre></div></div>
<p>For this to work you need DWARF debugging information.
So please make sure you compile your code with <code class="language-plaintext highlighter-rouge">-g</code> enabled!</p>

<p>This extra padding increases the size of your <code class="language-plaintext highlighter-rouge">struct</code>, which might be undesired:
On embedded systems you often have less memory and excessive padding might waste a lot of memory.
To minimize this, you have multiple options:</p>

<h3 id="packing">Packing</h3>
<p>You can declare the <code class="language-plaintext highlighter-rouge">struct</code> as <em>packed</em>:</p>
<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">struct</span> <span class="n">my3_t</span> <span class="p">{</span>
    <span class="kt">uint8_t</span> <span class="n">foo</span><span class="p">;</span>
    <span class="kt">uint32_t</span> <span class="n">bar</span><span class="p">;</span>
<span class="p">}</span> <span class="n">__attribute__</span><span class="p">((</span><span class="n">packed</span><span class="p">))</span> <span class="n">var3</span><span class="p">[</span><span class="mi">3</span><span class="p">];</span>
<span class="n">static_assert</span><span class="p">(</span><span class="k">sizeof</span><span class="p">(</span><span class="n">var3</span><span class="p">)</span> <span class="o">==</span> <span class="mi">5</span> <span class="o">*</span> <span class="mi">3</span><span class="p">,</span> <span class="s">"Unexpected sizeof"</span><span class="p">);</span>
</code></pre></div></div>
<p>This makes the <code class="language-plaintext highlighter-rouge">struct</code> as compact as possible by <strong>not</strong> inserting any padding automatically.
But you will get into trouble and risk getting a <code class="language-plaintext highlighter-rouge">SIGBUS</code> error on some architectures:
Accessing a 32 bit variable which is not 4 byte aligned requires additional work:</p>
<ol>
  <li>Either the hardware has some extra logic to split non-aligned memory access into multiple accesses and to recombine both parts into the final value,</li>
  <li>Or the compiler has to generate extra code to not do the unaligned access,</li>
  <li>Or your program terminates with <code class="language-plaintext highlighter-rouge">SIGBUS</code> as the processor raises the <em>unaligned trap</em></li>
</ol>

<p>Please do not use <code class="language-plaintext highlighter-rouge">-fpack-struct</code> to make every <code class="language-plaintext highlighter-rouge">struct</code> packed by default!</p>

<h3 id="re-ordering-descending-by-size">Re-ordering descending by size</h3>
<p>Most often you can re-order your members descending by size – assuming sizes being a power-of-two.
The compiler still adds padding, but only at the end of the structure.
That way you do not have holes in the middle.</p>

<p>That changes the layout and breaks any <abbr title="Application Binary Interface">ABI</abbr> compatibility!
So not not do this with <code class="language-plaintext highlighter-rouge">structs</code>, which are used to communicate with your hardware or some closed source binary, which assumes the old layout.</p>

<h3 id="careful-re-ordering">Careful re-ordering</h3>
<p>If you only need to insert some small data, look for those hole:
As the compiler added padding there <strong>automatically</strong>, there is no guarantee that these bits/bytes are zero initialized.
If you need that guarantee, you must manually insert padding bytes!</p>

<p>On the other hand that provides the opportunity, to re-use those <em>undefined bits</em> for additional members.
Just look for a hole which is large enough for your data and add your member in between the members bordering that hole.</p>

<p>Just be careful with structures which are used with hardware:
If their accessor function does a <code class="language-plaintext highlighter-rouge">memset(…, 0, …)</code> to initialize the <code class="language-plaintext highlighter-rouge">struct</code> to zero, it might be important that those bits remain cleared.
If you then start using those bits, the hardware might get confused.</p>

<h2 id="alignment">Alignment</h2>
<p>You might have noticed, that <code class="language-plaintext highlighter-rouge">sizeof(struct my0_t) == 8</code> and not <code class="language-plaintext highlighter-rouge">5 == sizeof(uint8_t) + sizeof(uint32_t)</code>.
<code class="language-plaintext highlighter-rouge">gcc</code> also adds padding before or after all members to extend the <code class="language-plaintext highlighter-rouge">struct</code>, until its <code class="language-plaintext highlighter-rouge">sizeof</code> if a natural multiple of the widest element.
This is important for arrays where multiple instances are placed after each other.
There each instances start address must be aligned properly, which requires padding in between.
The distance between two elements is called “stride size”, which equals the <code class="language-plaintext highlighter-rouge">sizeof</code>.</p>

<p>This also applies to nested <code class="language-plaintext highlighter-rouge">struct</code>s like this:</p>
<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">struct</span> <span class="n">my5_t</span> <span class="p">{</span>
    <span class="k">struct</span> <span class="n">my4_t</span> <span class="n">baz</span><span class="p">;</span>
    <span class="kt">uint8_t</span> <span class="n">bla</span><span class="p">;</span>
<span class="p">}</span> <span class="n">var5</span><span class="p">[</span><span class="mi">3</span><span class="p">];</span>
<span class="n">static_assert</span><span class="p">(</span><span class="k">sizeof</span><span class="p">(</span><span class="n">var5</span><span class="p">)</span> <span class="o">==</span> <span class="mi">12</span> <span class="o">*</span> <span class="mi">3</span><span class="p">,</span> <span class="s">"Unexpected sizeof"</span><span class="p">);</span>
</code></pre></div></div>
<p>This might be unexpected as <code class="language-plaintext highlighter-rouge">my4_t</code> ends with 3 padding bytes, where <code class="language-plaintext highlighter-rouge">bla</code> might fit it.
Instead <code class="language-plaintext highlighter-rouge">baz</code> gets placed after the padding from <code class="language-plaintext highlighter-rouge">baz</code>, after which 3 more padding bytes are required.
So in total you get 6 bytes of padding.</p>

<h3 id="cache-line-size">Cache line size</h3>
<p>Alignment becomes even more important for performance.
Modern <abbr title="Central Processing Units">CPUs</abbr> have lots of caches and their <em>line size</em> specifies the smallest quantity for data transfer.
Even when you only require a single bit, the cache will transfer 32 or 64 or even more bytes from <abbr title="Random Access Memory">RAM</abbr>.</p>
<ul>
  <li>with more tightly packed <code class="language-plaintext highlighter-rouge">struct</code>s you get more data per cache-line and require fewer cache-lines, leaving more free cache lines for other tasks.</li>
  <li>on the other hand <em>false sharing</em> might become a performance issue with multi-threading, where data with different access patterns are stored in the cache line.</li>
</ul>

<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">struct</span> <span class="n">my6_t</span> <span class="p">{</span>
    <span class="kt">uint8_t</span> <span class="n">foo</span><span class="p">;</span>
    <span class="kt">uint32_t</span> <span class="n">bar</span><span class="p">;</span>
<span class="p">}</span> <span class="n">__attribute__</span><span class="p">((</span><span class="n">aligned</span><span class="p">(</span><span class="mi">32</span><span class="p">)))</span> <span class="n">var6</span><span class="p">[</span><span class="mi">3</span><span class="p">];</span>
<span class="n">static_assert</span><span class="p">(</span><span class="k">sizeof</span><span class="p">(</span><span class="n">var6</span><span class="p">)</span> <span class="o">==</span> <span class="mi">32</span> <span class="o">*</span> <span class="mi">3</span><span class="p">,</span> <span class="s">"Unexpected sizeof"</span><span class="p">);</span>
</code></pre></div></div>

<p>In this case we get 3 bytes of padding between <code class="language-plaintext highlighter-rouge">foo</code> and <code class="language-plaintext highlighter-rouge">bar</code>.
But we also get 24 bytes of padding after <code class="language-plaintext highlighter-rouge">bar</code> to make <code class="language-plaintext highlighter-rouge">sizeof(struct my6_t)</code> a multiple of 32 as requested by <code class="language-plaintext highlighter-rouge">__attribute__((aligned(32)))</code>.</p>

<p>This easily becomes worse with nested ``struct<code class="language-plaintext highlighter-rouge">s where inner </code>struct`s also have alignments:</p>
<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">struct</span> <span class="n">my7_t</span> <span class="p">{</span>
    <span class="kt">uint8_t</span> <span class="n">foo</span><span class="p">;</span>
    <span class="k">struct</span> <span class="n">inner</span> <span class="p">{</span>
        <span class="kt">uint8_t</span> <span class="n">foo</span><span class="p">;</span>
    <span class="p">}</span> <span class="n">__attribute__</span><span class="p">((</span><span class="n">aligned</span><span class="p">(</span><span class="mi">32</span><span class="p">)))</span> <span class="n">bar</span><span class="p">[</span><span class="mi">4</span><span class="p">];</span>
<span class="p">}</span> <span class="n">var7</span><span class="p">[</span><span class="mi">3</span><span class="p">];</span>
<span class="n">static_assert</span><span class="p">(</span><span class="k">sizeof</span><span class="p">(</span><span class="n">var7</span><span class="p">)</span> <span class="o">==</span> <span class="mi">64</span> <span class="o">*</span> <span class="mi">3</span><span class="p">,</span> <span class="s">"Unexpected sizeof"</span><span class="p">);</span>
</code></pre></div></div>
<p>Runnig <code class="language-plaintext highlighter-rouge">gdb</code> shows what happens:</p>
<div class="language-console highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="gp">$</span><span class="w"> </span>gdb <span class="nt">--silent</span> <span class="nt">--batch</span> <span class="nt">-ex</span> <span class="s1">'ptype /o var7'</span> c-padding.o
<span class="go">type = struct my7_t {
</span><span class="gp">/*      0      |       1 */    uint8_t foo;</span><span class="w">
</span><span class="go">/* XXX 31-byte hole      */
/*     32      |      32 */    struct inner {
</span><span class="gp">/*     32      |       1 */        uint8_t foo;</span><span class="w">
</span><span class="go">/* XXX 31-byte padding   */
                                   /* total size (bytes):   32 */
</span><span class="gp">                               } bar;</span><span class="w">
</span><span class="go">                               /* total size (bytes):   64 */
                             } [3]
</span></code></pre></div></div>

<h2 id="summary">Summary</h2>
<ul>
  <li>Use <code class="language-plaintext highlighter-rouge">gcc</code>s <code class="language-plaintext highlighter-rouge">-Wpadded</code> to get a warning.</li>
  <li>Use <code class="language-plaintext highlighter-rouge">gdb</code>s <code class="language-plaintext highlighter-rouge">ptype</code> to print the real layout.</li>
  <li>Verify your assumtions, especially if the same code is compiled for multiple platforms with different alignment requirements.</li>
  <li>Do not trust the comments in the code claiming ancient values for <code class="language-plaintext highlighter-rouge">sizeof</code> or proper cache line alignment.</li>
  <li>Explicitly add padding bytes as they are then also initialized; otherwise the compiler may do as it likes.</li>
</ul>

<h2 id="further-reading">Further reading</h2>
<ul>
  <li><a href="https://gcc.gnu.org/onlinedocs/gcc/Warning-Options.html#index-Wpadded"><abbr title="GNU Compiler Collection">gcc</abbr>: -Wpadded</a></li>
  <li><a href="http://www.catb.org/esr/structure-packing/">ESR: The Lost Art of Structure Packing</a></li>
</ul>

<!-- *[FD]: File Daemon -->
<!-- *[FD]: File Descriptor -->
<!-- *[GPT]: Generative Pre-trained Transformer -->
<!-- *[GPT]: Global Partitioning Table -->
<!-- *[GPT]: GUID Partition Table -->]]></content><author><name>Philipp Hahn</name></author><category term="c" /><summary type="html"><![CDATA[Q: How to debug padding and alignment issues of C struct? A: gdb --silent --batch -ex 'ptype /o struct my_t' some.o]]></summary></entry><entry><title type="html">Linux Kernel Module Symbol Versioning</title><link href="https://blog.pmhahn.de/linux-kernel-symbol-versioning/" rel="alternate" type="text/html" title="Linux Kernel Module Symbol Versioning" /><published>2025-08-23T12:59:00+02:00</published><updated>2025-08-23T12:59:00+02:00</updated><id>https://blog.pmhahn.de/linux-kernel-symbol-versioning</id><content type="html" xml:base="https://blog.pmhahn.de/linux-kernel-symbol-versioning/"><![CDATA[<p>The Linux kernel itself and its modules may export symbols, so that other modules can import and use them.
As the functions are written in <abbr title="Catalog">C</abbr>, it is important that the function signature matches:</p>
<ul>
  <li>the number of arguments must match</li>
  <li>the ordering of the arguments must match</li>
  <li>the data types must match, which includes the structure and layout of all input and output parameters</li>
</ul>

<p>If any of them changes, the <em>Application Binary Interface</em> (<abbr title="Application Binary Interface">ABI</abbr>) changes and you risk crashing the kernel.
If you’re lucky, recompiling the kernel and the modules is enough for both ends to pick up the new <em>Application Programming Interface</em> (<abbr title="Application Programming Interface">API</abbr>).</p>

<p>To detect such breaking changes, the Linux kernel can be compiled with <code class="language-plaintext highlighter-rouge">CONFIG_MODVERSIONS</code> enabled:
This calculates a <em>Cyclic Redundancy Check</em> (CRC) checksum over the function signature and embeds this information with the kernel and the modules.
The dynamic linker of the Linux kernel checks, that for each requested symbol its CRC matches the CRC of the Linux kernel or already loaded modules.
A module is only loaded, if a match is found for all symbols.
Otherwise loading fails.</p>

<!--more-->

<h2 id="rust-goes-dwarf">Rust goes DWARF</h2>

<p>The mechanism described here does not work with Rust.
As such the Linux kernel learned a new trick and can use the DWARF (<em>Debugging With Arbitrary Record Formats</em>) debugging information to calculate the CRC.
When <code class="language-plaintext highlighter-rouge">CONFIG_RUST</code> is enabled, <code class="language-plaintext highlighter-rouge">gendwarfksyms</code> is used instead of <code class="language-plaintext highlighter-rouge">genksyms</code>.
Both versions are incompatible as they calculate different CRCs for the same function.
But they work similar enough, so I will not go into details here.
If you’re interested, look for <code class="language-plaintext highlighter-rouge">CONFIG_EXTENDED_MODVERSIONS</code>.</p>

<h2 id="executable-and-linkable-format">Executable and Linkable Format</h2>

<p>Linux Kernel modules object files using the <em>Executable and Linkable Format</em> (ELF).
Instead of using the well-known suffix <code class="language-plaintext highlighter-rouge">.o</code>, they use the suffix <code class="language-plaintext highlighter-rouge">.ko</code>, but are otherwise the same.
They are comprised of multiple <em>sections</em> containing executable code, read-only constants, initialized data and other informations required for linking.</p>

<details><summary>Example: ELF sections of a Linux Kernel Module</summary>
<div>

    <div class="language-console highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="gp">$</span><span class="w"> </span>objdump <span class="nt">--section-headers</span> <span class="nt">--wide</span> avm-modver.ko
<span class="gp">$</span><span class="w"> </span><span class="nv">LC_ALL</span><span class="o">=</span>C readelf <span class="nt">--wide</span> <span class="nt">--section-headers</span> avm-modver.ko
<span class="go">There are 38 section headers, starting at offset 0x25d80:

Sections Header:
  [Nr] Name                              Type     Addresse    Off   Size ES Flg Lk Inf Al  Usage
🔵[ 0]                                   NULL            0      0      0  0      0   0  0  ELF header
🟠[ 1] .note.gnu.build-id                NOTE            0     40     24  0   A  0   0  4  unique build ID bitstring
🟣[ 2] .note.Linux                       NOTE            0     64     30  0   A  0   0  4  Architecture data
🟢[ 3] .text                             PROGBITS        0     a0     1f  0  AX  0   0 16  Code
⚪[ 4] .rela.text                        RELA            0  14e50     30 18   I 35   3  8
🔴[ 5] __ksymtab                         PROGBITS        0     c0      c  0   A  0   0  4  EXPORT_SYMBOL
⚪[ 6] .rela__ksymtab                    RELA            0  14e80     48 18   I 35   5  8
🔴[ 7] __kcrctab                         PROGBITS        0     cc      4  0   A  0   0  4  CRC
🟣[ 8] __mcount_loc                      PROGBITS        0     d0      8  0   A  0   0  1  ftrace()
⚪[ 9] .rela__mcount_loc                 RELA            0  14ec8     18 18   I 35   8  8
🟣[10] .modinfo                          PROGBITS        0     d8     92  0   A  0   0  1  MODULE_INFO
🟣[11] .return_sites                     PROGBITS        0    16a      4  0   A  0   0  1  Live patching
⚪[12] .rela.return_sites                RELA            0  14ee0     18 18   I 35  11  8
🟣[13] .call_sites                       PROGBITS        0    16e      4  0   A  0   0  1  Live patching
⚪[14] .rela.call_sites                  RELA            0  14ef8     18 18   I 35  13  8
🔴[15] __ksymtab_strings                 PROGBITS        0    172      d  1 AMS  0   0  1  EXPORT_SYMBOL
🔴[16] __versions                        PROGBITS        0    180     51  0   A  0   0 32  CRC
🟣[17] __patchable_function_entries      PROGBITS       58    1d8      8  0 WAL  3   0  8  NOPs
⚪[18] .rela__patchable_function_entries RELA            0  14f10     18 18   I 35  17  8
⚫[19] .data                             PROGBITS        0    1e0      0  0  WA  0   0  1  Initialized data
🟣[20] .gnu.linkonce.this_module         PROGBITS        0    200    500  0  WA  0   0 64
⚫[21] .bss                              NOBITS          0    700      0  0  WA  0   0  1  Uninitialized data
🟠[22] .debug_info                       PROGBITS        0    700   b51f  0      0   0  1
⚪[23] .rela.debug_info                  RELA            0  14f28   fc00 18   I 35  22  8
🟠[24] .debug_abbrev                     PROGBITS        0   bc1f    71e  0      0   0  1
🟠[25] .debug_aranges                    PROGBITS        0   c33d     50  0      0   0  1
⚪[26] .rela.debug_aranges               RELA            0  24b28     48 18   I 35  25  8
🟠[27] .debug_line                       PROGBITS        0   c38d    3be  0      0   0  1
⚪[28] .rela.debug_line                  RELA            0  24b70   1050 18   I 35  27  8
🟠[29] .debug_str                        PROGBITS        0   c74b   7792  1  MS  0   0  1
🟠[30] .debug_line_str                   PROGBITS        0  13edd    943  1  MS  0   0  1
🟡[31] .comment                          PROGBITS        0  14820     58  1  MS  0   0  1  Compiler version
🟡[32] .note.GNU-stack                   PROGBITS        0  14878      0  0      0   0  1  Stack hardening flag
🟠[33] .debug_frame                      PROGBITS        0  14878     40  0      0   0  8
⚪[34] .rela.debug_frame                 RELA            0  25bc0     30 18   I 35  33  8
🔵[35] .symtab                           SYMTAB          0  148b8    438 18     36  40  8  Symbols
🔵[36] .strtab                           STRTAB          0  14cf0    15f  0      0   0  1  Symbol names
🔵[37] .shstrtab                         STRTAB          0  25bf0    18b  0      0   0  1  Section names
Key to Flags:
  Write, Alloc, eXecute, Merge, Strings, Info, Link order, extra Os processing required, Group, TLS,
  Compressed, x=unknown, o=OS specific, Exclude, mbinD, large, processor specific
</span></code></pre></div>    </div>

    <ul>
      <li>🔴 Linux kernel module specific sections</li>
      <li>🟣 Linux specific sections</li>
      <li>⚫ data</li>
      <li>🟢 executable code</li>
      <li>⚪ relocations</li>
      <li>🟠 debug information</li>
      <li>🟡 compiler information</li>
      <li>🔵 ELF</li>
    </ul>

  </div>
</details>

<p>The <em>section names</em> have varying lengths.
As such the names are collected in their own section called <code class="language-plaintext highlighter-rouge">.shstrtab</code>, which is referenced by index in the ELF file header.
All sections are listed in the <em>section header table</em> and their names are referenced by offset.
Run <code class="language-plaintext highlighter-rouge">readelf -p .shstrtab avm-job.ko</code> to dump those names.</p>

<p>Similar for symbols:
There names are collected in the section <code class="language-plaintext highlighter-rouge">.strtab</code> and referenced via offset from <code class="language-plaintext highlighter-rouge">.symtab</code>.
Run <code class="language-plaintext highlighter-rouge">readelf -p .strtab avm-job.ko</code> to dump those names.</p>

<p><code class="language-plaintext highlighter-rouge">.symtab</code> contains all symbols (and <code class="language-plaintext highlighter-rouge">.strtab</code>) their names.
When <em>shared objects</em> (<code class="language-plaintext highlighter-rouge">.so</code>) are used, the linker moves those symbols to <code class="language-plaintext highlighter-rouge">.dynsym</code> and their names to <code class="language-plaintext highlighter-rouge">.dynstr</code>.
Already resolved symbols may be removed respectively both tables <code class="language-plaintext highlighter-rouge">.symtab</code> and <code class="language-plaintext highlighter-rouge">.strtab</code> may be stripped completely.</p>

<p>The remaining dynamic symbols are only resolved by the <em>dynamic linker</em>, when section is loaded.
The <em>dynamic linker</em> has to go through the section and substitute the placeholders with the then correct address.
For that the ELF file contains the <em>relocation sections</em>, of which there are two types:
<code class="language-plaintext highlighter-rouge">REL</code> (relocation without addend) and <code class="language-plaintext highlighter-rouge">RELA</code> (relocation with addend), which allows to add an additional constant.
Either one may be used per section and each table references a <em>symbol table</em>, which gets used.</p>

<p>Not all of them are loaded into memory respectively are freed again, when they are no longer needed by the linker.
Only those sections, which contain information that is necessary for runtime execution of the file, are kept.
Multiple (similar) sections can be combined and are then called <em>segments</em>.
But that is only relevant for fully linked executables: Only they have a <em>program header</em></p>

<p>References to functions are then resolved by the linker and the place-holders get replaced by the real addresses.
This is where versioning kicks in.</p>

<h2 id="anatomy-of-a-linux-kernel-module">Anatomy of a Linux kernel module</h2>

<p>When you write and export a function in the Linux kernel or an module, the following happens:</p>
<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kt">void</span> <span class="nf">my_function</span><span class="p">(</span><span class="kt">void</span><span class="p">)</span> <span class="p">{</span>
    <span class="k">return</span><span class="p">;</span>
<span class="p">}</span>
</code></pre></div></div>
<ol>
  <li>The compiler/assembler puts the code into the <code class="language-plaintext highlighter-rouge">.text</code> section.</li>
  <li>The name of the function is added to the <code class="language-plaintext highlighter-rouge">.strtab</code> section.</li>
  <li>An entry is added to the <code class="language-plaintext highlighter-rouge">symtab</code> section linking the offset within the <code class="language-plaintext highlighter-rouge">.text</code> section to the name via its offset in the <code class="language-plaintext highlighter-rouge">strtab</code> section.</li>
</ol>

<p>Using <code class="language-plaintext highlighter-rouge">EXPORT_SYMBOL</code> adds more magic:</p>
<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="cp">#include</span> <span class="cpf">&lt;linux/module.h&gt;</span><span class="cp">
</span><span class="n">EXPORT_SYMBOL</span><span class="p">(</span><span class="n">my_function</span><span class="p">);</span>
</code></pre></div></div>
<ol>
  <li>It puts the name of the function into a section called <code class="language-plaintext highlighter-rouge">__ksymtab_strings</code>.
    <div class="language-console highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="gp"> $</span><span class="w"> </span><span class="nv">LC_ALL</span><span class="o">=</span>C readelf <span class="nt">--wide</span> <span class="nt">--string-dump</span><span class="o">=</span>__ksymtab_strings avm-modver.ko
<span class="go"> String dump of section '__ksymtab_strings':
   [     0]  my_function
</span></code></pre></div>    </div>
  </li>
  <li>It creates a new section called <code class="language-plaintext highlighter-rouge">__ksymtab+my_function</code> with a single <code class="language-plaintext highlighter-rouge">struct kernel_symbol</code> linking the address of the function to its name.
 Later on these sections will be collected by the linker script <code class="language-plaintext highlighter-rouge">scripts/module-common.lds</code> and will be put into the section called <code class="language-plaintext highlighter-rouge">__ksymtab</code>.
 Similar happens for <code class="language-plaintext highlighter-rouge">EXPORT_SYMBOL_GPL</code> and <code class="language-plaintext highlighter-rouge">EXPORT_SYMBOL_GPL_FUTURE</code> and <code class="language-plaintext highlighter-rouge">EXPORT_SYMBOL_NS</code>, but with different prefixes.
    <div class="language-console highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="gp"> $</span><span class="w"> </span><span class="nv">LC_ALL</span><span class="o">=</span>C readelf <span class="nt">--wide</span> <span class="nt">--relocated-dump</span><span class="o">=</span>__ksymtab <span class="nt">--relocs</span> avm-modver.ko | <span class="nb">grep</span> <span class="nt">-A4</span> __ksymtab
<span class="go"> Relocation section '.rela__ksymtab' at offset 0x14fc8 contains 3 entries:
     Offset             Info             Type               Symbol's Value  Symbol's Name + Addend
 0000000000000000  0000002e00000002 R_X86_64_PC32          0000000000000010 my_function + 0
 0000000000000004  0000001b00000002 R_X86_64_PC32          0000000000000000 __kstrtab_my_function + 0
 0000000000000008  0000001c00000002 R_X86_64_PC32          000000000000000c __kstrtabns_my_function + 0
 --
 Hex dump of section '__ksymtab':
   0x00000000 10000000 fcffffff 04000000          ............
</span></code></pre></div>    </div>
  </li>
</ol>

<p>Too see more details, use <code class="language-plaintext highlighter-rouge">make avm-modver.i</code> to run the pre-processor and to get the intermediate file, where all macros have been expanded.</p>

<p>With <code class="language-plaintext highlighter-rouge">CONFIG_MODVERSIONS</code> enabled even more magic happens.
If a module uses <code class="language-plaintext highlighter-rouge">EXPORT_SYMBOL</code>, then <code class="language-plaintext highlighter-rouge">genksyms</code> is called.
The source code of the module is pre-processed again via <code class="language-plaintext highlighter-rouge">cpp</code>, but with a different definition for <code class="language-plaintext highlighter-rouge">EXPORT_SYMBOLS</code>.</p>
<ol>
  <li>For each function exported via <code class="language-plaintext highlighter-rouge">EXPORT_SYMBOL</code> a CRC for the function signature is computed by parsing the <abbr title="Catalog">C</abbr> function call.
 A new section called <code class="language-plaintext highlighter-rouge">___kcrctab+my_function</code> with a single <code class="language-plaintext highlighter-rouge">long</code> containing the CRC is created.
 Later on these sections will be collected by the linker script <code class="language-plaintext highlighter-rouge">scripts/module-common.lds</code> and will be put into the section called <code class="language-plaintext highlighter-rouge">__kcrctab</code>.
 Similar happens for <code class="language-plaintext highlighter-rouge">EXPORT_SYMBOL_GPL</code> and <code class="language-plaintext highlighter-rouge">EXPORT_SYMBOL_GPL_FUTURE</code> and <code class="language-plaintext highlighter-rouge">EXPORT_SYMBOL_NS</code>, but with different prefixes.</li>
  <li>For each used symbol the CRC is looked up in the <code class="language-plaintext highlighter-rouge">Module.symvers</code> files.
 They are created as part of the kernel or any module compilation process when <code class="language-plaintext highlighter-rouge">CONFIG_MODVERSIONS</code> is enabled.
 The file collects the CRC and module path for each symbol.
 The symbol name and its CRC is collected in a <code class="language-plaintext highlighter-rouge">const char __versions[]</code> array in section <code class="language-plaintext highlighter-rouge">__versions</code>.</li>
</ol>

<h2 id="module-loading">Module loading</h2>

<p>When a kernel module is loaded, the Linux kernel linker resolves all dynamic symbols of the module.
It looks up each unresolved symbol from <code class="language-plaintext highlighter-rouge">.symtab</code> and resolves it to all symbols loaded so far.
You can view them from user-space in <code class="language-plaintext highlighter-rouge">/proc/kallsyms</code>.</p>

<p>In addition to that simple lookup the loader also checks the modules licence from <code class="language-plaintext highlighter-rouge">.modinfo</code>:
Symbols exported via <code class="language-plaintext highlighter-rouge">EXPORT_SYMBOL_GPL</code> can only be resolved if the module has <code class="language-plaintext highlighter-rouge">MODULE_LICENCE("GPL")</code> and such.</p>

<p>When <code class="language-plaintext highlighter-rouge">CONFIG_MODVERSIONS</code> is enabled, the linker inside the Linux kernel also checks the CRC:
For every undefined symbol there is a matching entry for it in section <code class="language-plaintext highlighter-rouge">__versions</code>, which contains the CRC of the symbol from compile time.</p>
<div class="language-console highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="gp">$</span><span class="w"> </span><span class="nv">LC_ALL</span><span class="o">=</span>C readelf <span class="nt">--wide</span> <span class="nt">-s</span> avm-modver.ko | <span class="nb">grep </span>UND
<span class="go">     0: 0000000000000000     0 NOTYPE  LOCAL  DEFAULT  UND
    42: 0000000000000000     0 NOTYPE  GLOBAL DEFAULT  UND __fentry__
    43: 0000000000000000     0 NOTYPE  GLOBAL DEFAULT  UND _printk
    44: 0000000000000000     0 NOTYPE  GLOBAL DEFAULT  UND __x86_return_thunk
</span></code></pre></div></div>

<p>But there are two different layouts used:</p>

<h3 id="upstream-linux-kernel">Upstream Linux Kernel</h3>

<div class="language-console highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="gp">$</span><span class="w"> </span><span class="nv">LC_ALL</span><span class="o">=</span>C readelf <span class="nt">--wide</span> <span class="nt">--hex-dump</span><span class="o">=</span>__versions avm-modver.ko
<span class="go">Hex dump of section '__versions':
  0x00000000 bb6dfbbd 00000000 5f5f6665 6e747279 .m......__fentry
  0x00000010 5f5f0000 00000000 00000000 00000000 __..............
  0x00000020 00000000 00000000 00000000 00000000 ................
  0x00000030 00000000 00000000 00000000 00000000 ................
  0x00000040 d87e9992 00000000 5f707269 6e746b00 .~......_printk.
  0x00000050 00000000 00000000 00000000 00000000 ................
  0x00000060 00000000 00000000 00000000 00000000 ................
  0x00000070 00000000 00000000 00000000 00000000 ................
  0x00000080 cb8119bf 00000000 6d6f6475 6c655f6c ........module_l
  0x00000090 61796f75 74000000 00000000 00000000 ayout...........
  0x000000a0 00000000 00000000 00000000 00000000 ................
  0x000000b0 00000000 00000000 00000000 00000000 ................
</span></code></pre></div></div>

<!-- ~/REPOS/LINUX/linux/scripts/mod/modpost.c -->
<p>The original Linux kernel uses <code class="language-plaintext highlighter-rouge">const struct modversion_info __version[]</code>.
The structure has a fixed size of 64 bytes:</p>
<ul>
  <li>the first 8 bytes contain the CRC.</li>
  <li>the remaining 56 bytes contain the symbol name.</li>
</ul>

<p>Longer symbol names are not supported and require the use of the <em>extended modversions</em>.</p>

<h3 id="ubuntu-linux-kernel">Ubuntu Linux Kernel</h3>

<div class="language-console highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="gp">$</span><span class="w"> </span><span class="nv">LC_ALL</span><span class="o">=</span>C readelf <span class="nt">--wide</span> <span class="nt">--hex-dump</span><span class="o">=</span>__versions avm-modver.ko
<span class="go">Hex dump of section '__versions':
  0x00000000 14000000 bb6dfbbd 5f5f6665 6e747279 .....m..__fentry
  0x00000010 5f5f0000 10000000 7e3a2c12 5f707269 __......~:,._pri
  0x00000020 6e746b00 1c000000 ca39825b 5f5f7838 ntk......9.[__x8
  0x00000030 365f7265 7475726e 5f746875 6e6b0000 6_return_thunk..
  0x00000040 18000000 eb7b33e1 6d6f6475 6c655f6c .....{3.module_l
  0x00000050 61796f75 74000000 00000000 00000000 ayout...........
  0x00000060 00
</span></code></pre></div></div>

<!-- /usr/src/linux-headers-6.8.0-84-generic/scripts/mod/modpost.c -->
<p>Ubuntu has changed this and uses <code class="language-plaintext highlighter-rouge">const char ____versions[]</code>:</p>
<ul>
  <li>the first 8 bytes contain the CRC.</li>
  <li>next follows the symbol name with a terminating NUL byte.</li>
  <li>more NUL bytes for padding up to the next address dividable by 4.</li>
</ul>

<p>Ubuntu changed this to support longer symbol names, which Ubuntu claims is required for RUST support.
See <a href="https://lists.ubuntu.com/archives/kernel-team/2023-March/137814.html">modpost: support arbitrary symbol length in modversion</a> for details.
This has been reverted by <a href="https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2039010">2039010</a>.</p>

<p>…</p>

<h2 id="further-reading">Further reading</h2>
<ul>
  <li>Linux: <a href="https://docs.kernel.org/kbuild/modules.html#module-versioning">Building External Modules - Module Versioning</a></li>
  <li><abbr title="Linux Weekly News">LWN</abbr>: <a href="https://lwn.net/Articles/986892/">A new version of modversions</a></li>
  <li><a href="https://terenceli.github.io/技术/2018/06/02/linux-loadable-module">Anatomy of the Linux loadable kernel module</a></li>
  <li>Linux manual page: <a href="https://man7.org/linux/man-pages/man5/elf.5.html">Executable and Linking Format</a></li>
</ul>

<!-- *[FD]: File Daemon -->
<!-- *[FD]: File Descriptor -->
<!-- *[GPT]: Generative Pre-trained Transformer -->
<!-- *[GPT]: Global Partitioning Table -->
<!-- *[GPT]: GUID Partition Table -->]]></content><author><name>Philipp Hahn</name></author><category term="linux" /><summary type="html"><![CDATA[The Linux kernel itself and its modules may export symbols, so that other modules can import and use them. As the functions are written in C, it is important that the function signature matches: the number of arguments must match the ordering of the arguments must match the data types must match, which includes the structure and layout of all input and output parameters If any of them changes, the Application Binary Interface (ABI) changes and you risk crashing the kernel. If you’re lucky, recompiling the kernel and the modules is enough for both ends to pick up the new Application Programming Interface (API). To detect such breaking changes, the Linux kernel can be compiled with CONFIG_MODVERSIONS enabled: This calculates a Cyclic Redundancy Check (CRC) checksum over the function signature and embeds this information with the kernel and the modules. The dynamic linker of the Linux kernel checks, that for each requested symbol its CRC matches the CRC of the Linux kernel or already loaded modules. A module is only loaded, if a match is found for all symbols. Otherwise loading fails.]]></summary></entry><entry><title type="html">Shell-trivia #3: set -e</title><link href="https://blog.pmhahn.de/shell-trivia-3-set-e/" rel="alternate" type="text/html" title="Shell-trivia #3: set -e" /><published>2025-08-13T08:39:00+02:00</published><updated>2025-08-13T08:39:00+02:00</updated><id>https://blog.pmhahn.de/shell-trivia-3-set-e</id><content type="html" xml:base="https://blog.pmhahn.de/shell-trivia-3-set-e/"><![CDATA[<p>Es gab bereits zwei Blog-Eintrag <a href="/shell-trivia-1-set-e/">Shell-trivia #1</a> und <a href="/shell-trivia-2-set-e/">Shell-trivia #2</a> zum Thema <code class="language-plaintext highlighter-rouge">set -e</code>.
Mein Kollege N. Schier hat mich heute Morgen aber mit einer weiteren Shell-Absurdität überrascht:</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c">#!/bin/sh</span>
<span class="nb">set</span> <span class="nt">-e</span>
<span class="nb">date</span> <span class="o">&amp;&amp;</span> <span class="nb">false</span> <span class="o">&amp;&amp;</span> <span class="nb">true
date</span>
</code></pre></div></div>

<p>Wie häufig wird <code class="language-plaintext highlighter-rouge">date</code> ausgeführt?</p>

<!--more-->

<p>Wie üblich muss man die <a href="https://manpages.debian.org/stretch/bash/bash.1.en.html#Shell_Function_Definitions">Manual-Page von bash</a> sehr genau lesen:</p>

<blockquote>
  <p>The ERR trap is <strong>not</strong> executed if the failed command is … part of a command executed in a &amp;&amp; or || list except the command <strong>following the final</strong> &amp;&amp; or ||.</p>
</blockquote>

<p>Die korrekte Antwort lautet also: 2</p>

<p>Beim <code class="language-plaintext highlighter-rouge">date &amp;&amp; false &amp;&amp; true</code> Endet die Ausführung nach dem <code class="language-plaintext highlighter-rouge">false</code> und der Exit-Code ist 1:
Das nachfolgende <code class="language-plaintext highlighter-rouge">&amp;&amp; …</code> wird nicht mehr ausgeführt.
Da das <code class="language-plaintext highlighter-rouge">false</code> aber dadurch nicht der letzte Befehl ist, bricht <code class="language-plaintext highlighter-rouge">set -e</code> nicht ab und das 2. <code class="language-plaintext highlighter-rouge">date</code> wird trotzdem ausgeführt.</p>

<p>Das sollte man bedenken, wenn einem <a href="https://www.shellcheck.net/">shellcheck</a> folgende Warnungen ausspuckt und einen dazu anregen, mehr <code class="language-plaintext highlighter-rouge">&amp;&amp;</code> zu benutzen:</p>
<ul>
  <li><a href="https://www.shellcheck.net/wiki/SC2015">SC2015</a>: Note that <code class="language-plaintext highlighter-rouge">A &amp;&amp; B || C</code> is not if-then-else. <code class="language-plaintext highlighter-rouge">C</code> may run when <code class="language-plaintext highlighter-rouge">A</code> is true.</li>
  <li><a href="https://www.shellcheck.net/wiki/SC2166">SC2166</a>: Prefer <code class="language-plaintext highlighter-rouge">[ p ] &amp;&amp; [ q ]</code> as <code class="language-plaintext highlighter-rouge">[ p -a q ]</code> is not well-defined.</li>
</ul>

<p>Man baut sich dadurch leicht semantische Unterschiede ein.</p>

<p>Oder um es mit den Worten von G. Aschemann aus 1995 zu sagen:</p>
<blockquote>
  <p>Jedes gute Shell-Script fängt mit #!/usr/bin/perl an.</p>
</blockquote>

<p>Naja, das ist 30 Jahre her und ich würde doch <code class="language-plaintext highlighter-rouge">perl</code> durch <code class="language-plaintext highlighter-rouge">python</code> ersetzten wollen 😉</p>

<!-- *[FD]: File Daemon -->
<!-- *[FD]: File Descriptor -->
<!-- *[GPT]: Generative Pre-trained Transformer -->
<!-- *[GPT]: Global Partitioning Table -->
<!-- *[GPT]: GUID Partition Table -->]]></content><author><name>Philipp Hahn</name></author><category term="shell" /><summary type="html"><![CDATA[Es gab bereits zwei Blog-Eintrag Shell-trivia #1 und Shell-trivia #2 zum Thema set -e. Mein Kollege N. Schier hat mich heute Morgen aber mit einer weiteren Shell-Absurdität überrascht: #!/bin/sh set -e date &amp;&amp; false &amp;&amp; true date Wie häufig wird date ausgeführt?]]></summary></entry><entry><title type="html">Debian 13 Trixie released</title><link href="https://blog.pmhahn.de/debian-13-trixie/" rel="alternate" type="text/html" title="Debian 13 Trixie released" /><published>2025-08-11T10:31:00+02:00</published><updated>2025-08-11T10:31:00+02:00</updated><id>https://blog.pmhahn.de/debian-13-trixie</id><content type="html" xml:base="https://blog.pmhahn.de/debian-13-trixie/"><![CDATA[<p>Last Saturday - 2025-08-09 - <a href="https://www.debian.org/News/2025/20250809">Debian 13 “Trixie”</a> has been released after 2 years of work. 🥳</p>

<p>I just updated my laptop and servers and stumbled upon some issues:</p>

<!--more-->

<h2 id="cyrus-imapd"><code class="language-plaintext highlighter-rouge">cyrus-imapd</code></h2>

<p>I’m running my own mail server infrastructure:
I have grown up with <em>Unix-to-Unix-Copy-Protocol</em> (UUCP) and was an admin for <em>UUCP Freunde Lahn e.V.</em> for a long time.
I’m still using <em>Postfix</em> with <em>Cryus IMAPd</em> and never switched to <em>Dovecot</em>.</p>

<p>After the upgrade I noticed that no mails were delivered:
<code class="language-plaintext highlighter-rouge">mailq</code> shown a growing list of mails stuck in queue.
Postfix was complaining that its <code class="language-plaintext highlighter-rouge">lmtp</code> service was no longer able to establish an encrypted connection to <code class="language-plaintext highlighter-rouge">lmtpd</code> from Cyrus.</p>

<p>For historic reason my setup is using <code class="language-plaintext highlighter-rouge">STARTTLS</code>, which is now deprecated and has been disabled by default in Cyrus IMAPd.
You have to explicitly re-enable it in your <code class="language-plaintext highlighter-rouge">/etc/imapd.conf</code> by adding some lines:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>imap_allowstarttls: yes
lmtp_allowstarttls: yes
</code></pre></div></div>

<h2 id="saslauthd"><code class="language-plaintext highlighter-rouge">saslauthd</code></h2>

<p><code class="language-plaintext highlighter-rouge">saslauthd.service</code> failed to start as I had to move its UNIX socket to <code class="language-plaintext highlighter-rouge">/var/spool/postfix/var/run/saslauthd/</code>.
This also moves the location of the <abbr title="Process Identifier">PID</abbr> file to that directory, which then no longer matches the information in <code class="language-plaintext highlighter-rouge">/usr/lib/systemd/system/saslauthd.servie</code>, which expects the file in <code class="language-plaintext highlighter-rouge">/var/run/saslauthd.pid</code>.</p>

<p>A fixed this by creating an override with <code class="language-plaintext highlighter-rouge">systemctl edit saslauthd.service</code>:</p>
<div class="language-ini highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nn">[Service]</span>
<span class="py">PIDFile</span><span class="p">=</span><span class="s">/var/spool/postfix/var/run/saslauthd/saslauthd.pid</span>
</code></pre></div></div>

<p>Previously I had a shell-hack in <code class="language-plaintext highlighter-rouge">/etc/default/saslauthd</code> to replace the old location with a symbolic link to the <code class="language-plaintext highlighter-rouge">chroot</code>-location.
This no longer works as that file is not sourced by <code class="language-plaintext highlighter-rouge">systemd</code>, which does not execute that shell code.
Therefore I had to tell Cyrus IMAPd to also use that changed location by putting this into <code class="language-plaintext highlighter-rouge">/etc/imapd.conf</code>:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>sasl_saslauthd_path: /var/spool/postfix/var/run/saslauthd/mux
</code></pre></div></div>

<p>PS: On a side node: <code class="language-plaintext highlighter-rouge">/var/run/</code> is deprecated and should be replaced by just <code class="language-plaintext highlighter-rouge">/run/</code>; <code class="language-plaintext highlighter-rouge">systemd</code> already complains about this every time it sees <code class="language-plaintext highlighter-rouge">/var/run/</code>.</p>

<h2 id="dockerio-and-libvirt"><code class="language-plaintext highlighter-rouge">docker.io</code> and <code class="language-plaintext highlighter-rouge">libvirt</code></h2>

<p>For some unknown reason <code class="language-plaintext highlighter-rouge">docker.io</code> and <code class="language-plaintext highlighter-rouge">libvirt</code> got removed during the upgrade.
Running <code class="language-plaintext highlighter-rouge">apt autopurge</code> afterwards was a very bad idea as that purged all images, volumes and containers. 🤦</p>

<p>I have to investigate why that happened. 🔍</p>

<h2 id="php-84">PHP-8.4</h2>

<p>Debian-13-Trixie has PHP-8.4, while Debian-12-Bookworm had PHP-8.2.
My local NextCloud (and Wordpress) setup was unhappy about that as it needs several <code class="language-plaintext highlighter-rouge">php8.4-…</code> packages.
Luckily just installing the equivalent of the matching packages did fix this.</p>

<h2 id="kde"><abbr title="K Desktop Environment">KDE</abbr></h2>

<p>In the past I did not install <code class="language-plaintext highlighter-rouge">kde-full</code> as it depends on many optional packages like KMail, KOrganizer, DragonPlayer, and such.
I don’t use may of those and thus don’t want them to be installed.
During the upgrade <code class="language-plaintext highlighter-rouge">plasmashell</code> got removed so on the next login I did not get back a working <abbr title="K Desktop Environment">KDE</abbr> session.
Installing <code class="language-plaintext highlighter-rouge">kde-standard</code> fixed this.
As it only <code class="language-plaintext highlighter-rouge">Recommends</code> most other packages, I was able to get rid of those packages I do not want.</p>

<p>And I got Wayland, which has this annoying bug: Konsole no longer stores the open sessions and starts with only one shell in <code class="language-plaintext highlighter-rouge">$HOME</code>. 🤔</p>

<h2 id="out-of-space-usr">Out-of-space <code class="language-plaintext highlighter-rouge">/usr</code></h2>

<p>My desktop system has many packages.
Upgrading all those (<abbr title="K Desktop Environment">KDE</abbr>-)libraries required too much space on <code class="language-plaintext highlighter-rouge">/usr</code>.
<code class="language-plaintext highlighter-rouge">dpkg</code> failed to unpack a package during upgrade.</p>

<p>After some manual <code class="language-plaintext highlighter-rouge">dpkg --configure --pending</code>, <code class="language-plaintext highlighter-rouge">apt install --fix-broken</code>, <code class="language-plaintext highlighter-rouge">apt autopurge</code> and <code class="language-plaintext highlighter-rouge">dpkg -P</code> I was finally able to continue.
I would have expected for <abbr title="Advanced Packaging Tool">APT</abbr> to check for enough disk space, but apparently it does not.
So double-check manually before doing an upgrade.</p>

<p>PS: Afterwards <code class="language-plaintext highlighter-rouge">systemd</code> complains about <code class="language-plaintext highlighter-rouge">usr-not-merged</code>, but that is <a href="https://www.debian.org/releases/trixie/release-notes/issues.html#systemd-message-system-is-tainted-unmerged-bin">normal and expected</a>.</p>

<h2 id="keepassxc">KeePassXC</h2>

<p>I used a self-compiled version of KeePassXC.
Debian now has two packages <code class="language-plaintext highlighter-rouge">keepassxc</code> and <code class="language-plaintext highlighter-rouge">keepassxc-full</code> – the later has support for browser-integration and more.
As some file have been move, the upgrade failed and I had to manually remove by self-compiled version.</p>

<h2 id="network">Network</h2>

<p>Running the upgrade while being logged into <abbr title="K Desktop Environment">KDE</abbr> is not a good idea:
During the upgrade NetworkManager got restarted and killed my local network connection.
Afterward even <code class="language-plaintext highlighter-rouge">ping</code> did no longer work, as I already had the <a href="https://www.debian.org/releases/trixie/release-notes/issues.de.html#ping-no-longer-runs-with-elevated-privileges">new version</a> but still the old Linux kernel.</p>

<p>Sadly I still need my <code class="language-plaintext highlighter-rouge">r8168-dkms</code> and <code class="language-plaintext highlighter-rouge">v4l2loopback-dkms</code> packages.</p>

<h2 id="prometheus-mysqlmariadb-exporter">Prometheus MySQL/MariaDB exporter</h2>

<p><a href="https://github.com/prometheus/mysqld_exporter/releases/tag/v0.15.0">v0.15.0</a> has a breaking change, which is neither mentioned in any <code class="language-plaintext highlighter-rouge">NEWS</code> file nor the <a href="https://salsa.debian.org/go-team/packages/prometheus-mysqld-exporter/-/blob/debian/sid/debian/changelog?ref_type=heads">debian/changelog</a>.</p>
<ul>
  <li><code class="language-plaintext highlighter-rouge">DATA_SOURCE_NAME</code> is no longer supported and you must pass the credentials via <code class="language-plaintext highlighter-rouge">--mysqld.username=</code> and via <code class="language-plaintext highlighter-rouge">MYSQLD_EXPORTER_PASSWORD=</code>.</li>
  <li>You also <a href="https://github.com/prometheus/mysqld_exporter/issues/754">cannot specify the UNIX domain socket</a> <code class="language-plaintext highlighter-rouge">/run/mysqld/mysqld.sock</code></li>
</ul>

<p>I’m now using <code class="language-plaintext highlighter-rouge">--config.my-cnf /var/lib/prometheus/mysql.cnf</code> to configure the credentials via another file.</p>

<h2 id="mailman3">Mailman3</h2>

<h3 id="cron">cron</h3>

<p><code class="language-plaintext highlighter-rouge">mailman3-web</code> still runs a CRON job <strong>every minute</strong>, which imports <code class="language-plaintext highlighter-rouge">robot_detection</code>, which spams you with a ton of <code class="language-plaintext highlighter-rouge">SyntaxWarning</code>s.
See <a href="https://bugs.debian.org/1082541">mailman3-web#1082541</a> and <a href="https://bugs.debian.org/1078661">python3-robot-detection#1078661</a></p>

<p>Edit <code class="language-plaintext highlighter-rouge">/etc/cron.d/mailman3-web</code> and add <code class="language-plaintext highlighter-rouge">2&gt;/dev/null</code> to each command.</p>

<h3 id="authentication">authentication</h3>

<p>Authentication was now broken for me:
<code class="language-plaintext highlighter-rouge">/var/log/mailman3/web/mailman-web.log</code> complains about <code class="language-plaintext highlighter-rouge">Missing column 'socialaccount_socialapp.provider_id'</code>.
Run <code class="language-plaintext highlighter-rouge">/usr/bin/mailman-web migrate</code> as user <code class="language-plaintext highlighter-rouge">root</code> to fix this.</p>

<h3 id="template">template</h3>

<p><code class="language-plaintext highlighter-rouge">/var/log/mailman3/web/mailman-web.log</code> showed another error:</p>
<blockquote>
  <p>django.template.exceptions.TemplateSyntaxError: ‘humanize’ is not a registered tag library.</p>
</blockquote>

<p>Adding <code class="language-plaintext highlighter-rouge">django.contrib.humanize</code> TO <code class="language-plaintext highlighter-rouge">INSTALLED_APPS</code> in <code class="language-plaintext highlighter-rouge">/etc/mailman3/mailman-web.py</code> fixes this.</p>

<!-- *[FD]: File Daemon -->
<!-- *[FD]: File Descriptor -->
<!-- *[GPT]: Generative Pre-trained Transformer -->
<!-- *[GPT]: Global Partitioning Table -->
<!-- *[GPT]: GUID Partition Table -->]]></content><author><name>Philipp Hahn</name></author><category term="debian" /><summary type="html"><![CDATA[Last Saturday - 2025-08-09 - Debian 13 “Trixie” has been released after 2 years of work. 🥳]]></summary></entry><entry><title type="html">shell `trap` signal</title><link href="https://blog.pmhahn.de/shell-trap-signal/" rel="alternate" type="text/html" title="shell `trap` signal" /><published>2025-06-30T07:47:00+02:00</published><updated>2025-06-30T07:47:00+02:00</updated><id>https://blog.pmhahn.de/shell-trap-signal</id><content type="html" xml:base="https://blog.pmhahn.de/shell-trap-signal/"><![CDATA[<p>What’s wrong with signal handling like this:</p>
<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c">#!/bin/sh</span>
<span class="nb">trap</span> <span class="s1">'echo Cleanup…'</span> EXIT HUP INT TERM
...
</code></pre></div></div>

<!--more-->

<h2 id="exit-and-signals">Exit and signals</h2>

<p>Before we begin:
Actually <em>exit codes</em> are <strong>mutual exclusive</strong> to <em>signal statuses</em>:
A process may either exit normally using <code class="language-plaintext highlighter-rouge">exit</code> or terminate via a signal.</p>

<p>If you read <a href="man:bash(1)">man:bash</a> you will read this:</p>
<blockquote>
  <p>The return value of a simple command is its exit status, or 128+n if the command is terminated by signal n.</p>
</blockquote>

<p>That might give you the idea, that they are the same, but that is only a (broken) shell convention to map <em>signal statuses</em> to <em>exit codes</em>.
Reading <a href="man:exit(2)">man:exit</a> you see this:</p>
<blockquote>
  <p>The value status &amp; 0xFF is returned to the parent process as the process’s exit status,</p>
</blockquote>

<p>So there are 256 exit codes from 0 to 255, which a process can use to exit.</p>

<p>The parent process then uses <a href="man:waitpid(2)">waitpid()</a> to wait for the childs <em>state change</em>:</p>
<blockquote>
  <p>That may be the process exited by calling <code class="language-plaintext highlighter-rouge">exit()</code> itself or caught a <code class="language-plaintext highlighter-rouge">signal()</code>, which might have <code class="language-plaintext highlighter-rouge">kill()</code>ed the process or just suspended it.</p>
</blockquote>

<p>You then have to first use <code class="language-plaintext highlighter-rouge">WIFEXITED()</code> or <code class="language-plaintext highlighter-rouge">WIFSIGNALED()</code> to check, if the child exited normally via <code class="language-plaintext highlighter-rouge">exit()</code> or caught a <code class="language-plaintext highlighter-rouge">signal()</code>.
Only after that you should either use <code class="language-plaintext highlighter-rouge">WEXITSTATUS()</code> to extract the byte containing the <em>exit code</em> or use <code class="language-plaintext highlighter-rouge">WTERMSIG()</code> to extract the <em>signal number</em>.</p>

<p>In a shell script you do not have access to these low-level <abbr title="Catalog">C</abbr> functions, but only get the mangled exit status.
You cannot distinguish is the called process did <code class="language-plaintext highlighter-rouge">exit(130)</code> itself or was terminated by the user pressing <em>Ctrl-<abbr title="Catalog">C</abbr></em> so send <code class="language-plaintext highlighter-rouge">SIGINT</code> to it.</p>

<h2 id="signals-and-exit-trap">Signals and EXIT trap</h2>

<p>Here’s a short overview of commonly used signals and traps.</p>

<table>
  <thead>
    <tr>
      <th>signal</th>
      <th style="text-align: right">number</th>
      <th>trigger</th>
      <th>when</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td>EXIT</td>
      <td style="text-align: right">“0”</td>
      <td><code class="language-plaintext highlighter-rouge">exit</code></td>
      <td>shell process exits</td>
    </tr>
    <tr>
      <td>SIGHUP</td>
      <td style="text-align: right">1</td>
      <td> </td>
      <td>login <abbr title="Tele Type Writer">TTY</abbr> closed</td>
    </tr>
    <tr>
      <td>SIGINT</td>
      <td style="text-align: right">2</td>
      <td>Ctrl-<abbr title="Catalog">C</abbr></td>
      <td>user aborts process</td>
    </tr>
    <tr>
      <td>SIGQUIT</td>
      <td style="text-align: right">3</td>
      <td>Ctrl-\</td>
      <td>user aborts process</td>
    </tr>
    <tr>
      <td>SIGTERM</td>
      <td style="text-align: right">15</td>
      <td> </td>
      <td><code class="language-plaintext highlighter-rouge">kill $PID</code></td>
    </tr>
  </tbody>
</table>

<p>Please not that shells misuse signal <code class="language-plaintext highlighter-rouge">0</code> here:
By default there is not signal numbered <code class="language-plaintext highlighter-rouge">0</code>.
Actually it is a no-operation and can be used to check, if <em>process A can send signals to process B</em> or if <em>process B is still alive</em>.
<code class="language-plaintext highlighter-rouge">bash</code> and other shells re-use that number to give their <code class="language-plaintext highlighter-rouge">EXIT</code> handler a number, which is supposed to be called on <em>any exit from shell</em>.
But that behaviour is very implementation dependant as you will see later on.</p>

<h2 id="implementation-specific-handling-of-exit">Implementation specific handling of EXIT</h2>

<p>Let’s try this with the more informative shell script <code class="language-plaintext highlighter-rouge">trap.sh</code>:</p>
<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c">#!/bin/bash</span>
cleanup <span class="o">()</span> <span class="o">{</span>
    <span class="nb">local </span><span class="nv">rv</span><span class="o">=</span><span class="nv">$?</span> <span class="nv">sig</span><span class="o">=</span><span class="k">${</span><span class="nv">1</span><span class="k">:-</span><span class="nv">0</span><span class="k">}</span>
    <span class="nb">echo</span> <span class="s2">"Process </span><span class="nv">$$</span><span class="s2"> received signal </span><span class="nv">$sig</span><span class="s2"> after rv=</span><span class="nv">$rv</span><span class="s2">"</span>
    <span class="k">case</span> <span class="s2">"</span><span class="nv">$sig</span><span class="s2">"</span> <span class="k">in
    </span>0|<span class="s1">''</span><span class="p">)</span> <span class="nb">exit</span> <span class="s2">"</span><span class="nv">$rv</span><span class="s2">"</span><span class="p">;;</span>
    <span class="k">*</span><span class="p">)</span> <span class="nb">trap</span> - <span class="s2">"</span><span class="nv">$sig</span><span class="s2">"</span><span class="p">;</span> <span class="nb">kill</span> <span class="s2">"-</span><span class="nv">$sig</span><span class="s2">"</span> <span class="s2">"</span><span class="nv">$$</span><span class="s2">"</span><span class="p">;;</span>
    <span class="k">esac</span>
<span class="o">}</span>
<span class="nb">trap</span> <span class="s1">'cleanup 0'</span> EXIT
<span class="nb">trap</span> <span class="s1">'cleanup 1'</span> HUP
<span class="nb">trap</span> <span class="s1">'cleanup 2'</span> INT
<span class="nb">trap</span> <span class="s1">'cleanup 3'</span> QUIT
<span class="nb">trap</span> <span class="s1">'cleanup 15'</span> TERM

<span class="o">[</span> <span class="nt">-n</span> <span class="s2">"</span><span class="k">${</span><span class="nv">1</span><span class="k">:-}</span><span class="s2">"</span> <span class="o">]</span> <span class="o">&amp;&amp;</span> <span class="nb">kill</span> <span class="s2">"-</span><span class="nv">$1</span><span class="s2">"</span> <span class="s2">"</span><span class="nv">$$</span><span class="s2">"</span>
</code></pre></div></div>

<h3 id="bash">bash</h3>
<div class="language-console highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="gp">$</span><span class="w"> </span>bash ./trap.sh 0  <span class="c"># EXIT</span>
<span class="go">Process 499218 received signal 0 after rv=0
</span><span class="gp">$</span><span class="w"> </span>bash ./trap.sh 1  <span class="c"># SIGHUP</span>
<span class="go">Process 499237 received signal 1 after rv=0
Process 499237 received signal 0 after rv=0
Hangup
</span><span class="gp">$</span><span class="w"> </span>bash ./trap.sh 2  <span class="c"># SIGINT</span>
<span class="go">Process 499256 received signal 2 after rv=0
Process 499256 received signal 0 after rv=0

</span><span class="gp">$</span><span class="w"> </span>bash ./trap.sh 3  <span class="c"># SIGQUIT</span>
<span class="go">Process 499275 received signal 3 after rv=0
Process 499275 received signal 0 after rv=0
</span><span class="gp">$</span><span class="w"> </span>bash ./trap.sh 15  <span class="c"># SIGTERM</span>
<span class="go">Process 499294 received signal 15 after rv=0
Process 499294 received signal 0 after rv=0
Terminated
</span></code></pre></div></div>

<p>As you can see <a href="https://www.gnu.org/software/bash/"><code class="language-plaintext highlighter-rouge">bash</code></a> <strong>always</strong> calls the trap handler for <code class="language-plaintext highlighter-rouge">EXIT</code>!</p>

<h3 id="dash">dash</h3>
<p>Let’s repeat this with <a href="http://gondor.apana.org.au/~herbert/dash/"><code class="language-plaintext highlighter-rouge">dash</code></a>:</p>
<div class="language-console highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="gp">$</span><span class="w"> </span>dash ./trap.sh 0  <span class="c"># EXIT</span>
<span class="go">Process 502873 received signal 0 after rv=1
</span><span class="gp">$</span><span class="w"> </span>dash ./trap.sh 1  <span class="c"># SIGHUP</span>
<span class="go">Process 501892 received signal 1 after rv=0
Hangup
</span><span class="gp">$</span><span class="w"> </span>dash ./trap.sh 2  <span class="c"># SIGINT</span>
<span class="go">Process 501912 received signal 2 after rv=0

</span><span class="gp">$</span><span class="w"> </span>dash ./trap.sh 3  <span class="c"># SIGQUIT</span>
<span class="go">Process 501929 received signal 3 after rv=0
Verlassen (Speicherabzug geschrieben)
</span><span class="gp">$</span><span class="w"> </span>dash ./trap.sh 15  <span class="c"># SIGQUIT</span>
<span class="go">Process 501971 received signal 15 after rv=0
Terminated
</span></code></pre></div></div>

<h3 id="busybox">busybox</h3>
<p>And once more with <a href="https://busybox.net/"><code class="language-plaintext highlighter-rouge">busybox</code></a>:</p>
<div class="language-console highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="gp">$</span><span class="w"> </span>busybox sh ./trap.sh 0  <span class="c"># EXIT</span>
<span class="go">Process 502338 received signal 0 after rv=0
</span><span class="gp">$</span><span class="w"> </span>busybox sh ./trap.sh 1  <span class="c"># SIGHUP</span>
<span class="go">Process 502366 received signal 1 after rv=0
Hangup
</span><span class="gp">$</span><span class="w"> </span>busybox sh ./trap.sh 2  <span class="c"># SIGINT</span>
<span class="go">Process 502402 received signal 2 after rv=0

</span><span class="gp">$</span><span class="w"> </span>busybox sh ./trap.sh 3  <span class="c"># SIGQUIT</span>
<span class="go">Process 502439 received signal 3 after rv=0
Process 502439 received signal 0 after rv=0
</span><span class="gp">$</span><span class="w"> </span>busybox sh ./trap.sh 15  <span class="c"># SIGTERM</span>
<span class="go">Process 502269 received signal 15 after rv=0
Terminated
</span></code></pre></div></div>

<p>There <code class="language-plaintext highlighter-rouge">EXIT</code> is <em>almost</em> never called, except by <code class="language-plaintext highlighter-rouge">busybox</code> on <code class="language-plaintext highlighter-rouge">SIGQUIT</code>.</p>

<p>That is why <strong>portable</strong> shell scripts setup <code class="language-plaintext highlighter-rouge">trap</code> not only for <code class="language-plaintext highlighter-rouge">EXIT</code>, but also for other <code class="language-plaintext highlighter-rouge">SIG</code>nals.</p>

<p>But if you do that, please make sure to do it right:</p>
<ol>
  <li>Reset the <code class="language-plaintext highlighter-rouge">trap</code> handler to its default.</li>
  <li>Afterwards kill the process by re-sending the received signal to the process again.</li>
</ol>

<h2 id="why-proper-trap-handling-is-important">Why proper trap handling is important</h2>

<p>Viacheslav Biriukov wrote a great blog post about <a href="https://biriukov.dev/docs/fd-pipe-session-terminal/3-process-groups-jobs-and-sessions/">Process groups, jobs and sessions</a> explaining why proper exiting is important.
A program might setup a signal handler for <code class="language-plaintext highlighter-rouge">SIGINT</code> to prevent the program from just terminating, which might loose important data.
It might ask the user if terminating is okay or if the data should be saved first before quitting.
A surrounding shell script must then decide, if this is an <em>abnormal exit</em> and should terminate itself <strong>afterwards</strong>, or should continue normally.
The UNIX convention is to transfer that detail via <em>exit codes</em> and <em>signal statuses</em>.
So be careful and do it right if your shell script starts  using <code class="language-plaintext highlighter-rouge">trap</code>.</p>

<h2 id="conclusion">Conclusion</h2>

<ol>
  <li>Use <code class="language-plaintext highlighter-rouge">bash</code> as it has consistent handling of <code class="language-plaintext highlighter-rouge">trap EXIT</code>.</li>
  <li>If you want to or must use other shells: Do not use the same <code class="language-plaintext highlighter-rouge">cleanup</code> trap of <code class="language-plaintext highlighter-rouge">EXIT</code> and other signals.</li>
  <li>If you trap signals, make sure to reset the handler and to re-raise the signal to properly propagate them.</li>
</ol>

<!-- *[FD]: File Daemon -->
<!-- *[FD]: File Descriptor -->
<!-- *[GPT]: Generative Pre-trained Transformer -->
<!-- *[GPT]: Global Partitioning Table -->
<!-- *[GPT]: GUID Partition Table -->]]></content><author><name>Philipp Hahn</name></author><category term="shell" /><summary type="html"><![CDATA[What’s wrong with signal handling like this: #!/bin/sh trap 'echo Cleanup…' EXIT HUP INT TERM ...]]></summary></entry><entry><title type="html">shell `trap` and proper quoting</title><link href="https://blog.pmhahn.de/shell-trap-quote/" rel="alternate" type="text/html" title="shell `trap` and proper quoting" /><published>2025-06-28T07:45:00+02:00</published><updated>2025-06-28T07:45:00+02:00</updated><id>https://blog.pmhahn.de/shell-trap-quote</id><content type="html" xml:base="https://blog.pmhahn.de/shell-trap-quote/"><![CDATA[<p>What’s wrong with</p>
<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c">#!/bin/bash</span>
<span class="nv">TMPDIR</span><span class="o">=</span><span class="si">$(</span><span class="nb">mktemp</span> <span class="nt">-d</span><span class="si">)</span>
<span class="nb">trap</span> <span class="s1">'rm -r $TMPDIR'</span> EXIT
...
</code></pre></div></div>

<!--more-->

<p>Let’s ask <a href="https://www.shellcheck.net/">shellcheck</a>:</p>
<blockquote>
  <p>No issues detected!</p>
</blockquote>

<p>Actually there are multiple issues:</p>

<h2 id="tmpdir">TMPDIR</h2>

<p>Please do not assign to <code class="language-plaintext highlighter-rouge">TMPDIR</code> as that variable in an <em>input parameter</em> to <code class="language-plaintext highlighter-rouge">mktemp</code> itself:
When you read <a href="man:mktemp(1)">man:mktemp</a> your will find this for option <code class="language-plaintext highlighter-rouge">-p</code>:</p>
<blockquote>
  <p>if DIR is not specified, use $TMPDIR if set, else /tmp.</p>
</blockquote>

<p>The variable is used for example by <a href="man:pam_tmpdir.8">pam_tmpdir</a> to setup <em>per user temporary directories</em> to improve security on multi-user systems.
By using <code class="language-plaintext highlighter-rouge">TMPDIR</code> inside your script to store the path of your <strong>specific</strong> temporary directory, you risk chanhing the behavior of other called child-processes also using <code class="language-plaintext highlighter-rouge">mktemp</code>.
Other <a href="https://docs.python.org/3/library/tempfile.html#tempfile.mkstemp">equivalent implementations thereof</a>) also use <code class="language-plaintext highlighter-rouge">TEMP</code> and <code class="language-plaintext highlighter-rouge">TMP</code>, so better do not use these as well.</p>

<p>So lets use <code class="language-plaintext highlighter-rouge">tmp</code>:</p>
<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nv">tmp</span><span class="o">=</span><span class="si">$(</span><span class="nb">mktemp</span> <span class="nt">-d</span><span class="si">)</span>
<span class="nb">trap</span> <span class="s1">'rm -r $tmp'</span> EXIT
</code></pre></div></div>

<h2 id="ifs">IFS</h2>

<p>By default <a href="man:mktemp(1)">man:mktemp</a> will only create <em>safe</em> file names, e.g. none containing blanks and characters of <code class="language-plaintext highlighter-rouge">$IFS</code>.
Remember that <code class="language-plaintext highlighter-rouge">$IFS</code> is used by the shell to split every argument — which is not quoted — into multiple arguments.
By default it is set to <em>space</em>, <em>tab</em> and <em>newline</em>.
But you can redefine or extend it, after which <em>hell breaks loose</em>:</p>
<div class="language-console highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="gp">$</span><span class="w"> </span>bash <span class="nt">-c</span> <span class="s1">'IFS="$IFS/."; . my-trap-script"
</span><span class="go">rm: cannot remove '': No such file or directory
rm: cannot remove 'tmp': No such file or directory
rm: cannot remove 'user': No such file or directory
rm: cannot remove '1000': No such file or directory
rm: cannot remove 'tmp': No such file or directory
rm: cannot remove 'XyJlR6AHpn': No such file or directory
</span></code></pre></div></div>

<p>Luckily <strong><code class="language-plaintext highlighter-rouge">$IFS</code> is re-set for each shell</strong> to its default value, but do keep that in mind when you fiddle with <code class="language-plaintext highlighter-rouge">$IFS</code>.
My advise is to do that only in functions and to use <code class="language-plaintext highlighter-rouge">local IFS</code> there to have the change confined to only inside the function.</p>

<h2 id="quoting">quoting</h2>

<p>To prevent <code class="language-plaintext highlighter-rouge">$IFS</code>-splitting you have to quote arguments.
So let’s try with this:</p>
<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nv">tmp</span><span class="o">=</span><span class="si">$(</span><span class="nb">mktemp</span> <span class="nt">-d</span><span class="si">)</span>
<span class="nb">trap</span> <span class="s1">'rm -r "$tmp"'</span> EXIT
</code></pre></div></div>

<p>You may wonder, why I didn’t quote <code class="language-plaintext highlighter-rouge">tmp=$(…)</code> as the spitting occurs after <em>command substitution</em>?
For that you have to read <a href="man:bash(1)">man:bash</a> very carefully.
In section <em>parameters</em> you have this:</p>
<blockquote>
  <p>All values undergo tilde expansion, parameter and variable expansion, command substitution, arithmetic expansion, and quote removal.</p>
</blockquote>

<p>Compare that to section <em>expansion</em>:</p>
<blockquote>
  <p>There are seven kinds of expansion performed: brace expansion, tilde expansion, parameter and variable expansion, command substitution, arithmetic expansion, <strong>word splitting</strong>, and pathname expansion.</p>
</blockquote>

<p>The important difference here is, that parameter assignment expects a <em>single argument</em> and this <em>word splitting</em> does <strong>not</strong> occurs there.
So no quoting is needed for parameter assignments, but you can do it for consistency — it does not hurt.</p>

<h2 id="late-vs-early-evaluation">late vs. early evaluation</h2>

<p>While <code class="language-plaintext highlighter-rouge">shellcheck</code> is happy, there is a lingering problem:
The <code class="language-plaintext highlighter-rouge">trap</code> is executed only later on when the shell exits.
<code class="language-plaintext highlighter-rouge">$tmp</code> might get changed (by accident) or be used for something else.
In that case the <code class="language-plaintext highlighter-rouge">rm</code> will delete whatever file <code class="language-plaintext highlighter-rouge">$tmp</code> points too.</p>

<p>That is because the <em>outer quotes</em> are <em>single quotes</em> while the <em>inner quotes</em> are <em>double quotes</em>:
<em>single quotes</em> prevent evaluation of the command when <code class="language-plaintext highlighter-rouge">the trap</code> statement is executed.
Later on when the trap is executed, the command is evaluated a second time.
That is when the <em>double quotes</em> prevent <code class="language-plaintext highlighter-rouge">$tmp</code> from being split on <code class="language-plaintext highlighter-rouge">$IFS</code>.</p>

<p>So lets look at the following variant:</p>
<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nv">tmp</span><span class="o">=</span><span class="si">$(</span><span class="nb">mktemp</span> <span class="nt">-d</span><span class="si">)</span>
<span class="nb">trap</span> <span class="s2">"rm -r </span><span class="nv">$tmp</span><span class="s2">"</span> EXIT
tmp+<span class="o">=</span><span class="s2">"/subdir"</span>
</code></pre></div></div>

<p><em>Double quotes</em> are now use when the trap is setup:
<code class="language-plaintext highlighter-rouge">$tmp</code> gets inserted here as it is currently defined.
If <code class="language-plaintext highlighter-rouge">$tmp</code> is changed later on, we still delete file temporary file we just created.</p>

<p>But <code class="language-plaintext highlighter-rouge">shellcheck</code> is unhappy now:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>trap "rm -r $tmp" EXIT
            ^-- SC2064 (warning): Use single quotes, otherwise this expands now rather than when signalled.
</code></pre></div></div>
<p>Personally I think <a href="https://www.shellcheck.net/wiki/SC2064">SC2064</a> is a bad advise here as we want to evaluate “$tmp” now and not later.
I want to delete the file <code class="language-plaintext highlighter-rouge">$tmp</code> is pointing to right now, not where it might point to in the future.
I’m not alone with that opinion and <a href="https://github.com/koalaman/shellcheck/issues/1945">issue 1945</a> calls SC2064 questionable.</p>

<p>But there is a bigger problem again:
But what will happen, when the trap fires?</p>

<h2 id="late-quoting">late quoting</h2>

<p>Remember that <code class="language-plaintext highlighter-rouge">$tmp</code> might contain <code class="language-plaintext highlighter-rouge">$IFS</code> characters!
For example I can set <code class="language-plaintext highlighter-rouge">TMPDIR=/tmp/I like blanks</code>.
The trap command will be <code class="language-plaintext highlighter-rouge">rm -f /tmp/I like blanks</code>.
It will fail as there is no file <code class="language-plaintext highlighter-rouge">/tmp/I</code>, <code class="language-plaintext highlighter-rouge">./like</code> and <code class="language-plaintext highlighter-rouge">./blanks</code> — hopefully.</p>

<p>So how do we fix that?
I give you two variants:</p>
<ol>
  <li>Nestes double quoting using backslash-escaping:
    <div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nv">tmp</span><span class="o">=</span><span class="si">$(</span><span class="nb">mktemp</span> <span class="nt">-d</span><span class="si">)</span>
<span class="nb">trap</span> <span class="s2">"rm -r </span><span class="se">\"</span><span class="nv">$tmp</span><span class="se">\"</span><span class="s2">"</span> EXIT
</code></pre></div>    </div>
  </li>
  <li>Single quotes inside double quotes:
    <div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nv">tmp</span><span class="o">=</span><span class="si">$(</span><span class="nb">mktemp</span> <span class="nt">-d</span><span class="si">)</span>
<span class="nb">trap</span> <span class="s2">"rm -r '</span><span class="nv">$tmp</span><span class="s2">'"</span> EXIT
</code></pre></div>    </div>
  </li>
</ol>

<p>Which one is correct?</p>

<p>The answer is very disappointing:
None!</p>

<p>Variant 1 will fail for <code class="language-plaintext highlighter-rouge">TMPDIR=/tmp/\"</code> and variante will fail for <code class="language-plaintext highlighter-rouge">TMPDIR=/tmp/\'</code>.
<code class="language-plaintext highlighter-rouge">$tmp</code> will then be a path containing a <em>double quote</em> in variant 1 and a <em>single quote</em> in variant 2.
Because of the <em>early evaluation</em> <code class="language-plaintext highlighter-rouge">$tmp</code> is inserted as-is during the first evaluation when <code class="language-plaintext highlighter-rouge">trap</code> is setup.
On the 2nd evaluation when the trap is executed, you will have an odd number of quotes!</p>

<h2 id="correct-quoting">correct quoting</h2>

<p>So we need a mechanism to quote <code class="language-plaintext highlighter-rouge">$tmp</code> correctly, so it survives two rounds of evaluation.</p>

<p>Luckily <code class="language-plaintext highlighter-rouge">bash</code> has such a feature:</p>
<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nv">tmp</span><span class="o">=</span><span class="si">$(</span><span class="nb">mktemp</span> <span class="nt">-d</span><span class="si">)</span>
<span class="c"># shellcheck disable=SC2064</span>
<span class="nb">trap</span> <span class="s2">"rm -r </span><span class="k">${</span><span class="nv">tmp</span><span class="p">@Q</span><span class="k">}</span><span class="s2">"</span> EXIT
</code></pre></div></div>

<p><code class="language-plaintext highlighter-rouge">@Q</code> is a <em>operator</em>, which is documented like this in <a href="man:bash(1)">man:bash</a>:</p>
<blockquote>
  <p>The expansion is a string that is the value of parameter quoted in a format that can be reused as input.</p>
</blockquote>

<p>That is exactly what we want:</p>
<ul>
  <li>the outer quotes prevent <code class="language-plaintext highlighter-rouge">$tmp</code> from being split when the <code class="language-plaintext highlighter-rouge">trap</code> is setup.</li>
  <li>the <code class="language-plaintext highlighter-rouge">@Q</code> adds the necessary escaping to also prevent <code class="language-plaintext highlighter-rouge">$tmp</code> from being split when the trap executes.</li>
</ul>

<h2 id="closing-words">closing words</h2>

<p>Be warned that the operator <code class="language-plaintext highlighter-rouge">@Q</code> is a <code class="language-plaintext highlighter-rouge">bash</code>ism:
This is not supported by <code class="language-plaintext highlighter-rouge">ash</code>, <code class="language-plaintext highlighter-rouge">dash</code>, or <code class="language-plaintext highlighter-rouge">busybox sh</code>:
There you have to quote <code class="language-plaintext highlighter-rouge">"</code> and <code class="language-plaintext highlighter-rouge">'</code> manually.
I leave that to you.</p>

<p>I will simply accept <code class="language-plaintext highlighter-rouge">bash</code> and use <code class="language-plaintext highlighter-rouge">@Q</code> as that is much more readable and — most importantly — correct.</p>

<!-- *[FD]: File Daemon -->
<!-- *[FD]: File Descriptor -->
<!-- *[GPT]: Generative Pre-trained Transformer -->
<!-- *[GPT]: Global Partitioning Table -->
<!-- *[GPT]: GUID Partition Table -->]]></content><author><name>Philipp Hahn</name></author><category term="shell" /><summary type="html"><![CDATA[What’s wrong with #!/bin/bash TMPDIR=$(mktemp -d) trap 'rm -r $TMPDIR' EXIT ...]]></summary></entry><entry><title type="html">Stange GitLab shell eval behaviour</title><link href="https://blog.pmhahn.de/gitlab-eval/" rel="alternate" type="text/html" title="Stange GitLab shell eval behaviour" /><published>2025-06-05T13:35:00+02:00</published><updated>2025-06-05T13:35:00+02:00</updated><id>https://blog.pmhahn.de/gitlab-eval</id><content type="html" xml:base="https://blog.pmhahn.de/gitlab-eval/"><![CDATA[<p>Two colleges of mine contacted me this week with a strange GitLab runner behaviour.
The following pipeline did not succeed:</p>

<div class="language-yaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="na">original</span><span class="pi">:</span>
  <span class="na">script</span><span class="pi">:</span>
    <span class="pi">-</span> <span class="s2">"</span><span class="s">sh</span><span class="nv"> </span><span class="s">-c</span><span class="nv"> </span><span class="s">'exit</span><span class="nv"> </span><span class="s">42'"</span>
  <span class="na">allow_failure</span><span class="pi">:</span>
    <span class="na">exit_codes</span><span class="pi">:</span>
      <span class="pi">-</span> <span class="m">42</span>
</code></pre></div></div>

<!--more-->

<p>The <code class="language-plaintext highlighter-rouge">sh -c 'exit 42'</code> launches a sub-process, which exits with return value <code class="language-plaintext highlighter-rouge">42</code>.
In my case I was launching some linter, which indicated a special condition by returning that code.</p>

<p>GitLab executes commands within a shell, where <code class="language-plaintext highlighter-rouge">set -e</code> is used.
This terminates the sequence of commands as soon as a command does <strong>not</strong> succeed, e.g. returns <code class="language-plaintext highlighter-rouge">0</code>.
But instead of getting <code class="language-plaintext highlighter-rouge">42</code> as the result, GitLab reportes <code class="language-plaintext highlighter-rouge">1</code> as the exit status for the job.
As that code is not listed in <code class="language-plaintext highlighter-rouge">exit_codes</code>, the job failes with an error instead of <em>okay with warnings</em>.</p>

<h2 id="working-alternatives">Working alternatives</h2>

<p>In contrast to that the following two jobs did work as expected:</p>

<div class="language-yaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="na">direct</span><span class="pi">:</span>
  <span class="na">script</span><span class="pi">:</span>
    <span class="pi">-</span> <span class="s2">"</span><span class="s">exit</span><span class="nv"> </span><span class="s">42"</span>
<span class="na">fail</span><span class="pi">:</span>
  <span class="na">script</span><span class="pi">:</span>
    <span class="pi">-</span> <span class="s2">"</span><span class="s">sh</span><span class="nv"> </span><span class="s">-c</span><span class="nv"> </span><span class="s">'exit</span><span class="nv"> </span><span class="s">42'</span><span class="nv"> </span><span class="s">||</span><span class="nv"> </span><span class="s">exit</span><span class="nv"> </span><span class="s">$?"</span>
</code></pre></div></div>

<h2 id="debugging-job-failures">Debugging job failures</h2>

<p>Using <code class="language-plaintext highlighter-rouge">variables: CI_DEBUG_TRACE: true</code> did not show anything strange:
GitLab sets up a trap handler called <code class="language-plaintext highlighter-rouge">runner_script_trap</code> to collect the return code of the failing command and converts it into a <abbr title="JavaScript Object Notation">JSON</abbr> output, which is then consumed by GitLab.
There the wrong return value <code class="language-plaintext highlighter-rouge">1</code> was also visible.</p>

<p>But actually it showed another hint:
GitLab executed the job by <a href="https://gitlab.com/gitlab-org/gitlab-runner/-/blob/main/shells/bash.go?ref_type=heads#L394-398">generating a shell script</a> in a temporary file, which then gets executed.
By using <code class="language-plaintext highlighter-rouge">cp "$0" ./debug.txt</code> combined with <code class="language-plaintext highlighter-rouge">artifacts: paths: [debug.txt]</code> you can get a hold on it.
Downloading that artifact shows the following (abbreviated) content:</p>

<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code>runner_script_trap<span class="o">()</span> <span class="o">{</span> <span class="nv">exit_code</span><span class="o">=</span><span class="nv">$?</span><span class="p">;</span> <span class="nb">echo </span>JSON… <span class="o">}</span>
<span class="nb">trap </span>runner_script_trap EXIT
<span class="nb">set</span> <span class="nt">-x</span> <span class="nt">-e</span> <span class="nt">-o</span> pipefail +o noclobber
: | <span class="nb">eval</span> <span class="s1">$'export=CI… CODE…'</span>
</code></pre></div></div>

<ol>
  <li>It sets up a trap handler to record the exit status as <abbr title="JavaScript Object Notation">JSON</abbr>.</li>
  <li>It sets up the environment to fail on error.</li>
  <li>It executes the ‘script’ code as part of a <em>shell pipeline</em> using <code class="language-plaintext highlighter-rouge">eval</code>.</li>
</ol>

<h2 id="several-experiments">Several experiments</h2>

<p>The last part is the relevant thing here and shows very some strange behaviour.
Modifying the command only slightly changes the return value:</p>

<div class="language-console highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="gp">$</span><span class="w"> </span>bash <span class="nt">-e</span> <span class="nt">-c</span> <span class="s1">':|eval "sh -e -c \"exit 12\""'</span><span class="p">;</span> <span class="nb">echo</span> <span class="nv">$?</span>
<span class="go">1
</span><span class="gp">$</span><span class="w"> </span>bash <span class="nt">--posix</span> <span class="nt">-e</span> <span class="nt">-c</span> <span class="s1">':|eval "sh -e -c \"exit 12\""'</span><span class="p">;</span> <span class="nb">echo</span> <span class="nv">$?</span>
<span class="go">1
</span><span class="gp">$</span><span class="w"> </span>busybox sh <span class="nt">-e</span> <span class="nt">-c</span> <span class="s1">':|eval "sh -e -c \"exit 12\""'</span><span class="p">;</span> <span class="nb">echo</span> <span class="nv">$?</span>
<span class="go">12
</span><span class="gp">$</span><span class="w"> </span>dash <span class="nt">-e</span> <span class="nt">-c</span> <span class="s1">':|eval "sh -e -c \"exit 12\""'</span><span class="p">;</span> <span class="nb">echo</span> <span class="nv">$?</span>
<span class="go">12
</span><span class="gp">$</span><span class="w"> </span>sh <span class="nt">-e</span> <span class="nt">-c</span> <span class="s1">':|eval "sh -e -c \"exit 12\""'</span><span class="p">;</span> <span class="nb">echo</span> <span class="nv">$?</span>
<span class="go">12
</span><span class="gp">$</span><span class="w"> </span>bash <span class="nt">-e</span> <span class="nt">-c</span> <span class="s1">'eval "exec sh -e -c \"exit 12\""'</span><span class="p">;</span> <span class="nb">echo</span> <span class="nv">$?</span>
<span class="go">12
</span><span class="gp">$</span><span class="w"> </span>bash <span class="nt">-e</span> <span class="nt">-c</span> <span class="s1">'eval "sh -e -c \"exit 12\" || exit \$?"'</span><span class="p">;</span> <span class="nb">echo</span> <span class="nv">$?</span>
<span class="go">12
</span><span class="gp">$</span><span class="w"> </span>bash <span class="nt">-e</span> <span class="nt">-c</span> <span class="s1">'eval "sh -e -c \"exit 12\""'</span><span class="p">;</span> <span class="nb">echo</span> <span class="nv">$?</span>
<span class="go">12
</span><span class="gp">$</span><span class="w"> </span>bash <span class="nt">-e</span> <span class="nt">-c</span> <span class="s1">':|sh -e -c "exit 12"'</span><span class="p">;</span> <span class="nb">echo</span> <span class="nv">$?</span>
<span class="go">12
</span><span class="gp">$</span><span class="w"> </span>bash <span class="nt">-e</span> <span class="nt">-c</span> <span class="s1">':|eval "exit 12"'</span><span class="p">;</span> <span class="nb">echo</span> <span class="nv">$?</span>
<span class="go">12
</span><span class="gp">$</span><span class="w"> </span>bash <span class="nt">-e</span> <span class="nt">-c</span> <span class="s1">' : |(eval "sh -e -c \"exit 12\"")'</span><span class="p">;</span> <span class="nb">echo</span> <span class="nv">$?</span>
<span class="go">12
</span></code></pre></div></div>

<p>So this is some strange behaviour when <code class="language-plaintext highlighter-rouge">|</code> is combined with <code class="language-plaintext highlighter-rouge">eval</code> in <code class="language-plaintext highlighter-rouge">bash</code>.
Looks like to be <a href="https://savannah.gnu.org/support/index.php?109840">bash bug 109840</a>.</p>

<h2 id="the-fix">The fix</h2>

<p>We fixed this strange GitLab behaviour by enabling the <a href="https://docs.gitlab.com/runner/configuration/feature-flags/#available-feature-flags">Runner Feature Flag</a> <code class="language-plaintext highlighter-rouge">FF_USE_NEW_BASH_EVAL_STRATEGY</code>.
This puts one level of parenthesis around the <code class="language-plaintext highlighter-rouge">eval</code> command to run it in a sub-shell – see last example from above.</p>

<p>This can be done in the pipeline itself by setting the variable to <code class="language-plaintext highlighter-rouge">true</code> or <code class="language-plaintext highlighter-rouge">1</code>:</p>
<div class="language-yaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="na">fixed</span><span class="pi">:</span>
  <span class="na">variables</span><span class="pi">:</span>
    <span class="na">FF_USE_NEW_BASH_EVAL_STRATEGY</span><span class="pi">:</span> <span class="s2">"</span><span class="s">true"</span>
  <span class="na">script</span><span class="pi">:</span>
    <span class="pi">-</span> <span class="s2">"</span><span class="s">sh</span><span class="nv"> </span><span class="s">-c</span><span class="nv"> </span><span class="s">'exit</span><span class="nv"> </span><span class="s">42'"</span>
  <span class="na">allow_failure</span><span class="pi">:</span>
    <span class="na">exit_codes</span><span class="pi">:</span>
      <span class="pi">-</span> <span class="m">42</span>
</code></pre></div></div>

<h2 id="closing-words">Closing words</h2>

<p>So be careful when using <code class="language-plaintext highlighter-rouge">allow_failure</code> with <code class="language-plaintext highlighter-rouge">exit_codes</code> and calling <em>external</em> programs.
Make sure to either enable the feature flag or use <code class="language-plaintext highlighter-rouge">exec …</code> or <code class="language-plaintext highlighter-rouge">… || exit $?</code> to really <strong>exit</strong> the shell.</p>

<p>The original <a href="https://gitlab.com/gitlab-org/gitlab-runner/-/issues/27668">GitLab issue 27668</a> has some more details.
Sadly the feature flag is still not enabled by default as of today, so please vote for <a href="https://gitlab.com/gitlab-org/gitlab-runner/-/issues/27909">issue 27909</a>.</p>

<!-- *[FD]: File Daemon -->
<!-- *[FD]: File Descriptor -->
<!-- *[GPT]: Generative Pre-trained Transformer -->
<!-- *[GPT]: Global Partitioning Table -->
<!-- *[GPT]: GUID Partition Table -->]]></content><author><name>Philipp Hahn</name></author><category term="gitlab" /><category term="shell" /><summary type="html"><![CDATA[Two colleges of mine contacted me this week with a strange GitLab runner behaviour. The following pipeline did not succeed: original: script: - "sh -c 'exit 42'" allow_failure: exit_codes: - 42]]></summary></entry><entry><title type="html">AVM FRITZ!Smart Energy 200 CSV issues</title><link href="https://blog.pmhahn.de/avm-smart-energy-200-csv/" rel="alternate" type="text/html" title="AVM FRITZ!Smart Energy 200 CSV issues" /><published>2025-04-30T14:24:00+02:00</published><updated>2025-04-30T14:24:00+02:00</updated><id>https://blog.pmhahn.de/avm-smart-energy-200-csv</id><content type="html" xml:base="https://blog.pmhahn.de/avm-smart-energy-200-csv/"><![CDATA[<p>Ich besitze privat mehrere (~25) Geräte meines Arbeitgebers <a href="https://avm.de/"><abbr title="Audio-Visuelles Marketing">AVM</abbr> <abbr title="Gesellschaft mit beschränkter Haftung">GmbH</abbr></a>, darunter u.a. ein schaltbare Steckdose <a href="https://fritz.com/produkte/smart-home/fritzsmart-energy-200/">FRITZ!Smart Energy 200</a>, früher bekannt als <em>FRITZ!DECT 200</em>.
Diese erfasst auch den Energieverbrauch der angeschlossenen Geräte.
Diese werden aggregiert für 24 Stunden, 1 Woche, 1 Monat oder 1-2 Jahr als <abbr title="Comma Separated Values">CSV</abbr>-Datensatz zur Verfügung gestellt.
Diesen kann man sich entweder bei Bedarf über die Web-Oberfläche der FRITZ!Box herunterladen oder sich regelmäßig als Push-Service-Mail zuschicken lassen.</p>

<!--more-->

<h2 id="prozedur">Prozedur</h2>

<p>Zunächst ein Blick auf die Prozedur, wie man an die Daten kommt.</p>

<h3 id="kritik-1-push-e-mail">Kritik 1: Push-E-Mail</h3>

<p>Der Push-Service hat (zumindest bei mir) jahrelang nicht funktioniert.
Ursache war, dass ich irgendwann meinen Provider gewechselt habe.
Als Absenderadresse war aber weiterhin die Anmeldekennung meines alten Providers als Absender-E-Mail-Adresse eingetragen.
Die hat dann irgendwann nicht mehr funktioniert und von da an landeten alle Mail im Nirgendwo.</p>

<p>Bitte in der Benutzeroberfläche irgendwo anzeigen, wenn es Probleme mit der Push-E-Mail gibt.</p>

<h3 id="kritik-2-fritzbox-web-oberfläche">Kritik 2: FRITZ!Box Web-Oberfläche</h3>

<p>Die Navigation durch die Web-Oberfläche ist auch alles andere als intuitiv:</p>
<ol>
  <li>Die Konfiguration des globalen Push-Services ist unter <em>System</em> → <em>Push Service</em></li>
  <li>In der <em>Übersicht</em> wird die 200 zwar als <em>Smart Home</em> gerät angezeigt, aber nicht mit einem direkten Link zu deren Einstellungen</li>
  <li>Diese findet man erst unter <em>Smart Home</em> → <em>Geräte und Gruppen</em> → <em>Gerätename</em> → <em>Einstellungen</em> → <em>Allgemein</em> bzw. <em>Energieanzeige</em> → <em>Gesamtenergie (kWh)</em></li>
</ol>

<p>Bitte verkürzt den Weg, um an die Information zu kommen.</p>

<h3 id="kritik-3-api">Kritik 3: <abbr title="Application Programming Interface">API</abbr></h3>

<p>Für die automatische Weiterverarbeitung ist E-Mail suboptimal:</p>
<ol>
  <li>Man muss diese irgendwie per <abbr title="Internet Message Access Protocol">IMAP</abbr> oder <abbr title="Post Office Protocol version 3">POP3</abbr> abholen</li>
  <li>Man muss die E-Mail parsen und nach dem <abbr title="Comma Separated Values">CSV</abbr>-Anhang durchsuchen</li>
  <li>Man muss dann die Daten irgendwo ablegen.</li>
</ol>

<p>Den Download von der Web-Oberfläche kann man auch nicht von extern aufrufen.</p>

<p>Bitte schafft eine <abbr title="Application Programming Interface">API</abbr>, über die man sich die Informationen ohne viel Aufwand herunterladen kann.
Für die Authentifizierung sollte einen standardisierten Mechanismus wie Benutzername-Passwort, <abbr title="Hypertext Transfer Protocol">HTTP</abbr>-Header-Token, oder ähnliches verwendet werden.</p>

<p>PS: Ich habe inzwischen meine Kollegen gefragt und sie haben mich auf das <a href="https://fritz.com/fileadmin/user_upload/Global/Service/Schnittstellen/AHA-HTTP-Interface.pdf"><abbr title="Audio-Visuelles Marketing">AVM</abbr> Home Automation Interface</a> hingewiesen. Danke.</p>

<h3 id="kritik-4-dateinamen">Kritik 4: Dateinamen</h3>

<p>Die Dateinamen der <abbr title="Comma Separated Values">CSV</abbr>-Dateien folgen 2 Schemata, je nach dem ob man sich die Datei per Push-Service zuschicken lässt oder sie von der Web-Oberfläche herunterlädt:</p>
<ol>
  <li><code class="language-plaintext highlighter-rouge">YYYYmmdd-HHMMSS-idXXXXX_ZEITRAUM.csv</code> (Push-Service-E-Mail)</li>
  <li><code class="language-plaintext highlighter-rouge">$NAME_dd.mm.YYYY_HH-MM_ZEITRUM.csv</code> (Download)</li>
</ol>

<ul>
  <li><code class="language-plaintext highlighter-rouge">NAME</code> ist der eingestellte Gerätename, die <code class="language-plaintext highlighter-rouge">ID</code> eine willkürliche(?) Nummer.
Warum ist der in der Push-Service-E-Mail nicht enthalten?
Den Namen des Geräts muss man sich also anderweitig aus der E-Mail parsen, sofern man mehrere Geräte hat.
Die <code class="language-plaintext highlighter-rouge">ID</code> taucht an keiner anderen Stelle nochmals auf.</li>
  <li>Warum enthält die eine Variante <em>Sekunden</em>, die andere nicht?</li>
  <li><code class="language-plaintext highlighter-rouge">ZEITRAUM</code> ist <code class="language-plaintext highlighter-rouge">24h</code> oder <code class="language-plaintext highlighter-rouge">week</code> oder <code class="language-plaintext highlighter-rouge">month</code> oder <code class="language-plaintext highlighter-rouge">2years</code>.<br />
<strong>Ausnahme</strong>: Beim Push-Service ist es <code class="language-plaintext highlighter-rouge">1month</code>, also mit vorangestellter <code class="language-plaintext highlighter-rouge">1</code>.</li>
</ul>

<p>Bitte vereinheitlicht die Dateinamen für den Push-Service und den Download.</p>

<hr />

<h2 id="kopfzeilen">Kopfzeilen</h2>

<p>Schauen wir uns nun den Inhalt der <abbr title="Comma Separated Values">CSV</abbr>-Dateien genauer an:
Diese haben grob folgenden Aufbau:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>sep=;
Kopfzeile
Datenzeilen…
</code></pre></div></div>

<h3 id="kritik-5-csv-format">Kritik 5: <abbr title="Comma Separated Values">CSV</abbr> Format</h3>

<p><a href="https://de.wikipedia.org/wiki/CSV_(Dateiformat)"><abbr title="Comma Separated Values">CSV</abbr>-Dateien</a> sind zwar einfach zu erzeugen, aber deren Weiterverarbeitung ist alles andere als trivial, weil es viele Unterformate gibt:</p>
<ul>
  <li>unterschiedliche <strong>Zeichenkodierungen</strong>, e.g. <em>ASCII</em>, <em><abbr title="Unicode Transformation Format">UTF</abbr>-8</em>, <em><abbr title="Unicode Transformation Format">UTF</abbr>-16</em>, <em><abbr title="International Organization for Standardization">ISO</abbr>-8859-1</em>, ̇…</li>
  <li>unterschiedliche Zeichen für den <strong>Zeilenumbruch</strong>, e.g. <em>LF</em> (<code class="language-plaintext highlighter-rouge">\n</code>), <em>CR</em> (<code class="language-plaintext highlighter-rouge">\r</code>), <em>CR</em>+<em>LF</em> (<code class="language-plaintext highlighter-rouge">\r\n</code>), ̇…</li>
  <li>unterschiedliche <strong>Trennzeichen</strong> für die Spalten: <em>Komma</em> (<code class="language-plaintext highlighter-rouge">,</code>), <em>Semikolon</em> (<code class="language-plaintext highlighter-rouge">;</code>), <em>Tabular</em> (<code class="language-plaintext highlighter-rouge">\t</code>), Leerzeichen (<code class="language-plaintext highlighter-rouge"> </code>), …</li>
  <li><strong>Zeichenketten</strong> in <em>einfache</em> (<code class="language-plaintext highlighter-rouge">'</code>) bzw. <em>doppelte Anführungszeichen</em> (<code class="language-plaintext highlighter-rouge">"</code>) einschließen oder nicht</li>
  <li>unterschiedliche <strong>Dezimaltrennzeichen</strong> für Zahlen, e.g. <em>Punkt</em> (<code class="language-plaintext highlighter-rouge">.</code>) oder <em>Komma</em> (<code class="language-plaintext highlighter-rouge">,</code>)</li>
  <li><strong>Escape-Mechanismus</strong> für Zeichen, sie ansonsten als Trennzeichen interpretiert würden</li>
  <li>initiale <strong>Leerzeichen</strong> nach Trennzeichen sind relevant oder werden ignoriert</li>
  <li>Datei enthält eine <strong>Kopfzeile</strong> oder beginnt direkt mit der ersten <strong>Daten-Zeile</strong></li>
</ul>

<p>Bitte stellt die Daten in einem strukturierten Format zur Verfügung, das einfach zu verarbeiten ist und nach Möglichkeit eine eindeutige Semantik hat.</p>

<h3 id="kritik-5-csv-trennzeichen">Kritik 5: <abbr title="Comma Separated Values">CSV</abbr> Trennzeichen</h3>

<p>Die Datei beginnt mit einem <code class="language-plaintext highlighter-rouge">sep=;</code>.
Es handelt sich um eine Excel-Erweiterung, die von vielen anderen <abbr title="Comma Separated Values">CSV</abbr>-Parsern nicht verstanden wird.
Sie ist nicht Bestandteil von <a href="https://www.rfc-editor.org/rfc/rfc4180"><abbr title="Request for Comment">RFC</abbr> 4180: Common Format and MIME Type for Comma-Separated Values (<abbr title="Comma Separated Values">CSV</abbr>) Files</a>.
In <a href="https://www.w3.org/TR/tabular-data-model/#h-sotd">W3C: Model for Tabular Data and Metadata on the</a> wird lediglich erwähnt, das manche Programm so <em>Metadaten</em> ablegen.
Davon wird davon abgeraten, denn es führt gerne zu Problemen:</p>
<ul>
  <li>Der <a href="https://docs.python.org/3/library/csv.html#csv.DictReader">Python-Parser</a> erkennt z.B. nur Kopfzeilen, wenn sie in der ersten Zeile sind.
Die zusätzliche Zeile mit dem <code class="language-plaintext highlighter-rouge">sep=;</code> zerstört diesen Mechanismus.</li>
</ul>

<p>Bitte diese Zeile entfernen.</p>

<h3 id="kritik-6-kopfzeile-uneinheitlich">Kritik 6: Kopfzeile uneinheitlich</h3>

<p>Als nächstes folgt die Kopfzeile.
Normalerweise dient dieser der Benennung der Spalten.
Hier ein paar Beispiele<sup id="fnref:1" role="doc-noteref"><a href="#fn:1" class="footnote" rel="footnote">1</a></sup>:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>1            |2             |3      |4                |5      |6           |7      |8|9       |10        |11|12     |13
Datum/Uhrzeit;Verbrauchswert;Einheit;Verbrauch in Euro;Einheit;CO2-Ausstoss;Einheit; ;Ansicht:;Datum     ;  ;1 Monat;dd.mm.YYYY HH:MM Uhr
Datum/   Zeit;Energie       ;Einheit;Energie   in Euro;Einheit;CO2-Ausstoss;Einheit; ;Ansicht:;1 Monat   ;  ;Datum  ;dd.mm.YYYY HH-MM Uhr
Datum/   Zeit;Energie       ;Einheit;Energie   in Euro;Einheit;CO2-Ausstoss;Einheit; ;Ansicht:;1 Woche   ;  ;Datum  ;dd.mm.YYYY HH-MM Uhr
Datum/   Zeit;Energie       ;Einheit;Energie   in Euro;Einheit;CO2-Ausstoss;Einheit; ;Ansicht:;24 Stunden;  ;Datum  ;dd.mm.YYYY HH-MM Uhr
Datum/   Zeit;Energie       ;Einheit;Energie   in Euro;Einheit;CO2-Ausstoss;Einheit; ;Ansicht:;2 Jahre   ;  ;Datum  ;dd.mm.YYYY HH-MM Uhr
</code></pre></div></div>
<ol>
  <li>Spalte 1 heißt <strong>uneinheitlich</strong> mal <code class="language-plaintext highlighter-rouge">…/Uhrzeit</code>, manchmal nur <code class="language-plaintext highlighter-rouge">…/Zeit</code>.</li>
  <li>Spalte 2 heißt <strong>uneinheitlich</strong> mal <code class="language-plaintext highlighter-rouge">Energie</code>, manchmal aber <code class="language-plaintext highlighter-rouge">Verbrauchswert</code>.</li>
  <li>Spalten 3, 5 und 7 haben jeweils die Überschrift <code class="language-plaintext highlighter-rouge">Einheit</code> und geben die physikalische Einheit der Spalte davor an.
Die Namen der Spalten sollten besser eindeutig sein, denn manche <abbr title="Comma Separated Values">CSV</abbr>-Parser erlauben keine doppelten Spaltennamen.</li>
  <li>Spalte 4 heißt <strong>uneinheitlich</strong> <code class="language-plaintext highlighter-rouge">Verbrauch in Euro</code><sup id="fnref:2" role="doc-noteref"><a href="#fn:2" class="footnote" rel="footnote">2</a></sup>, manchmal aber <code class="language-plaintext highlighter-rouge">Energie in Euro</code><sup id="fnref:3" role="doc-noteref"><a href="#fn:3" class="footnote" rel="footnote">3</a></sup>.
Warum steht hier überhaupt die Einheit <em>Euro</em> in der Überschrift, obwohl es dafür doch eine eigene Spalte 5 gibt?</li>
  <li>Spalten 8-13 sind nur in der Kopfzeile zu finden.
Sie benennen hier deswegen nicht die Spalten, sondern enthalten <em>Meta-Daten</em> über die gesamte Datei.</li>
  <li>Spalten 8 und 11 sind leer.
Das verwirrt manche Parser.
LibreOffice z.B. erlaubt es, solche leeren Spalten zu ignorieren.</li>
  <li>Spalten 10 und 12 sind manchmal <strong>vertauscht</strong>:
Manchmal steht der <em>Zeitraum</em> in Spalte 10, manchmal aber auch in Spalte 12.</li>
  <li>Spalte 13 enthält dern <em>Zeitpunkt</em>, zu dem die Datei generiert wurde.
Die Stunden sind von den Minuten durch unterschiedliche <strong>Trennzeichen</strong> separiert: manchmal mit einem <em>Doppelpunkt</em> (<code class="language-plaintext highlighter-rouge">:</code>), manchmal mit einem <em>Minus-Zeichen</em> (<code class="language-plaintext highlighter-rouge">-</code>).</li>
</ol>

<p>Bitte eine einheitliche und konsistente Kopfzeile erzeugen!</p>

<hr />

<h2 id="datensätze">Datensätze</h2>

<p>Je nach Zeitraum haben die Datensätze ein unterschiedliches Format für die 1. Spalte mit dem <em>Datum/[Uhr]zeit</em>:
Man benötigt also pro Format einen eigenen Parser.</p>

<h3 id="kritik-7-tag--24h">Kritik 7: Tag / 24h</h3>

<p>Die Datei mit den Datensätzen für einen Tag enthält für die letzten 24 Stunden jeweils 4 Datensätze im Abstand von 15 Minuten.
Die erste Spalte sieht wie folgt aus:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>23:45;…
0:00;…
jetzt;…
</code></pre></div></div>

<p>Ohne Kontextwissen sind diese Zeitstempeln nicht zu interpretieren:</p>
<ol>
  <li>Man benötigt den Erstellungszeitpunkt der Daten aus der Kopfzeile oder dem Dateinamen, um das korrekte Datum zu ergänzen.</li>
  <li>Man muss selber erkennen, zwischen welchen Zeilen der Datumswechel stattgefunden hat.</li>
  <li>Das <code class="language-plaintext highlighter-rouge">jetzt</code> erfordert eine weitere Sonderbehandlung.</li>
</ol>

<p>Es bleibt unklar, ob die Uhrzeit sich auf den <em>Beginn</em> oder das <em>Ende</em> der Erfassungsperiode bezieht.
Von <code class="language-plaintext highlighter-rouge">jetzt</code> könnte man auf <em>Ende</em> schließen, aber scheinbar ist es jeweils der <strong>Beginn</strong>.
Von daher ist die Bezeichnung <code class="language-plaintext highlighter-rouge">jetzt</code> doppelt falsch.</p>

<p>Bitte immer einen kompletten Zeitstempel bestehend aus Datum und Uhrzeit angeben.<br />
Bitte dokumentieren, ob es sich um den <em>Beginn</em> oder das <em>Ende</em> der Erfassungsperiode handelt.</p>

<h3 id="kritik-8-woche">Kritik 8: Woche</h3>

<p>Die Datei mit den Datensätzen für eine Woche enthält für die letzten 7 Tage jeweils 4 Datensätze im Abstand von 6 Stunden.
Die erste Spalte sieht wie folgt aus:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>12;…
18;…
Do.;…
6;…
</code></pre></div></div>

<p>Ohne Kontextwissen sind diese Zeitstempeln nicht zu interpretieren:</p>
<ol>
  <li>Man benötigt den Erstellungszeitpunkt der Daten aus der Kopfzeile oder dem Dateinamen, um das korrekte Datum zu ergänzen.</li>
  <li>Statt 0 Uhr wird der <em>Wochentag</em> benannt, der aber ohne Kontextwissen nutzlos bleibt.
Zudem ist unklar, ob der Wochentag lokalisiert ist, d.h. die Wochentage der Spracheinstellung nach unterschiedlich benannt werden.</li>
  <li>Die unterschiedlichen Datentypen <em>Wochentag</em> und <em>Stunde</em> machen das Parsen nur komplizierter.</li>
  <li>Die Zeilen müssen in genau dieser Reihenfolge verarbeitet werden.
Sie dürfen auf keinen Fall umsortiert werden, weil die Zeile <em>Stunden</em> ansonsten nicht mehr eindeutig einem Wochentag zugeordnet werden können.</li>
</ol>

<p>Es bleibt unklar, ob die Uhrzeit sich auf den <em>Beginn</em> oder das <em>Ende</em> der Erfassungsperiode bezieht.
Vermutlich der <strong>Beginn</strong>.</p>

<p>Bitte immer einen kompletten Zeitstempel bestehend aus Datum und Uhrzeit angeben.<br />
Bitte dokumentieren, ob es sich um den <em>Beginn</em> oder das <em>Ende</em> der Erfassungsperiode handelt.</p>

<h3 id="kritik-9-monat">Kritik 9: Monat</h3>

<p>Die Datei mit den Datensätzen für einen Monat enthält für die letzten 31 Tage jeweils einen Datensatz pro Tag.
Die erste Spalte sieht wie folgt aus:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>31.12.;…
1.1.;…
</code></pre></div></div>

<p>Ohne Kontextwissen sind diese Datumsangaben nicht zu interpretieren:</p>
<ol>
  <li>Man benötigt den Erstellungszeitpunkt der Daten aus der Kopfzeile, um das korrekte Jahr zu ergänzen.</li>
  <li>Man muss selber erkennen, zwischen welchen Zeilen der Jahreswechsel stattgefunden hat.</li>
  <li>Es werden immer 31 Tage gelistet, auch wenn der Monat nur 30/29/28 Tage hat.
Aggregiert man mehrere Dateien, so muss man auf die Überlappung der Tage achten und diese ggf. extra behandeln.</li>
</ol>

<p>Die Angabe bezieht sich vermutlich auf einen kompletten Tag, also von 00:00 Uhr bis 00:00 Uhr des Folgetags.
Bei der Umwandung in einen Zeitstempel muß man also <code class="language-plaintext highlighter-rouge">00:00:00</code> bzw. <code class="language-plaintext highlighter-rouge">23:59:59</code> als Uhrzeit ergänzen, je nach dem ob man mit dem <em>Beginn</em> oder <em>Ende</em> rechnet.</p>

<p>Bitte immer einen komplettes Datum inklusive Jahreszahl angeben.</p>

<h3 id="kritik-10-jahr">Kritik 10: Jahr</h3>

<p>Die Datei mit den Datensätzen für die letzten 1-2 Jahre enthält für jeden Monat jeweils einen Datensatz.
Die erste Spalte sieht wie folgt aus:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>Mai 2023;…
Juni 2023;…
</code></pre></div></div>

<ol>
  <li>Es bleibt unklar, wie die Monate lokalisiert werden, d.h. wie sie je nach Spracheinstellung benannt werden.</li>
</ol>

<p>Die Angabe bezieht sich vermutlich auf einen kompletten Monat, also von 00:00 Uhr des 1. Tags inklusive bis 00:00 Uhr des 1. Tags des Folgemonats exklusive.
Im Detail bleibt aber auch hier unklar, ob intern nicht auch einfach immer mit 31 Tagen pro Monat gerechnet wird.
Bei der Umwandung in einen Zeitstempel muß auch hier darauf geachtet werden, ob mit dem ersten oder letzten Tag des Monats gearbeitet wird und welche Uhrzeit verwendet wird.</p>

<p>Bitte Datums-Angaben nicht lokalisieren.
Bitte ein exaktes Datum für <em>Begin</em> und <em>Ende</em> angeben.</p>

<h3 id="kritik-11-daten-nicht-konstant">Kritik 11: Daten nicht konstant</h3>

<p>Exportiert man die Daten mehrfach hintereinander, stellt man fest, das diese für identische Zeiträume nicht identisch sind:
Sie unterscheiden sich zwar nur um wenige Watt, aber dennoch ist das unschön.
Vermutlich ist das der <a href="https://oss.oetiker.ch/rrdtool/">Round-Robin-Datenbank</a> geschuldet, die intern <em>Sampling</em> verwendet, um (fehlende) Werte zu interpolieren.</p>

<p>Bitte eine Datenbank verwenden, die reproduzierbar die selben Daten liefert.</p>

<hr />

<h2 id="fazit">Fazit</h2>

<p>Innerhalb des FRITZ-Ökosystems funktionieren die Produkte ja wunderbar miteinander, aber der Export der Daten für die Weiterverarbeitung in einem anderen System ist eine Katastrophe.
Insbesondere <abbr title="Comma Separated Values">CSV</abbr> als Format sehe ich als sehr problematisch, da die Weiterverarbeitung alles andere als einfach ist.
Eine fehlenden <abbr title="Application Programming Interface">API</abbr> für den Abruf der aktuellen Daten per Skript macht es noch komplizierter.</p>

<p>Mein in Python geschriebener Parser inklusive einigen Testfällen und Validierung der Zeichenketten bringt es aktuell auf 324 Zeilen.
Nicht gerade wenig für ein Programm, dass nur <abbr title="Comma Separated Values">CSV</abbr>-Dateien parsen und sie in ein einheitliches Format bingen soll.</p>

<hr />

<!-- *[FD]: File Daemon -->
<!-- *[FD]: File Descriptor -->
<!-- *[GPT]: Generative Pre-trained Transformer -->
<!-- *[GPT]: Global Partitioning Table -->
<!-- *[GPT]: GUID Partition Table -->

<div class="footnotes" role="doc-endnotes">
  <ol>
    <li id="fn:1" role="doc-endnote">
      <p>Für die bessere Lesbarkeit habe ich Leerzeichen eingefügt, um die Spalten besser kenntlich zu machen. <a href="#fnref:1" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:2" role="doc-endnote">
      <p>Als Physiker störe ich mich am Wort <em>Verbrauch</em>, denn Energie wir nach dem <a href="https://de.wikipedia.org/wiki/Thermodynamik">1. Hauptsatz der Thermodynamik</a> nicht <em>verbraucht</em>, sondern (u.a. in Wärmeenergie) <em>umgewandelt</em>. <a href="#fnref:2" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:3" role="doc-endnote">
      <p><a href="https://de.wikipedia.org/wiki/Energie">Energie</a> ist auch der falsche Begriff.
  Korrekt wäre Energie<strong>kosten</strong>.
  Und die Einheit für <em>Energie</em> wäre <em>Joule</em>, nicht <em>Euro</em>. <a href="#fnref:3" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
  </ol>
</div>]]></content><author><name>Philipp Hahn</name></author><category term="linux" /><summary type="html"><![CDATA[Ich besitze privat mehrere (~25) Geräte meines Arbeitgebers AVM GmbH, darunter u.a. ein schaltbare Steckdose FRITZ!Smart Energy 200, früher bekannt als FRITZ!DECT 200. Diese erfasst auch den Energieverbrauch der angeschlossenen Geräte. Diese werden aggregiert für 24 Stunden, 1 Woche, 1 Monat oder 1-2 Jahr als CSV-Datensatz zur Verfügung gestellt. Diesen kann man sich entweder bei Bedarf über die Web-Oberfläche der FRITZ!Box herunterladen oder sich regelmäßig als Push-Service-Mail zuschicken lassen.]]></summary></entry></feed>