<?xml version="1.0" encoding="utf-8"?><feed xmlns="http://www.w3.org/2005/Atom" ><generator uri="https://jekyllrb.com/" version="3.10.0">Jekyll</generator><link href="https://blog.pmhahn.de/feed.xml" rel="self" type="application/atom+xml" /><link href="https://blog.pmhahn.de/" rel="alternate" type="text/html" /><updated>2026-04-20T11:58:28+02:00</updated><id>https://blog.pmhahn.de/feed.xml</id><title type="html">Philipp Hahn</title><subtitle>Open Source Software Developer</subtitle><author><name>Philipp Hahn</name></author><entry><title type="html">Building Python packages via GitLab pipeline with type checking</title><link href="https://blog.pmhahn.de/gitlab-python-packaging/" rel="alternate" type="text/html" title="Building Python packages via GitLab pipeline with type checking" /><published>2026-04-18T09:13:00+02:00</published><updated>2026-04-18T09:13:00+02:00</updated><id>https://blog.pmhahn.de/gitlab-python-packaging</id><content type="html" xml:base="https://blog.pmhahn.de/gitlab-python-packaging/"><![CDATA[<p>My main programming language is <a href="https://python.org/">Python</a> and I’m a huge fan of <a href="https://docs.python.org/3/library/typing.html">static type hinting</a>.
As of 2026 there are four type checkers:</p>

<ul>
  <li>Dropbox <a href="https://mypy-lang.org/"><code class="language-plaintext highlighter-rouge">mypy</code></a> (Python)</li>
  <li>Microsoft <a href="https://github.com/microsoft/pyright"><code class="language-plaintext highlighter-rouge">pyright</code></a> (TypeScript)</li>
  <li>Facebook <a href="https://pyrefly.org/"><code class="language-plaintext highlighter-rouge">pyrefly</code></a> (rust)</li>
  <li>Astral <a href="https://docs.astral.sh/ty/"><code class="language-plaintext highlighter-rouge">ty</code></a> (rust)</li>
</ul>

<p>I like to integrate them into my <a href="https://gitlab.com/">GitLab</a> workflow, which includes generatin a <a href="https://docs.gitlab.com/ci/testing/code_quality/">Code Quality</a> report.</p>

<p>On top of this my pipeline also runs <a href="https://docs.astral.sh/ruff/">Astral <code class="language-plaintext highlighter-rouge">ruff</code></a> as a linter and code formatter and uses <a href="https://docs.astral.sh/uv/">Astral <code class="language-plaintext highlighter-rouge">uv</code></a> to build and publish the Python package to GitLabs’s <a href="https://docs.gitlab.com/user/packages/pypi_repository/"><abbr title="Python Package Index">PyPI</abbr> package repository</a>.</p>

<!--more-->

<p>The complete example is available on <a href="https://gitlab.com/pmhahn/python-packaging/">gitlab.com</a>.</p>

<h2 id="basic-gitlab-pipeline">Basic GitLab pipeline</h2>

<div class="language-yaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="na">workflow</span><span class="pi">:</span>
  <span class="na">rules</span><span class="pi">:</span>
    <span class="pi">-</span> <span class="na">if</span><span class="pi">:</span> <span class="s">$CI_PIPELINE_SOURCE == "merge_request_event"</span>  <span class="c1"># MR</span>
    <span class="pi">-</span> <span class="na">if</span><span class="pi">:</span> <span class="s">$CI_COMMIT_BRANCH &amp;&amp; $CI_OPEN_MERGE_REQUESTS</span>
      <span class="na">when</span><span class="pi">:</span> <span class="s">never</span>
    <span class="pi">-</span> <span class="na">if</span><span class="pi">:</span> <span class="s">$CI_COMMIT_BRANCH</span>  <span class="c1"># Branch,Schedule,Web,CLI</span>
    <span class="pi">-</span> <span class="na">if</span><span class="pi">:</span> <span class="s">$CI_COMMIT_TAG</span>  <span class="c1"># Tag</span>
</code></pre></div></div>

<p>My workflow required pipelines for merge-requests, stable/default/protected branches, tags and manually and time triggered pipelines.</p>

<div class="language-yaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="na">stages</span><span class="pi">:</span>
  <span class="pi">-</span> <span class="s">lint</span>
  <span class="pi">-</span> <span class="s">build</span>
  <span class="pi">-</span> <span class="s">publish</span>
  <span class="pi">-</span> <span class="s">release</span>

<span class="na">default</span><span class="pi">:</span>
  <span class="na">interruptible</span><span class="pi">:</span> <span class="no">true</span>
  <span class="na">artifacts</span><span class="pi">:</span>
    <span class="na">expire_in</span><span class="pi">:</span> <span class="s">1 day</span>

<span class="na">variables</span><span class="pi">:</span>
  <span class="na">FF_SCRIPT_SECTIONS</span><span class="pi">:</span> <span class="no">true</span>
  <span class="na">FF_TIMESTAMPS</span><span class="pi">:</span> <span class="no">true</span>
  <span class="na">FF_USE_NEW_BASH_EVAL_STRATEGY</span><span class="pi">:</span> <span class="no">true</span>

<span class="na">.py</span><span class="pi">:</span>
  <span class="na">rules</span><span class="pi">:</span>
    <span class="pi">-</span> <span class="na">changes</span><span class="pi">:</span>
        <span class="na">paths</span><span class="pi">:</span>
          <span class="pi">-</span> <span class="s">pyproject.toml</span>
          <span class="pi">-</span> <span class="s">uv.lock</span>
          <span class="pi">-</span> <span class="s2">"</span><span class="s">**/*.py"</span>
  <span class="na">variables</span><span class="pi">:</span>
    <span class="na">GIT_DEPTH</span><span class="pi">:</span> <span class="m">1</span>
</code></pre></div></div>

<p>I set several <a href="https://docs.gitlab.com/runner/configuration/feature-flags/">GitLab Runner feature flags</a> to get some better experience.</p>

<h2 id="prepare-uv">Prepare <code class="language-plaintext highlighter-rouge">uv</code></h2>

<div class="language-yaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="na">.uv</span><span class="pi">:</span>
  <span class="na">variables</span><span class="pi">:</span>
    <span class="na">UV_VERSION</span><span class="pi">:</span> <span class="s2">"</span><span class="s">0.11"</span>
    <span class="na">PYTHON_VERSION</span><span class="pi">:</span> <span class="s2">"</span><span class="s">3.13"</span>
    <span class="na">BASE_LAYER</span><span class="pi">:</span> <span class="s">trixie</span>
    <span class="c1"># GitLab CI creates a separate mountpoint for the build directory,</span>
    <span class="c1"># so we need to copy instead of using hard links.</span>
    <span class="na">UV_LINK_MODE</span><span class="pi">:</span> <span class="s">copy</span>
    <span class="na">UV_CACHE_DIR</span><span class="pi">:</span> <span class="s">.uv-cache</span>
  <span class="na">image</span><span class="pi">:</span> <span class="s">ghcr.io/astral-sh/uv:$UV_VERSION-python$PYTHON_VERSION-$BASE_LAYER</span>
  <span class="na">cache</span><span class="pi">:</span>
    <span class="pi">-</span> <span class="na">key</span><span class="pi">:</span>
        <span class="na">files</span><span class="pi">:</span>
          <span class="pi">-</span> <span class="s">uv.lock</span>
      <span class="na">paths</span><span class="pi">:</span>
        <span class="pi">-</span> <span class="s">$UV_CACHE_DIR</span>
  <span class="na">after_script</span><span class="pi">:</span>
    <span class="pi">-</span> <span class="s">uv cache prune --ci</span>
</code></pre></div></div>

<p>I’m using <a href="https://docs.astral.sh/uv/"><code class="language-plaintext highlighter-rouge">uv</code></a> here with some specific version matching Debian 13 “Trixie”.
This sets up caching as documented in <a href="https://docs.astral.sh/uv/guides/integration/gitlab/"><code class="language-plaintext highlighter-rouge">uv</code>’s GitLab integration</a>.</p>

<h2 id="running-ruff">Running <code class="language-plaintext highlighter-rouge">ruff</code></h2>

<div class="language-yaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="na">.ruff</span><span class="pi">:</span>
  <span class="na">extends</span><span class="pi">:</span> <span class="pi">[</span><span class="nv">.uv</span><span class="pi">,</span> <span class="nv">.py</span><span class="pi">]</span>
  <span class="na">stage</span><span class="pi">:</span> <span class="s">lint</span>

<span class="na">ruff check</span><span class="pi">:</span>
  <span class="na">extends</span><span class="pi">:</span> <span class="pi">[</span><span class="nv">.ruff</span><span class="pi">]</span>
  <span class="na">script</span><span class="pi">:</span>
    <span class="pi">-</span> <span class="s">uvx ruff check --output-format=gitlab --output-file=code-quality-report.json</span>
  <span class="na">artifacts</span><span class="pi">:</span>
    <span class="na">reports</span><span class="pi">:</span>
      <span class="na">codequality</span><span class="pi">:</span> <span class="s">$CI_PROJECT_DIR/code-quality-report.json</span>

<span class="na">ruff format</span><span class="pi">:</span>
  <span class="na">extends</span><span class="pi">:</span> <span class="pi">[</span><span class="nv">.ruff</span><span class="pi">]</span>
  <span class="na">script</span><span class="pi">:</span>
    <span class="pi">-</span> <span class="s">uvx ruff format --diff</span>
</code></pre></div></div>
<p>This runs <a href="https://docs.astral.sh/ruff/integrations/#gitlab-cicd"><code class="language-plaintext highlighter-rouge">ruff</code></a> twice:</p>
<ol>
  <li>Once as a linter to check form <a href="https://docs.astral.sh/ruff/rules/">common issues</a></li>
  <li>Once again as a code formatter to check, if the formatting does not follow the configured style.</li>
</ol>

<h2 id="running-the-type-checkers">Running the type checkers</h2>

<div class="language-yaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="na">.lint</span><span class="pi">:</span>
  <span class="na">stage</span><span class="pi">:</span> <span class="s">lint</span>
  <span class="na">extends</span><span class="pi">:</span> <span class="pi">[</span><span class="nv">.uv</span><span class="pi">,</span> <span class="nv">.py</span><span class="pi">]</span>
  <span class="na">artifacts</span><span class="pi">:</span>
    <span class="na">reports</span><span class="pi">:</span>
      <span class="na">codequality</span><span class="pi">:</span> <span class="s">$CI_PROJECT_DIR/gl-code-quality-report.json</span>

<span class="na">mypy</span><span class="pi">:</span>
  <span class="na">extends</span><span class="pi">:</span> <span class="pi">[</span><span class="nv">.lint</span><span class="pi">]</span>
  <span class="na">script</span><span class="pi">:</span>
    <span class="pi">-</span> <span class="s">uvx mypy --no-error-summary &gt;mypy-out.txt</span>
  <span class="na">after_script</span><span class="pi">:</span>
    <span class="pi">-</span> <span class="s">uvx mypy-gitlab-code-quality &lt;mypy-out.txt &gt;gl-code-quality-report.json</span>
    <span class="pi">-</span> <span class="kt">!reference</span> <span class="pi">[</span><span class="nv">.uv</span><span class="pi">,</span> <span class="nv">after_script</span><span class="pi">]</span>

<span class="na">pyright</span><span class="pi">:</span>
  <span class="na">extends</span><span class="pi">:</span> <span class="pi">[</span><span class="nv">.lint</span><span class="pi">]</span>
  <span class="na">script</span><span class="pi">:</span>
    <span class="pi">-</span> <span class="s">uvx --from=pyright[nodejs] pyright --outputjson &gt;pyright-raw.json</span>
  <span class="na">after_script</span><span class="pi">:</span>
    <span class="pi">-</span> <span class="s">uvx pyright-to-gitlab -i pyright-raw.json -o gl-code-quality-report.json</span>
    <span class="pi">-</span> <span class="kt">!reference</span> <span class="pi">[</span><span class="nv">.uv</span><span class="pi">,</span> <span class="nv">after_script</span><span class="pi">]</span>

<span class="na">pyrefly</span><span class="pi">:</span>
  <span class="na">extends</span><span class="pi">:</span> <span class="pi">[</span><span class="nv">.uv</span><span class="pi">,</span> <span class="nv">.py</span><span class="pi">]</span>  <span class="c1"># .lint</span>
  <span class="na">stage</span><span class="pi">:</span> <span class="s">lint</span>  <span class="c1"># TEMPORARY</span>
  <span class="na">script</span><span class="pi">:</span>
    <span class="pi">-</span> <span class="s">uvx pyrefly check</span>  <span class="c1"># --output-format CodeQuality --output gl-code-quality-report.json</span>

<span class="na">ty</span><span class="pi">:</span>
  <span class="na">extends</span><span class="pi">:</span> <span class="pi">[</span><span class="nv">.lint</span><span class="pi">]</span>
  <span class="na">script</span><span class="pi">:</span>
    <span class="pi">-</span> <span class="s">uvx ty check  --output-format gitlab &gt;gl-code-quality-report.json</span>
</code></pre></div></div>

<p><code class="language-plaintext highlighter-rouge">mypy</code> and <code class="language-plaintext highlighter-rouge">pyright</code> do not generate the <a href="https://docs.gitlab.com/ci/testing/code_quality/#code-quality-report-format">Code Quality <abbr title="JavaScript Object Notation">JSON</abbr></a> themselves.
They require running some converter, which transforms their output format.
This is done in <code class="language-plaintext highlighter-rouge">after_script</code> to always run them, even when the type checkers abort with an exit code other than 0.
This overwrites the <code class="language-plaintext highlighter-rouge">after_script</code> from the template job <code class="language-plaintext highlighter-rouge">.uv</code>, which calls <code class="language-plaintext highlighter-rouge">uv cache prune --ci</code> to maintain its cache.
As such we have to restore that functionality and use <a href="https://docs.gitlab.com/ci/yaml/yaml_optimization/#reference-tags"><code class="language-plaintext highlighter-rouge">!reference</code></a> to do that.</p>

<p><code class="language-plaintext highlighter-rouge">ty</code> already generated the required <abbr title="JavaScript Object Notation">JSON</abbr>.
For <code class="language-plaintext highlighter-rouge">pyrefly</code> there is <a href="https://github.com/facebook/pyrefly/issues/3049">issue 3049</a>, where I asked to add native support for it.</p>

<h2 id="build-and-publish-the-python-package">Build and publish the Python package</h2>

<div class="language-yaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="na">build python package</span><span class="pi">:</span>
  <span class="na">stage</span><span class="pi">:</span> <span class="s">build</span>
  <span class="na">extends</span><span class="pi">:</span> <span class="pi">[</span><span class="nv">.uv</span><span class="pi">]</span>
  <span class="na">rules</span><span class="pi">:</span>
    <span class="pi">-</span> <span class="na">if</span><span class="pi">:</span> <span class="s">$CI_COMMIT_BRANCH</span>
    <span class="pi">-</span> <span class="na">if</span><span class="pi">:</span> <span class="s">$CI_COMMIT_TAG</span>
  <span class="na">variables</span><span class="pi">:</span>
    <span class="na">GIT_DEPTH</span><span class="pi">:</span> <span class="m">0</span>
    <span class="na">GIT_FETCH_EXTRA_FLAGS</span><span class="pi">:</span> <span class="s">--prune --quiet --tags --filter=tree:0</span>
  <span class="na">script</span><span class="pi">:</span>
    <span class="c1"># - |</span>
    <span class="c1">#   VERSION=$(git describe --exact-match --tags) &amp;&amp;</span>
    <span class="c1">#   uvx --from=toml-cli toml set --toml-path=pyproject.toml project.version "$VERSION"</span>
    <span class="pi">-</span> <span class="s">uv build</span>
  <span class="na">artifacts</span><span class="pi">:</span>
    <span class="na">paths</span><span class="pi">:</span>
      <span class="pi">-</span> <span class="s">dist/</span>
</code></pre></div></div>
<p>This builds the Python package.</p>

<p>Alter <code class="language-plaintext highlighter-rouge">uv publish</code> will fail when you try to upload a Python package with an already existing version.
This happens mostly because I forget to update <code class="language-plaintext highlighter-rouge">[project]version</code> in the <code class="language-plaintext highlighter-rouge">pyproject.toml</code>.
The out-commented code above would poke the version into the file before each build.</p>

<p>My alternative was to switch to <a href="https://pypi.org/project/setuptools-scm/"><code class="language-plaintext highlighter-rouge">setuptools-scm</code></a>:
This will use <code class="language-plaintext highlighter-rouge">git desctibe</code> to generate the version from <code class="language-plaintext highlighter-rouge">git tags</code>, which requires two things:</p>

<ol>
  <li>The image must contain the <code class="language-plaintext highlighter-rouge">git</code> binary. Debian’s <code class="language-plaintext highlighter-rouge">-slim</code>-images do not.
Switch to the non-<code class="language-plaintext highlighter-rouge">-slim</code> versions.</li>
  <li>Fetch enough history:
<code class="language-plaintext highlighter-rouge">GIT_DEPTH: 1</code> may not be enough to walk the git commits from <code class="language-plaintext highlighter-rouge">HEAD</code> to any previous <code class="language-plaintext highlighter-rouge">git tag</code>.
Therefor I use <code class="language-plaintext highlighter-rouge">GIT_DEPTH: 0</code> to fetch the complete history, but combine it with <code class="language-plaintext highlighter-rouge">GIT_FETCH_EXTRA_FLAGS: --filter=tree:0</code>:
That way I use <a href="https://git-scm.com/docs/partial-clone"><code class="language-plaintext highlighter-rouge">git</code>’s partial-clone</a> feature:
It fetch all commit object, but only the tree and blob objects required for HEAD.</li>
</ol>

<div class="language-yaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="na">publish python package</span><span class="pi">:</span>
  <span class="na">stage</span><span class="pi">:</span> <span class="s">publish</span>
  <span class="na">extends</span><span class="pi">:</span> <span class="pi">[</span><span class="nv">.uv</span><span class="pi">]</span>
  <span class="na">rules</span><span class="pi">:</span>
    <span class="pi">-</span> <span class="na">if</span><span class="pi">:</span> <span class="s">$CI_COMMIT_TAG</span>
  <span class="na">needs</span><span class="pi">:</span>
    <span class="pi">-</span> <span class="na">job</span><span class="pi">:</span> <span class="s">build python package</span>
  <span class="na">variables</span><span class="pi">:</span>
    <span class="na">GIT_STRATEGY</span><span class="pi">:</span> <span class="s">none</span>
    <span class="na">UV_PUBLISH_URL</span><span class="pi">:</span> <span class="s">${CI_API_V4_URL}/projects/${CI_PROJECT_ID}/packages/pypi</span>
    <span class="na">UV_PUBLISH_USERNAME</span><span class="pi">:</span> <span class="s">gitlab-ci-token</span>
    <span class="na">UV_PUBLISH_PASSWORD</span><span class="pi">:</span> <span class="s">${CI_JOB_TOKEN}</span>
  <span class="na">script</span><span class="pi">:</span>
    <span class="pi">-</span> <span class="s">uv publish dist/*.whl</span>
</code></pre></div></div>
<p>This published the Python package to GitLab’s <a href="https://docs.gitlab.com/user/packages/pypi_repository/"><abbr title="Python Package Index">PyPI</abbr> package registry</a>.</p>

<h2 id="creating-a-gitlab-release">Creating a GitLab release</h2>

<div class="language-yaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="na">release_job</span><span class="pi">:</span>
  <span class="na">stage</span><span class="pi">:</span> <span class="s">release</span>
  <span class="na">image</span><span class="pi">:</span> <span class="s">registry.gitlab.com/gitlab-org/cli:latest</span>
  <span class="na">rules</span><span class="pi">:</span>
    <span class="pi">-</span> <span class="na">if</span><span class="pi">:</span> <span class="s1">'</span><span class="s">$CI_COMMIT_TAG</span><span class="nv"> </span><span class="s">=~</span><span class="nv"> </span><span class="s">/^v?\d+\.\d+\.\d+$/'</span>
  <span class="na">variables</span><span class="pi">:</span>
    <span class="na">GIT_STRATEGY</span><span class="pi">:</span> <span class="s">none</span>
    <span class="na">GLAB_CONFIG_DIR</span><span class="pi">:</span> <span class="s">${CI_PROJECT_DIR}/.glab-config.${CI_PIPELINE_ID}</span>
    <span class="na">GLAB_ENABLE_CI_AUTOLOGIN</span><span class="pi">:</span> <span class="no">true</span>
    <span class="na">GITLAB_HOST</span><span class="pi">:</span> <span class="s">$CI_SERVER_URL</span>
  <span class="na">dependencies</span><span class="pi">:</span> <span class="pi">[]</span>
  <span class="na">script</span><span class="pi">:</span>
    <span class="pi">-</span> <span class="pi">&gt;</span>
      <span class="s">glab changelog generate &gt;changelog.md</span>
      <span class="s">--repo "$CI_PROJECT_PATH"</span>
      <span class="s">--version "$CI_COMMIT_TAG"</span>
      <span class="s">--to "$CI_COMMIT_BRANCH"</span>
  <span class="na">release</span><span class="pi">:</span>
    <span class="na">name</span><span class="pi">:</span> <span class="s1">'</span><span class="s">Release</span><span class="nv"> </span><span class="s">$CI_COMMIT_TAG'</span>
    <span class="na">description</span><span class="pi">:</span> <span class="s">changelog.md</span>
    <span class="na">tag_name</span><span class="pi">:</span> <span class="s">$CI_COMMIT_TAG</span>
    <span class="na">assets</span><span class="pi">:</span>
      <span class="na">links</span><span class="pi">:</span>
        <span class="pi">-</span> <span class="na">name</span><span class="pi">:</span> <span class="s1">'</span><span class="s">PyPi</span><span class="nv"> </span><span class="s">package</span><span class="nv"> </span><span class="s">$CI_COMMIT_TAG'</span>
          <span class="na">url</span><span class="pi">:</span> <span class="s">$CI_PROJECT_URL/-/packages/</span>
          <span class="na">link_type</span><span class="pi">:</span> <span class="s">package</span>
</code></pre></div></div>
<p>The final part creates a <a href="https://docs.gitlab.com/user/project/releases/#create-a-release">GitLab release</a>.
It uses <a href="https://docs.gitlab.com/user/project/changelogs/">GitLab’s changelog <abbr title="Application Programming Interface">API</abbr></a> to automatically create a changelog in Markdown format from the git commits having a <code class="language-plaintext highlighter-rouge">Changelog:</code> trailer.</p>

<p>For my environment I have to tell <code class="language-plaintext highlighter-rouge">glab</code> to use configuration file in a writeable directory.
Without that it will try to write to <code class="language-plaintext highlighter-rouge">/.glab/</code>, which will fail.</p>

<p>That <code class="language-plaintext highlighter-rouge">assets:links:</code> part creates a link to the <abbr title="Python Package Index">PyPI</abbr> package registry.
<a href="https://docs.gitlab.com/releases/18/gitlab-18-11-released/">GitLab 18.11</a> just received a feature, where <a href="https://docs.gitlab.com/user/project/releases/release_evidence/#include-packages-as-release-evidence">packages are included as release evidence</a>, which might make this optional.</p>

<!-- *[FD]: File Daemon -->
<!-- *[FD]: File Descriptor -->
<!-- *[GPT]: Generative Pre-trained Transformer -->
<!-- *[GPT]: Global Partitioning Table -->
<!-- *[GPT]: GUID Partition Table -->]]></content><author><name>Philipp Hahn</name></author><category term="gitlab" /><category term="python" /><summary type="html"><![CDATA[My main programming language is Python and I’m a huge fan of static type hinting. As of 2026 there are four type checkers: Dropbox mypy (Python) Microsoft pyright (TypeScript) Facebook pyrefly (rust) Astral ty (rust) I like to integrate them into my GitLab workflow, which includes generatin a Code Quality report. On top of this my pipeline also runs Astral ruff as a linter and code formatter and uses Astral uv to build and publish the Python package to GitLabs’s PyPI package repository.]]></summary></entry><entry><title type="html">Running linters in GitLab</title><link href="https://blog.pmhahn.de/gitlab-linter/" rel="alternate" type="text/html" title="Running linters in GitLab" /><published>2026-04-07T09:32:00+02:00</published><updated>2026-04-07T09:32:00+02:00</updated><id>https://blog.pmhahn.de/gitlab-linter</id><content type="html" xml:base="https://blog.pmhahn.de/gitlab-linter/"><![CDATA[<p>Linters run as part of GitLab Continuous Integration pipelines to guarantee code quality.
A Merge Request based workflow will start from a branch <code class="language-plaintext highlighter-rouge">main</code> being currently at some commit <code class="language-plaintext highlighter-rouge">start</code>.
You normally fork a branch <code class="language-plaintext highlighter-rouge">develop</code> from that branch <code class="language-plaintext highlighter-rouge">base</code> and will create several commits <code class="language-plaintext highlighter-rouge">c1</code>, <code class="language-plaintext highlighter-rouge">c2</code>, …, <code class="language-plaintext highlighter-rouge">cN</code>.
When you <a href="/gitlab-merge-request-cli/">push that branch and create a Merge Request</a>, a pipeline may start and  will run several jobs.</p>

<!--more-->

<pre><code class="language-mermaid">gitGraph
    commit
    branch develop
    checkout develop
    commit
    commit
    commit
</code></pre>

<p>There are different ways to run your linters:</p>
<ol>
  <li>You may run your linter like <code class="language-plaintext highlighter-rouge">ruff</code> just on the <strong>last</strong> commit <code class="language-plaintext highlighter-rouge">cN</code>, which then just checks the state after applying all commits <code class="language-plaintext highlighter-rouge">c1…cN</code> on top of <code class="language-plaintext highlighter-rouge">branch</code>.
This will not only check and find issues introduced by your changes, but will also check all other files and find issues there.</li>
  <li>You may want to build the <strong>diff</strong> from <code class="language-plaintext highlighter-rouge">main</code> to <code class="language-plaintext highlighter-rouge">develop</code> and run your linter on that.
That way you check you <em>effective change</em>.</li>
  <li>You may also want to run the linter on <strong>each</strong> commit.
This is important if you later want to use <code class="language-plaintext highlighter-rouge">git bisect</code> to find regressions.</li>
</ol>

<p>Depending on which strategy you want to use, you have to configure your jobs differently in GitLab.</p>

<h2 id="check-state-after-last-commit">Check state after last commit</h2>

<p>This is the easies one as GitLab already checks out the branch for us.
You can optimize that by using <a href="https://docs.gitlab.com/ci/runners/configure_runners/#shallow-cloning">Shallow cloning</a> to only clone the last commit.
By default GitLab clones the last 20 commits, which might pull too much data.</p>

<div class="language-yaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="na">run linter</span><span class="pi">:</span>
  <span class="na">variables</span><span class="pi">:</span>
    <span class="na">GIT_DEPTH</span><span class="pi">:</span> <span class="m">1</span>
  <span class="na">script</span><span class="pi">:</span>
    <span class="pi">-</span> <span class="s">ruff check</span>
</code></pre></div></div>

<p>You can reduce the clone depth down to 1 to improve that further, but that may introduce problems later.
<a href="https://docs.gitlab.com/ci/runners/configure_runners/#shallow-cloning">Shallow cloning</a> has more information on that and contains a warning, that this might not work with GitLab in some cases:
GitLab-Runner clones by <code class="language-plaintext highlighter-rouge">ref</code>, but might then pick a previous commit for the job to run on.
With <code class="language-plaintext highlighter-rouge">--depth=1</code> that might not work as there will be no previous commits to run on.</p>

<p>Disclaimer: I have not checked that behavior myself and doubt, that this is actually correct.</p>

<h2 id="check-diff-from-fork-point-to-last-commit">Check diff from fork-point to last commit</h2>

<p>To build the diff you must also fetch the <em>fork-point</em> in addition to the branch <code class="language-plaintext highlighter-rouge">develop</code>.
This is not equivalent to <code class="language-plaintext highlighter-rouge">main</code> as other commits might have been pushed there.
Luckily GitLab already tells us the <em>fork-point</em> commit:
The <a href="https://docs.gitlab.com/ci/variables/predefined_variables/#predefined-variables-for-merge-request-pipelines">pre-defined variable</a> <code class="language-plaintext highlighter-rouge">CI_MERGE_REQUEST_DIFF_BASE_SHA</code> contains the <abbr title="Secure Hash Algorithm 1">SHA1</abbr> of the merge request diff.
We have to fetch this either manually by running <code class="language-plaintext highlighter-rouge">git fetch</code> ourself — or we can add it to <a href="https://docs.gitlab.com/ci/runners/configure_runners/#git-fetch-extra-flags"><code class="language-plaintext highlighter-rouge">GIT_FETCH_EXTRA_FLAGS</code></a>.
That has the benefit that the GitLab-Runner will fetch the commit for us when it fetches <code class="language-plaintext highlighter-rouge">develop</code>.
This is more efficient and also uses the credentials of the Runner.</p>

<div class="language-yaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="na">run linter</span><span class="pi">:</span>
  <span class="na">variables</span><span class="pi">:</span>
    <span class="na">GIT_DEPTH</span><span class="pi">:</span> <span class="m">1</span>
    <span class="na">GIT_FETCH_EXTRA_FLAGS</span><span class="pi">:</span> <span class="s2">"</span><span class="s">--prune</span><span class="nv"> </span><span class="s">--quiet</span><span class="nv"> </span><span class="s">--no-tags</span><span class="nv"> </span><span class="s">$CI_MERGE_REQUEST_DIFF_BASE_SHA"</span>
  <span class="na">script</span><span class="pi">:</span>
    <span class="pi">-</span> <span class="s">git diff "${CI_MERGE_REQUEST_DIFF_BASE_SHA}..HEAD" | checkpatch -</span>
</code></pre></div></div>

<p>Correction (2026-04-17):
Actually this does not work as the GitLab-Runner uses <code class="language-plaintext highlighter-rouge">git init</code> followed by <code class="language-plaintext highlighter-rouge">git fetch</code> by default.
For the later GitLab will add the refspecs <code class="language-plaintext highlighter-rouge">+refs/heads/*:refs/origin/heads/*</code> and <code class="language-plaintext highlighter-rouge">+refs/tags/*:refs/tags/*</code> to <code class="language-plaintext highlighter-rouge">git fetch</code>, which will fetch all heads and tags.
Neither the <code class="language-plaintext highlighter-rouge">--no-tags</code> disables fetching the tags not is the <code class="language-plaintext highlighter-rouge">$CI_MERGE_REQUEST_DIFF_BASE_SHA</code> needed.</p>

<p>Instead of using <code class="language-plaintext highlighter-rouge">init+ferch</code>, <code class="language-plaintext highlighter-rouge">git clone</code> can be used if</p>
<ul>
  <li>the feature-flag <a href="https://docs.gitlab.com/runner/configuration/feature-flags/"><code class="language-plaintext highlighter-rouge">FF_USE_GIT_NATIVE_CLONE</code></a> is enabled.</li>
  <li>the version of <code class="language-plaintext highlighter-rouge">git</code> is at least <code class="language-plaintext highlighter-rouge">2.49</code> released 2025-03-14 supporting <code class="language-plaintext highlighter-rouge">--branch</code> and the newer <code class="language-plaintext highlighter-rouge">--revision</code>.</li>
  <li><code class="language-plaintext highlighter-rouge">GIT_STRATEGY=clone</code> is set as a job variable.</li>
</ul>

<p><code class="language-plaintext highlighter-rouge">GIT_DEPTH</code> can be used to limit the <code class="language-plaintext highlighter-rouge">git clone --depth</code>, which then also adds <code class="language-plaintext highlighter-rouge">--single-branch</code>.
That option and other options like <code class="language-plaintext highlighter-rouge">--no-tags</code> can be passed via <code class="language-plaintext highlighter-rouge">GIT_CLONE_EXTRA_FLAGS</code>.</p>

<p>The drawback is, that <code class="language-plaintext highlighter-rouge">git clone</code> can only clone a single branch or revision.
Fetching <code class="language-plaintext highlighter-rouge">CI_MERGE_REQUEST_DIFF_BASE_SHA</code> thus requires a manual call of <code class="language-plaintext highlighter-rouge">git fetch</code>.</p>

<div class="language-yaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="na">run linter</span><span class="pi">:</span>
  <span class="na">variables</span><span class="pi">:</span>
    <span class="na">GIT_STRATEGY</span><span class="pi">:</span> <span class="s2">"</span><span class="s">clone"</span>
    <span class="na">GIT_DEPTH</span><span class="pi">:</span> <span class="m">1</span>
    <span class="na">GIT_CLONE_EXTRA_FLAGS</span><span class="pi">:</span> <span class="s2">"</span><span class="s">--no-tags</span><span class="nv"> </span><span class="s">--single-branch"</span>
  <span class="na">script</span><span class="pi">:</span>
    <span class="pi">-</span> <span class="s">git fetch origin "$CI_MERGE_REQUEST_DIFF_BASE_SHA"</span>
    <span class="pi">-</span> <span class="s">git diff "${CI_MERGE_REQUEST_DIFF_BASE_SHA}..HEAD" | checkpatch -</span>
</code></pre></div></div>

<h2 id="check-all-commits-from-fork-point-to-last-commit">Check all commits from fork-point to last commit</h2>

<p>To check all commits of a <abbr title="Merge Request">MR</abbr> individually, you have to do more work.
The <a href="https://git-scm.com/book/en/v2/Git-Internals-Git-Objects"><code class="language-plaintext highlighter-rouge">git</code> protocol</a> only supports fetching refs by name or tags / commits / trees / blobs by <abbr title="Secure Hash Algorithm 1">SHA1</abbr>!
The problem is that you don’t know how many commits are between the fork-point and HEAD; you do not know which value to use for <code class="language-plaintext highlighter-rouge">GIT_DEPTH</code>.
By using <code class="language-plaintext highlighter-rouge">GIT_DEPTH: 0</code> you tell the GitLab-Runner to <strong>not</strong> do a shallow-clone and to clone all commits reachable from the tip of the branch.
For large repositories with many commits or large blobs that can become very costly.
So reducing the number or type of objects to download can become important.</p>

<p>There are two sub-cases:</p>

<h3 id="only-check-all-commit-messages">Only check all commit messages</h3>

<p>If you only need the commit messages (and not the files itself), you can use <a href="https://git-scm.com/docs/git-rev-list#Documentation/git-rev-list.txt---filterfilter-spec"><code class="language-plaintext highlighter-rouge">git clone --filter</code></a> to limit which type of objects you want to clone initially: commits, trees, blobs.
Missing objects are lazy-fetches, which can result in a dramatic performance issue when your initial clone filters too much!</p>

<p>But there is one caveat and you have to be careful to use the right <code class="language-plaintext highlighter-rouge">git</code> commands:
If any <code class="language-plaintext highlighter-rouge">git</code> command requires missing data, <code class="language-plaintext highlighter-rouge">git</code> will fetch it on demand.
It will connect again to the remote serer and fetch the missing data.
As this will result in (many) network connection with lots of round-trips, this is then slower than to download it at once during the initial <code class="language-plaintext highlighter-rouge">clone</code>.</p>

<p>Therefore check the commands you run:</p>
<ul>
  <li><code class="language-plaintext highlighter-rouge">git format-patch</code> and <code class="language-plaintext highlighter-rouge">git show</code> require the associated <code class="language-plaintext highlighter-rouge">tree</code>s and <code class="language-plaintext highlighter-rouge">blob</code>s to be present and will trigger delayed fetches.</li>
  <li><code class="language-plaintext highlighter-rouge">git log --no-patch</code> does not and works on <code class="language-plaintext highlighter-rouge">commit</code>s alone.</li>
</ul>

<div class="language-yaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="na">run linter</span><span class="pi">:</span>
  <span class="na">variables</span><span class="pi">:</span>
    <span class="na">GIT_DEPTH</span><span class="pi">:</span> <span class="m">0</span>
    <span class="na">GIT_FETCH_EXTRA_FLAGS</span><span class="pi">:</span> <span class="s">--prune --quiet --no-tags --filter=object:type=commit</span>
  <span class="na">script</span><span class="pi">:</span>
    <span class="pi">-</span> <span class="s">git log --no-patch "${CI_MERGE_REQUEST_DIFF_BASE_SHA}..HEAD" | mrcheck</span>
</code></pre></div></div>

<p>If you also need the latest tree, use <code class="language-plaintext highlighter-rouge">--filter=tree:0</code> instead.</p>

<h3 id="check-all-individual-commits-and-their-trees-wip">Check all individual commits and their trees (<abbr title="Work in Progress">WIP</abbr>)</h3>

<p>Similar to above we first fetch only all <code class="language-plaintext highlighter-rouge">commit</code> objects.
We then let <code class="language-plaintext highlighter-rouge">git</code> fetch the required <code class="language-plaintext highlighter-rouge">tree</code> and <code class="language-plaintext highlighter-rouge">blob</code> objects on demand.</p>

<div class="language-yaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="na">run linter</span><span class="pi">:</span>
  <span class="na">variables</span><span class="pi">:</span>
    <span class="na">GIT_DEPTH</span><span class="pi">:</span> <span class="m">0</span>
    <span class="na">GIT_FETCH_EXTRA_FLAGS</span><span class="pi">:</span> <span class="s">--prune --quiet --no-tags --filter=tree:0</span>
  <span class="na">script</span><span class="pi">:</span>
    <span class="pi">-</span> <span class="s">git log --patch "${CI_MERGE_REQUEST_DIFF_BASE_SHA}..HEAD" | checkpatch -</span>
</code></pre></div></div>

<p>The alternative to start first with a limited number of commits and to incrementally deepen that number of commits until the fork-point is reached also works, but requires more work and leads to multiple network round-trips to the repository server.
Which strategy is faster may depend on the history size.</p>

<p><abbr title="To be continued">TBC</abbr>…</p>

<!-- *[FD]: File Daemon -->
<!-- *[FD]: File Descriptor -->
<!-- *[GPT]: Generative Pre-trained Transformer -->
<!-- *[GPT]: Global Partitioning Table -->
<!-- *[GPT]: GUID Partition Table -->]]></content><author><name>Philipp Hahn</name></author><category term="gitlab" /><summary type="html"><![CDATA[Linters run as part of GitLab Continuous Integration pipelines to guarantee code quality. A Merge Request based workflow will start from a branch main being currently at some commit start. You normally fork a branch develop from that branch base and will create several commits c1, c2, …, cN. When you push that branch and create a Merge Request, a pipeline may start and will run several jobs.]]></summary></entry><entry><title type="html">Linux and the Windows NT file system</title><link href="https://blog.pmhahn.de/linux-ntfs/" rel="alternate" type="text/html" title="Linux and the Windows NT file system" /><published>2026-02-17T15:01:00+01:00</published><updated>2026-02-17T15:01:00+01:00</updated><id>https://blog.pmhahn.de/linux-ntfs</id><content type="html" xml:base="https://blog.pmhahn.de/linux-ntfs/"><![CDATA[<p>The <em>Windows New Technology File System</em> (<abbr title="New Technology File System">NTFS</abbr>) has a long history with Linux:</p>

<table>
  <thead>
    <tr>
      <th>Driver</th>
      <th>Type</th>
      <th>Based on</th>
      <th>Kernel</th>
      <th>Period</th>
      <th>Read-write</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td>Original</td>
      <td>Kernel</td>
      <td>Scratch</td>
      <td>2.1.74</td>
      <td>1995-2001</td>
      <td>Read-only</td>
    </tr>
    <tr>
      <td><a href="https://flatcap.github.io/linux-ntfs/misc.html">Linux-<abbr title="New Technology File System">NTFS</abbr></a></td>
      <td>Kernel</td>
      <td>Scratch</td>
      <td>2.5.11</td>
      <td>2002-2024</td>
      <td>Read-only</td>
    </tr>
    <tr>
      <td><a href="https://en.wikipedia.org/wiki/Captive_NTFS">Captive</a></td>
      <td>FUSE</td>
      <td><code class="language-plaintext highlighter-rouge">ntfs.sys</code></td>
      <td> </td>
      <td>2003-2006</td>
      <td>Read-write</td>
    </tr>
    <tr>
      <td><a href="https://en.wikipedia.org/wiki/NTFS-3G"><abbr title="New Technology File System">NTFS</abbr>-3G</a></td>
      <td>FUSE</td>
      <td>Linux-<abbr title="New Technology File System">NTFS</abbr></td>
      <td>3.18</td>
      <td>2006-</td>
      <td>Read-write</td>
    </tr>
    <tr>
      <td><a href="https://www.kernel.org/doc/html/latest/filesystems/ntfs3.html">NTFS3</a></td>
      <td>Kernel</td>
      <td>Paragon</td>
      <td>5.15</td>
      <td>2021-</td>
      <td>Read-write</td>
    </tr>
    <tr>
      <td><a href="https://git.kernel.org/pub/scm/linux/kernel/git/linkinjeon/ntfs.git/log/?h=ntfs-next&amp;ref=itsfoss.com"><abbr title="New Technology File System">NTFS</abbr> Plus</a></td>
      <td>Kernel</td>
      <td>Linux-<abbr title="New Technology File System">NTFS</abbr></td>
      <td>7.x?</td>
      <td>2026?</td>
      <td>Read-write</td>
    </tr>
    <tr>
      <td><abbr title="Audio-Visuelles Marketing">AVM</abbr> <abbr title="New Technology File System">NTFS</abbr></td>
      <td>Kernel</td>
      <td><abbr title="New Technology File System">NTFS</abbr>-3G</td>
      <td> </td>
      <td>2012-</td>
      <td>Read-write</td>
    </tr>
  </tbody>
</table>

<!--more-->

<ol>
  <li>The <em>Original</em> implementation was from Martin von Löwis.</li>
  <li>Anton Altaparmakov created the 2nd implementation <em>Linux-<abbr title="New Technology File System">NTFS</abbr></em> from scratch, which replaced the original implementation.</li>
  <li><em>Captive</em> was the first user-space based implementation, which used the original Windows driver <code class="language-plaintext highlighter-rouge">ntfs.sys</code> from Microsoft and run it under Wine.</li>
  <li><em><abbr title="New Technology File System">NTFS</abbr>-3G</em> also runs in user-space and uses FUSE to talk to the kernel.</li>
  <li>Paragaon donated an open-source version of if proprietary <em>NTFS3</em> to the Linux kernel. It was the first read-write implementation in the kernel, but is less documented.</li>
  <li><em><abbr title="New Technology File System">NTFS</abbr> Plus</em> is based on the older <em>Linux-<abbr title="New Technology File System">NTFS</abbr></em> implementation, adds read-write support and updates the implementation to use modern Linux APIs. It is scheduled to replace <em>NTFS3</em> again.</li>
  <li><abbr title="Audio-Visuelles Marketing">AVM</abbr> – now FRITZ! Technology – ported the <em><abbr title="New Technology File System">NTFS</abbr>-3G</em> to kernel space, which is used in FritzOS only.</li>
</ol>

<pre><code class="language-mermaid">gantt
    title NTFS
    dateFormat  YYYY-MM-DD
    axisFormat %Y
    Original    : 1997-01-01, 2002-04-01
    Linux-NTFS  : 2002-04-01, 2024-01-01
    Captive     : 2003-01-01, 2006-01-01
    NTFS-3G     : 2006-01-01, 2030-01-01
    NTFS3       : 2021-11-01, 2026-01-01
    NTFS+       : 2026-03-01, 2030-01-01
    ANTFS       : 2012-01-01, 2030-01-01
    2.0.0       : vert, 1996-06-09, 1m
    2.2.0       : vert, 1999-01-26, 1m
    2.4.0       : vert, 2001-01-04, 1m
    2.6.0       : vert, 2003-12-18, 1m
    2.6.16      : vert, 2006-03-20, 1m
    2.6.27      : vert, 2008-10-09, 1m
    3.0         : vert, 2011-07-21, 1m
    3.8         : vert, 2013-02-18, 1m
    4.4         : vert, 2016-01-10, 1m
    4.19        : vert, 2018-10-22, 1m
    5.10        : vert, 2020-12-13, 1m
    5.15        : vert, 2021-10-31, 1m
    6.1         : vert, 2022-12-11, 1m
    6.6         : vert, 2023-10-29, 1m
    6.12        : vert, 2024-11-17, 1m
</code></pre>

<h2 id="links">Links</h2>
<ul>
  <li>Wikipedia: <a href="https://en.wikipedia.org/wiki/Linux_kernel_version_history">Linux Kernel version history</a></li>
</ul>

<!-- <https://www.cyberark.com/resources/threat-research-blog/the-linux-kernel-and-the-cursed-driver -->

<!-- *[FD]: File Daemon -->
<!-- *[FD]: File Descriptor -->
<!-- *[GPT]: Generative Pre-trained Transformer -->
<!-- *[GPT]: Global Partitioning Table -->
<!-- *[GPT]: GUID Partition Table -->]]></content><author><name>Philipp Hahn</name></author><category term="linux" /><category term="filesystem" /><summary type="html"><![CDATA[The Windows New Technology File System (NTFS) has a long history with Linux: Driver Type Based on Kernel Period Read-write Original Kernel Scratch 2.1.74 1995-2001 Read-only Linux-NTFS Kernel Scratch 2.5.11 2002-2024 Read-only Captive FUSE ntfs.sys   2003-2006 Read-write NTFS-3G FUSE Linux-NTFS 3.18 2006- Read-write NTFS3 Kernel Paragon 5.15 2021- Read-write NTFS Plus Kernel Linux-NTFS 7.x? 2026? Read-write AVM NTFS Kernel NTFS-3G   2012- Read-write]]></summary></entry><entry><title type="html">Proper dependency tracking in GNU make</title><link href="https://blog.pmhahn.de/make/" rel="alternate" type="text/html" title="Proper dependency tracking in GNU make" /><published>2025-11-22T12:50:00+01:00</published><updated>2025-11-22T12:50:00+01:00</updated><id>https://blog.pmhahn.de/make</id><content type="html" xml:base="https://blog.pmhahn.de/make/"><![CDATA[<p><code class="language-plaintext highlighter-rouge">make</code> is used to build projects, e.g. compile source code into binaries.
If the project consists of multiple files, explicit dependencies must be specified to run the command in the correct order.</p>

<p>In addition to that <code class="language-plaintext highlighter-rouge">Makefiles</code> can also be used to track implicit dependencies:
If one file is modified, only those commands are re-run which are needed.
For large projects that can be a big time-saver if incremental changes are done.</p>

<p>But how to do that properly (for a <abbr title="Catalog">C</abbr> project)?</p>

<!--more-->

<h2 id="the-historical-way">The historical way</h2>

<ul>
  <li><code class="language-plaintext highlighter-rouge">Makefile</code>
    <div class="language-make highlighter-rouge"><div class="highlight"><pre class="highlight"><code>  <span class="nl">main</span><span class="o">:</span> <span class="nf">main.o</span>
  <span class="nl">main.o</span><span class="o">:</span> <span class="nf">main.c</span>
</code></pre></div>    </div>
  </li>
  <li><code class="language-plaintext highlighter-rouge">main.c</code>
    <div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code>  <span class="cp">#include</span> <span class="cpf">&lt;stdio.h&gt;</span><span class="cp">
</span>  <span class="cp">#include</span> <span class="cpf">"main.h"</span><span class="cp">
</span></code></pre></div>    </div>
  </li>
  <li><code class="language-plaintext highlighter-rouge">main.h</code>
    <div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code>  <span class="cp">#include</span> <span class="cpf">&lt;stdint.h&gt;</span><span class="cp">
</span></code></pre></div>    </div>
  </li>
</ul>

<p>In the past many projects implemented that themselves.
They used the pre-processor <code class="language-plaintext highlighter-rouge">cpp</code> to process all <code class="language-plaintext highlighter-rouge">#include</code> statements and then used <em>regular expressions</em> to extract the path of all files, which have been read.
These dependencies are then converted into a <code class="language-plaintext highlighter-rouge">make</code> fragment, which declares that dependency:</p>
<div class="language-make highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nl">main.o</span><span class="o">:</span> <span class="nf">main.h /usr/include/stdio.h /usr/include/stdint.h</span>
</code></pre></div></div>
<p>The main <code class="language-plaintext highlighter-rouge">Makefiles</code> has to include this fragment using something like <code class="language-plaintext highlighter-rouge">-include main.d</code>.</p>

<p>This solution has multiple issues.</p>

<h3 id="vanishing-dependencies">Vanishing dependencies</h3>

<p>Consider, you refactor your code and remove <code class="language-plaintext highlighter-rouge">main.h</code>.
In that case your automatically generated dependencies show an issue:
As <code class="language-plaintext highlighter-rouge">main.o</code> depends on <code class="language-plaintext highlighter-rouge">main.h</code>, which no longer is there, <code class="language-plaintext highlighter-rouge">make</code> will fail as there is no receipt to remake it.</p>

<p>This fix this your dependency generation tool needs to output empty rules for all dependencies:</p>
<div class="language-make highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nl">main.h</span><span class="o">:</span>
<span class="nl">/usr/include/stdio.h</span><span class="o">:</span>
<span class="nl">/usr/include/stdint.h</span><span class="o">:</span>
</code></pre></div></div>
<p>There are three cases:</p>
<ol>
  <li>if the file still exists and was not updated — it is older than the target — no remake is triggered by this dependency — but others may still trigger one.</li>
  <li>if the file still exists and was updates — it is newer than the target - a rebuild is triggered for the target.</li>
  <li>if the file does no longer exist, <code class="language-plaintext highlighter-rouge">make</code> invokes the <em>empty receipt</em> to remake it. The will not really create the file, but <code class="language-plaintext highlighter-rouge">make</code> will consider it as <em>newer than the target</em> and continue with the previous case 2 above and remake the target.</li>
</ol>

<p>Without that any developer would have to invoke <code class="language-plaintext highlighter-rouge">make clean</code> to remove all targets and dependency files, resulting in a full rebuild:</p>
<div class="language-make highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nl">.PHONY</span><span class="o">:</span> <span class="nf">clean</span>
<span class="nl">clean</span><span class="o">:</span>
    <span class="err">$(RM)</span> <span class="err">main</span> <span class="err">*.o</span> <span class="err">*.d</span>
</code></pre></div></div>

<h3 id="maintaining-the-dependency-tool">Maintaining the dependency tool</h3>

<p>First of all you must run the pre-processor a 2nd time to generate the input for you dependency extraction tool.
For small projects that cost might be negligible, but for larger projects that might add up.</p>

<p>Second you must maintain yet another tool.
While the pre-processed output is relatively easy to parse, newer compiler versions may add new features or change the output slightly, which your tool then must handle also.</p>

<p>Third you must make sure to invoke your pre-process run with exactly the same arguments as your real compilation:
Any <code class="language-plaintext highlighter-rouge">-Ddefine</code>, <code class="language-plaintext highlighter-rouge">-Idirectory</code>, <code class="language-plaintext highlighter-rouge">-include</code>, <code class="language-plaintext highlighter-rouge">-imacros</code> is important as otherwise you might miss or record wrong dependencies.</p>

<p>You must also decide, <strong>when</strong> to call your tool:
Many projects call it <strong>before</strong> the actual compilation, but that is unneeded:
If the target is missing, <code class="language-plaintext highlighter-rouge">make</code> must remake it anyway.
If the target exists, but you don’t no longer have the dependency information, you must also remake the target as you cannot guarantee, that any (changed) header might not introduce a significant change.</p>

<p>Generating the dependency information afterwards looks okay.
But you might get into situations, where you have stale information, for example if you interrupt <code class="language-plaintext highlighter-rouge">make</code> between the compilation and dependency-gathering steps.</p>

<p>Best would be to do it at the same time.
Luckily that is possible with <code class="language-plaintext highlighter-rouge">gcc</code> and other modern compilers like <code class="language-plaintext highlighter-rouge">clang</code>.</p>

<h2 id="the-gcc-way">The <abbr title="GNU Compiler Collection">gcc</abbr> way</h2>

<p>Luckily <em>modern</em> GCC has built-in support to <a href="https://gcc.gnu.org/onlinedocs/cpp/Invocation.html">generate dependency information</a> in <code class="language-plaintext highlighter-rouge">make</code>-syntax itself:</p>
<ul>
  <li><code class="language-plaintext highlighter-rouge">-M</code> enables generating dependency information <strong>instead</strong> of compiling the file. The output is written to <abbr title="Standard Output">STDOUT</abbr> unless <code class="language-plaintext highlighter-rouge">-o</code> is used to redirect it to a file.</li>
  <li><code class="language-plaintext highlighter-rouge">-MM</code> similar to the above, but <em>system header files</em> are not mentioned.</li>
  <li><code class="language-plaintext highlighter-rouge">-MD</code> and <code class="language-plaintext highlighter-rouge">-MMD</code> are variants of <code class="language-plaintext highlighter-rouge">-M</code> and <code class="language-plaintext highlighter-rouge">-MM</code> respectively, which generate dependency information in <strong>addition</strong> to the requested action, e.g. <code class="language-plaintext highlighter-rouge">-c</code> to compile the unite.</li>
  <li><code class="language-plaintext highlighter-rouge">-MF file</code> writes the information to the given file instead of <abbr title="Standard Output">STDOUT</abbr>.</li>
  <li><code class="language-plaintext highlighter-rouge">-MP</code> adds additional <code class="language-plaintext highlighter-rouge">.PHONY</code> targets for all dependencies to solve the <a href="#vanishing-dependencies">Vanishing dependencies</a> problem from above.</li>
  <li><code class="language-plaintext highlighter-rouge">-MT target</code> allows to overwrite the target name. By default the base-name of the <em>main input file</em> is used, where the suffix is replaced by <code class="language-plaintext highlighter-rouge">.o</code>.</li>
  <li><code class="language-plaintext highlighter-rouge">-MQ target</code> is the variant of the above, which also quotes any <code class="language-plaintext highlighter-rouge">make</code> meta-characters to make sure, the name is not mangled by <code class="language-plaintext highlighter-rouge">make</code> but reaches the shell command as-given.</li>
</ul>

<p>So let’s rewrite our <code class="language-plaintext highlighter-rouge">Makefile</code> and try this:</p>
<div class="language-make highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nl">main</span><span class="o">:</span> <span class="nf">main.o</span>

<span class="nl">%.o %.d &amp;</span><span class="o">:</span> <span class="nf">%.c</span>
	<span class="nv">$(CC)</span> <span class="nv">$(CPPFLAGS)</span> <span class="nv">$(CFLAGS)</span> <span class="nt">-MMD</span> <span class="nt">-MF</span> <span class="nv">$*</span>.d <span class="nt">-MP</span> <span class="nt">-c</span> <span class="nt">-o</span> <span class="nv">$*</span>.o <span class="nv">$&lt;</span>

<span class="k">-include</span><span class="sx"> *.d</span>
</code></pre></div></div>

<ol>
  <li>‘&amp;:<code class="language-plaintext highlighter-rouge"> tells </code>make`, that the recipt generated both files at the same time. (<a href="https://www.gnu.org/software/make/manual/html_node/Multiple-Targets.html">grouped targets</a>)</li>
  <li><code class="language-plaintext highlighter-rouge">-MMD</code> tells <code class="language-plaintext highlighter-rouge">gcc</code> to both compile and generate dependency information at the same time. <em>System header files</em> are excluded.</li>
  <li><code class="language-plaintext highlighter-rouge">-MF $*.d</code> tells <code class="language-plaintext highlighter-rouge">gcc</code> to write the dependency information into a file with the file name extension <code class="language-plaintext highlighter-rouge">.d</code>.</li>
  <li><code class="language-plaintext highlighter-rouge">-MP</code> tells <code class="language-plaintext highlighter-rouge">gcc</code> to generate <code class="language-plaintext highlighter-rouge">.PHONY</code> targets for all included file to make the dependency information future-proof in case one of them gets deleted.</li>
  <li><code class="language-plaintext highlighter-rouge">-c -o $*.o $&lt;</code> to compile the unit.</li>
  <li><code class="language-plaintext highlighter-rouge">-include *.d</code> includes the dependency information as far as it already exists</li>
</ol>

<h3 id="first-compilation-issue">First compilation issue</h3>

<p>This does not work as expected:
<code class="language-plaintext highlighter-rouge">make</code> has a built-in mechanism to <a href="https://www.gnu.org/software/make/manual/html_node/Remaking-Makefiles.html">Remake Makefiles</a>.
All files included via <code class="language-plaintext highlighter-rouge">include</code> are considered <em>Makefiles</em> and <code class="language-plaintext highlighter-rouge">make</code> tries to update them.
If there is no file <code class="language-plaintext highlighter-rouge">*.d</code>, <code class="language-plaintext highlighter-rouge">make</code> applies our rule and will try to compile <code class="language-plaintext highlighter-rouge">*.c</code> to <code class="language-plaintext highlighter-rouge">*.d</code> :-(
(That is why the above rule already uses <code class="language-plaintext highlighter-rouge">$*.o</code> instead of <code class="language-plaintext highlighter-rouge">$@</code> as the later would be <code class="language-plaintext highlighter-rouge">*.d</code>, which then is passed to both <code class="language-plaintext highlighter-rouge">-MF</code> and <code class="language-plaintext highlighter-rouge">-o</code> with catastrophic results.)</p>

<p>We can avoid this by explicitly using <code class="language-plaintext highlighter-rouge">$(wildcard )</code> to include only the existing files:</p>
<div class="language-make highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">-include</span><span class="sx"> $(wildcard *.d)</span>
</code></pre></div></div>

<h3 id="second-compilation-issue">Second compilation issue</h3>

<p>While the solution looks okay, actually it is not:
This way dependency information is optional.
If you delete all dependency files <code class="language-plaintext highlighter-rouge">*.d</code>, modify <code class="language-plaintext highlighter-rouge">main.h</code> and re-run <code class="language-plaintext highlighter-rouge">make</code>: Nothing will happen.
We lost the information, that <code class="language-plaintext highlighter-rouge">main.o</code> depends on <code class="language-plaintext highlighter-rouge">main.h</code>.
Therefore we must change the rule to always require the associated file <code class="language-plaintext highlighter-rouge">$*.d</code> to always exist:</p>
<div class="language-make highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nl">%.o</span><span class="o">:</span> <span class="nf">%.c %.d</span>
	<span class="nv">$(CC)</span> <span class="nv">$(CPPFLAGS)</span> <span class="nv">$(CFLAGS)</span> <span class="nt">-MMD</span> <span class="nt">-MF</span> <span class="nv">$*</span>.d <span class="nt">-MP</span> <span class="nt">-c</span> <span class="nt">-o</span> <span class="nv">$@</span> <span class="nv">$&lt;</span>
<span class="nl">%.d</span><span class="o">:</span> <span class="nf">;</span>
<span class="nl">.NOTINTERMEDIATE</span><span class="o">:</span> <span class="nf">%.d</span>
</code></pre></div></div>
<ul>
  <li>the <em>empty rule</em> for <code class="language-plaintext highlighter-rouge">%.d</code> is needed for <code class="language-plaintext highlighter-rouge">make</code> to handle the case, when the file is missing.
For that case we tell <code class="language-plaintext highlighter-rouge">make</code> that it should consider that file as <code class="language-plaintext highlighter-rouge">remade</code>, so it newer than the target.
That will remake the target to actually generate the real dependency information.</li>
  <li>
    <p>the <code class="language-plaintext highlighter-rouge">.NOTINTERMEDIATE</code> is needed as <code class="language-plaintext highlighter-rouge">%.d</code> is never mentioned as a real target.
<code class="language-plaintext highlighter-rouge">make</code> will search its <a href="https://www.gnu.org/software/make/manual/html_node/Chained-Rules.html">chain of implicit rules</a> <code class="language-plaintext highlighter-rouge">main</code> → <code class="language-plaintext highlighter-rouge">main.o</code> → <code class="language-plaintext highlighter-rouge">main.d</code> and mark it as <em>intermediate</em>.
Because of that the file is not remade and/or will be deleted if it is remade.
By marking it as non-intermediate we tell <code class="language-plaintext highlighter-rouge">make</code> to handle it as a regular file and to keep it afterwards.</p>

    <p>This is only available since <em>GNU make 4.4</em>!</p>
  </li>
</ul>

<h3 id="final-version--make-44">Final version — make 4.4</h3>

<div class="language-make highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c">#!/usr/bin/make -f
# Disable built-in rules and variables
</span><span class="nv">MAKEFLAGS</span> <span class="o">+=</span> <span class="nt">--no-builtin-rules</span>

<span class="nl">main</span><span class="o">:</span> <span class="nf">main.o</span>

<span class="nv">CFLAGS</span> <span class="o">:=</span> <span class="nt">-g</span>
<span class="nv">MYCFLAGS</span> <span class="o">:=</span> <span class="nt">-Wall</span> <span class="nt">-Werror</span>
<span class="nv">DEPFLAGS</span> <span class="o">=</span> <span class="nt">-MMD</span> <span class="nt">-MP</span> <span class="nt">-MF</span> <span class="nv">$*</span>.d <span class="nt">-MT</span> <span class="nv">$@</span>

<span class="nv">COMPILE.c</span> <span class="o">=</span> <span class="nv">$(CC)</span> <span class="nv">$(DEPFLAGS)</span> <span class="nv">$(CPPFLAGS)</span> <span class="nv">$(CFLAGS)</span> <span class="nv">$(MYCFLAGS)</span> <span class="nt">-c</span>

<span class="nl">%.o</span><span class="o">:</span> <span class="nf">%.c %.d</span>
	<span class="nv">$(COMPILE.c)</span> <span class="nv">$(OUTPUT_OPTION)</span> <span class="nv">$&lt;</span>
<span class="nl">%.d</span><span class="o">:</span> <span class="nf">;</span>
<span class="nl">.NOTINTERMEDIATE</span><span class="o">:</span> <span class="nf">%.d</span>
<span class="nl">%</span><span class="o">:</span> <span class="nf">%.o</span>
	<span class="nv">$(LINK.o)</span> <span class="nv">$^</span> <span class="nv">$(LOADLIBES)</span> <span class="nv">$(LDLIBS)</span> <span class="nt">-o</span> <span class="nv">$@</span>

<span class="k">-include</span><span class="sx"> $(wildcard *.d)</span>

<span class="nl">.PHONY</span><span class="o">:</span> <span class="nf">clean</span>
<span class="nl">clean</span><span class="o">:</span>
	<span class="nv">$(RM)</span> main <span class="k">*</span>.o <span class="k">*</span>.d
</code></pre></div></div>

<h3 id="final-version--make-43">Final version — make 4.3</h3>

<div class="language-make highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c">#!/usr/bin/make -f
# Disable built-in rules and variables
</span><span class="nv">MAKEFLAGS</span> <span class="o">+=</span> <span class="nt">--no-builtin-rules</span>

<span class="nv">SRCS</span> <span class="o">:=</span> main.c
<span class="nv">OBJS</span> <span class="o">:=</span> <span class="nv">$(SRCS:%.c=%.o)</span>
<span class="nv">DEPS</span> <span class="o">:=</span> <span class="nv">$(SRCS:%.c=%.d)</span>

<span class="nl">main</span><span class="o">:</span> <span class="nf">$(OBJS)</span>

<span class="nv">CFLAGS</span> <span class="o">:=</span> <span class="nt">-g</span>
<span class="nv">MYCFLAGS</span> <span class="o">:=</span> <span class="nt">-Wall</span> <span class="nt">-Werror</span>
<span class="nv">DEPFLAGS</span> <span class="o">=</span> <span class="nt">-MMD</span> <span class="nt">-MP</span> <span class="nt">-MF</span> <span class="nv">$*</span>.d <span class="nt">-MT</span> <span class="nv">$@</span>

<span class="nv">COMPILE.c</span> <span class="o">=</span> <span class="nv">$(CC)</span> <span class="nv">$(DEPFLAGS)</span> <span class="nv">$(CPPFLAGS)</span> <span class="nv">$(CFLAGS)</span> <span class="nv">$(MYCFLAGS)</span> <span class="nt">-c</span>

<span class="nl">%.o</span><span class="o">:</span> <span class="nf">%.c %.d</span>
	<span class="nv">$(COMPILE.c)</span> <span class="nv">$(OUTPUT_OPTION)</span> <span class="nv">$&lt;</span>
<span class="nl">$(DEPS)</span><span class="o">:</span>

<span class="k">-include</span><span class="sx"> $(wildcard $(DEPS))</span>

<span class="nl">.PHONY</span><span class="o">:</span> <span class="nf">clean</span>
<span class="nl">clean</span><span class="o">:</span>
	<span class="nv">$(RM)</span> main <span class="nv">$(OBJS)</span> <span class="nv">$(DEPS)</span>
</code></pre></div></div>

<h2 id="the-kbuild-way">The kbuild way</h2>

<p>The Linux kernel uses its own <a href="https://docs.kernel.org/kbuild/index.html">build system</a> called <code class="language-plaintext highlighter-rouge">kbuild</code>, which is based on a bunch of <code class="language-plaintext highlighter-rouge">make</code> receipts.
It has some additional requirements:</p>
<ol>
  <li>The Linux is heavily configurable.
There is a huge <code class="language-plaintext highlighter-rouge">.config</code> file, which lists all options.
If that file would be used as a pre-dependency, all such files would get rebuilt each time a single option was changed.
Therefore kbuild uses some mechanisms to split that big file into smaller chunks, so that each compilation unit can just depend on those options, it really depends on.</li>
  <li>The above solution does not track the <code class="language-plaintext highlighter-rouge">$(…FLAGS)</code> variables or <code class="language-plaintext highlighter-rouge">$(CC)</code>.
Changing them might a complete rebuild to have a consistent kernel again.
As such kbuild logs the final command used to compile the target also in the dependency information file.
On the next run the commands are compared and the invocation may only be skipped, if they match.</li>
</ol>

<p>For that kbuild overwrites most of <code class="language-plaintext highlighter-rouge">make</code>s dependency mechanism with its own implementation:</p>
<ol>
  <li>Most targets have <code class="language-plaintext highlighter-rouge">FORCE</code> as their pre-dependency, so that the receipt will always run.</li>
  <li>The receipt itself will then use some heavy macro magic to read back its dependency information from a file and compare that to the actual run.
The command is only executed if any pre-requisite is changed or any relevant configuration option is changed.</li>
  <li>If a command cannot determine, if it needs to run, it will run by default but will write its output to a temporary file.
That file is then compared to the previous version.
    <ul>
      <li>if the content differs, the temporary file is renamed over the real output file.</li>
      <li>if the content did not change, the temporary file is deleted.
That way the old time stamp is preserved if no change did happen.
This is done to prevent needless downstream rebuilds.</li>
    </ul>
  </li>
</ol>

<h2 id="closing-word">Closing word</h2>

<p>Much of this was inspired by the article <a href="https://make.mad-scientist.net/papers/advanced-auto-dependency-generation/">Auto-Dependency Generation</a> from <em>Paul <abbr title="Director">D</abbr>. Smith</em>.
Thank you very much for writing this in the first place.
The main difference is, that he uses a variable <code class="language-plaintext highlighter-rouge">$(SRCS)</code>, which explicitly lists all source <abbr title="Catalog">C</abbr> files.
That way he can <strong>explicitly</strong> name the expected <code class="language-plaintext highlighter-rouge">*.o</code> and <code class="language-plaintext highlighter-rouge">*.d</code> files, which bypasses the problem with intermediate files from my solution above.
That version also works for <code class="language-plaintext highlighter-rouge">make 4.3</code> an earlier as <code class="language-plaintext highlighter-rouge">.NOTINTERMEDIATE</code> is only available since <code class="language-plaintext highlighter-rouge">make 4.4</code>.</p>

<!-- *[FD]: File Daemon -->
<!-- *[FD]: File Descriptor -->
<!-- *[GPT]: Generative Pre-trained Transformer -->
<!-- *[GPT]: Global Partitioning Table -->
<!-- *[GPT]: GUID Partition Table -->]]></content><author><name>Philipp Hahn</name></author><category term="linux" /><summary type="html"><![CDATA[make is used to build projects, e.g. compile source code into binaries. If the project consists of multiple files, explicit dependencies must be specified to run the command in the correct order. In addition to that Makefiles can also be used to track implicit dependencies: If one file is modified, only those commands are re-run which are needed. For large projects that can be a big time-saver if incremental changes are done. But how to do that properly (for a C project)?]]></summary></entry><entry><title type="html">Padding and alignment of C structs</title><link href="https://blog.pmhahn.de/C-padding/" rel="alternate" type="text/html" title="Padding and alignment of C structs" /><published>2025-10-01T10:46:00+02:00</published><updated>2025-10-01T10:46:00+02:00</updated><id>https://blog.pmhahn.de/C-padding</id><content type="html" xml:base="https://blog.pmhahn.de/C-padding/"><![CDATA[<p>Q: How to debug padding and alignment issues of <abbr title="Catalog">C</abbr> <code class="language-plaintext highlighter-rouge">struct</code>?</p>

<p>A: <code class="language-plaintext highlighter-rouge">gdb --silent --batch -ex 'ptype /o struct my_t' some.o</code></p>

<!--more-->

<p>The past days I was investigating some performance issues with a proprietary SoC:
The code consists of closed-source pre-compiled binaries combined with public header files.
Some public glue-code added accessors to allocate, copy, and free the data structures.</p>

<h2 id="padding">Padding</h2>

<p>We have the requirement to extend the data-structure and add some additional members.
As some code was close-sourced, it is important to not change the layout of the existing structure.
Luckily <abbr title="Catalog">C</abbr> adds padding between members to align the next member according to common hardware constraints:</p>
<blockquote>
  <p>The start address of a 1,2,4,8,16,32,64,… sized member must align to that size.</p>
</blockquote>

<p>Consider this:</p>
<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="cp">#include</span> <span class="cpf">&lt;stdint.h&gt;</span><span class="cp">
</span><span class="k">struct</span> <span class="n">my0_t</span> <span class="p">{</span>
    <span class="kt">uint8_t</span> <span class="n">foo</span><span class="p">;</span>
    <span class="kt">uint32_t</span> <span class="n">bar</span><span class="p">;</span>
<span class="p">}</span> <span class="n">var0</span><span class="p">[</span><span class="mi">3</span><span class="p">];</span>
<span class="n">static_assert</span><span class="p">(</span><span class="k">sizeof</span><span class="p">(</span><span class="n">var0</span><span class="p">)</span> <span class="o">==</span> <span class="mi">8</span> <span class="o">*</span> <span class="mi">3</span><span class="p">,</span> <span class="s">"Unexpected sizeof"</span><span class="p">);</span>
</code></pre></div></div>

<p><code class="language-plaintext highlighter-rouge">bar</code> has size 32 bits or 4 bytes, so <code class="language-plaintext highlighter-rouge">var0[0]</code> must be placed into memory so that <code class="language-plaintext highlighter-rouge">&amp;(var[0].bar) % 4 == 0</code> is true.
The <abbr title="Catalog">C</abbr>-compiler will thus add <em>padding bytes</em> before <code class="language-plaintext highlighter-rouge">bar</code> to satisfy that requirement.
Compiling the code with <code class="language-plaintext highlighter-rouge">-Wpadded</code> shows this:</p>
<div class="language-console highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="gp">$</span><span class="w"> </span>gcc <span class="nt">-c</span> <span class="nt">-g</span> <span class="nt">-Wpadded</span> c-padding.c
<span class="go">c-padding.c:4:14: warning: padding struct to align ‘bar’ [-Wpadded]
</span><span class="gp">    4 |     uint32_t bar;</span><span class="w">
</span><span class="go">      |              ^~~
</span></code></pre></div></div>

<p>But you don’t know, what the <abbr title="Catalog">C</abbr> compiler does here:
<code class="language-plaintext highlighter-rouge">gcc</code> may either insert padding <strong>before</strong> <code class="language-plaintext highlighter-rouge">foo</code> or <strong>after</strong> it:</p>
<pre><code class="language-struct">struct my1_t {
    uint8_t foo;
    uint8_t _padding[3];
    uint32_t bar;
} var1[3];
static_assert(sizeof(var1) == 8 * 3, "Unexpected sizeof");
</code></pre>
<p>or</p>
<pre><code class="language-struct">struct my2_t {
    uint8_t _padding[3];
    uint8_t foo;
    uint32_t bar;
} var2[3];
static_assert(sizeof(var2) == 8 * 3, "Unexpected sizeof");
</code></pre>

<p>Both are valid, but all I have ever seen is padding being inserted after the previous member and before the next member.</p>

<p>But you can use <code class="language-plaintext highlighter-rouge">gdb</code>s <code class="language-plaintext highlighter-rouge">ptype</code> command to dump the exact layout including offset, size and inserted padding:</p>
<div class="language-console highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="gp">$</span><span class="w"> </span>gdb <span class="nt">--silent</span> <span class="nt">--batch</span> <span class="nt">-ex</span> <span class="s1">'ptype /o struct my0_t'</span> c-padding.o
<span class="go">/* offset      |    size */  type = struct my0_t {
</span><span class="gp">/*      0      |       1 */    uint8_t foo;</span><span class="w">
</span><span class="go">/* XXX  3-byte hole      */
</span><span class="gp">/*      4      |       4 */    uint32_t bar;</span><span class="w">
</span><span class="go">                               /* total size (bytes):    8 */
                             }
</span></code></pre></div></div>
<p>For this to work you need DWARF debugging information.
So please make sure you compile your code with <code class="language-plaintext highlighter-rouge">-g</code> enabled!</p>

<p>This extra padding increases the size of your <code class="language-plaintext highlighter-rouge">struct</code>, which might be undesired:
On embedded systems you often have less memory and excessive padding might waste a lot of memory.
To minimize this, you have multiple options:</p>

<h3 id="packing">Packing</h3>
<p>You can declare the <code class="language-plaintext highlighter-rouge">struct</code> as <em>packed</em>:</p>
<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">struct</span> <span class="n">my3_t</span> <span class="p">{</span>
    <span class="kt">uint8_t</span> <span class="n">foo</span><span class="p">;</span>
    <span class="kt">uint32_t</span> <span class="n">bar</span><span class="p">;</span>
<span class="p">}</span> <span class="n">__attribute__</span><span class="p">((</span><span class="n">packed</span><span class="p">))</span> <span class="n">var3</span><span class="p">[</span><span class="mi">3</span><span class="p">];</span>
<span class="n">static_assert</span><span class="p">(</span><span class="k">sizeof</span><span class="p">(</span><span class="n">var3</span><span class="p">)</span> <span class="o">==</span> <span class="mi">5</span> <span class="o">*</span> <span class="mi">3</span><span class="p">,</span> <span class="s">"Unexpected sizeof"</span><span class="p">);</span>
</code></pre></div></div>
<p>This makes the <code class="language-plaintext highlighter-rouge">struct</code> as compact as possible by <strong>not</strong> inserting any padding automatically.
But you will get into trouble and risk getting a <code class="language-plaintext highlighter-rouge">SIGBUS</code> error on some architectures:
Accessing a 32 bit variable which is not 4 byte aligned requires additional work:</p>
<ol>
  <li>Either the hardware has some extra logic to split non-aligned memory access into multiple accesses and to recombine both parts into the final value,</li>
  <li>Or the compiler has to generate extra code to not do the unaligned access,</li>
  <li>Or your program terminates with <code class="language-plaintext highlighter-rouge">SIGBUS</code> as the processor raises the <em>unaligned trap</em></li>
</ol>

<p>Please do not use <code class="language-plaintext highlighter-rouge">-fpack-struct</code> to make every <code class="language-plaintext highlighter-rouge">struct</code> packed by default!</p>

<h3 id="re-ordering-descending-by-size">Re-ordering descending by size</h3>
<p>Most often you can re-order your members descending by size – assuming sizes being a power-of-two.
The compiler still adds padding, but only at the end of the structure.
That way you do not have holes in the middle.</p>

<p>That changes the layout and breaks any <abbr title="Application Binary Interface">ABI</abbr> compatibility!
So not not do this with <code class="language-plaintext highlighter-rouge">structs</code>, which are used to communicate with your hardware or some closed source binary, which assumes the old layout.</p>

<h3 id="careful-re-ordering">Careful re-ordering</h3>
<p>If you only need to insert some small data, look for those hole:
As the compiler added padding there <strong>automatically</strong>, there is no guarantee that these bits/bytes are zero initialized.
If you need that guarantee, you must manually insert padding bytes!</p>

<p>On the other hand that provides the opportunity, to re-use those <em>undefined bits</em> for additional members.
Just look for a hole which is large enough for your data and add your member in between the members bordering that hole.</p>

<p>Just be careful with structures which are used with hardware:
If their accessor function does a <code class="language-plaintext highlighter-rouge">memset(…, 0, …)</code> to initialize the <code class="language-plaintext highlighter-rouge">struct</code> to zero, it might be important that those bits remain cleared.
If you then start using those bits, the hardware might get confused.</p>

<h2 id="alignment">Alignment</h2>
<p>You might have noticed, that <code class="language-plaintext highlighter-rouge">sizeof(struct my0_t) == 8</code> and not <code class="language-plaintext highlighter-rouge">5 == sizeof(uint8_t) + sizeof(uint32_t)</code>.
<code class="language-plaintext highlighter-rouge">gcc</code> also adds padding before or after all members to extend the <code class="language-plaintext highlighter-rouge">struct</code>, until its <code class="language-plaintext highlighter-rouge">sizeof</code> if a natural multiple of the widest element.
This is important for arrays where multiple instances are placed after each other.
There each instances start address must be aligned properly, which requires padding in between.
The distance between two elements is called “stride size”, which equals the <code class="language-plaintext highlighter-rouge">sizeof</code>.</p>

<p>This also applies to nested <code class="language-plaintext highlighter-rouge">struct</code>s like this:</p>
<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">struct</span> <span class="n">my5_t</span> <span class="p">{</span>
    <span class="k">struct</span> <span class="n">my4_t</span> <span class="n">baz</span><span class="p">;</span>
    <span class="kt">uint8_t</span> <span class="n">bla</span><span class="p">;</span>
<span class="p">}</span> <span class="n">var5</span><span class="p">[</span><span class="mi">3</span><span class="p">];</span>
<span class="n">static_assert</span><span class="p">(</span><span class="k">sizeof</span><span class="p">(</span><span class="n">var5</span><span class="p">)</span> <span class="o">==</span> <span class="mi">12</span> <span class="o">*</span> <span class="mi">3</span><span class="p">,</span> <span class="s">"Unexpected sizeof"</span><span class="p">);</span>
</code></pre></div></div>
<p>This might be unexpected as <code class="language-plaintext highlighter-rouge">my4_t</code> ends with 3 padding bytes, where <code class="language-plaintext highlighter-rouge">bla</code> might fit it.
Instead <code class="language-plaintext highlighter-rouge">baz</code> gets placed after the padding from <code class="language-plaintext highlighter-rouge">baz</code>, after which 3 more padding bytes are required.
So in total you get 6 bytes of padding.</p>

<h3 id="cache-line-size">Cache line size</h3>
<p>Alignment becomes even more important for performance.
Modern <abbr title="Central Processing Units">CPUs</abbr> have lots of caches and their <em>line size</em> specifies the smallest quantity for data transfer.
Even when you only require a single bit, the cache will transfer 32 or 64 or even more bytes from <abbr title="Random Access Memory">RAM</abbr>.</p>
<ul>
  <li>with more tightly packed <code class="language-plaintext highlighter-rouge">struct</code>s you get more data per cache-line and require fewer cache-lines, leaving more free cache lines for other tasks.</li>
  <li>on the other hand <em>false sharing</em> might become a performance issue with multi-threading, where data with different access patterns are stored in the cache line.</li>
</ul>

<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">struct</span> <span class="n">my6_t</span> <span class="p">{</span>
    <span class="kt">uint8_t</span> <span class="n">foo</span><span class="p">;</span>
    <span class="kt">uint32_t</span> <span class="n">bar</span><span class="p">;</span>
<span class="p">}</span> <span class="n">__attribute__</span><span class="p">((</span><span class="n">aligned</span><span class="p">(</span><span class="mi">32</span><span class="p">)))</span> <span class="n">var6</span><span class="p">[</span><span class="mi">3</span><span class="p">];</span>
<span class="n">static_assert</span><span class="p">(</span><span class="k">sizeof</span><span class="p">(</span><span class="n">var6</span><span class="p">)</span> <span class="o">==</span> <span class="mi">32</span> <span class="o">*</span> <span class="mi">3</span><span class="p">,</span> <span class="s">"Unexpected sizeof"</span><span class="p">);</span>
</code></pre></div></div>

<p>In this case we get 3 bytes of padding between <code class="language-plaintext highlighter-rouge">foo</code> and <code class="language-plaintext highlighter-rouge">bar</code>.
But we also get 24 bytes of padding after <code class="language-plaintext highlighter-rouge">bar</code> to make <code class="language-plaintext highlighter-rouge">sizeof(struct my6_t)</code> a multiple of 32 as requested by <code class="language-plaintext highlighter-rouge">__attribute__((aligned(32)))</code>.</p>

<p>This easily becomes worse with nested ``struct<code class="language-plaintext highlighter-rouge">s where inner </code>struct`s also have alignments:</p>
<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">struct</span> <span class="n">my7_t</span> <span class="p">{</span>
    <span class="kt">uint8_t</span> <span class="n">foo</span><span class="p">;</span>
    <span class="k">struct</span> <span class="n">inner</span> <span class="p">{</span>
        <span class="kt">uint8_t</span> <span class="n">foo</span><span class="p">;</span>
    <span class="p">}</span> <span class="n">__attribute__</span><span class="p">((</span><span class="n">aligned</span><span class="p">(</span><span class="mi">32</span><span class="p">)))</span> <span class="n">bar</span><span class="p">[</span><span class="mi">4</span><span class="p">];</span>
<span class="p">}</span> <span class="n">var7</span><span class="p">[</span><span class="mi">3</span><span class="p">];</span>
<span class="n">static_assert</span><span class="p">(</span><span class="k">sizeof</span><span class="p">(</span><span class="n">var7</span><span class="p">)</span> <span class="o">==</span> <span class="mi">64</span> <span class="o">*</span> <span class="mi">3</span><span class="p">,</span> <span class="s">"Unexpected sizeof"</span><span class="p">);</span>
</code></pre></div></div>
<p>Runnig <code class="language-plaintext highlighter-rouge">gdb</code> shows what happens:</p>
<div class="language-console highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="gp">$</span><span class="w"> </span>gdb <span class="nt">--silent</span> <span class="nt">--batch</span> <span class="nt">-ex</span> <span class="s1">'ptype /o var7'</span> c-padding.o
<span class="go">type = struct my7_t {
</span><span class="gp">/*      0      |       1 */    uint8_t foo;</span><span class="w">
</span><span class="go">/* XXX 31-byte hole      */
/*     32      |      32 */    struct inner {
</span><span class="gp">/*     32      |       1 */        uint8_t foo;</span><span class="w">
</span><span class="go">/* XXX 31-byte padding   */
                                   /* total size (bytes):   32 */
</span><span class="gp">                               } bar;</span><span class="w">
</span><span class="go">                               /* total size (bytes):   64 */
                             } [3]
</span></code></pre></div></div>

<h2 id="summary">Summary</h2>
<ul>
  <li>Use <code class="language-plaintext highlighter-rouge">gcc</code>s <code class="language-plaintext highlighter-rouge">-Wpadded</code> to get a warning.</li>
  <li>Use <code class="language-plaintext highlighter-rouge">gdb</code>s <code class="language-plaintext highlighter-rouge">ptype</code> to print the real layout.</li>
  <li>Verify your assumtions, especially if the same code is compiled for multiple platforms with different alignment requirements.</li>
  <li>Do not trust the comments in the code claiming ancient values for <code class="language-plaintext highlighter-rouge">sizeof</code> or proper cache line alignment.</li>
  <li>Explicitly add padding bytes as they are then also initialized; otherwise the compiler may do as it likes.</li>
</ul>

<h2 id="further-reading">Further reading</h2>
<ul>
  <li><a href="https://gcc.gnu.org/onlinedocs/gcc/Warning-Options.html#index-Wpadded"><abbr title="GNU Compiler Collection">gcc</abbr>: -Wpadded</a></li>
  <li><a href="http://www.catb.org/esr/structure-packing/">ESR: The Lost Art of Structure Packing</a></li>
</ul>

<!-- *[FD]: File Daemon -->
<!-- *[FD]: File Descriptor -->
<!-- *[GPT]: Generative Pre-trained Transformer -->
<!-- *[GPT]: Global Partitioning Table -->
<!-- *[GPT]: GUID Partition Table -->]]></content><author><name>Philipp Hahn</name></author><category term="c" /><summary type="html"><![CDATA[Q: How to debug padding and alignment issues of C struct? A: gdb --silent --batch -ex 'ptype /o struct my_t' some.o]]></summary></entry><entry><title type="html">Linux Kernel Module Symbol Versioning</title><link href="https://blog.pmhahn.de/linux-kernel-symbol-versioning/" rel="alternate" type="text/html" title="Linux Kernel Module Symbol Versioning" /><published>2025-08-23T12:59:00+02:00</published><updated>2025-08-23T12:59:00+02:00</updated><id>https://blog.pmhahn.de/linux-kernel-symbol-versioning</id><content type="html" xml:base="https://blog.pmhahn.de/linux-kernel-symbol-versioning/"><![CDATA[<p>The Linux kernel itself and its modules may export symbols, so that other modules can import and use them.
As the functions are written in <abbr title="Catalog">C</abbr>, it is important that the function signature matches:</p>
<ul>
  <li>the number of arguments must match</li>
  <li>the ordering of the arguments must match</li>
  <li>the data types must match, which includes the structure and layout of all input and output parameters</li>
</ul>

<p>If any of them changes, the <em>Application Binary Interface</em> (<abbr title="Application Binary Interface">ABI</abbr>) changes and you risk crashing the kernel.
If you’re lucky, recompiling the kernel and the modules is enough for both ends to pick up the new <em>Application Programming Interface</em> (<abbr title="Application Programming Interface">API</abbr>).</p>

<p>To detect such breaking changes, the Linux kernel can be compiled with <code class="language-plaintext highlighter-rouge">CONFIG_MODVERSIONS</code> enabled:
This calculates a <em>Cyclic Redundancy Check</em> (CRC) checksum over the function signature and embeds this information with the kernel and the modules.
The dynamic linker of the Linux kernel checks, that for each requested symbol its CRC matches the CRC of the Linux kernel or already loaded modules.
A module is only loaded, if a match is found for all symbols.
Otherwise loading fails.</p>

<!--more-->

<h2 id="rust-goes-dwarf">Rust goes DWARF</h2>

<p>The mechanism described here does not work with Rust.
As such the Linux kernel learned a new trick and can use the DWARF (<em>Debugging With Arbitrary Record Formats</em>) debugging information to calculate the CRC.
When <code class="language-plaintext highlighter-rouge">CONFIG_RUST</code> is enabled, <code class="language-plaintext highlighter-rouge">gendwarfksyms</code> is used instead of <code class="language-plaintext highlighter-rouge">genksyms</code>.
Both versions are incompatible as they calculate different CRCs for the same function.
But they work similar enough, so I will not go into details here.
If you’re interested, look for <code class="language-plaintext highlighter-rouge">CONFIG_EXTENDED_MODVERSIONS</code>.</p>

<h2 id="executable-and-linkable-format">Executable and Linkable Format</h2>

<p>Linux Kernel modules object files using the <em>Executable and Linkable Format</em> (ELF).
Instead of using the well-known suffix <code class="language-plaintext highlighter-rouge">.o</code>, they use the suffix <code class="language-plaintext highlighter-rouge">.ko</code>, but are otherwise the same.
They are comprised of multiple <em>sections</em> containing executable code, read-only constants, initialized data and other informations required for linking.</p>

<details><summary>Example: ELF sections of a Linux Kernel Module</summary>
<div>

    <div class="language-console highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="gp">$</span><span class="w"> </span>objdump <span class="nt">--section-headers</span> <span class="nt">--wide</span> avm-modver.ko
<span class="gp">$</span><span class="w"> </span><span class="nv">LC_ALL</span><span class="o">=</span>C readelf <span class="nt">--wide</span> <span class="nt">--section-headers</span> avm-modver.ko
<span class="go">There are 38 section headers, starting at offset 0x25d80:

Sections Header:
  [Nr] Name                              Type     Addresse    Off   Size ES Flg Lk Inf Al  Usage
🔵[ 0]                                   NULL            0      0      0  0      0   0  0  ELF header
🟠[ 1] .note.gnu.build-id                NOTE            0     40     24  0   A  0   0  4  unique build ID bitstring
🟣[ 2] .note.Linux                       NOTE            0     64     30  0   A  0   0  4  Architecture data
🟢[ 3] .text                             PROGBITS        0     a0     1f  0  AX  0   0 16  Code
⚪[ 4] .rela.text                        RELA            0  14e50     30 18   I 35   3  8
🔴[ 5] __ksymtab                         PROGBITS        0     c0      c  0   A  0   0  4  EXPORT_SYMBOL
⚪[ 6] .rela__ksymtab                    RELA            0  14e80     48 18   I 35   5  8
🔴[ 7] __kcrctab                         PROGBITS        0     cc      4  0   A  0   0  4  CRC
🟣[ 8] __mcount_loc                      PROGBITS        0     d0      8  0   A  0   0  1  ftrace()
⚪[ 9] .rela__mcount_loc                 RELA            0  14ec8     18 18   I 35   8  8
🟣[10] .modinfo                          PROGBITS        0     d8     92  0   A  0   0  1  MODULE_INFO
🟣[11] .return_sites                     PROGBITS        0    16a      4  0   A  0   0  1  Live patching
⚪[12] .rela.return_sites                RELA            0  14ee0     18 18   I 35  11  8
🟣[13] .call_sites                       PROGBITS        0    16e      4  0   A  0   0  1  Live patching
⚪[14] .rela.call_sites                  RELA            0  14ef8     18 18   I 35  13  8
🔴[15] __ksymtab_strings                 PROGBITS        0    172      d  1 AMS  0   0  1  EXPORT_SYMBOL
🔴[16] __versions                        PROGBITS        0    180     51  0   A  0   0 32  CRC
🟣[17] __patchable_function_entries      PROGBITS       58    1d8      8  0 WAL  3   0  8  NOPs
⚪[18] .rela__patchable_function_entries RELA            0  14f10     18 18   I 35  17  8
⚫[19] .data                             PROGBITS        0    1e0      0  0  WA  0   0  1  Initialized data
🟣[20] .gnu.linkonce.this_module         PROGBITS        0    200    500  0  WA  0   0 64
⚫[21] .bss                              NOBITS          0    700      0  0  WA  0   0  1  Uninitialized data
🟠[22] .debug_info                       PROGBITS        0    700   b51f  0      0   0  1
⚪[23] .rela.debug_info                  RELA            0  14f28   fc00 18   I 35  22  8
🟠[24] .debug_abbrev                     PROGBITS        0   bc1f    71e  0      0   0  1
🟠[25] .debug_aranges                    PROGBITS        0   c33d     50  0      0   0  1
⚪[26] .rela.debug_aranges               RELA            0  24b28     48 18   I 35  25  8
🟠[27] .debug_line                       PROGBITS        0   c38d    3be  0      0   0  1
⚪[28] .rela.debug_line                  RELA            0  24b70   1050 18   I 35  27  8
🟠[29] .debug_str                        PROGBITS        0   c74b   7792  1  MS  0   0  1
🟠[30] .debug_line_str                   PROGBITS        0  13edd    943  1  MS  0   0  1
🟡[31] .comment                          PROGBITS        0  14820     58  1  MS  0   0  1  Compiler version
🟡[32] .note.GNU-stack                   PROGBITS        0  14878      0  0      0   0  1  Stack hardening flag
🟠[33] .debug_frame                      PROGBITS        0  14878     40  0      0   0  8
⚪[34] .rela.debug_frame                 RELA            0  25bc0     30 18   I 35  33  8
🔵[35] .symtab                           SYMTAB          0  148b8    438 18     36  40  8  Symbols
🔵[36] .strtab                           STRTAB          0  14cf0    15f  0      0   0  1  Symbol names
🔵[37] .shstrtab                         STRTAB          0  25bf0    18b  0      0   0  1  Section names
Key to Flags:
  Write, Alloc, eXecute, Merge, Strings, Info, Link order, extra Os processing required, Group, TLS,
  Compressed, x=unknown, o=OS specific, Exclude, mbinD, large, processor specific
</span></code></pre></div>    </div>

    <ul>
      <li>🔴 Linux kernel module specific sections</li>
      <li>🟣 Linux specific sections</li>
      <li>⚫ data</li>
      <li>🟢 executable code</li>
      <li>⚪ relocations</li>
      <li>🟠 debug information</li>
      <li>🟡 compiler information</li>
      <li>🔵 ELF</li>
    </ul>

  </div>
</details>

<p>The <em>section names</em> have varying lengths.
As such the names are collected in their own section called <code class="language-plaintext highlighter-rouge">.shstrtab</code>, which is referenced by index in the ELF file header.
All sections are listed in the <em>section header table</em> and their names are referenced by offset.
Run <code class="language-plaintext highlighter-rouge">readelf -p .shstrtab avm-job.ko</code> to dump those names.</p>

<p>Similar for symbols:
There names are collected in the section <code class="language-plaintext highlighter-rouge">.strtab</code> and referenced via offset from <code class="language-plaintext highlighter-rouge">.symtab</code>.
Run <code class="language-plaintext highlighter-rouge">readelf -p .strtab avm-job.ko</code> to dump those names.</p>

<p><code class="language-plaintext highlighter-rouge">.symtab</code> contains all symbols (and <code class="language-plaintext highlighter-rouge">.strtab</code>) their names.
When <em>shared objects</em> (<code class="language-plaintext highlighter-rouge">.so</code>) are used, the linker moves those symbols to <code class="language-plaintext highlighter-rouge">.dynsym</code> and their names to <code class="language-plaintext highlighter-rouge">.dynstr</code>.
Already resolved symbols may be removed respectively both tables <code class="language-plaintext highlighter-rouge">.symtab</code> and <code class="language-plaintext highlighter-rouge">.strtab</code> may be stripped completely.</p>

<p>The remaining dynamic symbols are only resolved by the <em>dynamic linker</em>, when section is loaded.
The <em>dynamic linker</em> has to go through the section and substitute the placeholders with the then correct address.
For that the ELF file contains the <em>relocation sections</em>, of which there are two types:
<code class="language-plaintext highlighter-rouge">REL</code> (relocation without addend) and <code class="language-plaintext highlighter-rouge">RELA</code> (relocation with addend), which allows to add an additional constant.
Either one may be used per section and each table references a <em>symbol table</em>, which gets used.</p>

<p>Not all of them are loaded into memory respectively are freed again, when they are no longer needed by the linker.
Only those sections, which contain information that is necessary for runtime execution of the file, are kept.
Multiple (similar) sections can be combined and are then called <em>segments</em>.
But that is only relevant for fully linked executables: Only they have a <em>program header</em></p>

<p>References to functions are then resolved by the linker and the place-holders get replaced by the real addresses.
This is where versioning kicks in.</p>

<h2 id="anatomy-of-a-linux-kernel-module">Anatomy of a Linux kernel module</h2>

<p>When you write and export a function in the Linux kernel or an module, the following happens:</p>
<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kt">void</span> <span class="nf">my_function</span><span class="p">(</span><span class="kt">void</span><span class="p">)</span> <span class="p">{</span>
    <span class="k">return</span><span class="p">;</span>
<span class="p">}</span>
</code></pre></div></div>
<ol>
  <li>The compiler/assembler puts the code into the <code class="language-plaintext highlighter-rouge">.text</code> section.</li>
  <li>The name of the function is added to the <code class="language-plaintext highlighter-rouge">.strtab</code> section.</li>
  <li>An entry is added to the <code class="language-plaintext highlighter-rouge">symtab</code> section linking the offset within the <code class="language-plaintext highlighter-rouge">.text</code> section to the name via its offset in the <code class="language-plaintext highlighter-rouge">strtab</code> section.</li>
</ol>

<p>Using <code class="language-plaintext highlighter-rouge">EXPORT_SYMBOL</code> adds more magic:</p>
<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="cp">#include</span> <span class="cpf">&lt;linux/module.h&gt;</span><span class="cp">
</span><span class="n">EXPORT_SYMBOL</span><span class="p">(</span><span class="n">my_function</span><span class="p">);</span>
</code></pre></div></div>
<ol>
  <li>It puts the name of the function into a section called <code class="language-plaintext highlighter-rouge">__ksymtab_strings</code>.
    <div class="language-console highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="gp"> $</span><span class="w"> </span><span class="nv">LC_ALL</span><span class="o">=</span>C readelf <span class="nt">--wide</span> <span class="nt">--string-dump</span><span class="o">=</span>__ksymtab_strings avm-modver.ko
<span class="go"> String dump of section '__ksymtab_strings':
   [     0]  my_function
</span></code></pre></div>    </div>
  </li>
  <li>It creates a new section called <code class="language-plaintext highlighter-rouge">__ksymtab+my_function</code> with a single <code class="language-plaintext highlighter-rouge">struct kernel_symbol</code> linking the address of the function to its name.
 Later on these sections will be collected by the linker script <code class="language-plaintext highlighter-rouge">scripts/module-common.lds</code> and will be put into the section called <code class="language-plaintext highlighter-rouge">__ksymtab</code>.
 Similar happens for <code class="language-plaintext highlighter-rouge">EXPORT_SYMBOL_GPL</code> and <code class="language-plaintext highlighter-rouge">EXPORT_SYMBOL_GPL_FUTURE</code> and <code class="language-plaintext highlighter-rouge">EXPORT_SYMBOL_NS</code>, but with different prefixes.
    <div class="language-console highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="gp"> $</span><span class="w"> </span><span class="nv">LC_ALL</span><span class="o">=</span>C readelf <span class="nt">--wide</span> <span class="nt">--relocated-dump</span><span class="o">=</span>__ksymtab <span class="nt">--relocs</span> avm-modver.ko | <span class="nb">grep</span> <span class="nt">-A4</span> __ksymtab
<span class="go"> Relocation section '.rela__ksymtab' at offset 0x14fc8 contains 3 entries:
     Offset             Info             Type               Symbol's Value  Symbol's Name + Addend
 0000000000000000  0000002e00000002 R_X86_64_PC32          0000000000000010 my_function + 0
 0000000000000004  0000001b00000002 R_X86_64_PC32          0000000000000000 __kstrtab_my_function + 0
 0000000000000008  0000001c00000002 R_X86_64_PC32          000000000000000c __kstrtabns_my_function + 0
 --
 Hex dump of section '__ksymtab':
   0x00000000 10000000 fcffffff 04000000          ............
</span></code></pre></div>    </div>
  </li>
</ol>

<p>Too see more details, use <code class="language-plaintext highlighter-rouge">make avm-modver.i</code> to run the pre-processor and to get the intermediate file, where all macros have been expanded.</p>

<p>With <code class="language-plaintext highlighter-rouge">CONFIG_MODVERSIONS</code> enabled even more magic happens.
If a module uses <code class="language-plaintext highlighter-rouge">EXPORT_SYMBOL</code>, then <code class="language-plaintext highlighter-rouge">genksyms</code> is called.
The source code of the module is pre-processed again via <code class="language-plaintext highlighter-rouge">cpp</code>, but with a different definition for <code class="language-plaintext highlighter-rouge">EXPORT_SYMBOLS</code>.</p>
<ol>
  <li>For each function exported via <code class="language-plaintext highlighter-rouge">EXPORT_SYMBOL</code> a CRC for the function signature is computed by parsing the <abbr title="Catalog">C</abbr> function call.
 A new section called <code class="language-plaintext highlighter-rouge">___kcrctab+my_function</code> with a single <code class="language-plaintext highlighter-rouge">long</code> containing the CRC is created.
 Later on these sections will be collected by the linker script <code class="language-plaintext highlighter-rouge">scripts/module-common.lds</code> and will be put into the section called <code class="language-plaintext highlighter-rouge">__kcrctab</code>.
 Similar happens for <code class="language-plaintext highlighter-rouge">EXPORT_SYMBOL_GPL</code> and <code class="language-plaintext highlighter-rouge">EXPORT_SYMBOL_GPL_FUTURE</code> and <code class="language-plaintext highlighter-rouge">EXPORT_SYMBOL_NS</code>, but with different prefixes.</li>
  <li>For each used symbol the CRC is looked up in the <code class="language-plaintext highlighter-rouge">Module.symvers</code> files.
 They are created as part of the kernel or any module compilation process when <code class="language-plaintext highlighter-rouge">CONFIG_MODVERSIONS</code> is enabled.
 The file collects the CRC and module path for each symbol.
 The symbol name and its CRC is collected in a <code class="language-plaintext highlighter-rouge">const char __versions[]</code> array in section <code class="language-plaintext highlighter-rouge">__versions</code>.</li>
</ol>

<h2 id="module-loading">Module loading</h2>

<p>When a kernel module is loaded, the Linux kernel linker resolves all dynamic symbols of the module.
It looks up each unresolved symbol from <code class="language-plaintext highlighter-rouge">.symtab</code> and resolves it to all symbols loaded so far.
You can view them from user-space in <code class="language-plaintext highlighter-rouge">/proc/kallsyms</code>.</p>

<p>In addition to that simple lookup the loader also checks the modules licence from <code class="language-plaintext highlighter-rouge">.modinfo</code>:
Symbols exported via <code class="language-plaintext highlighter-rouge">EXPORT_SYMBOL_GPL</code> can only be resolved if the module has <code class="language-plaintext highlighter-rouge">MODULE_LICENCE("GPL")</code> and such.</p>

<p>When <code class="language-plaintext highlighter-rouge">CONFIG_MODVERSIONS</code> is enabled, the linker inside the Linux kernel also checks the CRC:
For every undefined symbol there is a matching entry for it in section <code class="language-plaintext highlighter-rouge">__versions</code>, which contains the CRC of the symbol from compile time.</p>
<div class="language-console highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="gp">$</span><span class="w"> </span><span class="nv">LC_ALL</span><span class="o">=</span>C readelf <span class="nt">--wide</span> <span class="nt">-s</span> avm-modver.ko | <span class="nb">grep </span>UND
<span class="go">     0: 0000000000000000     0 NOTYPE  LOCAL  DEFAULT  UND
    42: 0000000000000000     0 NOTYPE  GLOBAL DEFAULT  UND __fentry__
    43: 0000000000000000     0 NOTYPE  GLOBAL DEFAULT  UND _printk
    44: 0000000000000000     0 NOTYPE  GLOBAL DEFAULT  UND __x86_return_thunk
</span></code></pre></div></div>

<p>But there are two different layouts used:</p>

<h3 id="upstream-linux-kernel">Upstream Linux Kernel</h3>

<div class="language-console highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="gp">$</span><span class="w"> </span><span class="nv">LC_ALL</span><span class="o">=</span>C readelf <span class="nt">--wide</span> <span class="nt">--hex-dump</span><span class="o">=</span>__versions avm-modver.ko
<span class="go">Hex dump of section '__versions':
  0x00000000 bb6dfbbd 00000000 5f5f6665 6e747279 .m......__fentry
  0x00000010 5f5f0000 00000000 00000000 00000000 __..............
  0x00000020 00000000 00000000 00000000 00000000 ................
  0x00000030 00000000 00000000 00000000 00000000 ................
  0x00000040 d87e9992 00000000 5f707269 6e746b00 .~......_printk.
  0x00000050 00000000 00000000 00000000 00000000 ................
  0x00000060 00000000 00000000 00000000 00000000 ................
  0x00000070 00000000 00000000 00000000 00000000 ................
  0x00000080 cb8119bf 00000000 6d6f6475 6c655f6c ........module_l
  0x00000090 61796f75 74000000 00000000 00000000 ayout...........
  0x000000a0 00000000 00000000 00000000 00000000 ................
  0x000000b0 00000000 00000000 00000000 00000000 ................
</span></code></pre></div></div>

<!-- ~/REPOS/LINUX/linux/scripts/mod/modpost.c -->
<p>The original Linux kernel uses <code class="language-plaintext highlighter-rouge">const struct modversion_info __version[]</code>.
The structure has a fixed size of 64 bytes:</p>
<ul>
  <li>the first 8 bytes contain the CRC.</li>
  <li>the remaining 56 bytes contain the symbol name.</li>
</ul>

<p>Longer symbol names are not supported and require the use of the <em>extended modversions</em>.</p>

<h3 id="ubuntu-linux-kernel">Ubuntu Linux Kernel</h3>

<div class="language-console highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="gp">$</span><span class="w"> </span><span class="nv">LC_ALL</span><span class="o">=</span>C readelf <span class="nt">--wide</span> <span class="nt">--hex-dump</span><span class="o">=</span>__versions avm-modver.ko
<span class="go">Hex dump of section '__versions':
  0x00000000 14000000 bb6dfbbd 5f5f6665 6e747279 .....m..__fentry
  0x00000010 5f5f0000 10000000 7e3a2c12 5f707269 __......~:,._pri
  0x00000020 6e746b00 1c000000 ca39825b 5f5f7838 ntk......9.[__x8
  0x00000030 365f7265 7475726e 5f746875 6e6b0000 6_return_thunk..
  0x00000040 18000000 eb7b33e1 6d6f6475 6c655f6c .....{3.module_l
  0x00000050 61796f75 74000000 00000000 00000000 ayout...........
  0x00000060 00
</span></code></pre></div></div>

<!-- /usr/src/linux-headers-6.8.0-84-generic/scripts/mod/modpost.c -->
<p>Ubuntu has changed this and uses <code class="language-plaintext highlighter-rouge">const char ____versions[]</code>:</p>
<ul>
  <li>the first 8 bytes contain the CRC.</li>
  <li>next follows the symbol name with a terminating NUL byte.</li>
  <li>more NUL bytes for padding up to the next address dividable by 4.</li>
</ul>

<p>Ubuntu changed this to support longer symbol names, which Ubuntu claims is required for RUST support.
See <a href="https://lists.ubuntu.com/archives/kernel-team/2023-March/137814.html">modpost: support arbitrary symbol length in modversion</a> for details.
This has been reverted by <a href="https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2039010">2039010</a>.</p>

<p>…</p>

<h2 id="further-reading">Further reading</h2>
<ul>
  <li>Linux: <a href="https://docs.kernel.org/kbuild/modules.html#module-versioning">Building External Modules - Module Versioning</a></li>
  <li><abbr title="Linux Weekly News">LWN</abbr>: <a href="https://lwn.net/Articles/986892/">A new version of modversions</a></li>
  <li><a href="https://terenceli.github.io/技术/2018/06/02/linux-loadable-module">Anatomy of the Linux loadable kernel module</a></li>
  <li>Linux manual page: <a href="https://man7.org/linux/man-pages/man5/elf.5.html">Executable and Linking Format</a></li>
</ul>

<!-- *[FD]: File Daemon -->
<!-- *[FD]: File Descriptor -->
<!-- *[GPT]: Generative Pre-trained Transformer -->
<!-- *[GPT]: Global Partitioning Table -->
<!-- *[GPT]: GUID Partition Table -->]]></content><author><name>Philipp Hahn</name></author><category term="linux" /><summary type="html"><![CDATA[The Linux kernel itself and its modules may export symbols, so that other modules can import and use them. As the functions are written in C, it is important that the function signature matches: the number of arguments must match the ordering of the arguments must match the data types must match, which includes the structure and layout of all input and output parameters If any of them changes, the Application Binary Interface (ABI) changes and you risk crashing the kernel. If you’re lucky, recompiling the kernel and the modules is enough for both ends to pick up the new Application Programming Interface (API). To detect such breaking changes, the Linux kernel can be compiled with CONFIG_MODVERSIONS enabled: This calculates a Cyclic Redundancy Check (CRC) checksum over the function signature and embeds this information with the kernel and the modules. The dynamic linker of the Linux kernel checks, that for each requested symbol its CRC matches the CRC of the Linux kernel or already loaded modules. A module is only loaded, if a match is found for all symbols. Otherwise loading fails.]]></summary></entry><entry><title type="html">Shell-trivia #3: set -e</title><link href="https://blog.pmhahn.de/shell-trivia-3-set-e/" rel="alternate" type="text/html" title="Shell-trivia #3: set -e" /><published>2025-08-13T08:39:00+02:00</published><updated>2025-08-13T08:39:00+02:00</updated><id>https://blog.pmhahn.de/shell-trivia-3-set-e</id><content type="html" xml:base="https://blog.pmhahn.de/shell-trivia-3-set-e/"><![CDATA[<p>Es gab bereits zwei Blog-Eintrag <a href="/shell-trivia-1-set-e/">Shell-trivia #1</a> und <a href="/shell-trivia-2-set-e/">Shell-trivia #2</a> zum Thema <code class="language-plaintext highlighter-rouge">set -e</code>.
Mein Kollege N. Schier hat mich heute Morgen aber mit einer weiteren Shell-Absurdität überrascht:</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c">#!/bin/sh</span>
<span class="nb">set</span> <span class="nt">-e</span>
<span class="nb">date</span> <span class="o">&amp;&amp;</span> <span class="nb">false</span> <span class="o">&amp;&amp;</span> <span class="nb">true
date</span>
</code></pre></div></div>

<p>Wie häufig wird <code class="language-plaintext highlighter-rouge">date</code> ausgeführt?</p>

<!--more-->

<p>Wie üblich muss man die <a href="https://manpages.debian.org/stretch/bash/bash.1.en.html#Shell_Function_Definitions">Manual-Page von bash</a> sehr genau lesen:</p>

<blockquote>
  <p>The ERR trap is <strong>not</strong> executed if the failed command is … part of a command executed in a &amp;&amp; or || list except the command <strong>following the final</strong> &amp;&amp; or ||.</p>
</blockquote>

<p>Die korrekte Antwort lautet also: 2</p>

<p>Beim <code class="language-plaintext highlighter-rouge">date &amp;&amp; false &amp;&amp; true</code> Endet die Ausführung nach dem <code class="language-plaintext highlighter-rouge">false</code> und der Exit-Code ist 1:
Das nachfolgende <code class="language-plaintext highlighter-rouge">&amp;&amp; …</code> wird nicht mehr ausgeführt.
Da das <code class="language-plaintext highlighter-rouge">false</code> aber dadurch nicht der letzte Befehl ist, bricht <code class="language-plaintext highlighter-rouge">set -e</code> nicht ab und das 2. <code class="language-plaintext highlighter-rouge">date</code> wird trotzdem ausgeführt.</p>

<p>Das sollte man bedenken, wenn einem <a href="https://www.shellcheck.net/">shellcheck</a> folgende Warnungen ausspuckt und einen dazu anregen, mehr <code class="language-plaintext highlighter-rouge">&amp;&amp;</code> zu benutzen:</p>
<ul>
  <li><a href="https://www.shellcheck.net/wiki/SC2015">SC2015</a>: Note that <code class="language-plaintext highlighter-rouge">A &amp;&amp; B || C</code> is not if-then-else. <code class="language-plaintext highlighter-rouge">C</code> may run when <code class="language-plaintext highlighter-rouge">A</code> is true.</li>
  <li><a href="https://www.shellcheck.net/wiki/SC2166">SC2166</a>: Prefer <code class="language-plaintext highlighter-rouge">[ p ] &amp;&amp; [ q ]</code> as <code class="language-plaintext highlighter-rouge">[ p -a q ]</code> is not well-defined.</li>
</ul>

<p>Man baut sich dadurch leicht semantische Unterschiede ein.</p>

<p>Oder um es mit den Worten von G. Aschemann aus 1995 zu sagen:</p>
<blockquote>
  <p>Jedes gute Shell-Script fängt mit #!/usr/bin/perl an.</p>
</blockquote>

<p>Naja, das ist 30 Jahre her und ich würde doch <code class="language-plaintext highlighter-rouge">perl</code> durch <code class="language-plaintext highlighter-rouge">python</code> ersetzten wollen 😉</p>

<!-- *[FD]: File Daemon -->
<!-- *[FD]: File Descriptor -->
<!-- *[GPT]: Generative Pre-trained Transformer -->
<!-- *[GPT]: Global Partitioning Table -->
<!-- *[GPT]: GUID Partition Table -->]]></content><author><name>Philipp Hahn</name></author><category term="shell" /><summary type="html"><![CDATA[Es gab bereits zwei Blog-Eintrag Shell-trivia #1 und Shell-trivia #2 zum Thema set -e. Mein Kollege N. Schier hat mich heute Morgen aber mit einer weiteren Shell-Absurdität überrascht: #!/bin/sh set -e date &amp;&amp; false &amp;&amp; true date Wie häufig wird date ausgeführt?]]></summary></entry><entry><title type="html">Debian 13 Trixie released</title><link href="https://blog.pmhahn.de/debian-13-trixie/" rel="alternate" type="text/html" title="Debian 13 Trixie released" /><published>2025-08-11T10:31:00+02:00</published><updated>2025-08-11T10:31:00+02:00</updated><id>https://blog.pmhahn.de/debian-13-trixie</id><content type="html" xml:base="https://blog.pmhahn.de/debian-13-trixie/"><![CDATA[<p>Last Saturday - 2025-08-09 - <a href="https://www.debian.org/News/2025/20250809">Debian 13 “Trixie”</a> has been released after 2 years of work. 🥳</p>

<p>I just updated my laptop and servers and stumbled upon some issues:</p>

<!--more-->

<h2 id="cyrus-imapd"><code class="language-plaintext highlighter-rouge">cyrus-imapd</code></h2>

<p>I’m running my own mail server infrastructure:
I have grown up with <em>Unix-to-Unix-Copy-Protocol</em> (UUCP) and was an admin for <em>UUCP Freunde Lahn e.V.</em> for a long time.
I’m still using <em>Postfix</em> with <em>Cryus IMAPd</em> and never switched to <em>Dovecot</em>.</p>

<p>After the upgrade I noticed that no mails were delivered:
<code class="language-plaintext highlighter-rouge">mailq</code> shown a growing list of mails stuck in queue.
Postfix was complaining that its <code class="language-plaintext highlighter-rouge">lmtp</code> service was no longer able to establish an encrypted connection to <code class="language-plaintext highlighter-rouge">lmtpd</code> from Cyrus.</p>

<p>For historic reason my setup is using <code class="language-plaintext highlighter-rouge">STARTTLS</code>, which is now deprecated and has been disabled by default in Cyrus IMAPd.
You have to explicitly re-enable it in your <code class="language-plaintext highlighter-rouge">/etc/imapd.conf</code> by adding some lines:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>imap_allowstarttls: yes
lmtp_allowstarttls: yes
</code></pre></div></div>

<h2 id="saslauthd"><code class="language-plaintext highlighter-rouge">saslauthd</code></h2>

<p><code class="language-plaintext highlighter-rouge">saslauthd.service</code> failed to start as I had to move its UNIX socket to <code class="language-plaintext highlighter-rouge">/var/spool/postfix/var/run/saslauthd/</code>.
This also moves the location of the <abbr title="Process Identifier">PID</abbr> file to that directory, which then no longer matches the information in <code class="language-plaintext highlighter-rouge">/usr/lib/systemd/system/saslauthd.servie</code>, which expects the file in <code class="language-plaintext highlighter-rouge">/var/run/saslauthd.pid</code>.</p>

<p>A fixed this by creating an override with <code class="language-plaintext highlighter-rouge">systemctl edit saslauthd.service</code>:</p>
<div class="language-ini highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nn">[Service]</span>
<span class="py">PIDFile</span><span class="p">=</span><span class="s">/var/spool/postfix/var/run/saslauthd/saslauthd.pid</span>
</code></pre></div></div>

<p>Previously I had a shell-hack in <code class="language-plaintext highlighter-rouge">/etc/default/saslauthd</code> to replace the old location with a symbolic link to the <code class="language-plaintext highlighter-rouge">chroot</code>-location.
This no longer works as that file is not sourced by <code class="language-plaintext highlighter-rouge">systemd</code>, which does not execute that shell code.
Therefore I had to tell Cyrus IMAPd to also use that changed location by putting this into <code class="language-plaintext highlighter-rouge">/etc/imapd.conf</code>:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>sasl_saslauthd_path: /var/spool/postfix/var/run/saslauthd/mux
</code></pre></div></div>

<p>PS: On a side node: <code class="language-plaintext highlighter-rouge">/var/run/</code> is deprecated and should be replaced by just <code class="language-plaintext highlighter-rouge">/run/</code>; <code class="language-plaintext highlighter-rouge">systemd</code> already complains about this every time it sees <code class="language-plaintext highlighter-rouge">/var/run/</code>.</p>

<h2 id="dockerio-and-libvirt"><code class="language-plaintext highlighter-rouge">docker.io</code> and <code class="language-plaintext highlighter-rouge">libvirt</code></h2>

<p>For some unknown reason <code class="language-plaintext highlighter-rouge">docker.io</code> and <code class="language-plaintext highlighter-rouge">libvirt</code> got removed during the upgrade.
Running <code class="language-plaintext highlighter-rouge">apt autopurge</code> afterwards was a very bad idea as that purged all images, volumes and containers. 🤦</p>

<p>I have to investigate why that happened. 🔍</p>

<h2 id="php-84">PHP-8.4</h2>

<p>Debian-13-Trixie has PHP-8.4, while Debian-12-Bookworm had PHP-8.2.
My local NextCloud (and Wordpress) setup was unhappy about that as it needs several <code class="language-plaintext highlighter-rouge">php8.4-…</code> packages.
Luckily just installing the equivalent of the matching packages did fix this.</p>

<h2 id="kde"><abbr title="K Desktop Environment">KDE</abbr></h2>

<p>In the past I did not install <code class="language-plaintext highlighter-rouge">kde-full</code> as it depends on many optional packages like KMail, KOrganizer, DragonPlayer, and such.
I don’t use may of those and thus don’t want them to be installed.
During the upgrade <code class="language-plaintext highlighter-rouge">plasmashell</code> got removed so on the next login I did not get back a working <abbr title="K Desktop Environment">KDE</abbr> session.
Installing <code class="language-plaintext highlighter-rouge">kde-standard</code> fixed this.
As it only <code class="language-plaintext highlighter-rouge">Recommends</code> most other packages, I was able to get rid of those packages I do not want.</p>

<p>And I got Wayland, which has this annoying bug: Konsole no longer stores the open sessions and starts with only one shell in <code class="language-plaintext highlighter-rouge">$HOME</code>. 🤔</p>

<h2 id="out-of-space-usr">Out-of-space <code class="language-plaintext highlighter-rouge">/usr</code></h2>

<p>My desktop system has many packages.
Upgrading all those (<abbr title="K Desktop Environment">KDE</abbr>-)libraries required too much space on <code class="language-plaintext highlighter-rouge">/usr</code>.
<code class="language-plaintext highlighter-rouge">dpkg</code> failed to unpack a package during upgrade.</p>

<p>After some manual <code class="language-plaintext highlighter-rouge">dpkg --configure --pending</code>, <code class="language-plaintext highlighter-rouge">apt install --fix-broken</code>, <code class="language-plaintext highlighter-rouge">apt autopurge</code> and <code class="language-plaintext highlighter-rouge">dpkg -P</code> I was finally able to continue.
I would have expected for <abbr title="Advanced Packaging Tool">APT</abbr> to check for enough disk space, but apparently it does not.
So double-check manually before doing an upgrade.</p>

<p>PS: Afterwards <code class="language-plaintext highlighter-rouge">systemd</code> complains about <code class="language-plaintext highlighter-rouge">usr-not-merged</code>, but that is <a href="https://www.debian.org/releases/trixie/release-notes/issues.html#systemd-message-system-is-tainted-unmerged-bin">normal and expected</a>.</p>

<h2 id="keepassxc">KeePassXC</h2>

<p>I used a self-compiled version of KeePassXC.
Debian now has two packages <code class="language-plaintext highlighter-rouge">keepassxc</code> and <code class="language-plaintext highlighter-rouge">keepassxc-full</code> – the later has support for browser-integration and more.
As some file have been move, the upgrade failed and I had to manually remove by self-compiled version.</p>

<h2 id="network">Network</h2>

<p>Running the upgrade while being logged into <abbr title="K Desktop Environment">KDE</abbr> is not a good idea:
During the upgrade NetworkManager got restarted and killed my local network connection.
Afterward even <code class="language-plaintext highlighter-rouge">ping</code> did no longer work, as I already had the <a href="https://www.debian.org/releases/trixie/release-notes/issues.de.html#ping-no-longer-runs-with-elevated-privileges">new version</a> but still the old Linux kernel.</p>

<p>Sadly I still need my <code class="language-plaintext highlighter-rouge">r8168-dkms</code> and <code class="language-plaintext highlighter-rouge">v4l2loopback-dkms</code> packages.</p>

<h2 id="prometheus-mysqlmariadb-exporter">Prometheus MySQL/MariaDB exporter</h2>

<p><a href="https://github.com/prometheus/mysqld_exporter/releases/tag/v0.15.0">v0.15.0</a> has a breaking change, which is neither mentioned in any <code class="language-plaintext highlighter-rouge">NEWS</code> file nor the <a href="https://salsa.debian.org/go-team/packages/prometheus-mysqld-exporter/-/blob/debian/sid/debian/changelog?ref_type=heads">debian/changelog</a>.</p>
<ul>
  <li><code class="language-plaintext highlighter-rouge">DATA_SOURCE_NAME</code> is no longer supported and you must pass the credentials via <code class="language-plaintext highlighter-rouge">--mysqld.username=</code> and via <code class="language-plaintext highlighter-rouge">MYSQLD_EXPORTER_PASSWORD=</code>.</li>
  <li>You also <a href="https://github.com/prometheus/mysqld_exporter/issues/754">cannot specify the UNIX domain socket</a> <code class="language-plaintext highlighter-rouge">/run/mysqld/mysqld.sock</code></li>
</ul>

<p>I’m now using <code class="language-plaintext highlighter-rouge">--config.my-cnf /var/lib/prometheus/mysql.cnf</code> to configure the credentials via another file.</p>

<h2 id="mailman3">Mailman3</h2>

<h3 id="cron">cron</h3>

<p><code class="language-plaintext highlighter-rouge">mailman3-web</code> still runs a CRON job <strong>every minute</strong>, which imports <code class="language-plaintext highlighter-rouge">robot_detection</code>, which spams you with a ton of <code class="language-plaintext highlighter-rouge">SyntaxWarning</code>s.
See <a href="https://bugs.debian.org/1082541">mailman3-web#1082541</a> and <a href="https://bugs.debian.org/1078661">python3-robot-detection#1078661</a></p>

<p>Edit <code class="language-plaintext highlighter-rouge">/etc/cron.d/mailman3-web</code> and add <code class="language-plaintext highlighter-rouge">2&gt;/dev/null</code> to each command.</p>

<h3 id="authentication">authentication</h3>

<p>Authentication was now broken for me:
<code class="language-plaintext highlighter-rouge">/var/log/mailman3/web/mailman-web.log</code> complains about <code class="language-plaintext highlighter-rouge">Missing column 'socialaccount_socialapp.provider_id'</code>.
Run <code class="language-plaintext highlighter-rouge">/usr/bin/mailman-web migrate</code> as user <code class="language-plaintext highlighter-rouge">root</code> to fix this.</p>

<h3 id="template">template</h3>

<p><code class="language-plaintext highlighter-rouge">/var/log/mailman3/web/mailman-web.log</code> showed another error:</p>
<blockquote>
  <p>django.template.exceptions.TemplateSyntaxError: ‘humanize’ is not a registered tag library.</p>
</blockquote>

<p>Adding <code class="language-plaintext highlighter-rouge">django.contrib.humanize</code> TO <code class="language-plaintext highlighter-rouge">INSTALLED_APPS</code> in <code class="language-plaintext highlighter-rouge">/etc/mailman3/mailman-web.py</code> fixes this.</p>

<!-- *[FD]: File Daemon -->
<!-- *[FD]: File Descriptor -->
<!-- *[GPT]: Generative Pre-trained Transformer -->
<!-- *[GPT]: Global Partitioning Table -->
<!-- *[GPT]: GUID Partition Table -->]]></content><author><name>Philipp Hahn</name></author><category term="debian" /><summary type="html"><![CDATA[Last Saturday - 2025-08-09 - Debian 13 “Trixie” has been released after 2 years of work. 🥳]]></summary></entry><entry><title type="html">shell `trap` signal</title><link href="https://blog.pmhahn.de/shell-trap-signal/" rel="alternate" type="text/html" title="shell `trap` signal" /><published>2025-06-30T07:47:00+02:00</published><updated>2025-06-30T07:47:00+02:00</updated><id>https://blog.pmhahn.de/shell-trap-signal</id><content type="html" xml:base="https://blog.pmhahn.de/shell-trap-signal/"><![CDATA[<p>What’s wrong with signal handling like this:</p>
<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c">#!/bin/sh</span>
<span class="nb">trap</span> <span class="s1">'echo Cleanup…'</span> EXIT HUP INT TERM
...
</code></pre></div></div>

<!--more-->

<h2 id="exit-and-signals">Exit and signals</h2>

<p>Before we begin:
Actually <em>exit codes</em> are <strong>mutual exclusive</strong> to <em>signal statuses</em>:
A process may either exit normally using <code class="language-plaintext highlighter-rouge">exit</code> or terminate via a signal.</p>

<p>If you read <a href="man:bash(1)">man:bash</a> you will read this:</p>
<blockquote>
  <p>The return value of a simple command is its exit status, or 128+n if the command is terminated by signal n.</p>
</blockquote>

<p>That might give you the idea, that they are the same, but that is only a (broken) shell convention to map <em>signal statuses</em> to <em>exit codes</em>.
Reading <a href="man:exit(2)">man:exit</a> you see this:</p>
<blockquote>
  <p>The value status &amp; 0xFF is returned to the parent process as the process’s exit status,</p>
</blockquote>

<p>So there are 256 exit codes from 0 to 255, which a process can use to exit.</p>

<p>The parent process then uses <a href="man:waitpid(2)">waitpid()</a> to wait for the childs <em>state change</em>:</p>
<blockquote>
  <p>That may be the process exited by calling <code class="language-plaintext highlighter-rouge">exit()</code> itself or caught a <code class="language-plaintext highlighter-rouge">signal()</code>, which might have <code class="language-plaintext highlighter-rouge">kill()</code>ed the process or just suspended it.</p>
</blockquote>

<p>You then have to first use <code class="language-plaintext highlighter-rouge">WIFEXITED()</code> or <code class="language-plaintext highlighter-rouge">WIFSIGNALED()</code> to check, if the child exited normally via <code class="language-plaintext highlighter-rouge">exit()</code> or caught a <code class="language-plaintext highlighter-rouge">signal()</code>.
Only after that you should either use <code class="language-plaintext highlighter-rouge">WEXITSTATUS()</code> to extract the byte containing the <em>exit code</em> or use <code class="language-plaintext highlighter-rouge">WTERMSIG()</code> to extract the <em>signal number</em>.</p>

<p>In a shell script you do not have access to these low-level <abbr title="Catalog">C</abbr> functions, but only get the mangled exit status.
You cannot distinguish is the called process did <code class="language-plaintext highlighter-rouge">exit(130)</code> itself or was terminated by the user pressing <em>Ctrl-<abbr title="Catalog">C</abbr></em> so send <code class="language-plaintext highlighter-rouge">SIGINT</code> to it.</p>

<h2 id="signals-and-exit-trap">Signals and EXIT trap</h2>

<p>Here’s a short overview of commonly used signals and traps.</p>

<table>
  <thead>
    <tr>
      <th>signal</th>
      <th style="text-align: right">number</th>
      <th>trigger</th>
      <th>when</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td>EXIT</td>
      <td style="text-align: right">“0”</td>
      <td><code class="language-plaintext highlighter-rouge">exit</code></td>
      <td>shell process exits</td>
    </tr>
    <tr>
      <td>SIGHUP</td>
      <td style="text-align: right">1</td>
      <td> </td>
      <td>login <abbr title="Tele Type Writer">TTY</abbr> closed</td>
    </tr>
    <tr>
      <td>SIGINT</td>
      <td style="text-align: right">2</td>
      <td>Ctrl-<abbr title="Catalog">C</abbr></td>
      <td>user aborts process</td>
    </tr>
    <tr>
      <td>SIGQUIT</td>
      <td style="text-align: right">3</td>
      <td>Ctrl-\</td>
      <td>user aborts process</td>
    </tr>
    <tr>
      <td>SIGTERM</td>
      <td style="text-align: right">15</td>
      <td> </td>
      <td><code class="language-plaintext highlighter-rouge">kill $PID</code></td>
    </tr>
  </tbody>
</table>

<p>Please not that shells misuse signal <code class="language-plaintext highlighter-rouge">0</code> here:
By default there is not signal numbered <code class="language-plaintext highlighter-rouge">0</code>.
Actually it is a no-operation and can be used to check, if <em>process A can send signals to process B</em> or if <em>process B is still alive</em>.
<code class="language-plaintext highlighter-rouge">bash</code> and other shells re-use that number to give their <code class="language-plaintext highlighter-rouge">EXIT</code> handler a number, which is supposed to be called on <em>any exit from shell</em>.
But that behaviour is very implementation dependant as you will see later on.</p>

<h2 id="implementation-specific-handling-of-exit">Implementation specific handling of EXIT</h2>

<p>Let’s try this with the more informative shell script <code class="language-plaintext highlighter-rouge">trap.sh</code>:</p>
<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c">#!/bin/bash</span>
cleanup <span class="o">()</span> <span class="o">{</span>
    <span class="nb">local </span><span class="nv">rv</span><span class="o">=</span><span class="nv">$?</span> <span class="nv">sig</span><span class="o">=</span><span class="k">${</span><span class="nv">1</span><span class="k">:-</span><span class="nv">0</span><span class="k">}</span>
    <span class="nb">echo</span> <span class="s2">"Process </span><span class="nv">$$</span><span class="s2"> received signal </span><span class="nv">$sig</span><span class="s2"> after rv=</span><span class="nv">$rv</span><span class="s2">"</span>
    <span class="k">case</span> <span class="s2">"</span><span class="nv">$sig</span><span class="s2">"</span> <span class="k">in
    </span>0|<span class="s1">''</span><span class="p">)</span> <span class="nb">exit</span> <span class="s2">"</span><span class="nv">$rv</span><span class="s2">"</span><span class="p">;;</span>
    <span class="k">*</span><span class="p">)</span> <span class="nb">trap</span> - <span class="s2">"</span><span class="nv">$sig</span><span class="s2">"</span><span class="p">;</span> <span class="nb">kill</span> <span class="s2">"-</span><span class="nv">$sig</span><span class="s2">"</span> <span class="s2">"</span><span class="nv">$$</span><span class="s2">"</span><span class="p">;;</span>
    <span class="k">esac</span>
<span class="o">}</span>
<span class="nb">trap</span> <span class="s1">'cleanup 0'</span> EXIT
<span class="nb">trap</span> <span class="s1">'cleanup 1'</span> HUP
<span class="nb">trap</span> <span class="s1">'cleanup 2'</span> INT
<span class="nb">trap</span> <span class="s1">'cleanup 3'</span> QUIT
<span class="nb">trap</span> <span class="s1">'cleanup 15'</span> TERM

<span class="o">[</span> <span class="nt">-n</span> <span class="s2">"</span><span class="k">${</span><span class="nv">1</span><span class="k">:-}</span><span class="s2">"</span> <span class="o">]</span> <span class="o">&amp;&amp;</span> <span class="nb">kill</span> <span class="s2">"-</span><span class="nv">$1</span><span class="s2">"</span> <span class="s2">"</span><span class="nv">$$</span><span class="s2">"</span>
</code></pre></div></div>

<h3 id="bash">bash</h3>
<div class="language-console highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="gp">$</span><span class="w"> </span>bash ./trap.sh 0  <span class="c"># EXIT</span>
<span class="go">Process 499218 received signal 0 after rv=0
</span><span class="gp">$</span><span class="w"> </span>bash ./trap.sh 1  <span class="c"># SIGHUP</span>
<span class="go">Process 499237 received signal 1 after rv=0
Process 499237 received signal 0 after rv=0
Hangup
</span><span class="gp">$</span><span class="w"> </span>bash ./trap.sh 2  <span class="c"># SIGINT</span>
<span class="go">Process 499256 received signal 2 after rv=0
Process 499256 received signal 0 after rv=0

</span><span class="gp">$</span><span class="w"> </span>bash ./trap.sh 3  <span class="c"># SIGQUIT</span>
<span class="go">Process 499275 received signal 3 after rv=0
Process 499275 received signal 0 after rv=0
</span><span class="gp">$</span><span class="w"> </span>bash ./trap.sh 15  <span class="c"># SIGTERM</span>
<span class="go">Process 499294 received signal 15 after rv=0
Process 499294 received signal 0 after rv=0
Terminated
</span></code></pre></div></div>

<p>As you can see <a href="https://www.gnu.org/software/bash/"><code class="language-plaintext highlighter-rouge">bash</code></a> <strong>always</strong> calls the trap handler for <code class="language-plaintext highlighter-rouge">EXIT</code>!</p>

<h3 id="dash">dash</h3>
<p>Let’s repeat this with <a href="http://gondor.apana.org.au/~herbert/dash/"><code class="language-plaintext highlighter-rouge">dash</code></a>:</p>
<div class="language-console highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="gp">$</span><span class="w"> </span>dash ./trap.sh 0  <span class="c"># EXIT</span>
<span class="go">Process 502873 received signal 0 after rv=1
</span><span class="gp">$</span><span class="w"> </span>dash ./trap.sh 1  <span class="c"># SIGHUP</span>
<span class="go">Process 501892 received signal 1 after rv=0
Hangup
</span><span class="gp">$</span><span class="w"> </span>dash ./trap.sh 2  <span class="c"># SIGINT</span>
<span class="go">Process 501912 received signal 2 after rv=0

</span><span class="gp">$</span><span class="w"> </span>dash ./trap.sh 3  <span class="c"># SIGQUIT</span>
<span class="go">Process 501929 received signal 3 after rv=0
Verlassen (Speicherabzug geschrieben)
</span><span class="gp">$</span><span class="w"> </span>dash ./trap.sh 15  <span class="c"># SIGQUIT</span>
<span class="go">Process 501971 received signal 15 after rv=0
Terminated
</span></code></pre></div></div>

<h3 id="busybox">busybox</h3>
<p>And once more with <a href="https://busybox.net/"><code class="language-plaintext highlighter-rouge">busybox</code></a>:</p>
<div class="language-console highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="gp">$</span><span class="w"> </span>busybox sh ./trap.sh 0  <span class="c"># EXIT</span>
<span class="go">Process 502338 received signal 0 after rv=0
</span><span class="gp">$</span><span class="w"> </span>busybox sh ./trap.sh 1  <span class="c"># SIGHUP</span>
<span class="go">Process 502366 received signal 1 after rv=0
Hangup
</span><span class="gp">$</span><span class="w"> </span>busybox sh ./trap.sh 2  <span class="c"># SIGINT</span>
<span class="go">Process 502402 received signal 2 after rv=0

</span><span class="gp">$</span><span class="w"> </span>busybox sh ./trap.sh 3  <span class="c"># SIGQUIT</span>
<span class="go">Process 502439 received signal 3 after rv=0
Process 502439 received signal 0 after rv=0
</span><span class="gp">$</span><span class="w"> </span>busybox sh ./trap.sh 15  <span class="c"># SIGTERM</span>
<span class="go">Process 502269 received signal 15 after rv=0
Terminated
</span></code></pre></div></div>

<p>There <code class="language-plaintext highlighter-rouge">EXIT</code> is <em>almost</em> never called, except by <code class="language-plaintext highlighter-rouge">busybox</code> on <code class="language-plaintext highlighter-rouge">SIGQUIT</code>.</p>

<p>That is why <strong>portable</strong> shell scripts setup <code class="language-plaintext highlighter-rouge">trap</code> not only for <code class="language-plaintext highlighter-rouge">EXIT</code>, but also for other <code class="language-plaintext highlighter-rouge">SIG</code>nals.</p>

<p>But if you do that, please make sure to do it right:</p>
<ol>
  <li>Reset the <code class="language-plaintext highlighter-rouge">trap</code> handler to its default.</li>
  <li>Afterwards kill the process by re-sending the received signal to the process again.</li>
</ol>

<h2 id="why-proper-trap-handling-is-important">Why proper trap handling is important</h2>

<p>Viacheslav Biriukov wrote a great blog post about <a href="https://biriukov.dev/docs/fd-pipe-session-terminal/3-process-groups-jobs-and-sessions/">Process groups, jobs and sessions</a> explaining why proper exiting is important.
A program might setup a signal handler for <code class="language-plaintext highlighter-rouge">SIGINT</code> to prevent the program from just terminating, which might loose important data.
It might ask the user if terminating is okay or if the data should be saved first before quitting.
A surrounding shell script must then decide, if this is an <em>abnormal exit</em> and should terminate itself <strong>afterwards</strong>, or should continue normally.
The UNIX convention is to transfer that detail via <em>exit codes</em> and <em>signal statuses</em>.
So be careful and do it right if your shell script starts  using <code class="language-plaintext highlighter-rouge">trap</code>.</p>

<h2 id="conclusion">Conclusion</h2>

<ol>
  <li>Use <code class="language-plaintext highlighter-rouge">bash</code> as it has consistent handling of <code class="language-plaintext highlighter-rouge">trap EXIT</code>.</li>
  <li>If you want to or must use other shells: Do not use the same <code class="language-plaintext highlighter-rouge">cleanup</code> trap of <code class="language-plaintext highlighter-rouge">EXIT</code> and other signals.</li>
  <li>If you trap signals, make sure to reset the handler and to re-raise the signal to properly propagate them.</li>
</ol>

<!-- *[FD]: File Daemon -->
<!-- *[FD]: File Descriptor -->
<!-- *[GPT]: Generative Pre-trained Transformer -->
<!-- *[GPT]: Global Partitioning Table -->
<!-- *[GPT]: GUID Partition Table -->]]></content><author><name>Philipp Hahn</name></author><category term="shell" /><summary type="html"><![CDATA[What’s wrong with signal handling like this: #!/bin/sh trap 'echo Cleanup…' EXIT HUP INT TERM ...]]></summary></entry><entry><title type="html">shell `trap` and proper quoting</title><link href="https://blog.pmhahn.de/shell-trap-quote/" rel="alternate" type="text/html" title="shell `trap` and proper quoting" /><published>2025-06-28T07:45:00+02:00</published><updated>2025-06-28T07:45:00+02:00</updated><id>https://blog.pmhahn.de/shell-trap-quote</id><content type="html" xml:base="https://blog.pmhahn.de/shell-trap-quote/"><![CDATA[<p>What’s wrong with</p>
<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c">#!/bin/bash</span>
<span class="nv">TMPDIR</span><span class="o">=</span><span class="si">$(</span><span class="nb">mktemp</span> <span class="nt">-d</span><span class="si">)</span>
<span class="nb">trap</span> <span class="s1">'rm -r $TMPDIR'</span> EXIT
...
</code></pre></div></div>

<!--more-->

<p>Let’s ask <a href="https://www.shellcheck.net/">shellcheck</a>:</p>
<blockquote>
  <p>No issues detected!</p>
</blockquote>

<p>Actually there are multiple issues:</p>

<h2 id="tmpdir">TMPDIR</h2>

<p>Please do not assign to <code class="language-plaintext highlighter-rouge">TMPDIR</code> as that variable in an <em>input parameter</em> to <code class="language-plaintext highlighter-rouge">mktemp</code> itself:
When you read <a href="man:mktemp(1)">man:mktemp</a> your will find this for option <code class="language-plaintext highlighter-rouge">-p</code>:</p>
<blockquote>
  <p>if DIR is not specified, use $TMPDIR if set, else /tmp.</p>
</blockquote>

<p>The variable is used for example by <a href="man:pam_tmpdir.8">pam_tmpdir</a> to setup <em>per user temporary directories</em> to improve security on multi-user systems.
By using <code class="language-plaintext highlighter-rouge">TMPDIR</code> inside your script to store the path of your <strong>specific</strong> temporary directory, you risk chanhing the behavior of other called child-processes also using <code class="language-plaintext highlighter-rouge">mktemp</code>.
Other <a href="https://docs.python.org/3/library/tempfile.html#tempfile.mkstemp">equivalent implementations thereof</a>) also use <code class="language-plaintext highlighter-rouge">TEMP</code> and <code class="language-plaintext highlighter-rouge">TMP</code>, so better do not use these as well.</p>

<p>So lets use <code class="language-plaintext highlighter-rouge">tmp</code>:</p>
<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nv">tmp</span><span class="o">=</span><span class="si">$(</span><span class="nb">mktemp</span> <span class="nt">-d</span><span class="si">)</span>
<span class="nb">trap</span> <span class="s1">'rm -r $tmp'</span> EXIT
</code></pre></div></div>

<h2 id="ifs">IFS</h2>

<p>By default <a href="man:mktemp(1)">man:mktemp</a> will only create <em>safe</em> file names, e.g. none containing blanks and characters of <code class="language-plaintext highlighter-rouge">$IFS</code>.
Remember that <code class="language-plaintext highlighter-rouge">$IFS</code> is used by the shell to split every argument — which is not quoted — into multiple arguments.
By default it is set to <em>space</em>, <em>tab</em> and <em>newline</em>.
But you can redefine or extend it, after which <em>hell breaks loose</em>:</p>
<div class="language-console highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="gp">$</span><span class="w"> </span>bash <span class="nt">-c</span> <span class="s1">'IFS="$IFS/."; . my-trap-script"
</span><span class="go">rm: cannot remove '': No such file or directory
rm: cannot remove 'tmp': No such file or directory
rm: cannot remove 'user': No such file or directory
rm: cannot remove '1000': No such file or directory
rm: cannot remove 'tmp': No such file or directory
rm: cannot remove 'XyJlR6AHpn': No such file or directory
</span></code></pre></div></div>

<p>Luckily <strong><code class="language-plaintext highlighter-rouge">$IFS</code> is re-set for each shell</strong> to its default value, but do keep that in mind when you fiddle with <code class="language-plaintext highlighter-rouge">$IFS</code>.
My advise is to do that only in functions and to use <code class="language-plaintext highlighter-rouge">local IFS</code> there to have the change confined to only inside the function.</p>

<h2 id="quoting">quoting</h2>

<p>To prevent <code class="language-plaintext highlighter-rouge">$IFS</code>-splitting you have to quote arguments.
So let’s try with this:</p>
<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nv">tmp</span><span class="o">=</span><span class="si">$(</span><span class="nb">mktemp</span> <span class="nt">-d</span><span class="si">)</span>
<span class="nb">trap</span> <span class="s1">'rm -r "$tmp"'</span> EXIT
</code></pre></div></div>

<p>You may wonder, why I didn’t quote <code class="language-plaintext highlighter-rouge">tmp=$(…)</code> as the spitting occurs after <em>command substitution</em>?
For that you have to read <a href="man:bash(1)">man:bash</a> very carefully.
In section <em>parameters</em> you have this:</p>
<blockquote>
  <p>All values undergo tilde expansion, parameter and variable expansion, command substitution, arithmetic expansion, and quote removal.</p>
</blockquote>

<p>Compare that to section <em>expansion</em>:</p>
<blockquote>
  <p>There are seven kinds of expansion performed: brace expansion, tilde expansion, parameter and variable expansion, command substitution, arithmetic expansion, <strong>word splitting</strong>, and pathname expansion.</p>
</blockquote>

<p>The important difference here is, that parameter assignment expects a <em>single argument</em> and this <em>word splitting</em> does <strong>not</strong> occurs there.
So no quoting is needed for parameter assignments, but you can do it for consistency — it does not hurt.</p>

<h2 id="late-vs-early-evaluation">late vs. early evaluation</h2>

<p>While <code class="language-plaintext highlighter-rouge">shellcheck</code> is happy, there is a lingering problem:
The <code class="language-plaintext highlighter-rouge">trap</code> is executed only later on when the shell exits.
<code class="language-plaintext highlighter-rouge">$tmp</code> might get changed (by accident) or be used for something else.
In that case the <code class="language-plaintext highlighter-rouge">rm</code> will delete whatever file <code class="language-plaintext highlighter-rouge">$tmp</code> points too.</p>

<p>That is because the <em>outer quotes</em> are <em>single quotes</em> while the <em>inner quotes</em> are <em>double quotes</em>:
<em>single quotes</em> prevent evaluation of the command when <code class="language-plaintext highlighter-rouge">the trap</code> statement is executed.
Later on when the trap is executed, the command is evaluated a second time.
That is when the <em>double quotes</em> prevent <code class="language-plaintext highlighter-rouge">$tmp</code> from being split on <code class="language-plaintext highlighter-rouge">$IFS</code>.</p>

<p>So lets look at the following variant:</p>
<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nv">tmp</span><span class="o">=</span><span class="si">$(</span><span class="nb">mktemp</span> <span class="nt">-d</span><span class="si">)</span>
<span class="nb">trap</span> <span class="s2">"rm -r </span><span class="nv">$tmp</span><span class="s2">"</span> EXIT
tmp+<span class="o">=</span><span class="s2">"/subdir"</span>
</code></pre></div></div>

<p><em>Double quotes</em> are now use when the trap is setup:
<code class="language-plaintext highlighter-rouge">$tmp</code> gets inserted here as it is currently defined.
If <code class="language-plaintext highlighter-rouge">$tmp</code> is changed later on, we still delete file temporary file we just created.</p>

<p>But <code class="language-plaintext highlighter-rouge">shellcheck</code> is unhappy now:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>trap "rm -r $tmp" EXIT
            ^-- SC2064 (warning): Use single quotes, otherwise this expands now rather than when signalled.
</code></pre></div></div>
<p>Personally I think <a href="https://www.shellcheck.net/wiki/SC2064">SC2064</a> is a bad advise here as we want to evaluate “$tmp” now and not later.
I want to delete the file <code class="language-plaintext highlighter-rouge">$tmp</code> is pointing to right now, not where it might point to in the future.
I’m not alone with that opinion and <a href="https://github.com/koalaman/shellcheck/issues/1945">issue 1945</a> calls SC2064 questionable.</p>

<p>But there is a bigger problem again:
But what will happen, when the trap fires?</p>

<h2 id="late-quoting">late quoting</h2>

<p>Remember that <code class="language-plaintext highlighter-rouge">$tmp</code> might contain <code class="language-plaintext highlighter-rouge">$IFS</code> characters!
For example I can set <code class="language-plaintext highlighter-rouge">TMPDIR=/tmp/I like blanks</code>.
The trap command will be <code class="language-plaintext highlighter-rouge">rm -f /tmp/I like blanks</code>.
It will fail as there is no file <code class="language-plaintext highlighter-rouge">/tmp/I</code>, <code class="language-plaintext highlighter-rouge">./like</code> and <code class="language-plaintext highlighter-rouge">./blanks</code> — hopefully.</p>

<p>So how do we fix that?
I give you two variants:</p>
<ol>
  <li>Nestes double quoting using backslash-escaping:
    <div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nv">tmp</span><span class="o">=</span><span class="si">$(</span><span class="nb">mktemp</span> <span class="nt">-d</span><span class="si">)</span>
<span class="nb">trap</span> <span class="s2">"rm -r </span><span class="se">\"</span><span class="nv">$tmp</span><span class="se">\"</span><span class="s2">"</span> EXIT
</code></pre></div>    </div>
  </li>
  <li>Single quotes inside double quotes:
    <div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nv">tmp</span><span class="o">=</span><span class="si">$(</span><span class="nb">mktemp</span> <span class="nt">-d</span><span class="si">)</span>
<span class="nb">trap</span> <span class="s2">"rm -r '</span><span class="nv">$tmp</span><span class="s2">'"</span> EXIT
</code></pre></div>    </div>
  </li>
</ol>

<p>Which one is correct?</p>

<p>The answer is very disappointing:
None!</p>

<p>Variant 1 will fail for <code class="language-plaintext highlighter-rouge">TMPDIR=/tmp/\"</code> and variante will fail for <code class="language-plaintext highlighter-rouge">TMPDIR=/tmp/\'</code>.
<code class="language-plaintext highlighter-rouge">$tmp</code> will then be a path containing a <em>double quote</em> in variant 1 and a <em>single quote</em> in variant 2.
Because of the <em>early evaluation</em> <code class="language-plaintext highlighter-rouge">$tmp</code> is inserted as-is during the first evaluation when <code class="language-plaintext highlighter-rouge">trap</code> is setup.
On the 2nd evaluation when the trap is executed, you will have an odd number of quotes!</p>

<h2 id="correct-quoting">correct quoting</h2>

<p>So we need a mechanism to quote <code class="language-plaintext highlighter-rouge">$tmp</code> correctly, so it survives two rounds of evaluation.</p>

<p>Luckily <code class="language-plaintext highlighter-rouge">bash</code> has such a feature:</p>
<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nv">tmp</span><span class="o">=</span><span class="si">$(</span><span class="nb">mktemp</span> <span class="nt">-d</span><span class="si">)</span>
<span class="c"># shellcheck disable=SC2064</span>
<span class="nb">trap</span> <span class="s2">"rm -r </span><span class="k">${</span><span class="nv">tmp</span><span class="p">@Q</span><span class="k">}</span><span class="s2">"</span> EXIT
</code></pre></div></div>

<p><code class="language-plaintext highlighter-rouge">@Q</code> is a <em>operator</em>, which is documented like this in <a href="man:bash(1)">man:bash</a>:</p>
<blockquote>
  <p>The expansion is a string that is the value of parameter quoted in a format that can be reused as input.</p>
</blockquote>

<p>That is exactly what we want:</p>
<ul>
  <li>the outer quotes prevent <code class="language-plaintext highlighter-rouge">$tmp</code> from being split when the <code class="language-plaintext highlighter-rouge">trap</code> is setup.</li>
  <li>the <code class="language-plaintext highlighter-rouge">@Q</code> adds the necessary escaping to also prevent <code class="language-plaintext highlighter-rouge">$tmp</code> from being split when the trap executes.</li>
</ul>

<h2 id="closing-words">closing words</h2>

<p>Be warned that the operator <code class="language-plaintext highlighter-rouge">@Q</code> is a <code class="language-plaintext highlighter-rouge">bash</code>ism:
This is not supported by <code class="language-plaintext highlighter-rouge">ash</code>, <code class="language-plaintext highlighter-rouge">dash</code>, or <code class="language-plaintext highlighter-rouge">busybox sh</code>:
There you have to quote <code class="language-plaintext highlighter-rouge">"</code> and <code class="language-plaintext highlighter-rouge">'</code> manually.
I leave that to you.</p>

<p>I will simply accept <code class="language-plaintext highlighter-rouge">bash</code> and use <code class="language-plaintext highlighter-rouge">@Q</code> as that is much more readable and — most importantly — correct.</p>

<!-- *[FD]: File Daemon -->
<!-- *[FD]: File Descriptor -->
<!-- *[GPT]: Generative Pre-trained Transformer -->
<!-- *[GPT]: Global Partitioning Table -->
<!-- *[GPT]: GUID Partition Table -->]]></content><author><name>Philipp Hahn</name></author><category term="shell" /><summary type="html"><![CDATA[What’s wrong with #!/bin/bash TMPDIR=$(mktemp -d) trap 'rm -r $TMPDIR' EXIT ...]]></summary></entry></feed>