Philipp Hahn

make dependeny tracking

2025-11-22T12:50:00+01:00

Proper dependency tracking in GNU make

make is used to build projects, e.g. compile source code into binaries. If the project consists of multiple files, explicit dependencies must be specified to run the command in the correct order.

In addition to that Makefiles can also be used to track implicit dependencies: If one file is modified, only those commands are re-run which are needed. For large projects that can be a big time-saver if incremental changes are done.

But how to do that properly (for a C project)?

The historical way

Makefile
```
  main: main.o
  main.o: main.c
```
main.c
```
  #include 
  #include "main.h"
```
main.h
```
  #include 
```

In the past many projects implemented that themselves. They used the pre-processor cpp to process all #include statements and then used regular expressions to extract the path of all files, which have been read. These dependencies are then converted into a make fragment, which declares that dependency:

main.o: main.h /usr/include/stdio.h /usr/include/stdint.h

The main Makefiles has to include this fragment using something like -include main.d.

This solution has multiple issues.

Vanishing dependencies

Consider, you refactor your code and remove main.h. In that case your automatically generated dependencies show an issue: As main.o depends on main.h, which no longer is there, make will fail as there is no receipt to remake it.

This fix this your dependency generation tool needs to output empty rules for all dependencies:

main.h:
/usr/include/stdio.h:
/usr/include/stdint.h:

There are three cases:

if the file still exists and was not updated — it is older than the target — no remake is triggered by this dependency — but others may still trigger one.
if the file still exists and was updates — it is newer than the target - a rebuild is triggered for the target.
if the file does no longer exist, make invokes the empty receipt to remake it. The will not really create the file, but make will consider it as newer than the target and continue with the previous case 2 above and remake the target.

Without that any developer would have to invoke make clean to remove all targets and dependency files, resulting in a full rebuild:

.PHONY: clean
clean:
    $(RM) main *.o *.d

Maintaining the dependency tool

First of all you must run the pre-processor a 2nd time to generate the input for you dependency extraction tool. For small projects that cost might be negligible, but for larger projects that might add up.

Second you must maintain yet another tool. While the pre-processed output is relatively easy to parse, newer compiler versions may add new features or change the output slightly, which your tool then must handle also.

Third you must make sure to invoke your pre-process run with exactly the same arguments as your real compilation: Any -Ddefine, -Idirectory, -include, -imacros is important as otherwise you might miss or record wrong dependencies.

You must also decide, when to call your tool: Many projects call it before the actual compilation, but that is unneeded: If the target is missing, make must remake it anyway. If the target exists, but you don’t no longer have the dependency information, you must also remake the target as you cannot guarantee, that any (changed) header might not introduce a significant change.

Generating the dependency information afterwards looks okay. But you might get into situations, where you have stale information, for example if you interrupt make between the compilation and dependency-gathering steps.

Best would be to do it at the same time. Luckily that is possible with gcc and other modern compilers like clang.

The gcc way

Luckily modern GCC has built-in support to generate dependency information in make-syntax itself:

-M enables generating dependency information instead of compiling the file. The output is written to STDOUT unless -o is used to redirect it to a file.
-MM similar to the above, but system header files are not mentioned.
-MD and -MMD are variants of -M and -MM respectively, which generate dependency information in addition to the requested action, e.g. -c to compile the unite.
-MF file writes the information to the given file instead of STDOUT.
-MP adds additional .PHONY targets for all dependencies to solve the Vanishing dependencies problem from above.
-MT target allows to overwrite the target name. By default the base-name of the main input file is used, where the suffix is replaced by .o.
-MQ target is the variant of the above, which also quotes any make meta-characters to make sure, the name is not mangled by make but reaches the shell command as-given.

So let’s rewrite our Makefile and try this:

main: main.o

%.o %.d &: %.c
	$(CC) $(CPPFLAGS) $(CFLAGS) -MMD -MF $*.d -MP -c -o $*.o $<

-include *.d

‘&: tells make`, that the recipt generated both files at the same time. (grouped targets)
-MMD tells gcc to both compile and generate dependency information at the same time. System header files are excluded.
-MF $*.d tells gcc to write the dependency information into a file with the file name extension .d.
-MP tells gcc to generate .PHONY targets for all included file to make the dependency information future-proof in case one of them gets deleted.
-c -o $*.o $< to compile the unit.
-include *.d includes the dependency information as far as it already exists

First compilation issue

This does not work as expected: make has a built-in mechanism to Remake Makefiles. All files included via include are considered Makefiles and make tries to update them. If there is no file *.d, make applies our rule and will try to compile *.c to *.d :-( (That is why the above rule already uses $*.o instead of $@ as the later would be *.d, which then is passed to both -MF and -o with catastrophic results.)

We can avoid this by explicitly using $(wildcard ) to include only the existing files:

-include $(wildcard *.d)

Second compilation issue

While the solution looks okay, actually it is not: This way dependency information is optional. If you delete all dependency files *.d, modify main.h and re-run make: Nothing will happen. We lost the information, that main.o depends on main.h. Therefore we must change the rule to always require the associated file $*.d to always exist:

%.o: %.c %.d
	$(CC) $(CPPFLAGS) $(CFLAGS) -MMD -MF $*.d -MP -c -o $@ $<
%.d: ;
.NOTINTERMEDIATE: %.d

the empty rule for %.d is needed for make to handle the case, when the file is missing. For that case we tell make that it should consider that file as remade, so it newer than the target. That will remake the target to actually generate the real dependency information.
the .NOTINTERMEDIATE is needed as %.d is never mentioned as a real target. make will search its chain of implicit rules main → main.o → main.d and mark it as intermediate. Because of that the file is not remade and/or will be deleted if it is remade. By marking it as non-intermediate we tell make to handle it as a regular file and to keep it afterwards.

This is only available since GNU make 4.4!

Final version — make 4.4

#!/usr/bin/make -f
# Disable built-in rules and variables
MAKEFLAGS += --no-builtin-rules

main: main.o

CFLAGS := -g
MYCFLAGS := -Wall -Werror
DEPFLAGS = -MMD -MP -MF $*.d -MT $@

COMPILE.c = $(CC) $(DEPFLAGS) $(CPPFLAGS) $(CFLAGS) $(MYCFLAGS) -c

%.o: %.c %.d
	$(COMPILE.c) $(OUTPUT_OPTION) $<
%.d: ;
.NOTINTERMEDIATE: %.d
%: %.o
	$(LINK.o) $^ $(LOADLIBES) $(LDLIBS) -o $@

-include $(wildcard *.d)

.PHONY: clean
clean:
	$(RM) main *.o *.d

Final version — make 4.3

#!/usr/bin/make -f
# Disable built-in rules and variables
MAKEFLAGS += --no-builtin-rules

SRCS := main.c
OBJS := $(SRCS:%.c=%.o)
DEPS := $(SRCS:%.c=%.d)

main: $(OBJS)

CFLAGS := -g
MYCFLAGS := -Wall -Werror
DEPFLAGS = -MMD -MP -MF $*.d -MT $@

COMPILE.c = $(CC) $(DEPFLAGS) $(CPPFLAGS) $(CFLAGS) $(MYCFLAGS) -c

%.o: %.c %.d
	$(COMPILE.c) $(OUTPUT_OPTION) $<
$(DEPS):

-include $(wildcard $(DEPS))

.PHONY: clean
clean:
	$(RM) main $(OBJS) $(DEPS)

The kbuild way

The Linux kernel uses its own build system called kbuild, which is based on a bunch of make receipts. It has some additional requirements:

The Linux is heavily configurable. There is a huge .config file, which lists all options. If that file would be used as a pre-dependency, all such files would get rebuilt each time a single option was changed. Therefore kbuild uses some mechanisms to split that big file into smaller chunks, so that each compilation unit can just depend on those options, it really depends on.
The above solution does not track the $(…FLAGS) variables or $(CC). Changing them might a complete rebuild to have a consistent kernel again. As such kbuild logs the final command used to compile the target also in the dependency information file. On the next run the commands are compared and the invocation may only be skipped, if they match.

For that kbuild overwrites most of makes dependency mechanism with its own implementation:

Most targets have FORCE as their pre-dependency, so that the receipt will always run.
The receipt itself will then use some heavy macro magic to read back its dependency information from a file and compare that to the actual run. The command is only executed if any pre-requisite is changed or any relevant configuration option is changed.
If a command cannot determine, if it needs to run, it will run by default but will write its output to a temporary file. That file is then compared to the previous version.
- if the content differs, the temporary file is renamed over the real output file.
- if the content did not change, the temporary file is deleted. That way the old time stamp is preserved if no change did happen. This is done to prevent needless downstream rebuilds.

Closing word

Much of this was inspired by the article Auto-Dependency Generation from Paul D. Smith. Thank you very much for writing this in the first place. The main difference is, that he uses a variable $(SRCS), which explicitly lists all source C files. That way he can explicitly name the expected *.o and *.d files, which bypasses the problem with intermediate files from my solution above. That version also works for make 4.3 an earlier as .NOTINTERMEDIATE is only available since make 4.4.

Padding and alignment of C structs

2025-10-01T10:46:00+02:00

Q: How to debug padding and alignment issues of C struct?

A: gdb --silent --batch -ex 'ptype /o struct my_t' some.o

The past days I was investigating some performance issues with a proprietary SoC: The code consists of closed-source pre-compiled binaries combined with public header files. Some public glue-code added accessors to allocate, copy, and free the data structures.

Padding

We have the requirement to extend the data-structure and add some additional members. As some code was close-sourced, it is important to not change the layout of the existing structure. Luckily C adds padding between members to align the next member according to common hardware constraints:

The start address of a 1,2,4,8,16,32,64,… sized member must align to that size.

Consider this:

#include 
struct my0_t {
    uint8_t foo;
    uint32_t bar;
} var0[3];
static_assert(sizeof(var0) == 8 * 3, "Unexpected sizeof");

bar has size 32 bits or 4 bytes, so var0[0] must be placed into memory so that &(var[0].bar) % 4 == 0 is true. The C-compiler will thus add padding bytes before bar to satisfy that requirement. Compiling the code with -Wpadded shows this:

$ gcc -c -g -Wpadded c-padding.c
c-padding.c:4:14: warning: padding struct to align ‘bar’ [-Wpadded]
    4 |     uint32_t bar;
      |              ^~~

But you don’t know, what the C compiler does here: gcc may either insert padding before foo or after it:

struct my1_t {
    uint8_t foo;
    uint8_t _padding[3];
    uint32_t bar;
} var1[3];
static_assert(sizeof(var1) == 8 * 3, "Unexpected sizeof");

struct my2_t {
    uint8_t _padding[3];
    uint8_t foo;
    uint32_t bar;
} var2[3];
static_assert(sizeof(var2) == 8 * 3, "Unexpected sizeof");

Both are valid, but all I have ever seen is padding being inserted after the previous member and before the next member.

But you can use gdbs ptype command to dump the exact layout including offset, size and inserted padding:

$ gdb --silent --batch -ex 'ptype /o struct my0_t' c-padding.o
/* offset      |    size */  type = struct my0_t {
/*      0      |       1 */    uint8_t foo;
/* XXX  3-byte hole      */
/*      4      |       4 */    uint32_t bar;
                               /* total size (bytes):    8 */
                             }

For this to work you need DWARF debugging information. So please make sure you compile your code with -g enabled!

This extra padding increases the size of your struct, which might be undesired: On embedded systems you often have less memory and excessive padding might waste a lot of memory. To minimize this, you have multiple options:

Packing

You can declare the struct as packed:

struct my3_t {
    uint8_t foo;
    uint32_t bar;
} __attribute__((packed)) var3[3];
static_assert(sizeof(var3) == 5 * 3, "Unexpected sizeof");

This makes the struct as compact as possible by not inserting any padding automatically. But you will get into trouble and risk getting a SIGBUS error on some architectures: Accessing a 32 bit variable which is not 4 byte aligned requires additional work:

Either the hardware has some extra logic to split non-aligned memory access into multiple accesses and to recombine both parts into the final value,
Or the compiler has to generate extra code to not do the unaligned access,
Or your program terminates with SIGBUS as the processor raises the unaligned trap

Please do not use -fpack-struct to make every struct packed by default!

Re-ordering descending by size

Most often you can re-order your members descending by size – assuming sizes being a power-of-two. The compiler still adds padding, but only at the end of the structure. That way you do not have holes in the middle.

That changes the layout and breaks any ABI compatibility! So not not do this with structs, which are used to communicate with your hardware or some closed source binary, which assumes the old layout.

Careful re-ordering

If you only need to insert some small data, look for those hole: As the compiler added padding there automatically, there is no guarantee that these bits/bytes are zero initialized. If you need that guarantee, you must manually insert padding bytes!

On the other hand that provides the opportunity, to re-use those undefined bits for additional members. Just look for a hole which is large enough for your data and add your member in between the members bordering that hole.

Just be careful with structures which are used with hardware: If their accessor function does a memset(…, 0, …) to initialize the struct to zero, it might be important that those bits remain cleared. If you then start using those bits, the hardware might get confused.

Alignment

You might have noticed, that sizeof(struct my0_t) == 8 and not 5 == sizeof(uint8_t) + sizeof(uint32_t). gcc also adds padding before or after all members to extend the struct, until its sizeof if a natural multiple of the widest element. This is important for arrays where multiple instances are placed after each other. There each instances start address must be aligned properly, which requires padding in between. The distance between two elements is called “stride size”, which equals the sizeof.

This also applies to nested structs like this:

struct my5_t {
    struct my4_t baz;
    uint8_t bla;
} var5[3];
static_assert(sizeof(var5) == 12 * 3, "Unexpected sizeof");

This might be unexpected as my4_t ends with 3 padding bytes, where bla might fit it. Instead baz gets placed after the padding from baz, after which 3 more padding bytes are required. So in total you get 6 bytes of padding.

Cache line size

Alignment becomes even more important for performance. Modern CPUs have lots of caches and their line size specifies the smallest quantity for data transfer. Even when you only require a single bit, the cache will transfer 32 or 64 or even more bytes from RAM.

with more tightly packed structs you get more data per cache-line and require fewer cache-lines, leaving more free cache lines for other tasks.
on the other hand false sharing might become a performance issue with multi-threading, where data with different access patterns are stored in the cache line.

struct my6_t {
    uint8_t foo;
    uint32_t bar;
} __attribute__((aligned(32))) var6[3];
static_assert(sizeof(var6) == 32 * 3, "Unexpected sizeof");

In this case we get 3 bytes of padding between foo and bar. But we also get 24 bytes of padding after bar to make sizeof(struct my6_t) a multiple of 32 as requested by __attribute__((aligned(32))).

This easily becomes worse with nested ``structs where inner struct`s also have alignments:

struct my7_t {
    uint8_t foo;
    struct inner {
        uint8_t foo;
    } __attribute__((aligned(32))) bar[4];
} var7[3];
static_assert(sizeof(var7) == 64 * 3, "Unexpected sizeof");

Runnig gdb shows what happens:

$ gdb --silent --batch -ex 'ptype /o var7' c-padding.o
type = struct my7_t {
/*      0      |       1 */    uint8_t foo;
/* XXX 31-byte hole      */
/*     32      |      32 */    struct inner {
/*     32      |       1 */        uint8_t foo;
/* XXX 31-byte padding   */
                                   /* total size (bytes):   32 */
                               } bar;
                               /* total size (bytes):   64 */
                             } [3]

Summary

Use gccs -Wpadded to get a warning.
Use gdbs ptype to print the real layout.
Verify your assumtions, especially if the same code is compiled for multiple platforms with different alignment requirements.
Do not trust the comments in the code claiming ancient values for sizeof or proper cache line alignment.
Explicitly add padding bytes as they are then also initialized; otherwise the compiler may do as it likes.

Linux Kernel Module Symbol Versioning

2025-08-23T12:59:00+02:00

The Linux kernel itself and its modules may export symbols, so that other modules can import and use them. As the functions are written in C, it is important that the function signature matches:

the number of arguments must match
the ordering of the arguments must match
the data types must match, which includes the structure and layout of all input and output parameters

If any of them changes, the Application Binary Interface (ABI) changes and you risk crashing the kernel. If you’re lucky, recompiling the kernel and the modules is enough for both ends to pick up the new Application Programming Interface (API).

To detect such breaking changes, the Linux kernel can be compiled with CONFIG_MODVERSIONS enabled: This calculates a Cyclic Redundancy Check (CRC) checksum over the function signature and embeds this information with the kernel and the modules. The dynamic linker of the Linux kernel checks, that for each requested symbol its CRC matches the CRC of the Linux kernel or already loaded modules. A module is only loaded, if a match is found for all symbols. Otherwise loading fails.

Rust goes DWARF

The mechanism described here does not work with Rust. As such the Linux kernel learned a new trick and can use the DWARF (Debugging With Arbitrary Record Formats) debugging information to calculate the CRC. When CONFIG_RUST is enabled, gendwarfksyms is used instead of genksyms. Both versions are incompatible as they calculate different CRCs for the same function. But they work similar enough, so I will not go into details here. If you’re interested, look for CONFIG_EXTENDED_MODVERSIONS.

Executable and Linkable Format

Linux Kernel modules object files using the Executable and Linkable Format (ELF). Instead of using the well-known suffix .o, they use the suffix .ko, but are otherwise the same. They are comprised of multiple sections containing executable code, read-only constants, initialized data and other informations required for linking.

Example: ELF sections of a Linux Kernel Module

Shell-trivia #3: set -e

2025-08-13T08:39:00+02:00

Es gab bereits zwei Blog-Eintrag Shell-trivia #1 und Shell-trivia #2 zum Thema set -e. Mein Kollege N. Schier hat mich heute Morgen aber mit einer weiteren Shell-Absurdität überrascht:

#!/bin/sh
set -e
date && false && true
date

Wie häufig wird date ausgeführt?

Wie üblich muss man die Manual-Page von bash sehr genau lesen:

The ERR trap is not executed if the failed command is … part of a command executed in a && or || list except the command following the final && or ||.

Die korrekte Antwort lautet also: 2

Beim date && false && true Endet die Ausführung nach dem false und der Exit-Code ist 1: Das nachfolgende && … wird nicht mehr ausgeführt. Da das false aber dadurch nicht der letzte Befehl ist, bricht set -e nicht ab und das 2. date wird trotzdem ausgeführt.

Das sollte man bedenken, wenn einem shellcheck folgende Warnungen ausspuckt und einen dazu anregen, mehr && zu benutzen:

SC2015: Note that A && B || C is not if-then-else. C may run when A is true.
SC2166: Prefer [ p ] && [ q ] as [ p -a q ] is not well-defined.

Man baut sich dadurch leicht semantische Unterschiede ein.

Oder um es mit den Worten von G. Aschemann aus 1995 zu sagen:

Jedes gute Shell-Script fängt mit #!/usr/bin/perl an.

Naja, das ist 30 Jahre her und ich würde doch perl durch python ersetzten wollen 😉

Debian 13 Trixie released

2025-08-11T10:31:00+02:00

Last Saturday - 2025-08-09 - Debian 13 “Trixie” has been released after 2 years of work. 🥳

I just updated my laptop and servers and stumbled upon some issues:

`cyrus-imapd`

I’m running my own mail server infrastructure: I have grown up with Unix-to-Unix-Copy-Protocol (UUCP) and was an admin for UUCP Freunde Lahn e.V. for a long time. I’m still using Postfix with Cryus IMAPd and never switched to Dovecot.

After the upgrade I noticed that no mails were delivered: mailq shown a growing list of mails stuck in queue. Postfix was complaining that its lmtp service was no longer able to establish an encrypted connection to lmtpd from Cyrus.

For historic reason my setup is using STARTTLS, which is now deprecated and has been disabled by default in Cyrus IMAPd. You have to explicitly re-enable it in your /etc/imapd.conf by adding some lines:

imap_allowstarttls: yes
lmtp_allowstarttls: yes

`saslauthd`

saslauthd.service failed to start as I had to move its UNIX socket to /var/spool/postfix/var/run/saslauthd/. This also moves the location of the PID file to that directory, which then no longer matches the information in /usr/lib/systemd/system/saslauthd.servie, which expects the file in /var/run/saslauthd.pid.

A fixed this by creating an override with systemctl edit saslauthd.service:

[Service]
PIDFile=/var/spool/postfix/var/run/saslauthd/saslauthd.pid

Previously I had a shell-hack in /etc/default/saslauthd to replace the old location with a symbolic link to the chroot-location. This no longer works as that file is not sourced by systemd, which does not execute that shell code. Therefore I had to tell Cyrus IMAPd to also use that changed location by putting this into /etc/imapd.conf:

sasl_saslauthd_path: /var/spool/postfix/var/run/saslauthd/mux

PS: On a side node: /var/run/ is deprecated and should be replaced by just /run/; systemd already complains about this every time it sees /var/run/.

`docker.io` and `libvirt`

For some unknown reason docker.io and libvirt got removed during the upgrade. Running apt autopurge afterwards was a very bad idea as that purged all images, volumes and containers. 🤦

I have to investigate why that happened. 🔍

PHP-8.4

Debian-13-Trixie has PHP-8.4, while Debian-12-Bookworm had PHP-8.2. My local NextCloud (and Wordpress) setup was unhappy about that as it needs several php8.4-… packages. Luckily just installing the equivalent of the matching packages did fix this.

KDE

In the past I did not install kde-full as it depends on many optional packages like KMail, KOrganizer, DragonPlayer, and such. I don’t use may of those and thus don’t want them to be installed. During the upgrade plasmashell got removed so on the next login I did not get back a working KDE session. Installing kde-standard fixed this. As it only Recommends most other packages, I was able to get rid of those packages I do not want.

And I got Wayland, which has this annoying bug: Konsole no longer stores the open sessions and starts with only one shell in $HOME. 🤔

Out-of-space `/usr`

My desktop system has many packages. Upgrading all those (KDE-)libraries required too much space on /usr. dpkg failed to unpack a package during upgrade.

After some manual dpkg --configure --pending, apt install --fix-broken, apt autopurge and dpkg -P I was finally able to continue. I would have expected for APT to check for enough disk space, but apparently it does not. So double-check manually before doing an upgrade.

PS: Afterwards systemd complains about usr-not-merged, but that is normal and expected.

KeePassXC

I used a self-compiled version of KeePassXC. Debian now has two packages keepassxc and keepassxc-full – the later has support for browser-integration and more. As some file have been move, the upgrade failed and I had to manually remove by self-compiled version.

Network

Running the upgrade while being logged into KDE is not a good idea: During the upgrade NetworkManager got restarted and killed my local network connection. Afterward even ping did no longer work, as I already had the new version but still the old Linux kernel.

Sadly I still need my r8168-dkms and v4l2loopback-dkms packages.

Prometheus MySQL/MariaDB exporter

v0.15.0 has a breaking change, which is neither mentioned in any NEWS file nor the debian/changelog.

DATA_SOURCE_NAME is no longer supported and you must pass the credentials via --mysqld.username= and via MYSQLD_EXPORTER_PASSWORD=.
You also cannot specify the UNIX domain socket /run/mysqld/mysqld.sock

I’m now using --config.my-cnf /var/lib/prometheus/mysql.cnf to configure the credentials via another file.

Mailman3

cron

mailman3-web still runs a CRON job every minute, which imports robot_detection, which spams you with a ton of SyntaxWarnings. See mailman3-web#1082541 and python3-robot-detection#1078661

Edit /etc/cron.d/mailman3-web and add 2>/dev/null to each command.

authentication

Authentication was now broken for me: /var/log/mailman3/web/mailman-web.log complains about Missing column 'socialaccount_socialapp.provider_id'. Run /usr/bin/mailman-web migrate as user root to fix this.

template

/var/log/mailman3/web/mailman-web.log showed another error:

django.template.exceptions.TemplateSyntaxError: ‘humanize’ is not a registered tag library.

Adding django.contrib.humanize TO INSTALLED_APPS in /etc/mailman3/mailman-web.py fixes this.

shell `trap` signal

2025-06-30T07:47:00+02:00

What’s wrong with signal handling like this:

#!/bin/sh
trap 'echo Cleanup…' EXIT HUP INT TERM
...

Exit and signals

Before we begin: Actually exit codes are mutual exclusive to signal statuses: A process may either exit normally using exit or terminate via a signal.

If you read man:bash you will read this:

The return value of a simple command is its exit status, or 128+n if the command is terminated by signal n.

That might give you the idea, that they are the same, but that is only a (broken) shell convention to map signal statuses to exit codes. Reading man:exit you see this:

The value status & 0xFF is returned to the parent process as the process’s exit status,

So there are 256 exit codes from 0 to 255, which a process can use to exit.

The parent process then uses waitpid() to wait for the childs state change:

That may be the process exited by calling exit() itself or caught a signal(), which might have kill()ed the process or just suspended it.

You then have to first use WIFEXITED() or WIFSIGNALED() to check, if the child exited normally via exit() or caught a signal(). Only after that you should either use WEXITSTATUS() to extract the byte containing the exit code or use WTERMSIG() to extract the signal number.

In a shell script you do not have access to these low-level C functions, but only get the mangled exit status. You cannot distinguish is the called process did exit(130) itself or was terminated by the user pressing Ctrl-C so send SIGINT to it.

Signals and EXIT trap

Here’s a short overview of commonly used signals and traps.

signal	number	trigger	when
EXIT	“0”	`exit`	shell process exits
SIGHUP	1		login TTY closed
SIGINT	2	Ctrl-C	user aborts process
SIGQUIT	3	Ctrl-\	user aborts process
SIGTERM	15		`kill $PID`

Please not that shells misuse signal 0 here: By default there is not signal numbered 0. Actually it is a no-operation and can be used to check, if process A can send signals to process B or if process B is still alive. bash and other shells re-use that number to give their EXIT handler a number, which is supposed to be called on any exit from shell. But that behaviour is very implementation dependant as you will see later on.

Implementation specific handling of EXIT

Let’s try this with the more informative shell script trap.sh:

#!/bin/bash
cleanup () {
    local rv=$? sig=${1:-0}
    echo "Process $$ received signal $sig after rv=$rv"
    case "$sig" in
    0|'') exit "$rv";;
    *) trap - "$sig"; kill "-$sig" "$$";;
    esac
}
trap 'cleanup 0' EXIT
trap 'cleanup 1' HUP
trap 'cleanup 2' INT
trap 'cleanup 3' QUIT
trap 'cleanup 15' TERM

[ -n "${1:-}" ] && kill "-$1" "$$"

bash

$ bash ./trap.sh 0  # EXIT
Process 499218 received signal 0 after rv=0
$ bash ./trap.sh 1  # SIGHUP
Process 499237 received signal 1 after rv=0
Process 499237 received signal 0 after rv=0
Hangup
$ bash ./trap.sh 2  # SIGINT
Process 499256 received signal 2 after rv=0
Process 499256 received signal 0 after rv=0

$ bash ./trap.sh 3  # SIGQUIT
Process 499275 received signal 3 after rv=0
Process 499275 received signal 0 after rv=0
$ bash ./trap.sh 15  # SIGTERM
Process 499294 received signal 15 after rv=0
Process 499294 received signal 0 after rv=0
Terminated

As you can see bash always calls the trap handler for EXIT!

dash

Let’s repeat this with dash:

$ dash ./trap.sh 0  # EXIT
Process 502873 received signal 0 after rv=1
$ dash ./trap.sh 1  # SIGHUP
Process 501892 received signal 1 after rv=0
Hangup
$ dash ./trap.sh 2  # SIGINT
Process 501912 received signal 2 after rv=0

$ dash ./trap.sh 3  # SIGQUIT
Process 501929 received signal 3 after rv=0
Verlassen (Speicherabzug geschrieben)
$ dash ./trap.sh 15  # SIGQUIT
Process 501971 received signal 15 after rv=0
Terminated

busybox

And once more with busybox:

$ busybox sh ./trap.sh 0  # EXIT
Process 502338 received signal 0 after rv=0
$ busybox sh ./trap.sh 1  # SIGHUP
Process 502366 received signal 1 after rv=0
Hangup
$ busybox sh ./trap.sh 2  # SIGINT
Process 502402 received signal 2 after rv=0

$ busybox sh ./trap.sh 3  # SIGQUIT
Process 502439 received signal 3 after rv=0
Process 502439 received signal 0 after rv=0
$ busybox sh ./trap.sh 15  # SIGTERM
Process 502269 received signal 15 after rv=0
Terminated

There EXIT is almost never called, except by busybox on SIGQUIT.

That is why portable shell scripts setup trap not only for EXIT, but also for other SIGnals.

But if you do that, please make sure to do it right:

Reset the trap handler to its default.
Afterwards kill the process by re-sending the received signal to the process again.

Why proper trap handling is important

Viacheslav Biriukov wrote a great blog post about Process groups, jobs and sessions explaining why proper exiting is important. A program might setup a signal handler for SIGINT to prevent the program from just terminating, which might loose important data. It might ask the user if terminating is okay or if the data should be saved first before quitting. A surrounding shell script must then decide, if this is an abnormal exit and should terminate itself afterwards, or should continue normally. The UNIX convention is to transfer that detail via exit codes and signal statuses. So be careful and do it right if your shell script starts using trap.

Conclusion

Use bash as it has consistent handling of trap EXIT.
If you want to or must use other shells: Do not use the same cleanup trap of EXIT and other signals.
If you trap signals, make sure to reset the handler and to re-raise the signal to properly propagate them.

shell `trap` and proper quoting

2025-06-28T07:45:00+02:00

What’s wrong with

#!/bin/bash
TMPDIR=$(mktemp -d)
trap 'rm -r $TMPDIR' EXIT
...

Let’s ask shellcheck:

No issues detected!

Actually there are multiple issues:

TMPDIR

Please do not assign to TMPDIR as that variable in an input parameter to mktemp itself: When you read man:mktemp your will find this for option -p:

if DIR is not specified, use $TMPDIR if set, else /tmp.

The variable is used for example by pam_tmpdir to setup per user temporary directories to improve security on multi-user systems. By using TMPDIR inside your script to store the path of your specific temporary directory, you risk chanhing the behavior of other called child-processes also using mktemp. Other equivalent implementations thereof) also use TEMP and TMP, so better do not use these as well.

So lets use tmp:

tmp=$(mktemp -d)
trap 'rm -r $tmp' EXIT

IFS

By default man:mktemp will only create safe file names, e.g. none containing blanks and characters of $IFS. Remember that $IFS is used by the shell to split every argument — which is not quoted — into multiple arguments. By default it is set to space, tab and newline. But you can redefine or extend it, after which hell breaks loose:

$ bash -c 'IFS="$IFS/."; . my-trap-script"
rm: cannot remove '': No such file or directory
rm: cannot remove 'tmp': No such file or directory
rm: cannot remove 'user': No such file or directory
rm: cannot remove '1000': No such file or directory
rm: cannot remove 'tmp': No such file or directory
rm: cannot remove 'XyJlR6AHpn': No such file or directory

Luckily $IFS is re-set for each shell to its default value, but do keep that in mind when you fiddle with $IFS. My advise is to do that only in functions and to use local IFS there to have the change confined to only inside the function.

quoting

To prevent $IFS-splitting you have to quote arguments. So let’s try with this:

tmp=$(mktemp -d)
trap 'rm -r "$tmp"' EXIT

You may wonder, why I didn’t quote tmp=$(…) as the spitting occurs after command substitution? For that you have to read man:bash very carefully. In section parameters you have this:

All values undergo tilde expansion, parameter and variable expansion, command substitution, arithmetic expansion, and quote removal.

Compare that to section expansion:

There are seven kinds of expansion performed: brace expansion, tilde expansion, parameter and variable expansion, command substitution, arithmetic expansion, word splitting, and pathname expansion.

The important difference here is, that parameter assignment expects a single argument and this word splitting does not occurs there. So no quoting is needed for parameter assignments, but you can do it for consistency — it does not hurt.

late vs. early evaluation

While shellcheck is happy, there is a lingering problem: The trap is executed only later on when the shell exits. $tmp might get changed (by accident) or be used for something else. In that case the rm will delete whatever file $tmp points too.

That is because the outer quotes are single quotes while the inner quotes are double quotes: single quotes prevent evaluation of the command when the trap statement is executed. Later on when the trap is executed, the command is evaluated a second time. That is when the double quotes prevent $tmp from being split on $IFS.

So lets look at the following variant:

tmp=$(mktemp -d)
trap "rm -r $tmp" EXIT
tmp+="/subdir"

Double quotes are now use when the trap is setup: $tmp gets inserted here as it is currently defined. If $tmp is changed later on, we still delete file temporary file we just created.

But shellcheck is unhappy now:

trap "rm -r $tmp" EXIT
            ^-- SC2064 (warning): Use single quotes, otherwise this expands now rather than when signalled.

Personally I think SC2064 is a bad advise here as we want to evaluate “$tmp” now and not later. I want to delete the file $tmp is pointing to right now, not where it might point to in the future. I’m not alone with that opinion and issue 1945 calls SC2064 questionable.

But there is a bigger problem again: But what will happen, when the trap fires?

late quoting

Remember that $tmp might contain $IFS characters! For example I can set TMPDIR=/tmp/I like blanks. The trap command will be rm -f /tmp/I like blanks. It will fail as there is no file /tmp/I, ./like and ./blanks — hopefully.

So how do we fix that? I give you two variants:

Nestes double quoting using backslash-escaping:

tmp=$(mktemp -d)
trap "rm -r \"$tmp\"" EXIT

Single quotes inside double quotes:

tmp=$(mktemp -d)
trap "rm -r '$tmp'" EXIT

Which one is correct?

The answer is very disappointing: None!

Variant 1 will fail for TMPDIR=/tmp/\" and variante will fail for TMPDIR=/tmp/\'. $tmp will then be a path containing a double quote in variant 1 and a single quote in variant 2. Because of the early evaluation $tmp is inserted as-is during the first evaluation when trap is setup. On the 2nd evaluation when the trap is executed, you will have an odd number of quotes!

correct quoting

So we need a mechanism to quote $tmp correctly, so it survives two rounds of evaluation.

Luckily bash has such a feature:

tmp=$(mktemp -d)
# shellcheck disable=SC2064
trap "rm -r ${tmp@Q}" EXIT

@Q is a operator, which is documented like this in man:bash:

The expansion is a string that is the value of parameter quoted in a format that can be reused as input.

That is exactly what we want:

the outer quotes prevent $tmp from being split when the trap is setup.
the @Q adds the necessary escaping to also prevent $tmp from being split when the trap executes.

closing words

Be warned that the operator @Q is a bashism: This is not supported by ash, dash, or busybox sh: There you have to quote " and ' manually. I leave that to you.

I will simply accept bash and use @Q as that is much more readable and — most importantly — correct.

Stange GitLab shell eval behaviour

2025-06-05T13:35:00+02:00

Two colleges of mine contacted me this week with a strange GitLab runner behaviour. The following pipeline did not succeed:

original:
  script:
    - "sh -c 'exit 42'"
  allow_failure:
    exit_codes:
      - 42

The sh -c 'exit 42' launches a sub-process, which exits with return value 42. In my case I was launching some linter, which indicated a special condition by returning that code.

GitLab executes commands within a shell, where set -e is used. This terminates the sequence of commands as soon as a command does not succeed, e.g. returns 0. But instead of getting 42 as the result, GitLab reportes 1 as the exit status for the job. As that code is not listed in exit_codes, the job failes with an error instead of okay with warnings.

Working alternatives

In contrast to that the following two jobs did work as expected:

direct:
  script:
    - "exit 42"
fail:
  script:
    - "sh -c 'exit 42' || exit $?"

Debugging job failures

Using variables: CI_DEBUG_TRACE: true did not show anything strange: GitLab sets up a trap handler called runner_script_trap to collect the return code of the failing command and converts it into a JSON output, which is then consumed by GitLab. There the wrong return value 1 was also visible.

But actually it showed another hint: GitLab executed the job by generating a shell script in a temporary file, which then gets executed. By using cp "$0" ./debug.txt combined with artifacts: paths: [debug.txt] you can get a hold on it. Downloading that artifact shows the following (abbreviated) content:

runner_script_trap() { exit_code=$?; echo JSON… }
trap runner_script_trap EXIT
set -x -e -o pipefail +o noclobber
: | eval $'export=CI… CODE…'

It sets up a trap handler to record the exit status as JSON.
It sets up the environment to fail on error.
It executes the ‘script’ code as part of a shell pipeline using eval.

Several experiments

The last part is the relevant thing here and shows very some strange behaviour. Modifying the command only slightly changes the return value:

$ bash -e -c ':|eval "sh -e -c \"exit 12\""'; echo $?
1
$ bash --posix -e -c ':|eval "sh -e -c \"exit 12\""'; echo $?
1
$ busybox sh -e -c ':|eval "sh -e -c \"exit 12\""'; echo $?
12
$ dash -e -c ':|eval "sh -e -c \"exit 12\""'; echo $?
12
$ sh -e -c ':|eval "sh -e -c \"exit 12\""'; echo $?
12
$ bash -e -c 'eval "exec sh -e -c \"exit 12\""'; echo $?
12
$ bash -e -c 'eval "sh -e -c \"exit 12\" || exit \$?"'; echo $?
12
$ bash -e -c 'eval "sh -e -c \"exit 12\""'; echo $?
12
$ bash -e -c ':|sh -e -c "exit 12"'; echo $?
12
$ bash -e -c ':|eval "exit 12"'; echo $?
12
$ bash -e -c ' : |(eval "sh -e -c \"exit 12\"")'; echo $?
12

So this is some strange behaviour when | is combined with eval in bash. Looks like to be bash bug 109840.

The fix

We fixed this strange GitLab behaviour by enabling the Runner Feature Flag FF_USE_NEW_BASH_EVAL_STRATEGY. This puts one level of parenthesis around the eval command to run it in a sub-shell – see last example from above.

This can be done in the pipeline itself by setting the variable to true or 1:

fixed:
  variables:
    FF_USE_NEW_BASH_EVAL_STRATEGY: "true"
  script:
    - "sh -c 'exit 42'"
  allow_failure:
    exit_codes:
      - 42

Closing words

So be careful when using allow_failure with exit_codes and calling external programs. Make sure to either enable the feature flag or use exec … or … || exit $? to really exit the shell.

The original GitLab issue 27668 has some more details. Sadly the feature flag is still not enabled by default as of today, so please vote for issue 27909.

AVM FRITZ!Smart Energy 200 CSV issues

2025-04-30T14:24:00+02:00

Ich besitze privat mehrere (~25) Geräte meines Arbeitgebers AVM GmbH, darunter u.a. ein schaltbare Steckdose FRITZ!Smart Energy 200, früher bekannt als FRITZ!DECT 200. Diese erfasst auch den Energieverbrauch der angeschlossenen Geräte. Diese werden aggregiert für 24 Stunden, 1 Woche, 1 Monat oder 1-2 Jahr als CSV-Datensatz zur Verfügung gestellt. Diesen kann man sich entweder bei Bedarf über die Web-Oberfläche der FRITZ!Box herunterladen oder sich regelmäßig als Push-Service-Mail zuschicken lassen.

Prozedur

Zunächst ein Blick auf die Prozedur, wie man an die Daten kommt.

Kritik 1: Push-E-Mail

Der Push-Service hat (zumindest bei mir) jahrelang nicht funktioniert. Ursache war, dass ich irgendwann meinen Provider gewechselt habe. Als Absenderadresse war aber weiterhin die Anmeldekennung meines alten Providers als Absender-E-Mail-Adresse eingetragen. Die hat dann irgendwann nicht mehr funktioniert und von da an landeten alle Mail im Nirgendwo.

Bitte in der Benutzeroberfläche irgendwo anzeigen, wenn es Probleme mit der Push-E-Mail gibt.

Kritik 2: FRITZ!Box Web-Oberfläche

Die Navigation durch die Web-Oberfläche ist auch alles andere als intuitiv:

Die Konfiguration des globalen Push-Services ist unter System → Push Service
In der Übersicht wird die 200 zwar als Smart Home gerät angezeigt, aber nicht mit einem direkten Link zu deren Einstellungen
Diese findet man erst unter Smart Home → Geräte und Gruppen → Gerätename → Einstellungen → Allgemein bzw. Energieanzeige → Gesamtenergie (kWh)

Bitte verkürzt den Weg, um an die Information zu kommen.

Kritik 3: API

Für die automatische Weiterverarbeitung ist E-Mail suboptimal:

Man muss diese irgendwie per IMAP oder POP3 abholen
Man muss die E-Mail parsen und nach dem CSV-Anhang durchsuchen
Man muss dann die Daten irgendwo ablegen.

Den Download von der Web-Oberfläche kann man auch nicht von extern aufrufen.

Bitte schafft eine API, über die man sich die Informationen ohne viel Aufwand herunterladen kann. Für die Authentifizierung sollte einen standardisierten Mechanismus wie Benutzername-Passwort, HTTP-Header-Token, oder ähnliches verwendet werden.

PS: Ich habe inzwischen meine Kollegen gefragt und sie haben mich auf das AVM Home Automation Interface hingewiesen. Danke.

Kritik 4: Dateinamen

Die Dateinamen der CSV-Dateien folgen 2 Schemata, je nach dem ob man sich die Datei per Push-Service zuschicken lässt oder sie von der Web-Oberfläche herunterlädt:

YYYYmmdd-HHMMSS-idXXXXX_ZEITRAUM.csv (Push-Service-E-Mail)
$NAME_dd.mm.YYYY_HH-MM_ZEITRUM.csv (Download)

NAME ist der eingestellte Gerätename, die ID eine willkürliche(?) Nummer. Warum ist der in der Push-Service-E-Mail nicht enthalten? Den Namen des Geräts muss man sich also anderweitig aus der E-Mail parsen, sofern man mehrere Geräte hat. Die ID taucht an keiner anderen Stelle nochmals auf.
Warum enthält die eine Variante Sekunden, die andere nicht?
ZEITRAUM ist 24h oder week oder month oder 2years.
Ausnahme: Beim Push-Service ist es 1month, also mit vorangestellter 1.

Bitte vereinheitlicht die Dateinamen für den Push-Service und den Download.

Kopfzeilen

Schauen wir uns nun den Inhalt der CSV-Dateien genauer an: Diese haben grob folgenden Aufbau:

sep=;
Kopfzeile
Datenzeilen…

Kritik 5: CSV Format

CSV-Dateien sind zwar einfach zu erzeugen, aber deren Weiterverarbeitung ist alles andere als trivial, weil es viele Unterformate gibt:

unterschiedliche Zeichenkodierungen, e.g. ASCII, UTF-8, UTF-16, ISO-8859-1, ̇…
unterschiedliche Zeichen für den Zeilenumbruch, e.g. LF (\n), CR (\r), CR+LF (\r\n), ̇…
unterschiedliche Trennzeichen für die Spalten: Komma (,), Semikolon (;), Tabular (\t), Leerzeichen ( ), …
Zeichenketten in einfache (') bzw. doppelte Anführungszeichen (") einschließen oder nicht
unterschiedliche Dezimaltrennzeichen für Zahlen, e.g. Punkt (.) oder Komma (,)
Escape-Mechanismus für Zeichen, sie ansonsten als Trennzeichen interpretiert würden
initiale Leerzeichen nach Trennzeichen sind relevant oder werden ignoriert
Datei enthält eine Kopfzeile oder beginnt direkt mit der ersten Daten-Zeile

Bitte stellt die Daten in einem strukturierten Format zur Verfügung, das einfach zu verarbeiten ist und nach Möglichkeit eine eindeutige Semantik hat.

Kritik 5: CSV Trennzeichen

Die Datei beginnt mit einem sep=;. Es handelt sich um eine Excel-Erweiterung, die von vielen anderen CSV-Parsern nicht verstanden wird. Sie ist nicht Bestandteil von RFC 4180: Common Format and MIME Type for Comma-Separated Values (CSV) Files. In W3C: Model for Tabular Data and Metadata on the wird lediglich erwähnt, das manche Programm so Metadaten ablegen. Davon wird davon abgeraten, denn es führt gerne zu Problemen:

Der Python-Parser erkennt z.B. nur Kopfzeilen, wenn sie in der ersten Zeile sind. Die zusätzliche Zeile mit dem sep=; zerstört diesen Mechanismus.

Bitte diese Zeile entfernen.

Kritik 6: Kopfzeile uneinheitlich

Als nächstes folgt die Kopfzeile. Normalerweise dient dieser der Benennung der Spalten. Hier ein paar Beispiele¹:

1            |2             |3      |4                |5      |6           |7      |8|9       |10        |11|12     |13
Datum/Uhrzeit;Verbrauchswert;Einheit;Verbrauch in Euro;Einheit;CO2-Ausstoss;Einheit; ;Ansicht:;Datum     ;  ;1 Monat;dd.mm.YYYY HH:MM Uhr
Datum/   Zeit;Energie       ;Einheit;Energie   in Euro;Einheit;CO2-Ausstoss;Einheit; ;Ansicht:;1 Monat   ;  ;Datum  ;dd.mm.YYYY HH-MM Uhr
Datum/   Zeit;Energie       ;Einheit;Energie   in Euro;Einheit;CO2-Ausstoss;Einheit; ;Ansicht:;1 Woche   ;  ;Datum  ;dd.mm.YYYY HH-MM Uhr
Datum/   Zeit;Energie       ;Einheit;Energie   in Euro;Einheit;CO2-Ausstoss;Einheit; ;Ansicht:;24 Stunden;  ;Datum  ;dd.mm.YYYY HH-MM Uhr
Datum/   Zeit;Energie       ;Einheit;Energie   in Euro;Einheit;CO2-Ausstoss;Einheit; ;Ansicht:;2 Jahre   ;  ;Datum  ;dd.mm.YYYY HH-MM Uhr

Spalte 1 heißt uneinheitlich mal …/Uhrzeit, manchmal nur …/Zeit.
Spalte 2 heißt uneinheitlich mal Energie, manchmal aber Verbrauchswert.
Spalten 3, 5 und 7 haben jeweils die Überschrift Einheit und geben die physikalische Einheit der Spalte davor an. Die Namen der Spalten sollten besser eindeutig sein, denn manche CSV-Parser erlauben keine doppelten Spaltennamen.
Spalte 4 heißt uneinheitlich Verbrauch in Euro², manchmal aber Energie in Euro³. Warum steht hier überhaupt die Einheit Euro in der Überschrift, obwohl es dafür doch eine eigene Spalte 5 gibt?
Spalten 8-13 sind nur in der Kopfzeile zu finden. Sie benennen hier deswegen nicht die Spalten, sondern enthalten Meta-Daten über die gesamte Datei.
Spalten 8 und 11 sind leer. Das verwirrt manche Parser. LibreOffice z.B. erlaubt es, solche leeren Spalten zu ignorieren.
Spalten 10 und 12 sind manchmal vertauscht: Manchmal steht der Zeitraum in Spalte 10, manchmal aber auch in Spalte 12.
Spalte 13 enthält dern Zeitpunkt, zu dem die Datei generiert wurde. Die Stunden sind von den Minuten durch unterschiedliche Trennzeichen separiert: manchmal mit einem Doppelpunkt (:), manchmal mit einem Minus-Zeichen (-).

Bitte eine einheitliche und konsistente Kopfzeile erzeugen!

Datensätze

Je nach Zeitraum haben die Datensätze ein unterschiedliches Format für die 1. Spalte mit dem Datum/[Uhr]zeit: Man benötigt also pro Format einen eigenen Parser.

Kritik 7: Tag / 24h

Die Datei mit den Datensätzen für einen Tag enthält für die letzten 24 Stunden jeweils 4 Datensätze im Abstand von 15 Minuten. Die erste Spalte sieht wie folgt aus:

23:45;…
0:00;…
jetzt;…

Ohne Kontextwissen sind diese Zeitstempeln nicht zu interpretieren:

Man benötigt den Erstellungszeitpunkt der Daten aus der Kopfzeile oder dem Dateinamen, um das korrekte Datum zu ergänzen.
Man muss selber erkennen, zwischen welchen Zeilen der Datumswechel stattgefunden hat.
Das jetzt erfordert eine weitere Sonderbehandlung.

Es bleibt unklar, ob die Uhrzeit sich auf den Beginn oder das Ende der Erfassungsperiode bezieht. Von jetzt könnte man auf Ende schließen, aber scheinbar ist es jeweils der Beginn. Von daher ist die Bezeichnung jetzt doppelt falsch.

Bitte immer einen kompletten Zeitstempel bestehend aus Datum und Uhrzeit angeben.
Bitte dokumentieren, ob es sich um den Beginn oder das Ende der Erfassungsperiode handelt.

Kritik 8: Woche

Die Datei mit den Datensätzen für eine Woche enthält für die letzten 7 Tage jeweils 4 Datensätze im Abstand von 6 Stunden. Die erste Spalte sieht wie folgt aus:

12;…
18;…
Do.;…
6;…

Ohne Kontextwissen sind diese Zeitstempeln nicht zu interpretieren:

Man benötigt den Erstellungszeitpunkt der Daten aus der Kopfzeile oder dem Dateinamen, um das korrekte Datum zu ergänzen.
Statt 0 Uhr wird der Wochentag benannt, der aber ohne Kontextwissen nutzlos bleibt. Zudem ist unklar, ob der Wochentag lokalisiert ist, d.h. die Wochentage der Spracheinstellung nach unterschiedlich benannt werden.
Die unterschiedlichen Datentypen Wochentag und Stunde machen das Parsen nur komplizierter.
Die Zeilen müssen in genau dieser Reihenfolge verarbeitet werden. Sie dürfen auf keinen Fall umsortiert werden, weil die Zeile Stunden ansonsten nicht mehr eindeutig einem Wochentag zugeordnet werden können.

Es bleibt unklar, ob die Uhrzeit sich auf den Beginn oder das Ende der Erfassungsperiode bezieht. Vermutlich der Beginn.

Bitte immer einen kompletten Zeitstempel bestehend aus Datum und Uhrzeit angeben.
Bitte dokumentieren, ob es sich um den Beginn oder das Ende der Erfassungsperiode handelt.

Kritik 9: Monat

Die Datei mit den Datensätzen für einen Monat enthält für die letzten 31 Tage jeweils einen Datensatz pro Tag. Die erste Spalte sieht wie folgt aus:

31.12.;…
1.1.;…

Ohne Kontextwissen sind diese Datumsangaben nicht zu interpretieren:

Man benötigt den Erstellungszeitpunkt der Daten aus der Kopfzeile, um das korrekte Jahr zu ergänzen.
Man muss selber erkennen, zwischen welchen Zeilen der Jahreswechsel stattgefunden hat.
Es werden immer 31 Tage gelistet, auch wenn der Monat nur 30/29/28 Tage hat. Aggregiert man mehrere Dateien, so muss man auf die Überlappung der Tage achten und diese ggf. extra behandeln.

Die Angabe bezieht sich vermutlich auf einen kompletten Tag, also von 00:00 Uhr bis 00:00 Uhr des Folgetags. Bei der Umwandung in einen Zeitstempel muß man also 00:00:00 bzw. 23:59:59 als Uhrzeit ergänzen, je nach dem ob man mit dem Beginn oder Ende rechnet.

Bitte immer einen komplettes Datum inklusive Jahreszahl angeben.

Kritik 10: Jahr

Die Datei mit den Datensätzen für die letzten 1-2 Jahre enthält für jeden Monat jeweils einen Datensatz. Die erste Spalte sieht wie folgt aus:

Mai 2023;…
Juni 2023;…

Es bleibt unklar, wie die Monate lokalisiert werden, d.h. wie sie je nach Spracheinstellung benannt werden.

Die Angabe bezieht sich vermutlich auf einen kompletten Monat, also von 00:00 Uhr des 1. Tags inklusive bis 00:00 Uhr des 1. Tags des Folgemonats exklusive. Im Detail bleibt aber auch hier unklar, ob intern nicht auch einfach immer mit 31 Tagen pro Monat gerechnet wird. Bei der Umwandung in einen Zeitstempel muß auch hier darauf geachtet werden, ob mit dem ersten oder letzten Tag des Monats gearbeitet wird und welche Uhrzeit verwendet wird.

Bitte Datums-Angaben nicht lokalisieren. Bitte ein exaktes Datum für Begin und Ende angeben.

Kritik 11: Daten nicht konstant

Exportiert man die Daten mehrfach hintereinander, stellt man fest, das diese für identische Zeiträume nicht identisch sind: Sie unterscheiden sich zwar nur um wenige Watt, aber dennoch ist das unschön. Vermutlich ist das der Round-Robin-Datenbank geschuldet, die intern Sampling verwendet, um (fehlende) Werte zu interpolieren.

Bitte eine Datenbank verwenden, die reproduzierbar die selben Daten liefert.

Fazit

Innerhalb des FRITZ-Ökosystems funktionieren die Produkte ja wunderbar miteinander, aber der Export der Daten für die Weiterverarbeitung in einem anderen System ist eine Katastrophe. Insbesondere CSV als Format sehe ich als sehr problematisch, da die Weiterverarbeitung alles andere als einfach ist. Eine fehlenden API für den Abruf der aktuellen Daten per Skript macht es noch komplizierter.

Mein in Python geschriebener Parser inklusive einigen Testfällen und Validierung der Zeichenketten bringt es aktuell auf 324 Zeilen. Nicht gerade wenig für ein Programm, dass nur CSV-Dateien parsen und sie in ein einheitliches Format bingen soll.

Für die bessere Lesbarkeit habe ich Leerzeichen eingefügt, um die Spalten besser kenntlich zu machen. ↩
Als Physiker störe ich mich am Wort Verbrauch, denn Energie wir nach dem 1. Hauptsatz der Thermodynamik nicht verbraucht, sondern (u.a. in Wärmeenergie) umgewandelt. ↩
Energie ist auch der falsche Begriff. Korrekt wäre Energiekosten. Und die Einheit für Energie wäre Joule, nicht Euro. ↩

Variadic C arguments and other gcc tricks

2025-04-23T16:17:00+02:00

Q: How do you write a debug helper function passing through printf() like data?

A: Use a combination of macros and variadic functions combined with the power of gcc extensions.

Using a macro

Very simple

#define dbg(...) \
    printf(__VA_ARGS__)

Very simple as it just passes through all arguments unmodified.

Separate format argument

#define dbg(fmt, ...) \
    printf(fmt, ##__VA_ARGS__)

## is a gcc extension also supported by clang, but may require -Wno-gnu-zero-variadic-macro-arguments. It removes the trailing comma when only a single argument is passed as fmt, which prevents compilers from aborting with a syntax error otherwise. If you need support for other compilers too, see alternative.

Include additional information

Lets prepend the file name of the compile unit and line number of the call-site:

#define dbg(fmt, ...) \
    printf("%s:%d:" fmt, __FILE__, __LINE__, ##__VA_ARGS__)

See Standard Predefined Macros for more macros.

As a function

Using a macros has the draw-back, that you blow up your code size as the expansion is done at every call-site. Therefore it might be beneficial to create a single re-usable function, which then gets called from multiple locations:

#include 
static int dbg(const char *fmt, ...)
    __attribute__ ((format (printf, 1, 2)));
static int dbg(const char *fmt, ...)
{
    va_list args;
    va_start(args, fmt);
    int result = vprintf(fmt, args);
    va_end(args);
    return result;
}

__attribute__(format) is a gcc extension also supported by clang. This allows gcc to check, that the types of the arguments match the format string.

Combined

Using a function is less flexible compared to using a macro. But you can combine both techniques to get the best of both worlds:

use macros to insert additional call-site specific information like file name and line number.
use a single function for logging implementing all logic.

#include 
#define dbg(fmt, ...) \
    _dbg("%s:%d:" fmt, __FILE__, __LINE__, ##__VA_ARGS__)
static int _dbg(const char *fmt, ...)
    __attribute__ ((format (printf, 1, 2)));
static void _va_end(va_list *args)
{
    va_end(*args);
}
static int _dbg(const char *fmt, ...)
{
    va_list args __attribute__ ((cleanup (_va_end)));
    va_start(args, fmt);
    return vprintf(fmt, args);
}

__attribute__(cleanup) is another gcc extension also supported by clang. This allows gcc to register a cleanup function, which gets called automatically when the variable goes out-of-scope (at the end of the function). That way va_end() is called on all exit paths. Also very handy to call close() after open() or free() after malloc().