For the past couple months, I’ve been trying to get exceptions to work on AVR as a hobby project. After several months of work, I’ve done it. Exceptions work, you can just include <vector> and it just works.

How to use it

Currently I put prebuilt toolchains on github. My changes are currently a bunch of patches to gcc 13.2.0, binutils, and a few others.

What I actually did

I’ve reimplemented libunwind, the personality function, and also cfi macros by replacing them with my own stuff. This was a lot of work, and involved researching unwinding, compiler internals, and creating a replacement for DWARF debug information.

Why libunwind doesn’t work

At first, I tried porting actually libunwind to AVR. The problem is that AVR is a very limited platform, and actually porting libunwind results in a 12k binary. On a microprocessor that might have 32k in flash. Using one third of flash for an error-handling mechanism that the user may not have even asked for seems kind of bad, so naturally I went ahead and reimplemented the whole thing in assembly.

Why is libunwind big

I suspect the reason why libunwind is so large is due to two major factors. First, the ABI is defined in 32-bit sized words. AVR is a 8-bit architecture. Naturally, this means 4 loads, 4 registers, and 4 adds are required to even do basic arithmetic on AVR with a 32-bit int. This is kind of degenerate, but is necessary all the time because the libunwind binary assumes 32-bit architectures.

Secondly, libunwind only lazily unwinds the stack. It actually parses the stack, updating a struct that holds the unwound state. When time comes to unwind, special architecture-specific functions are called to actually load the state into registers. This is convenient to avoid writing everything in C, but also means that libunwind must manipulate various structs (of which there are many) during unwinding. On normal processors this is fine, but on AVR this is actually somewhat costly due to the aforementioned issues with 32-bit integers.

Moreover, libunwind does a lot of redundant work, resulting in a lot of wasted space. Libunwind unwinds the stack twice and calls the personality function twice. Libunwind also duplicates functions wholesale (via #include) for the sake of performance, which is bad for AVR because the duplicated functions are enormous.

Why is my solution better

My solution parses a bytecode that specifies how to unwind a frame, then parses a compiler generated LSDA section to see if handlers need to be called. The bytecode is 8-bit sized, which avoids the issues with 32-bit integers. It is also intentionally incredibly simplistic, in comparison to libunwind, which piggy-backs off of DWARF debug info. The issue with DWARF is that it is a general purpose debug information format, meaning the parser needs to be far more complex than really necessary. Even assuming you use a subset of DWARF, DWARF is also 32-bit aligned, which causes issues with AVR.

My solution ends up being about 2k in size, with 0.8k being for the unwinding and 1.2k being for parsing and LSDA. I haven’t done benchmarks given that I never got libunwind to work, but it’s safe to say that my implementation probably would have been faster too.

Implications for other platforms

My implementation only really is better because of the limitations of AVR, and many of the advantages are lessened on more capable platforms. That said, ARM actually utilizes a similar framework in the ARM EH ABI, which uses a similar bytecode.

Future Work

Currently, my implementation doesn’t fully implement C++ semantics surrounding exceptions. It does not support terminating on noexcept or allow for the unwinding of signal frames, as those use normally scratch registers.

Moreover, there are optimizations out there (mostly research by Khalill Estell) regarding the lookup function of the exception table. I also have ideas regarding the unwinding itself that I want to look into.

Using this on Arduino

Unfortunately using this on windows Arduino has proven difficult, since the way they launch programs does not allow scripts to be used. I’ve tried getting around this via ps2exe, but the way it parses arguments means that doesn’t work. I suspect that it might work for arduino on Linux, but I don’t really have the means to test this myself.