Skip to main content

Modifying GCC

Quite a while ago, I’ve integrated my exceptions work into a fork of GCC. At that point, I was already modifying GCC to suppress the libunwind symbols, so putting all of my code into GCC was a no brainer. Moreover, it isn’t actually that hard to build GCC.

Editing GCC

GCC is built via the autoconf build system. Autoconf is a collection of scripts that is somewhat like CMake, but written entirely in shell script and is intended to be portable for any system that runs shell scripts and has a C compiler. Autoconf is used to adapt programs for all unix types, and as such assumes very little about the host platform. In any case, I originally based my build script off of ZakKemble’s build script, although eventually I would deviate quite significantly from it.

Learning how to compile GCC took a few days, and at first I built GCC in a docker container every single time. Eventually though, I figured out it was better to just compile it locally, and rerun the makefile to update changes. Then every commit I test in a dockerfile to make sure my changes are portable. Autoconf, by virtue of being a shell script, is horrifically slow, but there’s not much I can do about that.

Moreover, GCC is very large. I typically use vscode for code editing, but I find that the large amount of files does not fit in vscode’s side bar. Because of that, I’ve grown accustomed to using a bare sublime text editor by itself. Programming with no IDE or autocomplete is surprisingly not that bad, but means that I’ve had to dive into GCC internal documentation often to understand programs. Notably, I find GCC internal documentation to be surprisingly complete, far more complete than Clang’s.

libgcc

As you may already know, I’ve implemented the unwind functions in AVR assembly. In order to move my work to GCC, the first task was to move my unwind routines into GCC. Oddly enough, libunwind functions are actually placed inside libgcc, a basic core library most programs are linked with, instead of libsupc++, the basic support library for C++. This is because the unwind mechanism is (meant to be) language and compiler agnostic, and theoretically an exception thrown from one langauge can be caught in another. There is code in GCC for catching java and ada exceptions, but I suspect nobody actually uses these. In any case, moving my code to libgcc is as simple as putting it in the AVR-specific folder inside and specifying the target file to compile my code.

Frame data emission

I was investigating how GCC emits exception frame data, when I found out that GCC does not actually emit the data itself. Instead, oddly enough, it is done by the assembler. GCC emits .cfi_* directives, which the assembler then turns into .eh_frame sections. This means that I had to not only fork GCC, but also binutils in order to replace my exception information emitter.

Fortunately, it seems ARM had already beat me to the punch. ARM as it turns out converged on a similar design for their ehabi, which also uses unwind instructions. In any case, I found the code they used to emit arm unwinding instructions, and replaced it with my own directives. Similarly to ARM, my code emits a function start and end directive, then directives for popping registers and allocating stack space. That said, there was a hiccup when interpreting the stack adjustment. Because AVR registers are 8-bit, there normally wouldn’t be enough bits to represent meaningful amounts of stack or instructions. Because of that, both the stack pointer and instruction pointer registers are 16-bit, but need to be read as two IO registers instead of a normal register. This causes the RTL emitted for stack adjustments to be as a read, modify, write rather than a normal pointer adjustment.

After modifying both GCC and binutils, my code just worked. I was able to unwind instructions, just as before. Moreover, because I was forking binutils already, I was able to modify the base linker script for AVR, meaning that there was no longer any need to use a special linker script. Because of that, my GCC fork works with exceptions out of the box.

Crosstool-NG

Several months later, I was looking at how to build GCC when I rediscovered crosstool-NG. I didn’t try it earlier due to a lack of experience, but I figured crosstool-NG was better than my current ad-hoc build script. As it turns out, crosstool-NG is really easy if you’ve already compiled GCC manually before. You specify compile flags, certain options you want, and it just works (on linux, no experience with anything else). It takes care of downloading all of the dependencies, and you can specify it to use your own git repo or to apply patches if you’re modifying GCC. I did have to modify the script lightly, but since it’s a shell script that wasn’t too bad.

What now?

As it turns out, modifying compiler’s isn’t that hard. I’ll probably work on porting the build to Windows and MacOS so I can release this for Arduino users, but other than that I’ll probably be done with AVR. After this, I’ll probably do exception optimization research. I’m not too sure I’ve mentioned this before, but I’m friends with Khalil Estell, whose done research on exception handling optimization. He’s independently (he’s been doing this research before I got to know him) been researching this, and has been optimizing exceptions on ARM. I’ll probably try to do some exception optimization of my own on ARM, forking either GCC again or Clang. In particular, I want to do link time exception optimizations.