I have this habit of starting projects, coming across some sort of roadblock while working on them, spending a bunch of time going down the rabbit hole to address the roadblock, and then working my way back up the stack. That's an egregious misuse of mixed metaphors, but I'm sure it's a phenomena many are familiar with. mstrap
v0.5.0 has been tagged and released, and it is one such journey of roadblocks, rabbit holes, and stack climbing. The changelog is relatively meager for a release of software with over a year's worth of changes. There's a few new features, some bugfixes, but overall the bulk of the work was on one thing: Apple Silicon support.
Poor timing
I'm a big fan of the move towards ARM64 generally. The Apple Silicon chips have been really impressive and I was excited to support them in mstrap
, despite the first M1 laptops releasing right around the time the first public mstrap
version was released.
I wasn't psyched about the idea of a tool designed specifically around bootstrapping new machines not having support for the new machines people would be setting up from then on out, so I got to work on determining what was needed for Apple Silicon support. Unfortunately, the first roadblock came almost immediately: support within Crystal, the language mstrap
is implemented in.
Language support for Apple Silicon was iffy during this period of late 2020/early 2021, in which partial support existed in some languages from folks contributing with the Apple A12-based dev kits prior to the release of the first M1 machines. However, with a language like Crystal, where the ecosystem is still rather small and the community is developing, there hadn't been any work on Apple Silicon during the dev kit period, and so it was not until the M1 machines came out that issues started being opened around adding Apple Silicon support.
Adding Apple Silicon support to Crystal
Romain Franceschini opened that issue on the Crystal issue tracker in December 2020 with the first important piece for supporting Apple Silicon: the libc bindings. Luckily, the ABI support was mostly already there. Apple's ARM64 implementation has a few differences from the main ARM64 ABI, but Romain had linked the relevant Apple documentation that explained those differences and they mostly did not affect Crystal's implementation.
I was able to follow along with what he had done so far to cross-compile a mostly-working Crystal compiler, along with figuring out the pieces to support Apple's choice to use their own target triple (arm64-apple-darwin
) and not supporting something like aarch64-apple-darwin
(aarch64 being the canonical term for ARM64, at least in the LLVM world.) This was merged in February 2021 in time for Crystal's 1.0.0 release, but this was just the beginning for supporting Apple Silicon with Crystal.
Cross-compiling from source
At this time, in order to cross-compile a Crystal program, you would need to first compile Crystal yourself because the official macOS builds did not ship with ARM64 support. After that, you would also need to cross-compile something called libcrystal.a
, which was a small C library responsible for setting up a signal handler for segfaults in Crystal.
Unfortunately, this added an annoying hurdle to cross-compiling, which is already an annoying endeavor that requires you to cross-compile other dependencies. Crystal itself provides some great facilities for binding to C libraries, including libc, and so it became clear that it would be possible to port this to Crystal and remove the need for this awkward build step. This required exploring how signals are implemented across a few platforms in order to add bindings for various signal data structures, but the result was a nice simplification of the build process.
I wrote up a blog post describing the process for cross-compiling the compiler for ARM64, so at least those who were interested could follow a charted path, but it was still inconvenient compared to an official binary release.
Distribution is hard
Crystal was still only setup to ship amd64 releases for macOS, which meant that anyone who wanted to run the compiler needed to run it through Rosetta. In addition to that, the version of LLVM shipping with the release builds was a version too old to support ARM64 code generation for macOS (LLVM 7). I worked with Brian Cardiff on the Crystal core team to update the version to LLVM 10, which didn't officially support Apple Silicon, but supported enough of the pieces for it to work. LLVM 11, unfortunately, carried a bug that affected the ability to for a Crystal compiler compiled with it to compile (try saying that ten times fast.) Next, we had to recompile LLVM 10 again to add in support for targeting AArch64.
At this point, I was a bit stumped on how best to get an ARM64 build out of CI. There was talk about eventual CircleCI macOS ARM64 runners, but it seemed unlikely those would materialize soon. Instead, I started looking at cross-compiling for ARM64 from x86_64 and getting that into CI somehow. During that process, I tried a few attempts at separate native and cross-compiling builds, but that duplicated the amount of macOS resources needed, and felt more complex that it needed to be. Eventually, I stumbled upon a tweet from Apple compiler engineer Kuba Mracek:
Apple Silicon Mac porting tip #8: For Makefile-based, CMake-based and other non-Xcode projects, building universal is often just about adding CFLAGS="-arch arm64 -arch x86_64", and the compiler and the linker will handle that — they will create universal .o files, link them...
— Kuba (Brecka) Mracek (@kubamracek), June 23, 2020
I had forgotten about universal macOS binaries and that they had been effectively revived from the PowerPC-to-Intel days to now take a role in the new era of Intel-to-ARM64 transition. This tip meant that we could create so-called "fat" libraries targeting both architectures without changing much about the overall build infrastructure, so it was a relatively easy solve for the various C dependencies needed. Unfortunately, it was a little bit more complicated for the Crystal compiler itself.
Crystal itself doesn't support building universal binaries, and I didn't feel like taking a crack at enhancing the compiler to do code generation for two targets at once, but luckily I found out about the lipo
tool. lipo
takes two binaries for different architectures and spits out a single universal binary. This meant all that was needed then was to compile Crystal natively, then cross-compile it for ARM64, and then pass both builds through to lipo
. There were a few tricks to doing this, but ultimately we got it working and it was merged in September 2021, and the first universal macOS build of Crystal with support for ARM64 was released with Crystal 1.2.0.
Adding Apple Silicon support to mstrap
Making mstrap
itself support Apple Silicon was actually quite easy. There wasn't a whole lot to do here, other than ensure that Homebrew on Apple Silicon got installed to the right place. By Crystal 1.2.0's release, Docker Desktop had come out with support for Apple Silicon and had squashed most of the bugs that affected the preview releases in the winter and spring of 2021, so there was little to do there. Unfortunately, again, distribution and cross-compiling were the chief roadblocks.
Cross-compiling mstrap
mstrap
, being a Crystal program, also depends on the same things Crystal depends on. This means that libraries like libpcre, libevent, libgc (bdw-gc), and openssl needed to be cross-compiled for the target architecture. These are shipped in official tarball releases, but are absent in the more commonly used Homebrew releases. The official tarball releases are mostly useful for compiler bootstrapping and don't track the latest library versions, so the shipped dual architecture libraries are not really intended for consumption by anything other than the compiler. mstrap
also has some specific requirements around statically linking as much as possible, as many libraries may not be present on a new machine, adding another layer of complexity to an already complex story.
Unfortunately, cross-compiling mstrap
prior to v0.5.0 involved a Docker-based setup for cross-compiling static binaries within an Alpine container for both amd64 and x86_64 on Linux. This wouldn't help for macOS, and frankly the whole setup was slow and cludgy anyway. I wanted something better and didn't want to reinvent a bunch of cross-compilation machinery. Tools like Autotools and CMake, which are used by most of the libraries depended on by mstrap
, already support cross-compiling and you just have to bring a suitable compiler. Luckily, clang
is already a cross-compiler, so I really just needed to bring the libraries and build something to enable that same kind of cross-compiling of mstrap
and all of its dependencies without specialization across platforms.
In my experience, Autotools is pretty easy to use as a user: you run a ./configure
script in the project, maybe customize some options, run make
, and at the end, you get your output. Autotools supports a bunch of cross-compiling options for a bunch of targets that are the same no matter what project you're compiling, so you can be reasonably sure cross-compiling is at least supported by the build tooling. Unfortunately, Autotools is also a) incredibly complex and b) doesn't solve the problem of cross-compiling dependencies for me, and writing it or debugging it is a terrible way to spend an afternoon.
CMake solves a number of the same problems as Autotools with a more approachable configuration language than Autotools macros and supports sub-projects, but writing CMake configuration also did not excite me.
Enter Meson
The Meson build system had been on my radar for a bit, after seeing lots of projects starting to migrate to it. I wasn't entirely aware of how it was different, but I knew it existed and decided to take a look. It's ethos was exactly what I was after:
The main design point of Meson is that every moment a developer spends writing or debugging build definitions is a second wasted. So is every second spent waiting for the build system to actually start compiling code. mesonbuild.com
Meson is similar to CMake in a number of ways: it supports sub-projects, can automatically find dependencies, supports cross-compiling, and lots of other cool stuff. What really sold me on trying it, though, was its concept of Wraps. Wraps provide a way to wrap and package the rules for building various dependencies with Meson turning it into a library package manager of sorts. Instead of writing Meson rules for compiling different libraries, I can just pull in community-provided rules. Many of the projects I need are served by the Meson WrapDB: bdw-gc, libpcre, openssl, and zlib. They're not the latest versions, but they're recent enough. The only two not covered were readline/libedit and libevent. Two packages seemed doable.
One nicety of Meson is that it provides some integration with CMake. There are some caveats, but more or less you can include a CMake project into your Meson build and let Meson do the heavy work of configuring and building it, assuming there's nothing too complex going on. libevent supports CMake, so it was relatively straightforward to integrate into the project's meson.build
.
Readline (or libedit, the BSD version) was the last remaining library. Prior to v0.5.0, it was used for a handful of interactive prompts in mstrap
. However, I quickly learned that cross-compiling libedit might be one of the hardest problems in computing, and after spending way too long wading through and reverse engineering various parts of the build configuration, I thought "There's Gotta Be A Better Way™", and replaced it with Term::Prompt
, a terminal solution built in Crystal without readline/libedit (well, almost)
With all the native libraries cross-compiling, it was finally possible to wire up Meson and update the build to support a more robust cross-compiling (and static linking) experience, culminating in a pleasing revamp of the build system that was merged in April of 2022. All that remained was to release a new version of mstrap
with builds for both macOS on x86_64 and arm64. For some reason, this took me six months to do. Whoops.
What's next?
There's still more to do on better support Apple Silicon in Crystal itself. LLVM bugs have caused some issues, and the official builds are still pinned to LLVM 10 (though support for LLVM 12, 13, and 14, have been added). Ironically, mstrap
itself will not compile on ARM64 with the official Crystal builds built with LLVM 10 due to bugs not fixed until LLVM 13, and there are likely other code generation bugs hiding or outstanding. Perhaps I will find myself diving back down this particular rabbit hole again.