Posixcafe Who's asking

Not Porting 9front to POWER64

A couple months ago while on a break between jobs I had taken some interest in exploring a new arch to port 9front to. 9front already runs on a fair share of architectures and 9legacy has a working risc-v port so I wound up turning my attention to POWER64. In doing so I was drawn inevitably to Raptor Computing, as the only maker of reasonably priced hobbyist grade systems. Now there hasn't been a new Raptor Computing machine in a little bit and since the pandemic the prices of their existing systems (Blackbird and Talos II) have gone up to almost double of what they were originally. Before I pulled the trigger on an expensive(4k USD) toy I wanted to get a good start enough to convince myself that I was going to really get it done.

I started by taking the existing POWER64 toolkit from 9legacy and imported that in to 9front. This gave me a compiler, linker and assembler which mostly worked and after a bit of patching they were capable of at least compiling the entire source tree. So far so good. Next up I got the POWER9 QEMU binary up and running and set out to get some code generated on 9front actually running on the machine. I first needed to decide exactly which entry in the boot chain I wanted to put 9front at as these POWER9 systems have an interesting bootchain. It looks something like:

Hostboot -> Skiboot -> Petiboot -> Linux Kernel

Now for QEMU hostboot is kind of "baked in", not something that is easy to change. Skiboot is there mainly to provide OPAL (OpenPower Abstraction Layer) to the next piece in the chain. OPAL itself is somewhat similar to a BIOS in a PC, in that it sits somewhere in physical memory and has routines that do some abstracted operations that would otherwise be machine specific. In this case it also passes us a device tree detailing the peripherals. Now typically skiboot is used to load Petiboot, which has the ability to do a full linux kexec boot. Somewhat influenced by this blog I had decided that I should aim to have skiboot boot my binary and go from there. In part since getting linux to kexec a 9front kernel is not something that works and fixing it sounded like a bit of a waste of time. So I spent a lot of time reading about OPAL and getting our assembler and linker to a point that it was generating something skiboot would actually run. This took slightly longer then expected but the end result did work.

At this point I was pretty bought in, I was having a fun time reading the ISA and learning about the boot process. I had a mostly working userspace and had some actual POWER64 code running under a virtual machine. Those Raptor Computing machines were looking quite tempting now. I had also learned then that OpenBSD did indeed run on the Talos II but not within the QEMU POWER9 environment. What I really wanted was this code running on hardware so I figured I should see if I could try getting some hardware under the assumption that I should be testing on what I want to be running on.

I had first attempted to reach out via the OpenPower Foundation. On their website you can contact them to request access to resources for open source work. I submitted the form and requested access to some Raptor Computing systems hardware to continue this work on. Roughly a month later I hear back from another organization within Open Power who offered to assist me with access to their virtual lab. They had mentioned they had attempted to reach out to Raptor on my behalf but had heard nothing back, but instead they could offer access to their IBM systems in their labs. I thought this was great! But there was some issues. They had access to some proper IBM big iron POWER boxes, the kind that would go for more then a house in the midwest, and figuring out a method to do OS development on these machines remotely turned out to be a bit troublesome. Short of throwing a kernel directly on the machine and hogging a whole one to myself, there wasn't really much that could be done. Everything else would be virtual, and while the prospect of running 9front on a IBM big iron was neat it didn't seem practical. Not to mention my fear of some dumb code running on bare metal harmering this very very expensive hardware. So my thought was "well I can keep going with QEMU for now, if I get it working there I can test it running with hardware accelerated virtualization here to make sure that also works" and set my sights back on to poking at QEMU.

A bit later a netizen of SDF mentions they have their hand on a Talos II, and would be willing to sell to me for a reasonable price. The problem was that the machine did not fully IPL, there was something going on with it. The original owner thought perhaps it just needed its firmware reflashed or something but didn't quite have the time themselves to tinker with it. So we agreed to ship it over to me, see if I can get it working, and then deal with the money later. The machine arrived and I see the same error and set to work on diagnosing it. I spent a lot of time on the raptor wiki, I asked around in the IRC, I tried all the upgrading, downgrading, sidegrading, compiling from source, all of it to see if I could get lucky and have this just be a software issue. After a week worth of evenings messing around with it, I came to the conclusion that there indeed was some hardware issue with the board.

I was a bit out of my element now. The error specifically was an i2c timeout. Through enough debug prints in the boot loader we found that this was specifically related to the dram. So I bundled everything up, asked a couple more times on IRC and submitted a support ticket to Raptor. I figured that it might be possible to pay to send the board back to Raptor to have it properly diagnosed and fixed. I submitted this ticket in Oct 2023, with all the information I had gained and decided to take a break from POWER for the time being.

Fast forward to February 2024 and I had still not heard back. I was itching to get going again and make use of this machine I had sitting in my apartment. I had recalled that someone mentioned in IRC that the only reliable way to get in contact with Raptor was through twitter. I don't have an account there so I had a friend of mine @ them and ask about the ticket status. Some 3 days later I got a response to my ticket. They confirmed that issue was likely a hardware issue, and that it is something that they are familiar with. They said it was likely some pads or caps on the underside of the mother board had been damaged between the main chip and the DRAM. Repairability they said would depend on if the pads were still intact.

I responded an hour later with pictures of the entire motherboard and reviewed the board myself. They had not supplied a reference photo and it seemed that there was not one available online, so it was hard for me to tell if there was an issue. That was a bit ago now and I still have yet to hear back. Their response did not have any details on if it were possible to send it in for repairs to them, I assume that is not something they are interested in.

Now I know I wasn't the original buyer of this box and its out of warranty and they're likely a small shop of engineers, but something about the whole experience with Raptor just didn't feel right to me. If my only options were to attempt repairs myself and/or buy a new motherboard for something that is a known issue I would honestly rather spend my money elsewhere.

In my initial support ticket I had talked a bit about my intention to do this port work. I thought that folks who cared a lot about open-ness would think the port work is cool. Something like "Look at how usable our devices are because you can just read and fork the code" kinda deal. The folks behind the MNT reform was very happy to learn that Cinap had ported 9front to their machine. But that just didn't seem to be the case here. It seemed like Raptor was very removed from the overall POWER community. They didn't have any employees in the IRC, they ghosted the other folks from OpenPower, and it took months and twitter pestering to get them to reply to a support ticket.

There are great members of the community that I met in my journey to figuring all this stuff out, and those folks were awesome. I still learned a lot and had a lot of fun getting to this point. I hope that another company gets involved with POWER64 hardware, I am most particularly interested in this potential PowerPi. However at this point I think I am putting all this work on hiatus.

The existing code for the port can be found here if anyone is curious.

There is a part 2 of this post that continues this adventure.