Switching between VFIO-enabled virtual machines

For the past few months, I’ve been using a setup for software development where I pass the mouse, keyboard and graphics card to a virtual machine via Linux VFIO. The setup is described in detail in its own blog post.

In short though, the idea is to provision a separate virtual machine for each project, so that I can install whatever tools I need to solve the problem at hand, with no real consequences if I break my install.

The problem

I have multiple virtual machines which are configured to use the same hardware, so I can only run one at a time. Before making the changes outlined in this blog post, the method I used to switch to a different VM was unwieldy: I needed is to shut down the current VM, switch monitor inputs to the host graphics, start a different VM, then switch monitor inputs back the the VM graphics.

I intend to do all of my work in VM’s, so my basic goal was to make a ‘switch VM’ mechanism available from within a VM, to remove any need to interact with the host operating system directly.

Creating an API

My plan for this was to run an agent on the host, which will accept a request to switch VM’s. It would then shut down the current VM, and start up a different one instead.

This was the first time I tried using the libvirt Python bindings before, and I was able to quickly make a proof of concept which toggled between two VM’s, which are nested within the VM I’m using for development, because of course they are.

Toggling which VM is active, using the Python libvirt API in PyCharm.

I then extended this to use an orderly shutdown, and to work with an arbitrary number of VM’s. CirrOS was not completing its boot in this environment, so I switched the test VM’s to Debian, and also passed through a USB device to each of them so that I would get an error if I ever tried to boot both VM’s simultaneously.

Last, I exposed this over HTTP as a simple API.

For the first iteration, I installed this on the host, and placed some scripts in a folder which is shared between all of the VM’s. The scripts use curl commands to call the API. To trigger a VM switch (or a shutdown of the host), I could go to a folder, and run one of these scripts.

Web UI

With that proof-of-concept out of the way, I made a web interface, so that I can use this instead of some bash scripts in a folder. This is a very simple Angular app.

GNOME extension

For the most part, I’m working on Debian using the GNOME desktop environment, which gives me some options for integrating this further into my desktop.

I first considered making a search provider, but settled instead for making a simple standalone extension to display an indicator instead, based on the example app shown running here.

As if nested virtualization was not mind-bending enough, I started testing this using a GNOME session within a GNOME session.

The extension is written in JavaScript, though the JavaScript API for GNOME extensions was very unfamiliar to me. There some obvious quality issues in the code, but it’s not being published to a repository, and it fires off the correct HTTP requests, so I’ll call it a success.

Deploying a GNOME extension manually is as simple as placing some files my home directory.

Result and next steps

I’ve bookmarked this URL in each of my VM’s.

And I’ve also installed this GNOME extension on the VM that runs on boot.

The idea of this setup is to have one virtual machine per project, so a good next step would be to make it easier to kick of the install process.

I’ve put the code for this project online at mike42/vfio-vm-switcher on GitHub.

Going all-in on GPU passthrough for software development

I recently spent some time improving my software development workflow at home, since my previous setup was starting to limit me. I settled on a configuration which uses GPU passthrough with the KVM hypervisor, running Debian as both a host and a guest.

This post aims to show some of the benefits and drawbacks of this more complex setup for my use case, as well as a specific combination of hardware and software which can be made to work.

Background and previous setup

I was using Debian testing as a host system, and provisioning project-specific virtual machines to work in via virt-manager. I use two monitors, and typically had an IDE (in the VM) open on one monitor, and a web browser (on the host) open on the other.

The setup is simple and effective:

  • The VMs ran in the QEMU user session – previously blogged about here.
  • The path /home/mike/workspace on the host was shared with every VM via a 9p fileshare.
  • Each desktop was set to 1920 x 1080 resolution, with auto-resize disabled. I kept the VM’s at this lower resolution to get acceptable graphical performance on 4K monitors when I set this up.

This allowed me to keep the host operating system uncluttered, which I value for both maintainability and security. Modern software development involves running a lot of code, such as dependencies pulled in via a language-specific package manager, random binaries from GitHub, or tools installed via a sketchy curl ... | sudo sh command. Better to keep that in a VM.

The main drawbacks are with the interface into the VM.

  • Due to scaling, I had imperfect image quality. I also had a small but noticeable input lag when working in my IDE.
  • Development involving 3D graphics was impractical due to the performance difference. I reported a trivial bug in Blender earlier this year but didn’t have an environment suitable for more extensive development on that codebase.
  • There was an audio delay when testing within the VM. I created a NES demo last year and this delay was less than ideal when it came to adding sound.

I was at one point triple-booting Debian, Ubuntu 22.04 and Windows, so that I could also run some software which wouldn’t work well in this setup.

Hardware

GPU passthrough on consumer hardware is highly dependent on motherboard and BIOS support. The most critical components in my setup are:

  • Motherboard: ASUS TUF X470-PLUS GAMING
  • CPU: AMD Ryzen 7 5800X
  • Monitors: 2 x LG 27UD58
  • Graphics card: AMD Radeon RX 6950 XT
    • Installed at PCIEX16_1, directly connected to the CPU
    • Connected to DisplayPort inputs on the two monitors

I added one component specifically for this setup:

  • Second graphics card: AMD Radeon RX 6400
    • Installed at PCIEX16_2, connected to chipset
    • Connected to HDMI inputs on the two monitors (one directly, one via an adapter)

The second graphics card will be used by the host. The RX 6400 is a low-cost, low-power, low-profile card which supports UEFI, Vulkan and the modern amdgpu driver.

In the slot I’ve installed it, it’s limited to PCIE 2.0 x4, and 1920 x 1080 is the highest resolution I can run at 60 Hz on these monitor inputs. I only need to use this to set up the VM’s, or as a recovery interface, so I’m not expecting this to be a major issue.

Inspecting IOMMU groups

On a fresh install of Debian 12, I started walking through the PCI passthrough via OVMF guide on the Arch Wiki, adapting it for my specific setup as necessary.

I verified that I was using UEFI boot, had SVM and IOMMU enabled, and I also enabled resizable BAR. I did not set any IOMMU-related Linux boot options, and also could not find a firmware setting to select which graphics card to use during boot.

Using the script on the Arch Wiki, I then printed out the IOMMU groups on my hardware, which are as follows:

IOMMU Group 0:
    00:01.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Starship/Matisse PCIe Dummy Host Bridge [1022:1482]
IOMMU Group 1:
    00:01.1 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Starship/Matisse GPP Bridge [1022:1483]
IOMMU Group 2:
    00:01.3 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Starship/Matisse GPP Bridge [1022:1483]
IOMMU Group 3:
    00:02.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Starship/Matisse PCIe Dummy Host Bridge [1022:1482]
IOMMU Group 4:
    00:03.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Starship/Matisse PCIe Dummy Host Bridge [1022:1482]
IOMMU Group 5:
    00:03.1 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Starship/Matisse GPP Bridge [1022:1483]
IOMMU Group 6:
    00:04.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Starship/Matisse PCIe Dummy Host Bridge [1022:1482]
IOMMU Group 7:
    00:05.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Starship/Matisse PCIe Dummy Host Bridge [1022:1482]
IOMMU Group 8:
    00:07.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Starship/Matisse PCIe Dummy Host Bridge [1022:1482]
IOMMU Group 9:
    00:07.1 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Starship/Matisse Internal PCIe GPP Bridge 0 to bus[E:B] [1022:1484]
IOMMU Group 10:
    00:08.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Starship/Matisse PCIe Dummy Host Bridge [1022:1482]
IOMMU Group 11:
    00:08.1 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Starship/Matisse Internal PCIe GPP Bridge 0 to bus[E:B] [1022:1484]
IOMMU Group 12:
    00:14.0 SMBus [0c05]: Advanced Micro Devices, Inc. [AMD] FCH SMBus Controller [1022:790b] (rev 61)
    00:14.3 ISA bridge [0601]: Advanced Micro Devices, Inc. [AMD] FCH LPC Bridge [1022:790e] (rev 51)
IOMMU Group 13:
    00:18.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Matisse/Vermeer Data Fabric: Device 18h; Function 0 [1022:1440]
    00:18.1 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Matisse/Vermeer Data Fabric: Device 18h; Function 1 [1022:1441]
    00:18.2 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Matisse/Vermeer Data Fabric: Device 18h; Function 2 [1022:1442]
    00:18.3 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Matisse/Vermeer Data Fabric: Device 18h; Function 3 [1022:1443]
    00:18.4 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Matisse/Vermeer Data Fabric: Device 18h; Function 4 [1022:1444]
    00:18.5 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Matisse/Vermeer Data Fabric: Device 18h; Function 5 [1022:1445]
    00:18.6 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Matisse/Vermeer Data Fabric: Device 18h; Function 6 [1022:1446]
    00:18.7 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Matisse/Vermeer Data Fabric: Device 18h; Function 7 [1022:1447]
IOMMU Group 14:
    01:00.0 Non-Volatile memory controller [0108]: Samsung Electronics Co Ltd NVMe SSD Controller SM981/PM981/PM983 [144d:a808]
IOMMU Group 15:
    02:00.0 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD] Device [1022:43d0] (rev 01)
    02:00.1 SATA controller [0106]: Advanced Micro Devices, Inc. [AMD] 400 Series Chipset SATA Controller [1022:43c8] (rev 01)
    02:00.2 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] 400 Series Chipset PCIe Bridge [1022:43c6] (rev 01)
    03:00.0 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] 400 Series Chipset PCIe Port [1022:43c7] (rev 01)
    03:01.0 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] 400 Series Chipset PCIe Port [1022:43c7] (rev 01)
    03:02.0 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] 400 Series Chipset PCIe Port [1022:43c7] (rev 01)
    03:03.0 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] 400 Series Chipset PCIe Port [1022:43c7] (rev 01)
    03:04.0 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] 400 Series Chipset PCIe Port [1022:43c7] (rev 01)
    03:09.0 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] 400 Series Chipset PCIe Port [1022:43c7] (rev 01)
    04:00.0 Ethernet controller [0200]: Realtek Semiconductor Co., Ltd. RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller [10ec:8168] (rev 15)
    08:00.0 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD/ATI] Navi 10 XL Upstream Port of PCI Express Switch [1002:1478] (rev c7)
    09:00.0 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD/ATI] Navi 10 XL Downstream Port of PCI Express Switch [1002:1479]
    0a:00.0 VGA compatible controller [0300]: Advanced Micro Devices, Inc. [AMD/ATI] Navi 24 [Radeon RX 6400/6500 XT/6500M] [1002:743f] (rev c7)
    0a:00.1 Audio device [0403]: Advanced Micro Devices, Inc. [AMD/ATI] Navi 21/23 HDMI/DP Audio Controller [1002:ab28]
    0b:00.0 Non-Volatile memory controller [0108]: Sandisk Corp WD Blue SN550 NVMe SSD [15b7:5009] (rev 01)
IOMMU Group 16:
    0c:00.0 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD/ATI] Navi 10 XL Upstream Port of PCI Express Switch [1002:1478] (rev c0)
IOMMU Group 17:
    0d:00.0 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD/ATI] Navi 10 XL Downstream Port of PCI Express Switch [1002:1479]
IOMMU Group 18:
    0e:00.0 VGA compatible controller [0300]: Advanced Micro Devices, Inc. [AMD/ATI] Navi 21 [Radeon RX 6950 XT] [1002:73a5] (rev c0)
IOMMU Group 19:
    0e:00.1 Audio device [0403]: Advanced Micro Devices, Inc. [AMD/ATI] Navi 21/23 HDMI/DP Audio Controller [1002:ab28]
IOMMU Group 20:
    0f:00.0 Non-Essential Instrumentation [1300]: Advanced Micro Devices, Inc. [AMD] Starship/Matisse PCIe Dummy Function [1022:148a]
IOMMU Group 21:
    10:00.0 Non-Essential Instrumentation [1300]: Advanced Micro Devices, Inc. [AMD] Starship/Matisse Reserved SPP [1022:1485]
IOMMU Group 22:
    10:00.1 Encryption controller [1080]: Advanced Micro Devices, Inc. [AMD] Starship/Matisse Cryptographic Coprocessor PSPCPP [1022:1486]
IOMMU Group 23:
    10:00.3 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD] Matisse USB 3.0 Host Controller [1022:149c]
IOMMU Group 24:
    10:00.4 Audio device [0403]: Advanced Micro Devices, Inc. [AMD] Starship/Matisse HD Audio Controller [1022:1487]

Passthrough for an IOMMU group is all-or nothing. For example, all of the chipset-connected devices are grouped together, and I can’t pass any of them through to a VM unless I pass them all through.

I’m mainly interested in a graphics card, an audio device, and a USB controller, and helpfully I have one of each isolated in their own groups, presumably because they are connected to PCIe lanes which go directly to the CPU.

Diagram illustrating that a graphics card, audio device, and USB controller are connected to PCIe lanes which go directly to the CPU, while a number of other devices are connected to the chipset.

Graphics

I created a virtual machine, also Debian 12, in the QEMU system session. The only important setting for now is the chipset and firmware, and in my case I selected Q35, and the OVMF UEFI firmware. GPU passthrough will not work if a virtual machine is booting with legacy BIOS.

I use the WebGL Aquarium as a simple test for whether 3D acceleration is working. It runs much faster on the host system (this is using the RX 6400). The copy in the VM runs at just 3 frames per second, using SPICE display and virtio virtual GPU at this stage.

The next step was to isolate the GPU from the host. The relevant line from lspci -nn is:

0e:00.0 VGA compatible controller [0300]: Advanced Micro Devices, Inc. [AMD/ATI] Navi 21 [Radeon RX 6950 XT] [1002:73a5] (rev c0)

The vendor and device ID for this card is shown at the end of the line, 1002:73a5.

This needs to be added to the vfio-pci.ids boot option, which in my case involves updating /etc/default/grub.

GRUB_CMDLINE_LINUX="vfio-pci.ids=1002:73a5"

This is then applied by running update-grub2, and rebooting. It’s apparently possible to accomplish this without a reboot, but I’m following the easy path for now.

I verified that it worked by checking that the vfio-pci kernel module is in use for this device.

$ lspci -v
0e:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Navi 21 [Radeon RX 6950 XT] (rev c0) (prog-if 00 [VGA controller])
...
        Kernel driver in use: vfio-pci
        Kernel modules: amdgpu

Before booting up the VM, I added the card as a “Host PCI Device”. It was then visible in lspci output within the VM, but was not being used for video output yet.

To encourage the VM to output to the physical graphics card, I switched the virtualised video card model from “virtio” to “None”.

Booting up the VM after this change, the SPICE display no longer produces output, but USB redirection still works. I passed through the keyboard, then the mouse.

I then switched monitor inputs to the VM, now equipped with a graphics card, and the WebGL aquarium test ran at 30 FPS.

Now that I was switching between near-identical Debian systems, I started using different desktop backgrounds to stay oriented.

Audio output

I needed to make sure that audio output worked reliably in the virtual machine.

Sound was working out of the box through the emulated AC97 device, but I checked an online Audio Video Sync Test, and confirmed that there was a significant delay, somewhere in in the 200-300ms range. This is not good enough, so I deleted the emulated device and decided to try some other options.

I first tried passing through the audio device associated with the graphics card, identified as 1002:ab28.

0e:00.1 Audio device [0403]: Advanced Micro Devices, Inc. [AMD/ATI] Navi 21/23 HDMI/DP Audio Controller [1002:ab28]

I took the ID, and added it to the option in /etc/default/grub. Now that there are multiple devices, they are separated by commas.

GRUB_CMDLINE_LINUX="vfio-pci.ids=1002:73a5,1002:ab28"

As before, I ran update-grub2, rebooted, added the device to the VM through virt-manager, booted up the VM, and tried to use the device.

With that change, I could connect headphones to the monitor and the audio was no longer delayed. This environment would now be viable for developing apps with sound, or following along with a video tutorial for example.

Audio input

Next I attempted to pass through the on-board audio controller to see if I could get both audio input and output. Discussions online suggest that this doesn’t always work, but there is no harm in trying.

I’ll skip through the exact process this time, but I again identified the device, isolated it from the host, and passed it through to a VM.

10:00.4 Audio device [0403]: Advanced Micro Devices, Inc. [AMD] Starship/Matisse HD Audio Controller [1022:1487]

Output worked immediately, but the settings app did not show any microphone level, and attempts to capture would immediately stop. I did some basic reading and troubleshooting, but didn’t have a good idea of what was happening.

What worked for me was blindly upgrading my way out of the problem by switching to the Debian testing rolling release in the guest VM.

I was then able to see the input level in settings, and capture the audio with Audacity.

Audio input is not critical for me, but does allow me to move more types of work into virtual machines without switching to a USB sound card.

USB

My computer has two USB controllers. One is available for passthrough, while the other is in the same IOMMU group as all other chipset-attached devices.

02:00.0 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD] Device [1022:43d0] (rev 01)
10:00.3 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD] Matisse USB 3.0 Host Controller [1022:149c]

The controller I passed through is the “Matisse USB 3.0 Host Controller”, device ID 1022:149c, which I initially expected would have plenty of USB 3 ports attached to it.

Reading the manual for my motherboard more carefully, I discovered that this controller is only responsible for 2 USB-A ports and 1 USB-C port, or 20% of the USB ports on the system.

I use a lot of USB peripherals for hardware development. More physical USB ports would be ideal, but I can work around this by using a USB hub.

I’ll also continue to use USB passthrough via libvirt to get the keyboard and mouse into the VM. Instead of manually passing these through each time I start the VM, I added each device in the configuration as a “Physical USB Device”.

This automatically connects the device when the VM boots, and returns it to the host when the VM shuts down, without the SPICE console needing to be open. If I need to control the host and guest at the same time, I can connect a different mouse and keyboard temporarily – I’ve got plenty of USB ports on the host after all.

If this ever becomes a major issue, it should also be possible to switch this setup around, and pass through the IOMMU group containing all chipset-attached PCIe devices instead of the devices I’ve chosen. This would provide extensive I/O and expansion options to the VM, at the cost of things like on-board networking, SATA ports, and an NVMe slot on the host. The 3 USB ports on the “Matisse USB 3.0 Host Controller”, if left to the host, would be just enough for a mouse, keyboard, and USB-C Ethernet adapter.

File share

On my previous setup, I used a 9p fileshare to map a directory on the host to the same path on every VM, allowing an easy way to exchange files.

The path was /home/mike/workspace – a carry-over from when I used Eclipse. In practice it has been slower to work on a fileshare, so I’ll switch to developing in a local directory.

I’ll still set up a fileshare, but with two changes:

  • I’ll map a share to a more generic /home/mike/Shared, and start to back up anything that lands there.
  • I’ll use virtiofs instead of virtio-9p. This claims to be more performant, and it’s apparently possible to get this working on Windows as well.

This is added as a hardware device in virt-manager on the host.

In the guest, I added the following line to /etc/fstab.

shared /home/mike/Shared/ virtiofs

To activate this change, I would previously have created the mount-point and run mount -a. I recently learned that systemd creates mount-points automatically on modern systems, so I instead ran the correct incantation to trigger this process, which is:

$ sudo systemctl daemon-reload
$ sudo systemctl restart local-fs.target

Power management

In my current setup, sleep/wake causes instability. I initially disabled idle suspend on the host in GNOME:

While testing a VM which I had configured to start on boot, the system decided to go to sleep while I was using it. In hindsight this makes complete sense: as far as the host could tell, it was sitting on the login page with no mouse or keyboard input, and had been configured to go to sleep after an idle time-out (the setting within GNOME only applies after login).

Dec 05 16:24:43 mike-kvm systemd-logind[706]: The system will suspend now!

To avoid this, I additionally disabled relevant systemd units (source of this command).

$ sudo systemctl mask sleep.target suspend.target hibernate.target hybrid-sleep.target
Created symlink /etc/systemd/system/sleep.target → /dev/null.
Created symlink /etc/systemd/system/suspend.target → /dev/null.
Created symlink /etc/systemd/system/hibernate.target → /dev/null.
Created symlink /etc/systemd/system/hybrid-sleep.target → /dev/null.

I’ve also configured each VM to simply blank the screen when idle, since allowing a guest to suspend causes its own problems.

Recap of unexpected issues

Despite my efforts to plan around the limitations of my hardware, I did hit four unanticipated problems, mostly highlighted above.

  • I assumed that there would be a setting in my BIOS to change which graphics card to use during boot, but I couldn’t find one.
  • The audio devices associated with both of my graphics cards had the same vendor and device ID. The way I have it configured, no audio output is available on the host as a result.
  • Microphone input didn’t work on the HD Audio Controller until I upgraded the guest operating system. The usual workaround for this is apparently to use a USB sound card.
  • I misunderstood the limitations of my USB setup, so I have fewer USB ports directly available in the VM than I had hoped for.

Wrap-up and future ideas

I really like the idea of switching into an environment which only has the tools I need for the task at hand. Compared with my previous setup for developing in a VM, GPU passthrough (and sound controller passthrough, and USB controller passthrough) is a huge improvement.

I’ve manually provisioned 3 VM’s, including a general-purpose development VM which has a mix of basic tools so that I can get started.

Since I can only have one VM using the graphics card at a time, this setup works similarly to having multiple operating systems installed as a multi-boot setup. It does however have far better separation between the systems – they can’t read and write each others’ disks for example.

The next steps for me will be to streamline the process of switching VM’s and shutting down the system. Both of these currently require me to manually switch monitor inputs to interact with the host, which I would prefer to avoid.

Building a 1U quiet NAS

I recently built a compact, quiet rackmount NAS for home. I haven’t seen any builds quite like it online, so I’m writing a bit about how it came together.

The problem

My old backup was a mirrored pair of 2 TB hard drives in an old desktop computer, with a portable 2 TB hard drive as an off-site copy. The disks are now 9 years old, and 2 TB is small enough that I need to ration the space. I also recently repurposed most of the components in that system, so I no longer have an up-to-date backup.

I want to solve this properly, and hopefully build a replacement setup which I can install in my network cabinet, to run 24/7.

But if I’m going to do that, the must-haves are:

  • It fits in a 1U rack-mount form-factor with maximum 25cm depth
  • It’s quiet
  • It has front USB for making an offline copy of the backups
  • It has 2 SATA disks – I don’t want to be running a NAS off USB-SATA adapters, for example
  • It has wired ethernet – my network is 1 Gigabit

Nice-to-haves would be:

  • Fast front USB
  • Hot-swap disks or more disks
  • Faster network

I’ve worked with servers, and I’ve built small-form-factor computers, so how hard can it be to build a small-form-factor server?

Parts list

I spent a lot of time sketching out possible builds in LibreOffice Draw. Every combination of parts had some compromise, which is a familiar theme from small-form-factor PC building.

I ended up deciding on parts which follow normal PC standards, hopefully giving me a good chance of keeping it working for many years.

  • Case: Case Athena Power RM1UC138
  • Power supply: HDPLEX GAN 250W
  • Motherboard: Topton N6005 Mini ITX
  • RAM: Crucial 16GB (2 x 8GB) DDR4 3200 SO-DIMM
  • Boot disk: Samsung 970 EVO Plus 1TB M.2 SSD
  • Storage disks: 2 x Samsung 870 QVO 8TiB 2.5″ SSD
  • Fans: 4 x Noctua A4x20 PWM
  • Drive cage: Icy Dock MB608SP-B – 6 x 2.5″ SATA bays
  • Set of 6 x 50cm SATA cables

There are some compatibility issues in the above part list, and it took a little bit of problem-solving to get everything working together.

Closer look at the Athena Power RM1UC138

Short-depth 1U computer cases are nearly impossible to find in Australia. I ordered the Athena Power RM1UC138 from the United States, which is an OEM case with some flexibility built in.

On the front it has a 5.25″ bay, which I plan to use to add a drive cage. The front USB is only USB 2.0, which would normally be a disadvantage, but in this instance is a good match for the motherboard I’m using. The power button looks like a toggle switch, but is actually a momentary switch.

The side shows that the rack ears can be put on the front or back, allowing the case to be mounted in either direction.

On the back there is a wire mesh panel for cutting out a custom I/O shield, which is handy since I have an off-brand motherboard.

On the inside, it’s configured for 2 x 3.5″ hard drives by default. It also includes 2 x 2-pin 12v fans. They are loud like you would find in most managed network switches, but are not jet-engine loud like most servers. The fan controller simply distributes 12v, and has no speed control.

This is not a very common case, and I read everything I could find online about it. In order to help the next person who is searching for it, here are two more random pieces of information which I could not confirm until I had the case in-hand:

  • Stand-offs on the case are all 4mm tall and non-removable
  • Screw spacing of USB 2.0 front panel connector is approx 30.5mm (from centre of each screw). Stacked USB 3 headers that are 30mm spacing are available online and could be made to work.

Closer look at the Topton N6005 Mini ITX

I chose to use a Topton motherboard with a built-in Intel N6005 CPU for this build, since the alternatives were either too tall, use a socketed CPU, use an old CPU, would require add-in cards to get multiple SATA ports, or were not sold in Australia. All of these would make it far more difficult to complete the small, quiet build which I was aiming for.

From the few threads online about this board, I gathered that it is fussy about RAM compatibility, so I booted it up at the first opportunity with an SSD containing Pop!_OS to check that it worked. It’s not my use case, but this motherboard would definitely be viable as a lightweight desktop.

The specific memory I used was a 16 GB kit with the model number CT2K8G4SFRA32A. According to Intel’s product documentation, the N6005 only supports 16 GB maximum, and while I could find claims that higher-capacity memory does work, I couldn’t find anybody who posted actual part numbers.

I was happy to find that the built-in cooler is inaudible at idle loads. This was a bit of a risk: the cooler doesn’t have standard dimensions, so I couldn’t have easily replaced it with an alternative if it was noisy. Based on other people’s experiences with this board, I re-pasted the cooler with Noctua NT-H1 thermal paste, to hopefully help keep temperatures down at higher loads so that the fan will not need to spin up as much. I also also avoided using the M.2 slot which receives the most hot air from the cooler.

Topton also sells an N5105 variant which appears to be more popular (more info here), as well as an alternative layout which has a PCIe slot instead of a second M.2 slot.

Custom power supply adapter plate

The case is designed for a Flex ATX power supply, which is not what I’m using. I’ve instead opted to use a passively-cooled HDPLEX GAN 250W power supply, which ships with both an IEC C6 and IEC C14 cable.

I needed to choose one of these cables, and figure out how to securely mount it to the case.

I designed an adapter plate in FreeCAD around the included IEC C6 cable, since it had threaded holes, and screws were included.

I ordered it from a prototype supplier in laser cut 1mm steel, painted in matte black.

This is the first time I’ve used FreeCAD to design a part, and parametric CAD certainly has a learning curve. For this build, it was well worth it, since the result is better (and safer) than anything I could have improvised.

At the time of writing this post, HDPLEX sells plates for mounting their IEC C14 cables in cases accepting SFX and ATX power supplies, but none for cases which accept Flex ATX power supplies.

Custom fan controller

Cooling this build quietly was always going to be a challenge. The case shipped with 2 x 12 V fans, and had a simple splitter which ran them at max speed, which was just too loud.

I designed a replacement fan controller in KiCad, which allows me to upgrade to high-quality 4-pin fans with PWM speed control, and to set the speed using a potentiometer. I wrote about prototyping this in a separate blog post.

This photo shows the custom controller alongside the original one it replaces.

As you can probably guess, I’ve designed this to use the same mounting location, at the front of the case. My power supply has no Molex connectors, so I’m using a SATA-Molex power adapter.

The main drawback to this simple design is that once I close up the case, I can no longer adjust the fan speed.

Final assembly

Before continuing any further, I took apart the case completely and deleted three standoffs with a belt sander, to leave a flat area for power supply installation later.

Once I got the case back together, the motherboard went in first. I raised it by 1mm using plastic washers, hoping to line it up better with the I/O shield included with the motherboard.

The I/O opening for 1U servers is narrower than standard PC builds, so I needed to cut the I/O shield, which I unfortunately did not do correctly.

Since that did not work, I carefully marked and cut out the wire mesh I/O shield included with the case instead. I still left the motherboard raised up on 1mm washers, though this is not necessary anymore. You need to use slightly longer screws if you try this.

After that I installed the four case fans, plus the fan controller. I’m using a front-to-back airflow, with 2 x 40mm fans mounted at the front, and 2 x 40mm fans at the back. I added a Y splitter to the back fans, which did not have long enough cables to reach the fan controller.

The next component I installed was the drive cage. It’s worth mentioning at this point that the drive cage also has a fan header, which is the same as a 3-pin header that you would find on a PC motherboard. It supplies a different voltage for each of the speed settings. Medium is approximately 7.5 volts and is relatively quiet with the included “Good quality DC fan” fan, and high speed is 12 volts. I set mine to off but left the fan installed.

To install items into the 5.25″ bay in this case, you attach a bracket, then fasten it from above. The bracket allows the depth to be adjusted as well.

I also installed disks in the drive cage at this point, and numbers on the front. Disk 1 is connected to SATA0 on the motherboard, disk 2 is connected to SATA1, and so on.

Next was the power supply. I installed the custom plate for the power connector, and also installed the mounting plate on the bottom of the PSU so that it would have a flat surface. After confirming that it would fit, I cleaned both surfaces with alcohol, and applied double-sided tape.

I then followed a rehearsed path to drop the PSU into place. There is no opportunity to adjust it once it sticks.

At this point I connected everything up and booted up the system to start checking for problems, since it’s easier to troubleshoot in this state. Two modifications I made here were to disconnect the bright red HDD LED, and to introduce a SATA power Y splitter, because the power supply SATA cables were stretched to the limit.

It took a lot of work (and cable ties) to arrange the cables flat so that the case could close. In defence of cable ties, they do make maintenance more difficult, but that’s a worthwhile trade-off for keeping cables clear of airflow paths, fan blades, and the guillotine-like action of the top cover sliding shut.

Completed build

After closing the case, the build is, 434mm × 254mm x 44mm, or 4.8 litres, excluding rack ears.

This is how it appears from the front.

And this is how it appears from the back.

Software

I’m starting with Proxmox, with OpenMediaVault deployed as a virtual machine. I haven’t used either of these before, but both are Debian-based and provide convenient web front-ends to the tools I would otherwise be configuring on the command-line.

I’m passing through the disks as block devices. Running the NAS like this should make it possible to provision extra workloads which need their own SATA disks in future, or to switch from OpenMediaVault to stock Debian if necessary, all without connecting a monitor.

Within OpenMediaVault, I’ve configured Linux software RAID, with an ext4 filesystem, shared via Samba, and can access that file share over the network.

I’ve enabled some basic power management features such as C-states. The system idles in the range of 12-14 watts measured from the wall, and goes up to 20 watts when moving files around.

I don’t need a lot of disk capacity, so I’ve been able to preserve a useful property of my old setup, where every disk in the system has a full copy of the data, in a format which can be understood by a normal Linux system. This it makes single-disk recovery possible using any surviving disk from the system on practically any computer, and that disk can be from either an offline copy or one of the disks in the RAID mirror.

I haven’t tested the process of making an offline copy of the backup volume, but that will be up next.

Wrap-up

This is possibly the most effort I’ve ever put into a PC build. The only unexpected issue I encountered is how heavy it is, and wont be rack-mounting it until I get some generic rails.

The computer uses a strange mix of parts, but meets my requirements well. I hope that by writing this up, I’ll be providing some useful notes to anybody attempting to build something similar.

This project also gave me a chance to practice my entry-level CAD skills to build something which I’ll actually be using. I find a lot of utility in paper prototyping, and printed each design in 1:1 scale to check the physical dimensions before ordering anything.

For the circuit board, I used a print-out to check each part footprint, as well as the hole locations for fitting it in the case.

As with many of the projects which I blog about, I’ve put the design files up on GitHub. The fan controller is at mike42/fan-controller-athena-power, while the Flex ATX adapter plate is at mike42/flexatx-adapter-hdplex.

libvirt: Migrate a VM from qemu:///session to qemu:///system

In recent versions of the libvirt virtualisation libraries, you to create and manage virtual machines as a regular user, using the qemu:///session connection.

This is great, but the networking is quite limited. I found that machines defined in Gnome Boxes could not speak to each-other, and that libvirt commands for networking were unavailable.

For this reason, I’ve written this quick guide for booting up an existing same VM image under the qemu:///system instance, which is faster than re-installing the machine. Unlike most sorts of migrations, this leaves the disk image at the same location on the same host machine.

There’s many different ways to do VM’s in Linux. This setup will be useful only if you use libvirt/kvm using qcow2 images on Debian. As always, consider doing a backup before trying new things.

Configurations

First, find your virtual machine in virsh, and dump its configuration to a text file in your home directory, as a regular user.

$ virsh list --all
 Id    Name                           State
----------------------------------------------------
 -     foo-machine                    shut off
$ virsh dumpxml foo-machine > foo-machine.xml

Now remove the VM definition from your user:

$ virsh undefine foo-machine
Domain foo-machine has been undefined

Import the definitions into virsh as the root user:

$ sudo virsh define foo-machine.xml 
Domain foo-machin defined from foo-machine.xml

Attempt to start the new VM definition. Depending on where the disk image is, expect an error.

$ sudo virsh start foo-machine

Disk images

The disk image needs to be accessible to the libvirt-qemu user. There’s two basic ways to achieve this: Re-permission the directories above it, or move it.

I chose to just re-permission it, since it’s not an issue to have world-readable directories on this particular box:

$ cat foo-machine.xml | grep source
      <source file='/home/example/.local/share/gnome-boxes/images/foo-machine'/>

This one-liner outputs the commands to run to make a directory work/navigable:

$ dir=`pwd`; while [ "$dir" != "/" ]; do echo "chmod o+x,g+x \"$dir\""; dir=`dirname $dir`; done
chmod o+x,g+x "/home/example/.local/share/gnome-boxes/images"
chmod o+x,g+x "/home/example/.local/share/gnome-boxes"
chmod o+x,g+x "/home/example/.local/share"
chmod o+x,g+x "/home/example/.local"
chmod o+x,g+x "/home/example"
chmod o+x,g+x "/home"

And the user account needs to be able to write as well:

$ sudo chown libvirt-qemu /home/example/.local/share/gnome-boxes/images/foo-machine

Once you have the permissions right, the VM should start, using the same command as before:

$ sudo virsh start foo-machine

More importantly, you can now hook up virt-manager and view your machine on qemu:///system, allowing you to configure the VM with any network settings you need.

How to resize a Windows VM image with virt-resize

I recently had a Windows 7 Virtual Machine stored on an undersized qcow2 file. This post steps through the simplest way that I know to produce a new, bigger disk and expand the filesystem onto it.

Empty out the empty space

Because the guest VM is stored on a QCOW2 file, we can recover un-used space on disk by zeroing it out now. Download the sdelete utility from Microsoft and run it on the system.

sdelete -z

One this is done, power off the guest.

Assuming the host is linux, you need the qemu-utls and libguestfs-tools packages to follow these steps. On Debian-

apt-get install libguestfs-tools qemu-utls

Move the VM image to a new filename and inspect it.

mv windows.img windows.img.bak

The file command indicates that the disk is about 30GB expanded.

$ file windows.img.bak 
windows.img.bak: QEMU QCOW Image (v3), 32212254720 bytes

The qemu-img command shows that the disk is 83% full:

$ qemu-img check windows.img.bak 
No errors were found on the image.
411337/491520 = 83.69% allocated, 5.66% fragmented, 0.00% compressed clusters
Image end offset: 26961969152

Check out your FS names, note that /dev/sda2 is the disk we want to up-size in this case

$ virt-filesystems -a windows.img -l
Name       Type        VFS   Label            Size         Parent
/dev/sda1  filesystem  ntfs  System Reserved  104857600    -
/dev/sda2  filesystem  ntfs  -                32105299968  -

Make a new, bigger disk image

Create a new disk of the desired size. In my case, 50G is sufficient:

$ qemu-img create -f qcow2 windows.img 50G
Formatting 'windows.img', fmt=qcow2 size=53687091200 encryption=off cluster_size=65536 lazy_refcounts=off refcount_bits=16

The file command shows that this new empty disk image is larger than the old image.

$ file windows.img
windows.img: QEMU QCOW Image (v3), 53687091200 bytes

Copy the old disk to the new one. The --expand option names a partition which will be grown to fill the extra space.

virt-resize --expand /dev/sda2 windows.img.bak windows.img

The virt-resize command shows a progress bar while it works, and zero-blocks will be reclaimed as a result of the output format:

2016-04-disk01

The final line of output suggests holding on to your backup until you’ve checked it, which is wise:

Resize operation completed with no errors. Before deleting the old disk,
carefully check that the resized disk boots and works correctly.

Check that the new disk is valid and contains partitions at the expected size:

$ virt-filesystems -a windows.img -l
Name       Type        VFS   Label            Size         Parent
/dev/sda1  filesystem  ntfs  System Reserved  104857600    -
/dev/sda2  filesystem  ntfs  -                53579939840  -

Boot up the guest

When the machine boots up, you may get a disk check prompt. Because the console I was using triggered the ‘Press any key to cancel’ prompt, I had to reboot and leave the console disconnected in order for the check to start.

2016-04-disk02

2016-04-disk03

After booting, the C:\ drive should display at its new size:

2016-04-disk04

Crash course: Run Windows on desktop Linux

Sometimes, you need to use a tricky windows-only proprietary program on a GNU/Linux desktop. If you have a Windows install disk and licence at your disposal, then this post will show you how to get a Windows environment running without dual-booting.

The host here is a Debian box, and the guest is running Windows 7. The instructions will work with slight modifications for any mix of GNU/Linux and Windows

On the desktop, some things are not as important as the server world. Some things are excluded for simplicity: network bridging, para-virtualised disks, migration between hosts, and disk replication.

Software setup

Everything required from the host machine can be pulled in via Debian’s qemu-kvm package.

sudo apt-get install qemu-kvm

Install

Prepare a disk image for Windows. The qcow2 format is suggested for the desktop as it will not expand the file to the full size until the guest uses the space:

qemu-img create -f qcow2 windows.img 30G

Launch the Windows installer in KVM with a command that looks something like this:

kvm -hda windows.img --cdrom windows-install-disc.iso -vga std -localtime -net nic,model=ne2k_pci -m 2048

Note the -m option is the number of megabytes of RAM to allocate. You can set it a little lower if you don’t have much to spare, but if it’s too low you’ll get this screen:

2014-06-04-capture-noboot

If you have a physical disk but no .iso of it, then using the disk drive via --cdrom /dev/cdrom will work.

Install

If you have GNU/Linux, chances are you have installed an OS before. In case you haven’t seen the Windows 7 installer, the steps are below:

Select language, accept the licence agreement, choose the target disk, and let the files copy:

2014-06-04-capture-005
2014-06-04-capture-023
2014-06-04-capture-024
2014-06-04-capture-028
2014-06-04-capture-033
2014-06-04-capture-035
2014-06-04-capture-081
2014-06-04-capture-149

After reboot, enter the user details, licence key, update settings and timezone:

2014-06-04-capture-160
2014-06-04-capture-178
2014-06-04-capture-190
2014-06-04-capture-191
2014-06-04-capture-199
2014-06-04-capture-205
2014-06-04-capture-214
2014-06-04-capture-250
2014-06-04-capture-254
2014-06-04-capture-261

After another reboot, Windows is installed in the virtual machine:

2014-06-04-capture-263
2014-06-04-capture-265
2014-06-04-capture-277

Post-install tasks

The guest you have now will only run at standard VGA resolutions, and will probably not be networked. This section will show you how to fix that.

Network drivers

You will notice that we are launching the guest with -net nic,model=virtio. This means that we are using a virtual network card, rather than simulating a real one. You need to fetch a disk image with the latest binary drivers, which are best tracked down on linux-kvm.org via google.

Once you have the disk image in the same folder as your virtual machine, shut down and launch it with a CD:

kvm -hda windows.img --cdrom virtio-win-0.1-74.iso -vga std -localtime -net nic,model=ne2k_pci -m 2048

Under "My Computer" track down the "Device Manager", find your network card, and tell Windows to update the drivers. You can then point it to the CDROM’s "Win7" directory (or other one, if you are installing a different guest). After the network adapter is recognised, you will be connected automatically.

Note that you are using "user-mode" networking, which means you are on a simulated subnet, and can only communicate via TCP and UDP (ie, ping will not work). This can be a little slow, but will work on a laptop host whether plugged in or running on WiFi.

Remote desktop

You may also be annoyed by the screen resolution and mouse sensitivity having strange settings. The best way around this is not to fiddle with settings and drivers, but to enable remote desktop and log in via the network. This lets you use an arbitrary screen size, and match mouse speed to the host.

This is set up to run locally, so it is neither laggy nor a security issue, and makes it possible to leverage all RDP features.

First, in the guest, enable remote desktop using these Microsoft instructions.

Then shut down and boot up with the extra -redir tcp:3389::3389 option:

kvm -hda windows.img -vga std -localtime -net nic,model=ne2k_pci -m 2048 -redir tcp:3389::3389

On the host, wait for the guest to boot, then use rdesktop to log in:

rdestkop localhost

One this works, you can shut down and boot with the extra -nographic option to leave remote desktop as the only way to interact with the guest:

kvm -hda windows.img -vga std -localtime -net nic,model=ne2k_pci -m 2048 -nographic -redir tcp:3389::3389

The rdesktop tool supports sound, file and printer redirection. It can also run fullscreen when launched with -f. All the details are in man rdesktop

If you end up using the guest operating system more, it is worth investigating USB-redirection for any peripherals (printers or mobile phones), running a virtual sound card, or running SAMBA on the host to share files.

A tour of ReactOS 0.3.15

ReactOS is a project which aims to create an open source operating system which is binary-compatible with Windows. Although it is still cautiously labelled “alpha”, its basic use is about as reliable as Windows once was.

This post runs through the steps to install ReactOS 0.3.15 as a KVM guest on Linux.

Preparation

Before attempting anything, check that you a CPU supports Intel VT or AMD-V. This command will return the number of CPU cores with svm or vmx flags:

cat /proc/cpuinfo | grep -E 'svm|vmx' | wc -l

Now download the ReactOS 0.3.15 disk from reactos.org, extract it to get the .iso, and fetch some packages if you don’t have them installed:

apt-get install libvirt-bin kvm qemu-utils

Prepare a disk image to install to. If your hardware is slower, then a raw image is a better idea than the qcow2:

qemu-img --help
qemu-img create -f qcow2 reactos.img 4G

The working directory now has:

mike@mikebox:~/vm/reactos$ ls -Ahl
total 77M
-rw-r--r-- 1 mike mike  77M May 19  2013 ReactOS-BootCD.iso
-rw-r--r-- 1 mike mike 193K Jan 30 21:05 reactos.img

Installation and first boot

The kvm command will pop up a window with the guest operating system. To boot from the install disk, run:

kvm -hda reactos.img --cdrom ReactOS-BootCD.iso -vga std -localtime -net nic,model=ne2k_pci -net user

The meaning of each of these options is:

-hda reactos.img
Sets the HDD image file.
--cdrom ReactOS-BootCD.iso
Sets the CDROM image file. Because reactos.img is blank, this will boot.
-vga std
Sets the VGA card.
-localtime
Emulates a system clock in local time, rather than UTC.
-net nic,model=ne2k_pci
Sets the network card to something ReactOS will recognise.
-net user
Enables user-mode networking. Your computer will emulate a network and pass on TCP and UDP connections. This is the easiest mode to use, but ICMP packets (such as pings) will not work, and the VM will not be accessible from other computers.

Installation was fast, error-free, and did not require a network connection. The first screen capture below was taken at 16:04:49, and the desktop was captured at 16:06:07 (1 minute 18 seconds later). Most of that time would have been wasted waiting for user input.

ReactOS installer language select
ReactOS install or repair
ReactOS installer disclaimer
Confirm install settings
Select partition
Formatting options for new partition
Formatting confirmation dialog
ReactOS install directory
ReactOS installer copying files
Bootloader options (freeloader)

After copying files, the installer reboots to a more user-friendly mode (similar to the Windows installer):

ReactOS boot menu
ReactOS loading NTOSKRNL.EXE
ReactOS boot screen
ReactOS Install - Installing devices
ReactOS Install - Welcome to the ReactOS Setup Wizard
ReactOS Install - Acknowledgements
ReactOS Install - Personalize your Software
ReactOS Install - Computer name and Administrative Password
ReactOS Install - Regional settings
ReactOS Install - Date and Time
ReactOS Install - Registering Components
ReactOS Install - Completing the ReactOS Setup Wizard
ReactOS desktop after installation

The installed system

After installation, the --cdrom option can be dropped:

kvm -hda reactos.img -vga std -localtime -net nic,model=ne2k_pci -net user

The first thing I did was correct the colour depth, and then attempt to install VLC. This did not turn out well (the console screen is QEMU-monitor):

Reactos display properties
VLC installation in ReactOS Applications Manager
Ctrl+Alt+Del from QEMU monitor
BSOD during VLC installation

I used command prompt to verify that networking was fine (note the lack of ICMP in user-mode networking):

Ping from command prompt with user-mode networking as KVM guest
Testing network on ReactOS

The Firefox 22 install worked, but it went awry after that. Several reboots later I gave up:

Firefox installer on ReactOS
Firefox frozen on startup
BSOD while running firefox on ReactOS

The built-in programs were much more usable:

ReactOS paint
ReactOS paint Save As dialog
ReactOS Explorer
ReactOS Start Menu showing Administrative Tools
ReactOS Device Manager

PuTTY installed flawlessly, and I was able to SSH to the host computer:

PuTTY installation on ReactOS
PuTTY readme in ReactOS notepad
Main PuTTY window after installation
Using command prompt to find host computer address:
PuTTY with host computer address
PuTTY connected to host computer, showing 'uname -a' output

An example of a frozen program causing graphics glitches (Windows up to XP does this as well):


Frozen Application Manager in ReactOS

And an obligatory screenshot of the “Properties for System” dialog, showing the build as 20130518-r59037:


Reactos 'Properties of System' dialog 20130518-r59037

Conclusion

ReactOS is a cool idea and project, but the OS is still very glitchy. The built-in apps are stable and familiar-looking, but you would require a lot of patience (and a lot of rebooting) to use a ReactOS system for more than a few minutes.

Being open source is a big plus, as there is no need to activate the installation or enter software keys. GNU/Linux users will already be accustomed to this.