There are two sets of dependencies required for the development process: build and run dependencies. We typically build, develop and test using the latest Ubuntu LTS version and run nrk in QEMU. Other Linux systems will probably work but might require a manual installation of all dependencies. Other operating systems likely won't work out of the box without some adjustments for code and the build-process.
Check out the nrk sources first:
git clone <repo-url> cd nrk
The repository is structured using git submodules. You'll have to initialize and check-out the submodules separately:
In case you don't have the SSH key of your machine registered with a github account, you need to convert all submodule URLs to use the https protocol instead of SSH, to do so run this sed script before proceeding:
sed -i'' -e 'firstname.lastname@example.org:/https:\/\/github.com\//' .gitmodules
git submodule update --init
If you want to build without Docker, you
can install both build and run dependencies by executing
setup.sh in the root
of the repository directly on your machine (this requires the latest Ubuntu
LTS). The script will install all required OS packages, install Rust using
rustup and some additional rust programs and
The build dependencies can be divided into these categories
- Rust (nightly) and the
rust-srccomponent for compiling the OS
python3(and some python libraries) to execute the build and run script
- Test dependencies (qemu, corealloc, dhcpd, redis-benchmark, socat, graphviz etc.)
- Rumpkernel dependencies (gcc, zlib1g etc.)
- Build for documentation (mdbook)
We provide scripts to create a docker image which contains all build dependencies already.
To use Docker, it needs to be installed in your system. On Ubuntu execute the following steps:
sudo apt install docker.io sudo service docker restart sudo addgroup $USER docker newgrp docker
To create the image execute the following command in the
This will create the docker image and start the container. You will be dropped into a shell running inside the Docker container. You can build the OS as if you had installed the dependencies natively.
The script will create a user inside the docker container that corresponds to the user on the host system (same username and user ID).
You can rebuild the image with:
bash ./docker-run.sh force-build
To exit the container, just type
exit to terminate the shell.
To just build the OS invoke the
run.py script (in the kernel directory) with
-n parameter (no-run flag).
python3 kernel/run.py -n
If you want to run the build in a docker container, run
bash ./scripts/docker-run.sh beforehand. The source directory tree will be mounted
in the docker container in
Make sure the QEMU version for the account is is >= 6 . The following steps can be used to build it from scratch, if it the Ubuntu release has a lesser version in the package repository.
First, make sure to uncomment all #deb-src lines in /etc/apt/sources.list if not already uncommented. Then, run the following commands:
sudo apt update sudo apt install build-essential libpmem-dev libdaxctl-dev ninja-build apt source qemu sudo apt build-dep qemu wget https://download.qemu.org/qemu-6.0.0.tar.xz tar xvJf qemu-6.0.0.tar.xz cd qemu-6.0.0 ./configure --enable-rdma --enable-libpmem make -j 28 sudo make -j28 install sudo make rdmacm-mux # Check version (should be >=6.0.0) qemu-system-x86_64 --version
You can also add
--enable-debug to the configure script which will add debug
information (useful for source information when stepping through qemu code in
run.pywill enable RDMA support in QEMU. However, you'll manually have to run
rdmacm-muxand unload the Mellanox modules at the moment.
QEMU has support for
pvrdma (a para-virtual RDMA driver) which integrates with
physical cards (like Mellanox). In order to use it (aside from the
--enable-rdma flag and
sudo make rdmacm-mux during building), the following
steps are necessary:
Install Mellanox drivers (or any other native drivers for your RDMA card):
wget https://content.mellanox.com/ofed/MLNX_OFED-5.4-220.127.116.11/MLNX_OFED_LINUX-5.4-18.104.22.168-ubuntu20.04-x86_64.tgz tar zxvf MLNX_OFED_LINUX-5.4-22.214.171.124-ubuntu20.04-x86_64.tgz cd MLNX_OFED_LINUX-5.4-126.96.36.199-ubuntu20.04-x86_64 ./mlnxofedinstall --all
Before running the rdmacm-mux make sure that both ib_cm and rdma_cm kernel modules aren't loaded, otherwise the rdmacm-mux service will fail to start:
sudo rmmod ib_ipoib sudo rmmod rdma_cm sudo rmmod ib_cm
Start the QEMU
racadm-mux utility (before launching a qemu VM that uses
./rdmacm-mux -d mlx5_0 -p 0
run.pywill add persistent memory to the VM. If you want to customize further, read on.
Qemu has support for NVDIMM that is provided by a memory-backend-file or memory-backend-ram. A simple way to create a vNVDIMM device at startup time is done via the following command-line options:
-machine pc,nvdimm -m $RAM_SIZE,slots=$N,maxmem=$MAX_SIZE -object memory-backend-file,id=mem1,share=on,mem-path=$PATH,size=$NVDIMM_SIZE -device nvdimm,id=nvdimm1,memdev=mem1
nvdimmmachine option enables vNVDIMM feature.
slots=$Nshould be equal to or larger than the total amount of normal RAM devices and vNVDIMM devices, e.g. $N should be >= 2 here.
maxmem=$MAX_SIZEshould be equal to or larger than the total size of normal RAM devices and vNVDIMM devices.
object memory-backend-file,id=mem1,share=on,mem-path=$PATH, size=$NVDIMM_SIZEcreates a backend storage of size
share=on/offcontrols the visibility of guest writes. If
share=on, then the writes from multiple guests will be visible to each other.
device nvdimm,id=nvdimm1,memdev=mem1creates a read/write virtual NVDIMM device whose storage is provided by above memory backend device.
Though QEMU supports multiple types of vNVDIMM backends on Linux, the only backend that can guarantee the guest write persistence is:
- DAX device (e.g.,
/dev/dax0.0, ) or
- DAX file(mounted with dax option)
When using DAX file (A file supporting direct mapping of persistent memory) as a
backend, write persistence is guaranteed if the host kernel has support for the
MAP_SYNC flag in the mmap system call and additionally, both 'pmem' and
'share' flags are set to 'on' on the backend.
Users can provide a persistence value to a guest via the optional
nvdimm-persistence machine command line option:
There are currently two valid values for this option:
mem-ctrl - The platform supports flushing dirty data from the memory
controller to the NVDIMMs in the event of power loss.
cpu - The platform supports flushing dirty data from the CPU cache to the
NVDIMMs in the event of power loss.
Linux systems allow emulating DRAM as PMEM. These devices are seen as the Persistent Memory Region by the OS. Usually, these devices are faster than actual PMEM devices and do not provide any persistence. So, such devices are used only for development purposes.
On Linux, to find the DRAM region that can be used as PMEM, use dmesg:
dmesg | grep BIOS-e820
The viable region will have "usable" word at the end.
[ 0.000000] BIOS-e820: [mem 0x0000000100000000-0x000000053fffffff] usable
This means that the memory region between 4 GiB (0x0000000100000000) and 21 GiB (0x000000053fffffff) is usable. Say we want to reserve a 16 GiB region starting from 4 GiB; we need to add this information to the grub configuration file.
sudo vi /etc/default/grub GRUB_CMDLINE_LINUX="memmap=16G!4G" sudo update-grub2
After rebooting with our new kernel parameter, the
dmesg | grep user should
show a persistent memory region like the following:
[ 0.000000] user: [mem 0x0000000100000000-0x00000004ffffffff] persistent (type 12)
We will see this reserved memory range as
/dev/pmem0. Now the emulated PMEM
region is ready to use. Mount it with the dax option.
sudo mkdir /mnt/pmem0 sudo mkfs.ext4 /dev/pmem0 sudo mount -o dax /dev/pmem0 /mnt/pmem0
Use it as a
mem-path=/mnt/pmem0 as explained earlier.
The NVDIMMs need to be configured and provisioned before using them for the
ipmctl tool can be used to discover and provision the
Intel PMs on a Linux machine.
To show all the NVDIMMs attached to the machine, run:
sudo ipmctl show -dimm
To show all the NVDIMMs attached on a socket, run:
sudo ipmctl show -dimm -socket SocketID
NVDIMMs can be configured both in volatile (MemoryMode) and non-volatile
(AppDirect) modes or a mix of two using
ipmctl tool on Linux.
We are only interested in using the NVDIMMs in AppDirect mode. Even in AppDirect mode, the NVDIMMs can be configured in two ways; AppDirect and AppDirectNotInterleaved. In AppDirect mode, the data is interleaved across multiple DIMMs, and to use each NVDIMMs individually, AppDirectNotInterleaved is used. To configure multiple DIMMs in AppDirect interleaved mode, run:
sudo ipmctl create -goal PersistentMemoryType=AppDirect
Reboot the machine to reflect the changes made using the -goal command. The command creates a region on each socket on the machine.
ndctl show --regions
To show the DIMMs included in each region, run:
ndctl show --regions --dimms
Each region can be divided in one or more namespaces in order to show the storage devices in the operating system. To create the namespace(s), run:
sudo ndctl create-namesapce --mode=[raw/sector/fsdax/devdax]
The namespace can be created in different modes like raw, sector, fsdax, and devdax. The default mode is fsdax.
Reboot the machine after creating the namespaces, and the devices will show-up in /dev/* depending on the mode. For example, if the mode is fsdax, the devices will be named /dev/pmem*.
Mount these devices:
sudo mkdir /mnt/pmem0 sudo mkfs.ext4 /dev/pmem0 sudo mount -o dax /dev/pmem0 /mnt/pmem0
These mount points can be used directly in the userspace applications or for Qemu virtual machine as explained earlier.