Tprrt's Blog

Aug 06, 2024

Secure Boot with AHAB on i.MX93: A Complete Guide

The security of embedded devices has never been more critical. In a world where attacks targeting IoT systems are becoming increasingly sophisticated, ensuring the integrity of the boot process is a must. This is where Secure Boot comes in—an essential technology that guarantees only authorized code can execute on a device from the moment it starts. In this article, we will explore the implementation of Secure Boot using AHAB, the solution provided by NXP to secure the i.MX93 from its initial boot stages.

Why is Secure Boot crucial for your device?

A secure boot ensures that no malicious code interferes with the critical boot process, protecting your device from attacks targeting the bootloader and early boot stages. Furthermore, AHAB, integrated into i.MX93 processors, enables advanced authentication right from the initial boot stages, ensuring that only validated components can be loaded, thereby strengthening security from the get-go.

Secure boot is a critical security feature that ensures only authenticated and authorized code can run on a device. It operates through a chain of trust, where each component verifies the integrity of the next element in the chain.

Several mechanisms must be used to authenticate each element of this chain, but the mechanism for authenticating the first boot stages depends on the target SoC. The i.MX93 series uses NXP's Advanced High Assurance Boot (AHAB) to secure the first boot stages.

For subsequent stages, you can implement mechanisms such as:

Using U-Boot's "verified boot" feature to sign the kernel,
Using the default environment (cf. USE_DEFAULT_ENV_FILE), and restricting write access to only a few environment variables (cf. ENV_WRITEABLE_LIST), which are necessary for writable access, such as for OTA updates,
Using DM-verity to authenticate the root filesystem,
And finally, using OverlayFS combined with DM-crypt to mount encrypted, writable subfolders.

Here, we'll focus on the first part of the secure boot process, using NXP's AHAB to authenticate the bootloader on the NXP i.MX93 in single-boot mode. We will also briefly discuss how to generate the keys to sign the bootloader and provide an introduction to AHAB.

Note: AHAB also provides a complementary encryption feature designed to protect the confidentiality and integrity of data, whereas secure boot focuses on verifying the integrity and authenticity of the boot process. This post will not cover encryption in detail.

AHAB Architecture

The AHAB authentication mechanism is based on public key cryptography using asymmetric keys.

On the i.MX93, AHAB support is provided by a security co-processor, the EdgeLock enclave (ELE), which handles the authentication of binaries signed with one or more private keys. This co-processor contains fuses that must be burned with the hash of the public keys.

AHAB Containers

Since multiple boot stages (e.g., TF-A, OP-TEE, U-Boot, etc.) and firmwares are required to boot i.MX93 platforms, these binaries are packed into containers using the imx-mkimage tool:

bl31.bin
lpddr4_dmem_1d_v202201.bin
lpddr4_dmem_2d_v202201.bin
lpddr4_imem_1d_v202201.bin
lpddr4_imem_2d_v202201.bin
mx93a1-ahab-container.img
tee.bin
u-boot.bin
u-boot-spl.bin

In i.MX93 single-boot mode, the bootloader image contains at least three containers:

mx93a1-ahab-container.img: Contains the ELE Firmware.
u-boot-atf-container.img: Contains at least the SPL.
flash.bin: Contains TF-A, OP-TEE, and U-Boot.

        *start ----> +---------------------------+ ---------
                     |   1st Container header    |   ^
                     |       and signature       |   |
                     +---------------------------+   |
                     | Padding for 1kB alignment |   |
*start + 0x400 ----> +---------------------------+   |
                     |   2nd Container header    |   |
                     |       and signature       |   |
                     +---------------------------+   |
                     |          Padding          |   |  Authenticated at
                     +---------------------------+   |  ELE ROM/FW Level
                     |           ELE FW          |   |
                     +---------------------------+   |
                     |          Padding          |   |
                     +---------------------------+   |
                     |       Cortex-M Image      |   |
                     +---------------------------+   |
                     |         SPL Image         |   v
                     +---------------------------+ ---------
                     |   3rd Container header    |   ^
                     |       and signature       |   |
                     +---------------------------+   |
                     |          Padding          |   | Authenticated
                     +---------------------------+   | at SPL Level
                     |            TF-A           |   |
                     +---------------------------+   |
                     |           OP-TEE          |   |
                     +---------------------------+   |
                     |           U-Boot          |   v
                     +---------------------------+ ---------

These containers are signed offline using NXP Code-Signing Tools (CST), which also allow the creation of an OEM private key infrastructure (PKI) and the generation of the associated public keys (SRK) table, which is burned into the fuses. The CST can also be used with the PKCS#11 standard to access cryptographic services from tokens or devices such as HSM, TPM, and smart cards.

The first container is signed with NXP keys and is authenticated by the ELE ROM, while the other containers are signed with OEM keys.

AHAB Boot Flow

In single boot mode, the Cortex-A55 ROM reads data from the selected boot device, loading all containers in the chosen boot image set one by one. All images within each container (e.g., EdgeLock secure enclave firmware, Cortex-M33 firmware, A55 firmware, OP-TEE, and U-Boot) are loaded, and the EdgeLock secure enclave (ELE) is tasked with authenticating them. The ELE firmware is authenticated by the ELE ROM, and images in the second container are verified by the ELE firmware.

If the bootloader image contains more than two containers, the third and subsequent containers are authenticated by the SPL instead of the ELE.

PKI Generation

To authenticate the bootloader, we need to generate keys. These keys can be created with the CST. The private key will be used to sign the bootloader, and the public key will be burned into the i.MX93 fuses to authenticate the bootloader during boot.

Follow these steps to generate the keys:

cd cst-3.4.1/keys
echo 00000001 > serial

Write the passphrase for the certificate (replace "fooahabcert" with your choice) in two lines, separated by \n. It is important to store this passphrase securely with backups:

echo -e "fooahabcert\nfooahabcert" > key_pass.txt

Generate a P384 ECC PKI tree with a subordinate SGK key on CST:

./ahab_pki_tree.sh
[...]
Do you want to use an existing CA key (y/n)?: n

Key type options (confirm targeted device supports desired key type):
Select the key type (possible values: rsa, rsa-pss, ecc)?: ecc
Enter length for elliptic curve to be used for PKI tree:
Possible values p256, p384, p521:  p384
Enter the digest algorithm to use: sha384
Enter PKI tree duration (years): 10
Do you want the SRK certificates to have the CA flag set? (y/n)?: n

Generate the Signing Root Keys (SRK) Table and SRK Hash for 64-bit Linux machines:

cd ../crts/
../linux64/bin/srktool -a -d sha256 -s sha384 -t SRK_1_2_3_4_table.bin \
    -e SRK_1_2_3_4_fuse.bin -f 1 -c \
    SRK1_sha384_secp384r1_v3_usr_crt.pem,\
    SRK2_sha384_secp384r1_v3_usr_crt.pem,\
    SRK3_sha384_secp384r1_v3_usr_crt.pem,\
    SRK4_sha384_secp384r1_v3_usr_crt.pem

Do not enter spaces between the commas when specifying the SRKs in the "-c" or "--certs" option. Otherwise, the certificates specified after the first space will be excluded from the table.

Regenerate the SRK HASH (SRK_1_2_3_4_fuse.bin) using SHA256 with the SRK_1_2_3_4_table.bin:

openssl dgst -binary -sha256 SRK_1_2_3_4_table.bin

Optionally, verify that the sha256sum of SRK_1_2_3_4_table matches the SRK_1_2_3_4_fuse.bin:

od -t x4 SRK_1_2_3_4_fuse.bin
0000000 29eec727 eaed9aa7 c7e53bc0 36835f78
0000020 6901bc47 b244753c f78d3162 27ae36b9
0000040

Bootloader Signature

The CST uses CSF description files to sign (and encrypt) containers generated by imx-mkimage with OEM keys. When imx-mkimage generates containers, it also specifies the block offsets to be used in the CSF description files. For example, imx-mkimage returns the following values for your bootloader:

CST: CONTAINER 0 offset: 0x0
CST: CONTAINER 0: Signature Block: offset is at 0x190
CST: CONTAINER 0 offset: 0x400
CST: CONTAINER 0: Signature Block: offset is at 0x490

Where 0x190 is the block offset for the second container header and 0x490 is the block offset for the third container header.

The CSF description file used to sign a container contains three sections:

[Header]: Information about the HAB version to use for signing.
[Authenticate Data]: Information about the key used to sign.
[Install SRK]: Information about the container being signed.

The following CSF description files were used to sign the u-boot-atf-container.img in our example:

[Header]
Target = AHAB
Version = 1.0

[Install SRK]
# SRK table generated by srktool
File = "SRK_1_2_3_4_table.bin"
# Public key certificate in PEM format
Source = "SRK1_sha384_secp384r1_v3_usr_crt.pem"
# Index of the public key certificate within the SRK table (0 .. 3)
Source index = 0
# Type of SRK set (NXP or OEM)
Source set = OEM
# bitmask of the revoked SRKs
Revocations = 0x0

[Authenticate Data]
# Binary to be signed generated by mkimage
File = "u-boot-atf-container.img"
# Offsets = Container header  Signature block (printed out by mkimage)
Offsets = 0x0 0x190

The following CSF description files were used to sign flash.bin in our example:

[Header]
Target = AHAB
Version = 1.0

[Install SRK]
# SRK table generated by srktool
File = "SRK_1_2_3_4_table.bin"
# Public key certificate in PEM format
Source = "SRK1_sha384_secp384r1_v3_usr_crt.pem"
# Index of the public key certificate within the SRK table (0 .. 3)
Source index = 0
# Type of SRK set (NXP or OEM)
Source set = OEM
# bitmask of the revoked SRKs
Revocations = 0x0

[Authenticate Data]
# Binary to be signed generated by mkimage
File = "flash.bin"
# Offsets = Container header  Signature block (printed out by mkimage)
Offsets = 0x400 0x490

The first step is to generate a u-boot-atf-container.img, then copy the block offsets into the CSF description file to sign it:

make SOC=iMX9 REV=A1 dtbs=imx93-11x11-evk.dtb u-boot-atf-container.img

Next, sign it with the following command and replace the unsigned version:

cst -i u-boot-atf-container.img.csf -o u-boot-atf-container.img.signed
mv u-boot-atf-container.img.signed u-boot-atf-container.img

Then generate a flash.bin containing the signed u-boot-atf-container.img:

make SOC=iMX9 REV=A1 V2X=NO dtbs=imx93-11x11-evk.dtb flash_singleboot

Finally, sign the resulting flash.bin:

cst -i flash.bin.csf -o flash.bin.signed

Burn Fuses

Once the signed flash.bin is flashed, you need to burn the public keys used to sign the bootloader into the i.MX93 fuses to finalize AHAB secure boot. This requires using a U-Boot that provides AHAB functionalities, such as checking ELE events during bootloader authentication and securing the device.

Program SRK

The following commands enable AHAB secure boot by programming the SRK_HASH[255:0] fuses on i.MX93, ensuring that only bootloaders signed with keys matching the SRK hash programmed into the fuses will be accepted:

fuse prog -y 16 0 0x29eec727
fuse prog -y 16 1 0xeaed9aa7
fuse prog -y 16 2 0xc7e53bc0
fuse prog -y 16 3 0x36835f78
fuse prog -y 16 4 0x6901bc47
fuse prog -y 16 5 0xb244753c
fuse prog -y 16 6 0xf78d3162
fuse prog -y 16 7 0x27ae36b9

Close the Device

Once the SRK fuses are programmed, you can "close" the device to allow only the bootloader signed with keys matching the SRK table to boot:

ahab_close

Before closing the device, you can verify that the fuses have been written correctly by checking that no ELE events are raised:

ahab_status
Lifecycle: 0x00000008, OEM Open

No Events Found!
=>
Lifecycle: 0x00000008, OEM Open

No Events Found!

Once the device is closed, the ahab_status command will show OEM closed:

ahab_status
Lifecycle: 0x00000020, OEM closed

No Events Found!
=>
Lifecycle: 0x00000020, OEM closed
No Events Found!

As long as OEM Open appears in the status, the device is not secured and can still execute unsigned bootloaders or those signed with invalid keys.

Conclusion

By implementing AHAB on the i.MX93 platform, you can ensure that your boot process is protected from unauthorized code. The use of public key cryptography and secure containers adds an extra layer of security, making your device more resilient to attacks. This process is crucial for applications where integrity and authenticity from the very first boot stage are paramount.

posted at 19:21 · Security · security embedded imx93 secure-boot ahab nxp

Jul 29, 2022

Zephyr Device Tree Guide

Introduction

The goal of the Zephyr project, hosted by the Linux foundation, since 2016, is to provide a safe and secured real time operating system (RTOS) for connected devices that are too small for Linux, or for core companion, through the Apache 2.0 open source license.

It is designed for resource-constrained devices such as microcontrollers and Internet of Things (IoT) devices, to be modular and scalable. This makes it ideal for a wide range of devices, from simple sensors to complex systems. The operating system is written in C and is fully compatible with the C11 and C++17 standards.

One of the key benefits of the Zephyr device model is its small footprint, it can be configured to run on devices with as little as 10 KB of memory.

It supports multiple 32 bits and 64 bits architectures: Cortex-A, Cortex-M, Cortex-R, RISC-V, x86-64, etc. But it also support several boards and extensions: Feather, nRF52840, ST Discovery, ST Nucleo, ESP-32, etc. It is able to manage several kinds of connectivity: Bluetooth, ethernet, wifi, LoRa. And it support some network protocols: IPv4, IPv6,UDP, TCP, CoAP, LWM2M, MQTT, DNS, etc.

As Linux, Zephyr use Kconfig, and its device model is mainly based on device tree.

Device tree

Device trees are tree data structures that describe the hardware components and their relationships in a system. They are stored in a text file, named device tree sources (*.dts), and they written by developers to describe hardware architectures of SoCs and boards. And they are used by the operating system to determine how to initialize and interact with the hardware.

Each node describe a device of the system, has its own properties that describe their characteristics, and they have only one parent (except for the root node).

Each device driver is associated with a specific device tree node, which represents a hardware component in the system. The device driver provides the necessary code and data to control the behavior of the hardware component.

test_i2c_bme280: bme280@6 {
        compatible = "bosch,bme280";
        reg = <0x6>;
};

In the Linux kernel, device tree sources are compiled to device tree binaries (dtb) that are parsed, at boot, by bootloader stages (U-Boot, TF-A...) and the kernel to allow support several hardware configuration with same binaries.

But in Zephyr, device tree sources are transformed to a "devicetree_generated.h" C header file at build, that contains macro definitions and data structures allowing device drivers to access information about the hardware components in the system, such as the memory mapping of a device, its pin assignments, and its IRQ numbers:

#define DT_COMPAT_HAS_OKAY_bosch_bme280 1
#define DT_N_INST_bosch_bme280_NUM_OKAY 1
#define DT_FOREACH_OKAY_bosch_bme280(fn) fn(DT_N_S_soc_S_i2c_40005400_S_bme280_77)
#define DT_FOREACH_OKAY_VARGS_bosch_bme280(fn, ...) fn(DT_N_S_soc_S_i2c_40005400_S_bme280_77, __VA_ARGS__)
#define DT_FOREACH_OKAY_INST_bosch_bme280(fn) fn(0)
#define DT_FOREACH_OKAY_INST_VARGS_bosch_bme280(fn, ...) fn(0, __VA_ARGS__)
#define DT_COMPAT_bosch_bme280_BUS_i2c 1

Where:

DT_COMPAT_HAS_OKAY_bosch_bme280: indicates that there is at least one instance of BME280
DT_N_INST_bosch_bme280_NUM_OKAY: defines the number of BME280 instances that are marked okay
DT_FOREACH_OKAY_bosch_bme280: allows you to apply a function fn to each instance of the BME280
DT_FOREACH_OKAY_VARGS_bosch_bme280: also allows you to apply a function fn to each instance of the BME280, but with additional arguments
DT_FOREACH_OKAY_INST_bosch_bme280: allows you to apply a function fn to each instance of the BME280, passing the instance number as an argument
DT_FOREACH_OKAY_INST_VARGS_bosch_bme280: is similar to the previous macro, but this one allows for additional arguments
DT_COMPAT_bosch_bme280_BUS_i2c: indicates that the BME280 device is connected to an I2C bus.
DT_N_S_soc_S_i2c_40005400_S_bme280_77: refers to a specific node in the device tree, here it refers to the BME280 sensor connected to the I2C controller with the base address 0x40005400 within the SoC. The sensor's address on this I2C bus is 0x77.

In addition, device tree sources can be extended or overridden, for example to connect additional devices to a board, or to disable board devices which will not be used:

/ {
        aliases {
                bme280 = &bme280;
        };
};

&spi1 {
        status = "disabled";
};

&i2c1 {
        status = "okay";
        bme280: bme280@77 {
                compatible = "bosch,bme280";
                reg = <0x77>;
        };
};

Binding

Content of device tree sources is described in binding files, that are written in human readable and easy to parse YAML. Binding files can be also used to validate device tree sources by comparing the information in the YAML file with the information in the device tree sources.

description: BME280 integrated environmental sensor

compatible: "bosch,bme280"

include: [sensor-device.yaml, i2c-device.yaml]

Device driver

In Zephyr, a device driver can access the properties of an associated node in the device tree using the macro that are defined in C header files. For example, the following code can be used to initialize a BME280 sensor using properties defined in the device tree:

#include <device.h>
#include <drivers/i2c.h>
#include <devicetree.h>
#include <zephyr.h>

// Define the node identifier for the BME280 sensor
#define BME280_NODE DT_N_S_soc_S_i2c_40005400_S_bme280_77

// Function to initialize the BME280 sensor
static int bme280_init(const struct device *dev)
{
    // Check if the node is available
    if (!device_is_ready(dev)) {
        printk("Device %s is not ready\n", dev->name);
        return -ENODEV;
    }

    // Retrieve the I2C device associated with the BME280 node
    const struct device *i2c_dev = DEVICE_DT_GET(DT_BUS(BME280_NODE));

    if (!device_is_ready(i2c_dev)) {
        printk("I2C device not ready\n");
        return -ENODEV;
    }

    // Write some initialization code here, such as configuring registers

    printk("BME280 sensor initialized\n");
    return 0;
}

// Initialize the BME280 sensor at boot time
SYS_INIT(bme280_init, APPLICATION, CONFIG_APPLICATION_INIT_PRIORITY);

Conclusion

Those who have already implemented BSP or driver on Linux shouldn't encounter too much difficulty, but on the other hand, the step is a little higher for people coming from the world of micro-controllers.

posted at 20:32 · Embedded · zephyr device-tree rtos embedded

Sep 27, 2020

Build RIOT-OS with Podman

Summary

This article is a tip that explains how it is possible to build a RIOT-OS application with Podman and the official build container. And I would like to take this opportunity to introduce you to Podman and RIOT-OS.

Podman

Some Linux distribution, like Fedora chosen to officially support only Podman instead of Docker for several reasons:

It is daemonless container engine.
It is rootless.
It follows Open Container Initiative (OCI) standards.
It is safer than the Docker engine.
It introduces the notion of Pods: a group of container(s) that share storage or network resources.

Moreover, Podman is able to use the images built by the Docker engine and has been stored in Docker registry.

However, most of the time the Podman commands are identical to that of Docker, then a simple alias is enough to be misleading: alias docker=podman.

But as Podman is rootless and safer than Docker, then sometimes it is necessary to specify additional security parameters.

RIOT-OS

RIOT-OS is a memory-constrained RTOS, such as Contiki, that provides real-time and multithreading abilities, and it runs on processors from 8bits to 32bits.

It was designed for IoT devices then to be low power consumption and it provides three very complete network stacks including some protocols as:

IPv6
6LoWPAN
CoAP
etc.

The RIOT-OS project also provides some useful tools including a build container (riotdocker).

And the build environment of RIOT-OS offers a Makefile to build an application with this container simply by setting the variable BUILD_IN_DOCKER to 1. Then the prebuilt image is downloaded and instantiated to execute the make command.

By default, this feature is configured to be used with the Docker engine, but it is possible to override some variables from the build environment either to use a custom prebuilt image, either use another engine or to use custom engine parameters.

Then here, we will use these environments variable to instantiate a container with Podman (instead of Docker) and with the required parameters.

Tip of the day

In the following example, we build the Helloworld application for a STM32 Discovery board. To do that we specify the engine by setting the variable DOCKER to the value podman. The variable DOCKER_USER is set empty because in the variable DOCKER_RUN_FLAGS the parameter --userns is set to keep-id to map the uid:gid of the current rootless user (from host) with the values that will be used into the container.

export BUILD_IN_DOCKER=1
export DOCKER="podman"
export DOCKER_USER=""
export DOCKER_RUN_FLAGS="--rm -i -t --security-opt seccomp=unconfined --security-opt label=disable --userns=keep-id"
export DOCKER_MAKE_ARGS="-j$(nproc)"

make BOARD=stm32l476g-disco
Launching build container using image "riot/riotbuild:latest".
podman run --rm -i -t --security-opt seccomp=unconfined --security-opt label=disable --userns=keep-id -v '/usr/share/zoneinfo/Europe/Paris:/etc/localtime:ro' -v '/home/tperrot/dev/tprrt/pwm-ramp-gen/RIOT:/data/riotbuild/riotbase:delegated' -e 'RIOTBASE=/data/riotbuild/riotbase' -e 'CCACHE_BASEDIR=/data/riotbuild/riotbase' -e 'BUILD_DIR=/data/riotbuild/riotbase/build' -v '/home/tperrot/dev/tprrt/pwm-ramp-gen:/data/riotbuild/riotproject:delegated' -e 'RIOTPROJECT=/data/riotbuild/riotproject' -e 'RIOTCPU=/data/riotbuild/riotbase/cpu' -e 'RIOTBOARD=/data/riotbuild/riotbase/boards' -e 'RIOTMAKE=/data/riotbuild/riotbase/makefiles'     -v '/home/tperrot/dev/tprrt/pwm-ramp-gen/.git:/home/tperrot/dev/tprrt/pwm-ramp-gen/.git:delegated' -e 'BOARD=stm32l476g-disco'  -w '/data/riotbuild/riotproject/' 'riot/riotbuild:latest' make 'BOARD=stm32l476g-disco'   -j8
Building application "hello-world" for "stm32l476g-disco" with MCU "stm32".

[INFO] cloning stm32cmsis
fatal: not a git repository: /data/riotbuild/riotbase/../.git/modules/RIOT
Cloning into '/data/riotbuild/riotbase/cpu/stm32/include/vendor/cmsis/l4'...
remote: Enumerating objects: 364, done.
remote: Counting objects: 100% (364/364), done.
remote: Compressing objects: 100% (71/71), done.
remote: Total 364 (delta 309), reused 344 (delta 289), pack-reused 0
Receiving objects: 100% (364/364), 709.56 KiB | 561.00 KiB/s, done.
Resolving deltas: 100% (309/309), done.
HEAD is now at e442c72 Release v1.6.1
[INFO] updating stm32cmsis /data/riotbuild/riotbase/cpu/stm32/include/vendor/cmsis/l4/.pkg-state.git-downloaded
echo e442c72651e8d4757f6562acc14da949644944ce   > /data/riotbuild/riotbase/cpu/stm32/include/vendor/cmsis/l4/.pkg-state.git-downloaded
[INFO] patch stm32cmsis
"make" -C /data/riotbuild/riotbase/boards/stm32l476g-disco
"make" -C /data/riotbuild/riotbase/core
"make" -C /data/riotbuild/riotbase/cpu/stm32
"make" -C /data/riotbuild/riotbase/drivers
"make" -C /data/riotbuild/riotbase/sys
"make" -C /data/riotbuild/riotbase/cpu/cortexm_common
"make" -C /data/riotbuild/riotbase/cpu/stm32/periph
"make" -C /data/riotbuild/riotbase/drivers/periph_common
"make" -C /data/riotbuild/riotbase/cpu/stm32/stmclk
"make" -C /data/riotbuild/riotbase/sys/auto_init
"make" -C /data/riotbuild/riotbase/cpu/cortexm_common/periph
"make" -C /data/riotbuild/riotbase/cpu/stm32/vectors
"make" -C /data/riotbuild/riotbase/sys/malloc_thread_safe
"make" -C /data/riotbuild/riotbase/sys/newlib_syscalls_default
"make" -C /data/riotbuild/riotbase/sys/pm_layered
"make" -C /data/riotbuild/riotbase/sys/stdio_uart
   text    data     bss     dec     hex filename
   8900     112    2300   11312    2c30 /data/riotbuild/riotproject/bin/stm32l476g-disco/hello-world.elf

posted at 13:01 · riot-os · container podman riot-os

Sep 08, 2020

How the Busybox's chrt applet works

Introduction

In this article, I will dissect how the chrt applet from the release 1.32.0 of Busybox works, what it does, etc.

This command is a Linux utils allowing to consult or to modify the scheduling attributes of a process.

chrt -m
SCHED_OTHER min/max priority    : 0/0
SCHED_FIFO min/max priority     : 1/99
SCHED_RR min/max priority       : 1/99
SCHED_BATCH min/max priority    : 0/0
SCHED_IDLE min/max priority     : 0/0
SCHED_DEADLINE min/max priority : 0/0

pidof firefox
6987 6851 6825 6816 6800 6771 6767 6761 6720 6611

chrt -p 6987
pid 6987's current scheduling policy: SCHED_OTHER
pid 6987's current scheduling priority: 0

sudo chrt -f -p 1 6987
chrt -p 6987
pid 6987's current scheduling policy: SCHED_FIFO
pid 6987's current scheduling priority: 1

Busybox provides an applet whose size, once compiled, is ten times smaller than that of the binary implementation and with some limitations.

The dissection

The implementation of the chrt applet is in the file util-linux/chrt.c that containing several functions which are called in the main function of this applet.

The main function of this applet is divided into three main parts: - the first parses the command options - the second prints the scheduler's information - the last one, to apply scheduler changes in case of a set

At start of main, the character string containing the options are parsed to obtain a bitfield easier to use:

opt = getopt32(argv, "^"
                "+" "mprfobi"
                "\0"
                /* only one policy accepted: */
                "r--fobi:f--robi:o--rfbi:b--rfoi:i--rfob"
);

If the (-m) is set then the min and max valid priorities for each scheduling policies are shown and the command exits:

if (opt & OPT_m) { /* print min/max and exit */
        show_min_max(SCHED_OTHER);
        show_min_max(SCHED_FIFO);
        show_min_max(SCHED_RR);
        show_min_max(SCHED_BATCH);
        show_min_max(SCHED_IDLE);
        fflush_stdout_and_exit(EXIT_SUCCESS);
}

The function show_min_max uses the Posix functions sched_get_priority_max and sched_get_priority_min from the standard C library to send a syscall to the kernel in order to obtain the min and max values accepted by each policy:

max = sched_get_priority_max(pol);
min = sched_get_priority_min(pol);
if ((max|min) < 0)
    fmt = "SCHED_%s not supported\n";

Otherwise the required options and arguments to show or to apply real-time attributes of a process:

//if (opt & OPT_r)
//  policy = SCHED_RR; - default, already set
if (opt & OPT_f)
    policy = SCHED_FIFO;
if (opt & OPT_o)
    policy = SCHED_OTHER;
if (opt & OPT_b)
    policy = SCHED_BATCH;
if (opt & OPT_i)
    policy = SCHED_IDLE;

argv += optind;
if (!argv[0])
    bb_show_usage();
if (opt & OPT_p) {
    pid_str = *argv++;
    if (*argv) { /* "-p PRIO PID [...]" */
            priority = pid_str;
            pid_str = *argv;
    }
    /* else "-p PID", and *argv == NULL */
    pid = xatoul_range(pid_str, 1, ((unsigned)(pid_t)ULONG_MAX) >> 1);
} else {
    priority = *argv++;
    if (!*argv)
            bb_show_usage();
}

Then the applet uses the Posix function sched_getscheduler provides by the standard C library to obtain the scheduling attributes of the process specified by the pid.

print_rt_info:
    pol = sched_getscheduler(pid);
    if (pol < 0)
            bb_perror_msg_and_die("can't %cet pid %u's policy", 'g', (int)pid);

Finally, when the chrt applet is used to modify scheduling attributes then the Posix function sched_getscheduler is used and the new scheduling attributes are showed:

if (sched_setscheduler(pid, policy, &sp) < 0)
    bb_perror_msg_and_die("can't %cet pid %u's policy", 's', (int)pid);

if (!argv[0]) /* "-p PRIO PID [...]" */
    goto print_rt_info;

The function sched_setscheduler and sched_getscheduler will send a syscall to the scheduler subsystem of the kernel Linux. This subsystem also exposes this information from /proc:

cat /proc/6987/sched
WebExtensions (6987, #threads: 23)
-------------------------------------------------------------------
se.exec_start                                :       4421312.640001
se.vruntime                                  :        344438.942254
se.sum_exec_runtime                          :         38238.466094
se.nr_migrations                             :                 6811
nr_switches                                  :                49452
nr_voluntary_switches                        :                21749
nr_involuntary_switches                      :                27703
se.load.weight                               :              1048576
se.runnable_weight                           :              1048576
se.avg.load_sum                              :                 3415
se.avg.runnable_load_sum                     :                 3415
se.avg.util_sum                              :              3497621
se.avg.load_avg                              :                   74
se.avg.runnable_load_avg                     :                   74
se.avg.util_avg                              :                   74
se.avg.last_update_time                      :        4421312640000
se.avg.util_est.ewma                         :                   75
se.avg.util_est.enqueued                     :                   75
policy                                       :                    0
prio                                         :                  120
clock-delta                                  :                   89
mm->numa_scan_seq                            :                    0
numa_pages_migrated                          :                    0
numa_preferred_nid                           :                   -1
total_numa_faults                            :                    0
current_node=0, numa_group_id=0
numa_faults node=0 task_private=0 task_shared=0 group_private=0 group_shared=0

Limitations

Below a short list of limitations that I observed during my analysis of this applet.

Resetting scheduling policy

The chrt applet doesn't offer an option (-R) to specify if the scheduling policy should be applied or reset when a process forks to create children. This feature, introduced since Linux 2.6.32, can be only enabled or disabled at the build of busybox and it is applied on all scheduling attributes modifications done with this applet.

Deadline support

The chrt applet doesn't provide the required scheduling options (-d, -T, -P and -D) to set the deadline scheduling attributes of a process.

posted at 19:20 · busybox · busybox chrt dissection beginner

Jun 27, 2020

Build an embedded Linux in less than 15 minutes

Introduction

Since some years, I haven't built an embedded Linux without using a framework, like Open Embedded from the Yocto project. Then here, I wanted to make a guide to help you to build quickly, from "scratch" a very minimal embedded Linux to boot a target. The following examples have been written to boot a virtual Qemu target but, they can be adapted to boot a real target. Moreover, the build environment will be bootstrapped with a prebuilt cross-toolchain, I have chosen to use one provided by Bootlin and using glibc.

Setup the environment

First, it is required to install the packages that are needed to install and use the cross-toolchain but also to compile the host tools and to provide Qemu:

The Ncurses libraries are only required to execute the command make menuconfig.
The certificates and wget will be used to download the prebuilt toolchain.
In the same way, git will be used to checkout the source of Busybox and Linux.
The Qemu packages will be used to emulate system platform and to execute static binaries cross-compiled for aarch64 on the x86-64 host.

apt update
apt install -y --no-install-recommends \
    bc \
    build-essential \
    ca-certificates \
    cpio \
    file \
    flex \
    git \
    ipxe-qemu \
    libncurses5-dev \
    libncursesw5-dev \
    libssl-dev \
    qemu \
    qemu-system-aarch64 \
    qemu-user-static \
    wget

Now, it is time to download and install the prebuilt toolchain:

mkdir ~/src
cd ~/src
wget https://toolchains.bootlin.com/downloads/releases/toolchains/aarch64/tarballs/aarch64--glibc--stable-2020.08-1.tar.bz2
tar xvjf aarch64--glibc--stable-2020.08-1.tar.bz2

Once the toolchain has been extracted you have to set the required environment variables to cross-compile binaries:

PATH: It shall be extended so that the cross-tools from the cross-toolchain will be available from the environment
CROSS_COMPILE: In order to clarify the prefix used by the cross-tools
ARCH: The architecture of the target platform

ls ~/src/aarch64--glibc--stable-2020.08-1/bin/*gcc
~/src/aarch64--glibc--stable-2020.08-1/bin/aarch64-linux-gcc

export PATH=~/src/aarch64--glibc--stable-2020.08-1/bin:$PATH
export CROSS_COMPILE=aarch64-linux-

Now, it is possible to call the cross-tools from the shell:

aarch64-linux-gcc -v
Using built-in specs.
COLLECT_GCC=~/src/aarch64--glibc--stable-2020.08-1/bin/aarch64-linux-gcc.br_real
COLLECT_LTO_WRAPPER=~/src/aarch64--glibc--stable-2020.08-1/bin/../libexec/gcc/aarch64-buildroot-linux-gnu/9.3.0/lto-wrapper
Target: aarch64-buildroot-linux-gnu
<...>
Thread model: posix
gcc version 9.3.0 (Buildroot 2020.08-14-ge5a2a90)

Concerning the variable PATH this one will be set afterwards because its value depends on the binary that will be built.

Build the Linux kernel

So, the environment is ready to pull the sources of the latest stable branch of the kernel Linux and to build them:

git clone git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git
cd linux
git checkout -b local/linux-5.4.y origin/linux-5.4.y
# git show HEAD

export ARCH=arm64

make defconfig
  HOSTCC  scripts/basic/fixdep
  HOSTCC  scripts/kconfig/conf.o
  HOSTCC  scripts/kconfig/confdata.o
  HOSTCC  scripts/kconfig/expr.o
  LEX     scripts/kconfig/lexer.lex.c
  YACC    scripts/kconfig/parser.tab.[ch]
  HOSTCC  scripts/kconfig/lexer.lex.o
  HOSTCC  scripts/kconfig/parser.tab.o
  HOSTCC  scripts/kconfig/preprocess.o
  HOSTCC  scripts/kconfig/symbol.o
  HOSTLD  scripts/kconfig/conf
*** Default configuration is based on 'defconfig'
#
# configuration written to .config
#

# make menuconfig

make -j$(nproc)
  <...>
  AR      drivers/net/ethernet/built-in.a
  AR      drivers/net/built-in.a
  AR      drivers/built-in.a
  GEN     .version
  CHK     include/generated/compile.h
  LD      vmlinux.o
  MODPOST vmlinux.o
  MODINFO modules.builtin.modinfo
  LD      .tmp_vmlinux.kallsyms1
  KSYM    .tmp_vmlinux.kallsyms1.o
  LD      .tmp_vmlinux.kallsyms2
  KSYM    .tmp_vmlinux.kallsyms2.o
  LD      vmlinux
  SORTEX  vmlinux
  SYSMAP  System.map
  Building modules, stage 2.
  MODPOST 531 modules
  OBJCOPY arch/arm64/boot/Image
  GZIP    arch/arm64/boot/Image.gz

The command make defconfig will apply the default configuration for the target platform (cf. ARCH=arm64), and the compilation will be performed by make -j$(nproc).

The commands git show HEAD and make defconfig are optional: - the first is useful to verify that the latest commit corresponding to the latest tag of the branch linux-5.4.y. - the second can be used if you want to customize the kernel configuration.

NB. The kernel Linux but also Busybox and some projects use Kbuild to manage the build options

Populate the sysroot

The easy way to bootstrap a sysroot is to use Busybox that has been created to offer common UNIX tools into a single executable and it is size-optimized. To create a sysroot, it is only required to add a few configuration files.

The steps to pull and build Busybox are similar to those of the kernel Linux.

git clone git://git.busybox.net/busybox
cd busybox
git checkout -b local/1_32_stable origin/1_32_stable
# git show HEAD

export ARCH=aarch64
export LDFLAGS="--static"

make defconfig
# make menuconfig
make -j$(nproc)

make install

Here, the LDFLAGS is set to force static linking of Busybox quickly, but it is also possible to use make menuconfig to set CONFIG_STATIC=y. The advantage of the static executable is that it can be tested with Qemu:

qemu-aarch64-static busybox echo "Hello!"
Hello!
qemu-aarch64-static busybox date
Sat Jun 27 15:06:41 UTC 2020

The binary qemu-aarch64-static allows to execute a binary built for another architecture on the host computer, for example here it allows to execute the Busybox binary compiled for an aarch64 target on a x86-64 host.

The last command make install created a tree into the _install directory that can be used to populate the sysroot:

ls -l _install
total 4
drwxr-xr-x. 1 tperrot tperrot 974 Nov 30 15:22 bin
lrwxrwxrwx. 1 tperrot tperrot  11 Nov 30 15:22 linuxrc -> bin/busybox
drwxr-xr-x. 1 tperrot tperrot 986 Nov 30 15:22 sbin
drwxr-xr-x. 1 tperrot tperrot  14 Nov 30 15:22 usr

ls -l _install/bin
<...>
lrwxrwxrwx. 1 tperrot tperrot       7 Nov 30 15:22 umount -> busybox
lrwxrwxrwx. 1 tperrot tperrot       7 Nov 30 15:22 uname -> busybox
lrwxrwxrwx. 1 tperrot tperrot       7 Nov 30 15:22 usleep -> busybox
lrwxrwxrwx. 1 tperrot tperrot       7 Nov 30 15:22 vi -> busybox
lrwxrwxrwx. 1 tperrot tperrot       7 Nov 30 15:22 watch -> busybox
lrwxrwxrwx. 1 tperrot tperrot       7 Nov 30 15:22 zcat -> busybox

In order, to finalize this minimal sysroot, it is required to create a rcS init script:

mkdir _install/proc _install/sys _install/dev _install/etc _install/etc/init.d
cat > _install/etc/init.d/rcS << EOF
#!/bin/sh
mount -t proc none /proc
mount -t sysfs none /sys
/sbin/mdev -s
[ ! -h /etc/mtab ]  && ln -s /proc/mounts /etc/mtab
[ ! -f /etc/resolv.conf ] && cat /proc/net/pnp > /etc/resolv.conf
EOF
chmod +x _install/etc/init.d/rcS

Build the filesystem

The target of this step is to package the sysroot tree into a filesystem that can be mounted by the kernel. There is two available possibilities, either build a ramfs or a rootfs.

Globally, the difference between both is that:

the ramfs is a very simple filesystem that can be used by the kernel to create a block device into the RAM space from an archive.
the rootfs is a filesystem mounted from a non volatile device by the kernel.

For more information about the difference between the ramfs and the rootfs, you can you refer to the kernel documentation.

Build a ramfs

To build the ramfs we will use cpio and gzip to construct the compressed archive after modifying the rights:

mkdir _rootfs
rsync -a _install/ _rootfs
chown -R root:root _rootfs
cd _rootfs
find . | cpio -o --format=newc > ../rootfs.cpio
cd ..
gzip -c rootfs.cpio > rootfs.cpio.gz

Build a rootfs

To build the rootfs, the first step is to create an empty binary blob that will be mounted into a loop device to be formatted to create a ext3 filesystem. Then the tree can be copied and the rights updated.

dd if=/dev/zero of=rootfs.img bs=1M count=10
mke2fs -j rootfs.img
mkdir _rootfs
mount -o loop rootfs.img _rootfs
rsync -a _install/ _rootfs
chown -R root:root _rootfs
sync
umount _rootfs

Boot the target

Following, the qemu commands to boot the minimal embedded Linux system that has been built.

# With the ramfs
qemu-system-aarch64 -nographic -no-reboot -machine virt -cpu cortex-a57 -smp 2 -m 256 \
    -kernel ~/src/linux/arch/arm64/boot/Image \
    -initrd ~/src/busybox/rootfs.cpio.gz \
    -append "panic=5 ro ip=dhcp root=/dev/ram rdinit=/sbin/init"

# With the rootfs
qemu-system-aarch64 -nographic -no-reboot -machine virt -cpu cortex-a57 -smp 2 -m 256 \
    -kernel ~/src/linux/arch/arm64/boot/Image \
    -append "panic=5 ro ip=dhcp root=/dev/vda" \
    -drive file=~/src/busybox/rootfs.img,format=raw,if=none,id=hd0 -device virtio-blk-device,drive=hd0

Then the target will be boot to shell, "It's alive!":

[    0.000000] Booting Linux on physical CPU 0x0000000000 [0x411fd070]
[    0.000000] Linux version 5.10.0-rc5 (tperrot@27ea4a863f61) (aarch64-linux-gcc.br_real (Buildroot 2020.08-14-ge5a2a90) 9.3.0, GNU ld (GNU Binutils) 2.33.1) #1 SMP PREEMPT Mon Nov 30 14:40:05 UTC 2020
[    0.000000] Machine model: linux,dummy-virt
<...>
[    0.858346] Sending DHCP requests ., OK
[    0.870558] IP-Config: Got DHCP answer from 10.0.2.2, my address is 10.0.2.15
[    0.870909] IP-Config: Complete:
[    0.871199]      device=eth0, hwaddr=52:54:00:12:34:56, ipaddr=10.0.2.15, mask=255.255.255.0, gw=10.0.2.2
[    0.871566]      host=10.0.2.15, domain=, nis-domain=(none)
[    0.871825]      bootserver=10.0.2.2, rootserver=10.0.2.2, rootpath=
[    0.871866]      nameserver0=10.0.2.3
[    0.872389]
[    0.875863] ALSA device list:
[    0.876151]   No soundcards found.
[    0.879353] uart-pl011 9000000.pl011: no DMA platform data
[    0.920237] Freeing unused kernel memory: 5952K
[    0.921223] Run /sbin/init as init process

Please press Enter to activate this console.

posted at 13:01 · linux · busybox embedded intermediate linux qemu

May 28, 2020

My blog opening

Welcome,

After closing my last blog seventeen years ago, in order to share my knowledge and my little experiments about embedded open source. As you might have guessed, this blog will mainly focus on embedded Linux operating systems, but also about open firmware and rtos, as well as related topics like virtualization, security, etc.

I hope you will like the articles of this blog, enjoy the reading.

posted at 16:50 · blog