Featured image of post Quick builds and rebuilds of MariaDB using Docker

Quick builds and rebuilds of MariaDB using Docker

The MariaDB server has over 2 million lines of code. Downloading, compiling (and re-compiling) and running the test suite can potentially consume a lot of time away from actually making the code changes and being productive. Knowing a few simple shortcuts can help avoid wasting time.

While the official build instructions on mariadb.org and mariadb.com/kb are useful to read, there are ways to make the build (and rebuild) significantly faster and more efficient.

# TL;DR for Debian/Ubuntu users

Get the latest MariaDB 11.0 source code, install build dependencies, configure, build and run test suite to validate binaries work:

shell
mkdir quick-rebuilds
cd quick-rebuilds
git clone --branch 11.0 --shallow-since=3m \
          --recurse-submodules --shallow-submodules \
          https://github.com/MariaDB/server.git mariadb-server
mkdir -p ccache build data
docker run --interactive --tty --rm -v ${PWD}:/quick-rebuilds \
           -w /quick-rebuilds debian:sid bash
echo 'deb-src http://deb.debian.org/debian sid main' \
     > /etc/apt/sources.list.d/deb-src-sid.list
apt update
apt install -y --no-install-recommends \
            devscripts equivs ccache eatmydata ninja-build clang entr moreutils
mk-build-deps -r -i mariadb-server/debian/control \
              -t 'apt-get -y -o Debug::pkgProblemResolver=yes --no-install-recommends'
export CCACHE_DIR=$PWD/ccache
export CXX=${CXX:-clang++}
export CC=${CC:-clang}
export CXX_FOR_BUILD=${CXX_FOR_BUILD:-clang++}
export CC_FOR_BUILD=${CC_FOR_BUILD:-clang}
export CFLAGS='-Wno-unused-command-line-argument'
export CXXFLAGS='-Wno-unused-command-line-argument'
cmake -S mariadb-server/ -B build/ -G Ninja --fresh \
       -DCMAKE_CXX_COMPILER_LAUNCHER=ccache -DCMAKE_C_COMPILER_LAUNCHER=ccache \
       -DPLUGIN_COLUMNSTORE=NO -DPLUGIN_ROCKSDB=NO -DPLUGIN_S3=NO \
       -DPLUGIN_MROONGA=NO -DPLUGIN_CONNECT=NO -DPLUGIN_TOKUDB=NO \
       -DPLUGIN_PERFSCHEMA=NO -DWITH_WSREP=OFF
eatmydata cmake --build build/
./build/mysql-test/mysql-test-run.pl --force --parallel=auto

To rebuild after code change simply run:

shell
eatmydata cmake --build build/

For full details, read the whole article.

# Stay organized, keep directories clean

First step is to create the working directory and some directories inside it to:

shell
mkdir quick-rebuilds
cd quick-rebuilds
mkdir -p ccache build data

The directory ccache will be used by the tool with the same name to store build cache permanently. Build artifacts will be output in the directory build to avoid polluting the source code directory so that git in the source tree will not accidentally commit any machine-generated files. The data directory is useful for temporary test installs.

The next step is to get the source code into this working directory.

# Don’t download the whole project – use shallow git clone

The oldest git commit in the project is from July, 2000. Since then, MariaDB has had nearly 200 000 commits. To build the latest version and perhaps submit a Pull Request to commit your improvement to the project, you don’t necessarily need to have all those 200 000 commits available in your git clone. You can use shallow git clone to, for example, fetch only the history of past 3 months:

$ git clone --branch 11.0 --shallow-since=3m \
  --recurse-submodules --shallow-submodules \
  https://github.com/MariaDB/server.git mariadb-server

Cloning into 'mariadb-server'...
remote: Enumerating objects: 41075, done.
remote: Counting objects: 100% (41075/41075), done.
remote: Compressing objects: 100% (29333/29333), done.
remote: Total 41075 (delta 19706), reused 20092 (delta 10708), pack-reused 0
Receiving objects: 100% (41075/41075), 75.85 MiB | 8.48 MiB/s, done.
Resolving deltas: 100% (19706/19706), done.
Checking out files: 100% (24070/24070), done.
Submodule 'extra/wolfssl/wolfssl' (https://github.com/wolfSSL/wolfssl.git) registered for path 'extra/wolfssl/wolfssl'
Submodule 'libmariadb' (https://github.com/MariaDB/mariadb-connector-c.git) registered for path 'libmariadb'
Submodule 'storage/columnstore/columnstore' (https://github.com/mariadb-corporation/mariadb-columnstore-engine.git) registered for path 'storage/columnstore/columnstore'
Submodule 'storage/maria/libmarias3' (https://github.com/mariadb-corporation/libmarias3.git) registered for path 'storage/maria/libmarias3'
Submodule 'storage/rocksdb/rocksdb' (https://github.com/facebook/rocksdb.git) registered for path 'storage/rocksdb/rocksdb'
Submodule 'wsrep-lib' (https://github.com/codership/wsrep-lib.git) registered for path 'wsrep-lib'
Cloning into '/srv/sources/mariadb/quick-rebuilds/mariadb-server/extra/wolfssl/wolfssl'...
remote: Enumerating objects: 2851, done.
remote: Counting objects: 100% (2851/2851), done.
remote: Compressing objects: 100% (2124/2124), done.
remote: Total 2851 (delta 800), reused 1576 (delta 589), pack-reused 0
Receiving objects: 100% (2851/2851), 20.91 MiB | 10.43 MiB/s, done.
Resolving deltas: 100% (800/800), done.
...
Unpacking objects: 100% (3/3), done.
From https://github.com/codership/wsrep-API
 * branch            694d6ca47f5eec7873be99b7d6babccf633d1231 -> FETCH_HEAD
Submodule path 'wsrep-lib/wsrep-API/v26': checked out '694d6ca47f5eec7873be99b7d6babccf633d1231'

$ git -C mariadb-server/ show --oneline --summary
f2dc4d4c (HEAD -> 11.0, origin/HEAD, origin/11.0) MDEV-30673 InnoDB recovery hangs when buf_LRU_get_free_block

$ git -C mariadb-server submodule
 4fbd4fd36a21efd9d1a7e17aba390e91c78693b1 extra/wolfssl/wolfssl (4fbd4fd)
 12bd1d5511fc2ff766ff6256c71b79a95739533f libmariadb (12bd1d5)
 8b032853b7a200d9af4d468ac58bb9f4b6ac7040 storage/columnstore/columnstore (8b03285)
 3846890513df0653b8919bc45a7600f9b55cab31 storage/maria/libmarias3 (3846890)
 bba5e7bc21093d7cfa765e1280a7c4fdcd284288 storage/rocksdb/rocksdb (bba5e7b)
 275a0af8c5b92f0ee33cfe9e23f3db5f59b56e9d wsrep-lib (275a0af)

$ du -shc mariadb-server/.git/modules/{storage/*,extra/wolfssl,libmariadb,wsrep-lib} \
  mariadb-server/.git mariadb-server/
30M	  mariadb-server/.git/modules/storage/columnstore
1M	  mariadb-server/.git/modules/storage/maria
20M	  mariadb-server/.git/modules/storage/rocksdb
40M	  mariadb-server/.git/modules/extra/wolfssl
2M	  mariadb-server/.git/modules/libmariadb
1M	  mariadb-server/.git/modules/wsrep-lib
80M	  mariadb-server/.git
548M	mariadb-server/
=720M	total

With a 3-month history, the main git data for MariaDB is about 50 MB, and the submodules as shallow clones add 30 MB more. If not using shallow cloning, the whole MariaDB repository and submodules would amount to over 1 GB of data, so using shallow clones cuts the amount of data to be downloaded by over 80%.

The checked out data is almost 550 MB, but that is unpacked from the git data, so actual network transfer was at max 80 MB of git data.

# Build inside a throwaway container

In addition to the source code, one also needs a long list of build dependencies installed. Instead of polluting your laptop/workstation with tens of new libraries, install all the dependencies inside a container that has a working directory mounted inside it. This way your system will stay clean, but files written in the working directory will be accessible both inside and outside the container, and persist after the container is gone.

Next, start the container:

shell
docker run --interactive --tty --rm \
  -v ${PWD}:/quick-rebuilds -w /quick-rebuilds debian:sid bash

This example uses Docker, but the principle is the same with any Linux container tool, such as Podman.

Inside the Debian container, use apt to automatically install all dependencies (about 160 MB download, over 660 MB when unpacked to disk) as defined in MariaDB sources file debian/control:

shell
echo 'deb-src http://deb.debian.org/debian sid main' \
  > /etc/apt/sources.list.d/deb-src-sid.list
apt update
apt install -y --no-install-recommends \
  devscripts equivs ccache eatmydata ninja-build clang entr moreutils
mk-build-deps -r -i mariadb-server/debian/control \
  -t 'apt-get -y -o Debug::pkgProblemResolver=yes --no-install-recommends'

The single biggest boost to the (re-)compilation speed is gained with Ccache:

shell
export CCACHE_DIR=$PWD/ccache
ccache --show-stats --verbose

We also want to prime the environment to use Clang:

shell
export CXX=${CXX:-clang++}
export CC=${CC:-clang}
export CXX_FOR_BUILD=${CXX_FOR_BUILD:-clang++}
export CC_FOR_BUILD=${CC_FOR_BUILD:-clang}
export CFLAGS='-Wno-unused-command-line-argument'
export CXXFLAGS='-Wno-unused-command-line-argument'

The first step in actual compilation is to run CMake, instructing it to look at the source in directory mariadb-server/, output build artifacts in directory build/ and use Ninja as the build system. This line also always forces a fresh configuration, discarding any previous CMakeCache.txt files, to use ccache instead of calling gcc/c++ directly, and also skip a bunch of rarely used large plugins to save a lot of compilation time.

shell
cmake -S mariadb-server/ -B build/ -G Ninja --fresh \
  -DCMAKE_CXX_COMPILER_LAUNCHER=ccache -DCMAKE_C_COMPILER_LAUNCHER=ccache \
  -DPLUGIN_COLUMNSTORE=NO -DPLUGIN_ROCKSDB=NO -DPLUGIN_S3=NO \
  -DPLUGIN_MROONGA=NO -DPLUGIN_CONNECT=NO -DPLUGIN_TOKUDB=NO \
  -DPLUGIN_PERFSCHEMA=NO -DWITH_WSREP=OFF

If you are interested in knowing all possible build flags available, simply query them from CMake with:

shell
cmake build/ -LH

Note that after the configure stage has run, there are no traditional Makefiles in ‘build/’, only a ninja.build since we are using Ninja. Thus, running make build will build. With Ninja it will be ninja -C build. However, we don’t need to call Ninja directly either but just let CMake orchestrate everything with:

$ eatmydata cmake --build build/
[173/1462] Building C object plugin/auth_ed25519/CMakeFiles/ref10.dir/ref10/ge_add.c.o

In interactive mode, Ninja will have just one line of output at the time showing progress. The numbers inside the brackets show how many files have been compiled of the total number of files to compile, and the filename after it shows which file is currently being compiled. Ninja runs by default on all available CPU cores, so there is no need to define parallelism manually. If Ninja encounters warnings or errors, it will spit them out but continue to show the one-liner status at the bottom of the terminal. To abort Ninja, feel free to press Ctrl+C at any time.

Re-starting the compilation will continue where it left off – Ninja is very smart and fast in figuring out what files need to compiled.

# Running the MariaDB test suite (MTR)

While the MariaDB server does have a small amount of CTest unit tests, the main test system is the mariadb-test-run script (inherited from mysql-test-run). Each test file (suffix .test) consists mainly of SQL code which is executed by mariadb-test-run (MTR) and output compared to the corresponding file with the expected output in text format (suffix .result).

To start the MTR with CMake run:

shell
cmake --build build/ -t test-force

Alternatively, one can simply invoke the script directly after the binaries have been compiled:

shell
./build/mysql-test/mysql-test-run.pl --force

This offers more flexibility, as you can easily add parameters such as --parallel=auto (as the default is to run just one test worker on one CPU) or limit the scope to just one suite or just one individual test:

shell
./build/mysql-test/mysql-test-run.pl --force --parallel=auto --skip-rpl --suite=main

Note that all commands in this example run as root, as it necessary to start the whole container with a root user inside it to have permissions to apt install build dependencies. However, the mariadb-test-run is actually not designed to be run as root and will end up skipping some tests when run as root. Also, when run like this, a lot of the debugging information isn’t fully shown. To make most out of the mysql-test-run/mariadb-test-run script, read more in the post Grokking the MariaDB test runner (MTR).

# More build targets

As concluded above, the target test-force was for MTR, and the plainly named target test is for CUnit tests. The equivalent direct Ninja command for running target test would be ninja -C build/ test. To list all targets, run cmake --build build/ --target help or ninja -C build/ -t targets all.

MariaDB 11.0 has currently over 1300 targets. There does not seem to be a very consistent pattern in how build targets are named or how they are intended to be used. One way to find CMake targets that might be more important than others is to simply grep them from the main level CMake configuration file:

$ grep ADD_CUSTOM_TARGET mariadb-server/CMakeLists.txt
  ADD_CUSTOM_TARGET(import_executables
  ADD_CUSTOM_TARGET(INFO_SRC ALL
  ADD_CUSTOM_TARGET(INFO_BIN ALL
  ADD_CUSTOM_TARGET(minbuild)
  ADD_CUSTOM_TARGET(smoketest

One of the standard targets is install, which can be run ninja -C build install or CMake:

$ cmake --install build/
-- Install configuration: "RelWithDebInfo"
-- Up-to-date: /usr/local/mysql/./README.md
-- Up-to-date: /usr/local/mysql/./CREDITS
-- Up-to-date: /usr/local/mysql/./COPYING
-- Up-to-date: /usr/local/mysql/./THIRDPARTY
-- Up-to-date: /usr/local/mysql/./INSTALL-BINARY
-- Up-to-date: /usr/local/mysql/lib/plugin/dialog.so
-- Up-to-date: /usr/local/mysql/lib/plugin/client_ed25519.so
-- Up-to-date: /usr/local/mysql/lib/plugin/caching_sha2_password.so
-- Up-to-date: /usr/local/mysql/lib/plugin/sha256_password.so
...
-- Installing: /usr/local/mysql/support-files/systemd/mysql.service
-- Installing: /usr/local/mysql/support-files/systemd/mysqld.service
-- Installing: /usr/local/mysql/support-files/systemd/mariadb@.service
-- Installing: /usr/local/mysql/support-files/systemd/mariadb@.socket
-- Installing: /usr/local/mysql/support-files/systemd/mariadb-extra@.socket
-- Up-to-date: /usr/local/mysql/support-files/systemd/mysql.service
-- Up-to-date: /usr/local/mysql/support-files/systemd/mysqld.service

To better understand the full capabilities of the build tools, it is recommended to skim through the cmake man page and the ninja man page.

# Run the build binaries directly

Instead of wasting time on running the install target, one can simply invoke the build binaries directly:

$ ./build/client/mariadb --version
./build/client/mariadb from 11.0.1-MariaDB, client 15.2 for Linux (x86_64) using  EditLine wrapper

$ ./build/sql/mariadbd --version
./build/sql/mariadbd  Ver 11.0.1-MariaDB for Linux on x86_64 (Source distribution)

To actually run the server, it needs a data directory and a user, which can be created with:

$ ./build/scripts/mariadb-install-db --srcdir=mariadb-server
$ adduser --disabled-password mariadb
$ chown -R mariadb:mariadb ./data
$ ./build/sql/mariadbd --datadir=./data --user=mariadb &
[Note] Starting MariaDB 11.0.1-MariaDB source revision  as process 5428
[Note] InnoDB: Compressed tables use zlib 1.2.13
[Note] InnoDB: Using transactional memory
[Note] InnoDB: Number of transaction pools: 1
[Note] InnoDB: Using crc32 + pclmulqdq instructions
[Warning] mariadbd: io_uring_queue_init() failed with errno 0
[Warning] InnoDB: liburing disabled: falling back to innodb_use_native_aio=OFF
[Note] InnoDB: Initializing buffer pool, total size = 128.000MiB, chunk size = 2.000MiB
[Note] InnoDB: Completed initialization of buffer pool
[Note] InnoDB: File system buffers for log disabled (block size=512 bytes)
[Note] InnoDB: Opened 3 undo tablespaces
[Note] InnoDB: 128 rollback segments in 3 undo tablespaces are active.
[Note] InnoDB: Setting file './ibtmp1' size to 12.000MiB. Physically writing the file full; Please wait ...
[Note] InnoDB: File './ibtmp1' size is now 12.000MiB.
[Note] InnoDB: log sequence number 47391; transaction id 14
[Note] InnoDB: Loading buffer pool(s) from /quick-rebuilds/data/ib_buffer_pool
[Note] InnoDB: Buffer pool(s) load completed at 230220 20:28:45
[Note] Plugin 'FEEDBACK' is disabled.
[Note] Server socket created on IP: '0.0.0.0'.
[Note] Server socket created on IP: '::'.
[Note] ./build/sql/mariadbd: ready for connections.
Version: '11.0.1-MariaDB'  socket: '/tmp/mysql.sock'  port: 3306  Source distribution

It is necessary to define the custom data directory path and custom user, otherwise mariadbd will fail to start:

[Warning] Can't create test file /usr/local/mysql/data/03727bdc8fe2.lower-test
./build/sql/mariadbd: Can't change dir to '/usr/local/mysql/data/' (Errcode: 2 "No such file or directory")
[ERROR] Aborting

./build/sql/mariadbd: Please consult the Knowledge Base to find out how to run mysqld as root!
[ERROR] Aborting

To gracefully stop the server, send it the SIGTERM signal:

$ pkill -ef mariadbd
[Note] ./build/sql/mariadbd (initiated by: unknown): Normal shutdown
[Note] InnoDB: FTS optimize thread exiting.
[Note] InnoDB: Starting shutdown...
[Note] InnoDB: Dumping buffer pool(s) to /quick-rebuilds/data/ib_buffer_pool
[Note] InnoDB: Buffer pool(s) dump completed at 230220 20:29:05
[Note] InnoDB: Removed temporary tablespace data file: "./ibtmp1"
[Note] InnoDB: Shutdown completed; log sequence number 47391; transaction id 15
[Note] ./build/sql/mariadbd: Shutdown complete
mariadbd killed (pid 5428)

# Quick rebuilds

With this setup, you can invoke eatmydata cmake --build build/ to have the source code re-compiled as quickly as possible.

The ‘screenshot’ below showcases how Ninja/CMake will only rebuild the file with changes and its dependencies. In the case of a simple MariaDB client version string change, only 5 files needed to be re-built, and it took less than a second:

$ sed 's/*VER= "15.1"/*VER= "15.2"/' -i mariadb-server/client/mysql.cc
$ time eatmydata cmake --build build/
[5/5] Linking CXX executable client/mariadb
real	0m0.992s
user	0m0.374s
sys	0m0.353s

A similar version string change in the server leads to having to rebuild over a thousand files:

$ sed 's/MYSQL_VERSION_PATCH=1/MYSQL_VERSION_PATCH=2/' -i mariadb-server/VERSION
$ time eatmydata cmake --build build/
[0/1] Re-running CMake...
-- Running cmake version 3.25.1
-- MariaDB 11.0.2
-- Packaging as: mariadb-11.0.2-Linux-x86_64
-- Could NOT find PkgConfig (missing: PKG_CONFIG_EXECUTABLE)
== Configuring MariaDB Connector/C
-- SYSTEM_LIBS: dl;m;dl;m;/usr/lib/x86_64-linux-gnu/libssl.so;/usr/lib/x86_64-linux-gnu/libcrypto.so;/usr/lib/x86_64-linux-gnu/libz.so
-- Configuring OQGraph
-- Configuring done
-- Generating done
-- Build files have been written to: /quick-rebuilds/build
[377/1257] Generating user.t
troff: fatal error: can't find macro file m
[378/1257] Generating user.ps
troff: fatal error: can't find macro file m
[433/1257] Building CXX object storage/archive/CMakeFiles/archive.dir/ha_archive.cc.o
In file included from /quick-rebuilds/mariadb-server/storage/archive/ha_archive.cc:29:
/quick-rebuilds/mariadb-server/storage/archive/ha_archive.h:91:15: warning: 'index_type' overrides a member function but is not marked 'override' [-Winconsistent-missing-override]
  const char *index_type(uint inx) { return "NONE"; }
              ^
/quick-rebuilds/mariadb-server/sql/handler.h:3915:23: note: overridden virtual function is here
  virtual const char *index_type(uint key_number) { DBUG_ASSERT(0); return "";}
                      ^
[...]

In file included from /quick-rebuilds/mariadb-server/storage/archive/ha_archive.cc:29:
/quick-rebuilds/mariadb-server/storage/archive/ha_archive.h:163:7: warning: 'external_lock' overrides a member function but is not marked 'override' [-Winconsistent-missing-override]
  int external_lock(THD *thd, int lock_type);
      ^
/quick-rebuilds/mariadb-server/sql/handler.h:5153:15: note: overridden virtual function is here
  virtual int external_lock(THD *thd __attribute__((unused)),
              ^
36 warnings generated.
[1257/1257] Linking CXX executable extra/mariabackup/mariadb-backup
real	2m7.786s
user	12m56.232s
sys	1m57.842s

The above example also shows how Ninja spits out warnings.

Despite the majority of the project files being re-built, it still took only two minutes, mainly thanks to ccache having a high hit-rate.

$ ccache --show-stats
Cacheable calls:   3235 / 3235 (100.0%)
  Hits:            1932 / 3235 (59.72%)
    Direct:          49 / 1932 ( 2.54%)
    Preprocessed:  1883 / 1932 (97.46%)
  Misses:          1303 / 3235 (40.28%)
Local storage:
  Cache size (GB): 0.11 / 5.00 ( 2.18%)

Without ccache, the build time in the same scenario is 6–8 minutes. There are some extra flags in ccache (such as CCACHE_SLOPPINESS) which can be used to further tune the ccache speed, but when I did some experimenting, I didn’t discover any that made a visible impact.

Without eatmydata, the build takes 10-20 seconds longer, as the system calls to disk will wait for fsync and the like to complete, but which we are fine skipping since we don’t care about data durability and crash recovery as this is a throwaway environment anyway. Using regular GNU GCC instead of Clang adds another 20–40 seconds to the rebuild time.

The current two minutes for the build time on my laptop with 8-core Intel i7-8650U CPU @ 1.90GHz is not exactly instant, but it is fast enough that I can sit and wait it out without feeling the need to context switch and loose my focus.

# Automatic rebuild

As showcased in the post How to code 10x faster than an average programmer, as a high-performing software developer, you don’t want to waste time on manually running a lot of commands to build and test your code, but instead you want to have a setup where you write code in your editor and have the code automatically re-compile and run when the source code file is saved.

For MariaDB, the automatic rebuild part can easily be achieved with:

shell
find mariadb-server/* | entr eatmydata cmake --build build/

To automatically rebuild and also run a binary (in this case the mariadb client), define multiple commands in quotes to the -s parameter:

shell
find mariadb-server/* | \
  entr -s 'eatmydata cmake --build build/; ./build/client/mariadb --version'

MariaDB client automatic compilation and re-run

When running the server use the -r parameter to have Entr automatically restart it:

shell
find mariadb-server/* | \
  entr -r ./build/sql/mariadbd --datadir=./data --user=mariadb

MariaDB server automatic compilation and restart

If the you are developing an MTR test by editing *.test files, there is no need to recompile anything, and you can simply have Entr re-run the test every time a file is changed:

shell
find mariadb-server/* | entr -r ./build/mysql-test/mysql-test-run.pl main.connect

MariaDB test run automatic restart

# Conclusion

The examples above are specific for MariaDB and illustrate in detail how to be efficient and avoid wasting time compiling, but the principles of utilizing ccache/clang/ninja apply for any software project in C/C++, and entr comes in handy in a myriad of situations.

Hopefully this inspires you to raise the bar on what to expect of speed and efficiency in the future!

Hey if you enjoyed reading the post, please share it on social media and subscribe for notifications about new posts!

comments powered by Disqus