Saturday, September 24, 2016

Akro Build - an extreme C++ build system

This blog post describes a new build system for C++, Akro. By the end of this somewhat lengthy writeup, you'll figure out how to build complex C++ binaries with only three lines of build specification.


The C++ build fiasco


A long time ago I was a C/C++ developer who started to look at the new-language-on-the-block, Java. Language differences aside, what really impressed me was the easiness of building Java applications with ant. A straightforward but functional build file only needed to specify the top level of the source tree, and a target destination for the compiled class files. 

E.g.:
 <javac srcdir="java_src/" destdir="out/" />

In contrast, C/C++ developers had to deal with crummy, easy to break, and eventually bit rotting Makefiles. The main reason for this fiasco is the way C/C++ handles modularization - through the preprocessor which textually includes header files within the to-be-compiled C/C++ file. Header files would contain declarations of structures, functions, and classes, whereas C/C++ files would contain their definitions. To be link-compatible within the same executable, different C/C++ have to include exactly the same declarations.

The separation between declaration and definition makes it particularly difficult to construct efficient build systems. In the general sense, because one can't tell where the definition (or implementation) corresponding to a declaration resides, C/C++ build tools require the developer to manually specify all C/C++ files, and potentially the header files as well.

Note that the vast majority of modern languages have reasonable modularization systems which make it straightforward to design and write build systems. Nonetheless, no modern programming language with reasonable industry support matches C++'s speed across the board. In other words, C++ is not obsolete, even though it has many evolutionary vestiges and mis-features.

Meet Akro


Can we do any better than having to specify all the C/C++ sources and header files at least once? In the absolutely general sense, probably not. On the other hand, just because we can implement a function anywhere we like, doesn't mean that we should. Good coding practices require that the definition and implementation reside in parallel files, with a different extension but everything else equal.

E.g., ntest, the premier Othello-playing engine has the following files:

~/ntest/src$ ls n64/endgameSearch
endgameSearch.cpp      endgameSearch.h 


The endgameSearch.h contains declarations for two classes, EndgameSearch and Empty, while the cpp file has the implementations. Akro relies on these conventions, and makes the following fundamental assumption:

  • If you include a header file, it means that you want to build and link the associated C++ file, if it exists
Normally, you give Akro a top level file. If you're building an executable the top level file is the file generally the one with the main() function. Akro then determines all the header files that are included in that top-level file, by calling the compiler in a dependency-tracking mode (gcc -M). Then, it looks at all the header files that reside within the same directory tree structure (essentially project header files, not system ones from /usr/include). From those header files it collects all the associated C++ files. If the top level file included endgameSearch.h, then endgameSearch.cpp is added to the list of cpp files. The process is repeated for all collected files (so endgameSearch.cpp is also compiled in dependency-tracking mode, and all its c++ dependencies added to the list). A file is only added once, and once the collected list is transitively closed, the process stops.

To make all this tracking and collection conceptually simple, Akro builds all files in the same top-level directory, and with the same compilation flags.


What does Akro need?


Akro requires a rakefile within the top-level directory of your project. Akro is implemented on top of Rake, a superb generic build system written in Ruby.

Normally C++ projects need to specify compile flags. So the first line in the rakefile is:

$COMPILE_FLAGS = "-std=c++1y -fPIC -pthread -Wall -Werror -msse4.2 -I."

If a project links in additional libraries, those need to be specified via additional link flags. For instance, the following says to link in the tcmalloc library (a great replacement for the really slow gnu malloc):

$ADDITIONAL_LINK_FLAGS = "-L/opt/gperftools/lib/ -ltcmalloc_and_profiler"

The final line declares several binaries to build:

add_binaries("bookplay.exe", "ntest.exe")

With the above 3 lines, you can run:

akro release

which will build release/bookplay.exe and release/ntest.exe.

Note 1: none of the above three lines are mandatory. The default compilation flags are -Wall, and then depending on the build mode, debug or release, -g3 or -O3 -g3. The default additional link flags are empty. If you do not specify binaries to build, you can still build by specifying them on the command line, such as:
akro release/ntest.exe

So, it is technically possible to build with an empty rakefile, though it is unlikely that the default compilation flags will work for you, and you may want to not type the binary names.

Note 2: the above example is a simplified version of ntest's build file.

What Akro is and is not


Akro is a C++ build system based on Rake, but is not:
  • An automated and portable compilation flag detection system, like autoconf/automake
    • Nonetheless, build portability within Akro may be achieved by running commands such as uname -a and setting variables such as $COMPILE_FLAGS depending on the output. Ruby in fact makes it really easy to do the above.
  • pkg-config, though functionality to use pkg-config to determine build parameters may be added in the future. Akro only tracks dependencies within the project, not external ones.
  • A generic C++ build system. If you don't follow the highlighted best practices, don't use Akro.
  • Make. Akro does not use Make in any way, but relies on Rake instead (seriously, why do projects use an outdated system like Make anyway?)
Akro is a layer on top of Rake - it generates tasks and rules to build C++ projects. All Rake functionality is kept as well.


Installing Akro


You need a Linux system, with Ruby 2.0 or later, though 2.3 is preferred. Ideally your system already has it installed.

$ ruby --version
ruby 2.3.1p112 (2016-04-26) [x86_64-linux-gnu]

If not, it should already be present in the distribution's standard repositories. For instance, on Debian, Ubuntu you could apt-get install ruby.

Akro may work on other unices such as Mac OS/X, though it hasn't been tested on them yet.

If you're unlucky enough to run on an old ancient Linux system (such as Red Hat Enterprise Linux 6, whose acronym would be a good pun if it had another L), you may have to install the Ruby Virtual Machine manually, into your own directory. See https://rvm.io/rvm/install

If your system administrators are not cooperating, you may want to inform them that Linux is more akin to a cheap pale ale than wine or scotch - it really does not age well.

In any case, do not proceed until ruby --version shows something greater than 2.0.0

The next step is generally simple (may require root privileges):

gem install akro


Using Akro


Once akro is installed, you should be able to run it:

$ akro
rake aborted!
No Rakefile found (looking for: rakefile, Rakefile, rakefile.rb, Rakefile.rb)
/var/lib/gems/2.3.0/gems/akro-0.0.1/bin/akro:42:in `<top (required)>'
/usr/local/bin/akro:23:in `load'
/usr/local/bin/akro:23:in `<main>'
(See full trace by running task with --trace)

The error is to be expected, since we don't have a rakefile.

Now, go to your project's subdirectory. Create a rakefile which sets the proper $COMPILE_FLAGS. Then, run:

akro debug/path/to/main.exe

The above assumes that your top level file is path/to/main.{c,C,cpp,c++}. The main file may reside in the top level subdir, in which case the command is  akro debug/main.exe

Note that all the compilations happen in that top-level directory where the rakefile is created. It is your responsibility to set the correct -I include paths, as part of $COMPILE_FLAGS

You can also change the $COMPILER, which is by default g++.

Once you manage to compile your project, you may still run into link issues. If you need to specify third libraries to link in, use $ADDITIONAL_LINK_FLAGS

Finally, add your binary to the default build list via add_binaries, so that you can just run akro release or akro debug

Hacking Akro


Akro's code is hosted on Github: https://github.com/vladpetric/akro

It is released under the MIT license, which I consider the most permissive Free Software license.

It currently consists of three files:

bin/akro
lib/akro.rb
lib/akrobuild.rake

lib/akro.rb constains mostly variables and definitions which the users may override in their rakefiles. It's a good idea to take a look at it.

lib/akrobuild.rake contains the code which generates Rake rules and tasks.

Final Notes


Akro's current version is 0.0.1. It's been tested extensively, but it has a really small number of users. 
Your constructive feedback is greatly appreciated :)


Aknowledgements


I'd like to thank Maxim Trokhimtchouk for abuild, whose concepts served as an inspiration for Akro.

No comments:

Post a Comment