Compilation Databases¶
Before it can analyze a code base, CBI needs to know how each source file is compiled. Just like a compiler, CBI requires a full list of include paths, macro definitions and other options in order to identify which code is used by each platform. Rather than require all of this information to be specified manually, CBI reads it from a compilation database.
Generating a Compilation Database¶
Since our sample code base is already set up with a CMakeLists.txt file, we
can ask CMake to generate the compilation database for us with the
CMAKE_EXPORT_COMPILE_COMMANDS option:
cmake_minimum_required(VERSION 3.5)
project(tutorial)
set(CMAKE_EXPORT_COMPILE_COMMANDS ON)
set(SOURCES main.cpp third-party/library.cpp)
option(GPU_OFFLOAD "Enable GPU offload." OFF)
if (GPU_OFFLOAD)
add_definitions("-D GPU_OFFLOAD=1")
list(APPEND SOURCES gpu/foo.cpp)
else()
list(APPEND SOURCES cpu/foo.cpp)
endif()
add_executable(tutorial ${SOURCES})
Important
For projects that don’t use CMake, we can use Bear to intercept the commands generated by other build systems (such as GNU makefiles). Other build systems and tools that produce compilation databases should also be compatible.
CPU Compilation Commands¶
Let’s start by running CMake without the GPU_OFFLOAD option enabled, to
obtain a compilation database for the CPU:
$ mkdir build-cpu
$ cmake -G Ninja ../
$ ls
CMakeCache.txt CMakeFiles Makefile cmake_install.cmake compile_commands.json
Tip
Using the “Ninja” generator is not required, but is often faster and can
improve the quality of CBI’s results. Other generators (such as “Unix
Makefiles”) may use response (.rsp) files to pass command-line
options, and any options passed this way will not be respected by CBI.
You may need to install Ninja on your system (e.g., with pip install
ninja or similar).
This compile_commands.json file includes all the commands required to
build the code, corresponding to the commands that would be executed if we were
to actually run make.
Attention
CMake generates compilation databases when the cmake command is
executed, allowing us to generate compilation databases without also
building the application. Other tools (like Bear) may require a build.
In this case, it contains:
[
{
"directory": "/home/username/src/build-cpu",
"command": "/usr/bin/c++ -o CMakeFiles/tutorial.dir/main.cpp.o -c /home/username/src/main.cpp",
"file": "/home/username/src/main.cpp"
},
{
"directory": "/home/username/src/build-cpu",
"command": "/usr/bin/c++ -o CMakeFiles/tutorial.dir/third-party/library.cpp.o -c /home/username/src/third-party/library.cpp",
"file": "/home/username/src/third-party/library.cpp"
},
{
"directory": "/home/username/src/build-cpu",
"command": "/usr/bin/c++ -o CMakeFiles/tutorial.dir/cpu/foo.cpp.o -c /home/username/src/cpu/foo.cpp",
"file": "/home/username/src/cpu/foo.cpp"
}
]
GPU Compilation Commands¶
Repeating the exercise with GPU_OFFLOAD enabled gives us a different
compilation database for the GPU.
Warning
The GPU_OFFLOAD option is specific to this CMakeLists.txt file, and
isn’t something provided by CMake. Understanding how to build an application
for a specific target platform is beyond the scope of this tutorial.
As expected, we can see that the compilation database refers to gpu.cpp
instead of cpu.cpp, and that the GPU_OFFLOAD macro is defined as part
of each compilation command:
[
{
"directory": "/home/username/src/build-gpu",
"command": "/usr/bin/c++ -D GPU_OFFLOAD=1 -o CMakeFiles/tutorial.dir/main.cpp.o -c /home/username/src/main.cpp",
"file": "/home/username/src/main.cpp"
},
{
"directory": "/home/username/src/build-gpu",
"command": "/usr/bin/c++ -D GPU_OFFLOAD=1 -o CMakeFiles/tutorial.dir/third-party/library.cpp.o -c /home/username/src/third-party/library.cpp",
"file": "/home/username/src/third-party/library.cpp"
},
{
"directory": "/home/username/src/build-gpu",
"command": "/usr/bin/c++ -D GPU_OFFLOAD=1 -o CMakeFiles/tutorial.dir/gpu/foo.cpp.o -c /home/username/src/gpu/foo.cpp",
"file": "/home/username/src/gpu/foo.cpp"
}
]
These differences are the result of code divergence. We’ll explore how to use
codebasin to measure the amount of code divergence in a later tutorial.