Motivation
The previous package skeleton left out some essential details, such as testing for memory leaks and debugging. This vignette will cover these aspects in more detail. The references for this vignette are Vaughan, Hester, and Francois (2024), Padgham (2022), Vaughan (2019), Wickham and Bryan (2023), and R-pkg-devel (2024).
Compiler Setup
To test that your functions do not lead to memory errors, you can
create/edit a src/Makevars.in
file within the package
folder and add:
On Windows, the file is src/Makevars.win
and the content
is the same.
This requires some explanations:
-
-Wall
: Enables all warnings. -
-O0
: Disables optimizations for debugging purposes. Otherwise, the compiler will adjust the package compiled binaries for speedups, making it harder to debug. Once your code is working, you can switch to-O2
or-O3
to enable optimizations for production. If you plan to submit the package to CRAN, remove the flag, as they do not accept it. -
-pedantic
: Enforces strict ISO C++ compliance. This will warn about improper writing, similar to bad grammar or spelling errors in English. -
-UDEBUG -g
: Disables theDEBUG
macro and enables debugging information. When a package is ready,-g
can be removed to reduce the size of the compiled binaries.
In case that the “pedantic” part is not clear, here is an example:
#include "cpp4r.hpp"
#include <numeric>
using namespace cpp4r;
// Non-ISO: Use a variable length array
[[cpp4r::register]] double squared_sum_non_iso_(integers inp) {
int size = inp.size();
double array[size]; // will give a warning, but still compile
for (int i = 0; i < size; ++i) {
array[i] = inp[i] * inp[i];
}
return std::accumulate(array, array + size, 0.0);
}
// ISO: Use a vector
[[cpp4r::register]] double squared_sum_iso_(integers inp) {
int size = inp.size();
std::vector<double> vec(size);
for (int i = 0; i < size; ++i) {
vec[i] = inp[i] * inp[i];
}
return std::accumulate(vec.begin(), vec.end(), 0.0);
}
Even when the code compiles, it gives a warning:
code.cpp:489:10: warning: ISO C++ forbids variable length array ‘array’ [-Wvla]
489 | double array[size]; // will give a warning, but still compile
It is possible to verify that the functions are correct:
Testing for Memory Leaks
To test for memory leaks, you can use the valgrind
tool.
This tool is available on Linux and macOS.
You can test for memory leaks by running the following command in the terminal:
Or, alternatively, to call R in vanilla mode:
with test.R
containing:
library(cpp4rexamples)
squared_sum_iso_(1:5)
Adding a configure script
For a portable package, it is recommended to add a
configure
script. This script will check for the necessary
tools to build the package. The script is written in bash
and is placed in the configure
file in the package root
directory.
Here is an example of a configure
script for the
cpp4rexamples
package:
#!/bin/sh
PKG_CONFIG_NAME="gccsanissue"
pkg-config --version >/dev/null 2>&1
if [ $? -eq 0 ]; then
PKGCONFIG_CFLAGS=`pkg-config --cflags --silence-errors`
PKGCONFIG_LIBS=`pkg-config --libs`
fi
if [ "$PKGCONFIG_CFLAGS" ] || [ "$PKGCONFIG_LIBS" ]; then
echo "Found pkg-config cflags and libs!"
PKG_CFLAGS=${PKGCONFIG_CFLAGS}
PKG_LIBS=${PKGCONFIG_LIBS}
fi
CXXFLAGS="-stdlib=libc++"
# CXXFLAGS="-O0 -g -stdlib=libc++"
LDFLAGS="-stdlib=libc++"
sed -e "s|@cxxflags@|$CXXFLAGS|" \
-e "s|@ldflags@|$LDFLAGS|" \
src/Makevars.in > src/Makevars
exit 0
This file, meant for Unix systems, must be accompanied by an
configure.win
file for Windows systems.
Besides the configure script, an src/Makevars.in
file
must be created. This file is a template for the
src/Makevars
file. Here is an example:
LDFLAGS=@ldflags@
# Convert source files to object files
SOURCES = code.cpp \
cpp4r.cpp
OBJECTS = $(SOURCES:.cpp=.o)
all: $(SHLIB)
$(SHLIB): $(OBJECTS)
clean: rm -f $(OBJECTS) $(SHLIB)
Makevars.in
is also for Unix systems, and it must be
accompanied by an empty Makevars.win
for Windows
systems.
Finally, a cleanup
script helps to get a tidy package
build. Here is an example:
#!/bin/sh
rm -f src/Makevars configure.log
The advantage of this approach is that it will create a
src/Makevars
file with the correct flags for the system.
This way, the package will be portable and easier to test with GitHub
Actions or Docker.
Testing with Docker
CRAN checks packages on different Unix platforms, and additional tests for compiled code include testing for memory leaks with valgrind and address sanitizer.
Derived from the recommendations made by Dr. Krylov and Dr. Eddelbuettel in the R-pkg-devel mailing list, you can create the following script to test the package in a Docker container:
#!/bin/sh
PACKAGE_DIR=$(pwd)
# DOCKER_IMAGE="ghcr.io/r-hub/containers/valgrind:latest"
DOCKER_IMAGE="ghcr.io/r-hub/containers/clang-asan:latest"
docker pull $DOCKER_IMAGE
docker run --rm -v "$PACKAGE_DIR":/workspace -w /workspace $DOCKER_IMAGE bash -c "
Rscript -e 'install.packages(\"cpp4r\", repos=\"https://cran.rstudio.com/\")'
R CMD build .
R CMD check --as-cran --no-manual gccsanissue_0.1.0.tar.gz
"
You can add the following function to cpp4rexamples
:
[[cpp4r::register]] int bad_() {
int x = 42; // valid integer
int *ptr = &x; // pointer to `x`
// undefined behavior (alignment issue)
auto misaligned_ptr = reinterpret_cast<long*>(ptr);
return *misaligned_ptr; // Read through misaligned pointer
}
To access the function from the R session, you can add the following R code:
#' @title Bad function
#' @description This function has a GCC SAN issue
#' @export
#' @examples
#' bad()
bad <- function() {
bad_()
}
The function bad_
introduces undefined behavior by
reading through a misaligned pointer. This is a common issue in C++
code, and it is a good example to test the address sanitizer.
reinterpret_cast<long*>
creates a misaligned
pointer since ptr
is aligned for int
, not
long
. The pointer dereference *misaligned_ptr
introduces an undefined behavior. Without sanitizers, this will silently
execute, and it returns the value of x
. With sanitizers, it
will throw an error.
When testing with the valgrind container and the command
bash dev/test-docker.sh
, the R checks will pass, but the
valgrind check will fail with the following error:
Shadow bytes around the buggy address:
0x6ffdc4008d80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
Shadow byte legend (one shadow byte represents 8 application bytes):
Addressable: 00
Partially addressable: 01 02 03 04 05 06 07
Heap left redzone: fa
Freed heap region: fd
Stack left redzone: f1
Stack mid redzone: f2
Stack right redzone: f3
Stack after return: f5
Stack use after scope: f8
Global redzone: f9
Global init order: f6
Poisoned by user: f7
Container overflow: fc
Array cookie: ac
Intra object redzone: bb
ASan internal: fe
Left alloca redzone: ca
Right alloca redzone: cb
==1244==ABORTING