Skip to contents

These examples were adapted from Vaughan, Hester, and Francois (2024).

Note

These functions ignore NA values for now. Adjustments for handling NA values are covered in a separate vignette.

R already provides efficient versions of the functions covered here. This is just to illustrate how to use C++ code.

Is any value in a vector ‘true’?

Base R’s any() function returns TRUE if there is at least one TRUE element in a vector, and FALSE otherwise. Below is one possible C++ implementation:

[[cpp4r::register]]
bool any_cpp(logicals x) {
  int n = x.size();
  
  for (int i = 0; i < n; ++i) {
    if (x[i]) {
      return true;
    }
  }
  return false;
}

Its R equivalent would be:

any_r <- function(x) {
  n <- length(x)
  
  for (i in 1:n) {
    if (x[i]) {
      return(TRUE)
    }
  }
  FALSE
}

Add and document the functions, update the package as in the previous vignettes, and then compare the functions speed with:

# install.packages("bench")
library(bench)

set.seed(123) # for reproducibility
x <- rpois(1e6, lambda = 2) # 1,000,000 elements
y <- ifelse(x > 2, TRUE, FALSE)

any(y)
any_cpp(y)
any_r(y)

mark(
  any(y),
  any_cpp(y),
  any_r(y)
)

Which elements in a vector are ‘true’?

Base R’s which() function returns the indices of the TRUE elements in a vector. Here is a possible C++ implementation:

[[cpp4r::register]]
integers which_cpp(logicals x) {
  int n = x.size();
  writable::integers res;
  int j = 0;

  for (int i = 0; i < n; ++i) {
    if (x[i]) {
      ++j;
      res.push_back(i + 1);
    }
  }

  if (j == 0) {
    return integers(0);
  } else {
    return res;
  }
}

Its R equivalent would be:

which_r <- function(x) {
  n <- length(x)
  res <- c()
  j <- 0

  for (i in 1:n) {
    if (x[i]) {
      res <- c(res, i)
      j <- j + 1
    }
  }

  if (j == 0) {
    return(0)
  } else {
    return(res)
  }
}

To test the functions, you can run the following benchmark code in the R console:

which(y[1:100])
which_cpp(y[1:100])
which_r(y[1:100])

mark(
  which(y[1:1000]),
  which_cpp(y[1:1000]),
  which_r(y[1:1000])
)

Are all values in a vector ‘true’?

Base R’s all() function checks if all elements in a vector are TRUE. Here is a possible C++ implementation that loops over the vector:

[[cpp4r::register]]
bool all_cpp_1(logicals x) {
  int n = x.size();
  for (int i = 0; i < n; ++i) {
    if (!x[i]) {
      return false;
    }
  }
  return true;
}

More concise C++ alternatives are:

[[cpp4r::register]]
bool all_cpp_2(logicals x) {
  for (int i = 0; i < x.size(); ++i) {
    if (!x[i]) {
      return false;
    }
  }
  return true;
}

[[cpp4r::register]] bool
all_cpp_3(logicals x) {
  for (bool i : x) {
    if (!i) {
      return false;
    }
  }
  return true;
}

[[cpp4r::register]] bool
all_cpp_4(logicals x) {
  return std::all_of(x.begin(), x.end(), [](bool x) { return x; });
}

To avoid typing std:: every time, you can use using namespace std; at the top of src/code.cpp. However, this is not recommended because it can lead to conflicts. A better option is to declare using std::the_function; which means you can use the_function instead of std::the_function each time (Akbiggs 2024). However, this is not recommended as it can lead to conflicts if two functions have the same name.

To test the functions, you can run the following tests and benchmark code in the R console:

set.seed(123) # for reproducibility
x <- rpois(1e6, lambda = 2) # 1,000,000 elements

all(x > 2)
all_cpp_1(x > 2)
all_cpp_2(x > 2)
all_cpp_3(x > 2)
all_cpp_4(x > 2)

# also test the TRUE-only case
all(x >= 0)
all_cpp_1(x >= 0)
all_cpp_2(x >= 0)
all_cpp_3(x >= 0)
all_cpp_4(x >= 0)
mark(
  all(x > 2),
  all_cpp_1(x > 2),
  all_cpp_2(x > 2),
  all_cpp_3(x > 2),
  all_cpp_4(x > 2)
)

References

Akbiggs. 2024. “What’s the Problem with "Using Namespace Std;"?” Forum post. Stack Overflow. https://stackoverflow.com/q/1452721/3720258.
Vaughan, Davis, Jim Hester, and Roman Francois. 2024. “Get Started with Cpp11.” https://cpp11.r-lib.org/articles/cpp11.html#intro.