Skip to contents

Note

These functions ignore NA values for now. Adjustments for handling NA values are covered in a separate vignette.

R already provides efficient versions of the functions covered here. This is just to illustrate how to use C++ code.

Is any value in a vector ‘true’?

The following function expands the previous any_cpp() function to handle missing values.

[[cpp4r::register]]
bool any2_cpp(logicals x, bool na_rm = false) {
  int n = x.size();
  bool has_na = false;
  
  for (int i = 0; i < n; ++i) {
    if (x[i] == NA_LOGICAL) {
      has_na = true;
      if (!na_rm) {
        continue; // Skip NA values if na_rm is false
      }
    } else if (x[i]) {
      return true;
    }
  }
  
  // If we found any NA and na_rm is false, return NA
  if (has_na && !na_rm) {
    return NA_LOGICAL;
  }
  
  return false;
}

To test the functions, you can run the following benchmark code in the R console:

set.seed(123) # for reproducibility
x <- rpois(1e6, lambda = 2) # 1,000,000 elements
y <- ifelse(x > 2, TRUE, FALSE)

any(c(TRUE, NA, FALSE))
any2_cpp(c(TRUE, NA, FALSE))
any2_cpp(c(TRUE, NA, FALSE), na_rm = TRUE)

mark(
  any(y),
  any2_cpp(y)
)

Which elements in a vector are ‘true’?

The following function expands the previous which_cpp() function to handle missing values.

[[cpp4r::register]]
integers which2_cpp(logicals x, bool na_rm = false) {
  int n = x.size();
  writable::integers res;
  int j = 0;

  for (int i = 0; i < n; ++i) {
    if (x[i] == NA_LOGICAL) {
      if (!na_rm) {
        // Skip NA values if na_rm is false
        continue;
      }
    } else if (x[i]) {
      ++j;
      res.push_back(i + 1);
    }
  }

  if (j == 0) {
    return integers(0);
  } else {
    return res;
  }
}

To test the functions, you can run the following benchmark code in the R console:

which(c(TRUE, NA, FALSE, TRUE))
which2_cpp(c(TRUE, NA, FALSE, TRUE))
which2_cpp(c(TRUE, NA, FALSE, TRUE), na_rm = TRUE)

mark(
  which(y[1:1000]),
  which2_cpp(y[1:1000])
)

Are all values in a vector ‘true’?

The following function expands the previous all_cpp() function to handle missing values.

[[cpp4r::register]]
bool all2_cpp_1(logicals x, bool na_rm = false) {
  int n = x.size();
  bool has_na = false;
  
  for (int i = 0; i < n; ++i) {
    if (x[i] == NA_LOGICAL) {
      has_na = true;
      if (!na_rm) {
        continue; // Skip NA values if na_rm is false
      }
    } else if (!x[i]) {
      return false;
    }
  }
  
  // If we found any NA and na_rm is false, return NA
  if (has_na && !na_rm) {
    return NA_LOGICAL;
  }
  
  return true;
}

More concise C++ alternatives are:

[[cpp4r::register]]
bool all2_cpp_2(logicals x, bool na_rm = false) {
  bool has_na = false;
  
  for (int i = 0; i < x.size(); ++i) {
    if (x[i] == NA_LOGICAL) {
      has_na = true;
      if (!na_rm) {
        continue;
      }
    } else if (!x[i]) {
      return false;
    }
  }
  
  if (has_na && !na_rm) {
    return NA_LOGICAL;
  }
  
  return true;
}

[[cpp4r::register]]
bool all2_cpp_3(logicals x, bool na_rm = false) {
  bool has_na = false;
  
  for (int i = 0; i < x.size(); ++i) {
    if (x[i] == NA_LOGICAL) {
      has_na = true;
      if (!na_rm) {
        continue;
      }
    } else if (!x[i]) {
      return false;
    }
  }
  
  if (has_na && !na_rm) {
    return NA_LOGICAL;
  }
  
  return true;
}

To test the functions, you can run the following tests and benchmark code in the R console:

set.seed(123) # for reproducibility
x <- rpois(1e6, lambda = 2) # 1,000,000 elements

all(c(TRUE, NA, TRUE))
all2_cpp_1(c(TRUE, NA, TRUE))
all2_cpp_1(c(TRUE, NA, TRUE), na_rm = TRUE)

all(c(FALSE, NA, TRUE))
all2_cpp_1(c(FALSE, NA, TRUE))
all2_cpp_1(c(FALSE, NA, TRUE), na_rm = TRUE)

# also test the TRUE-only case
all(x >= 0)
all2_cpp_1(x >= 0)
all2_cpp_2(x >= 0)
all2_cpp_3(x >= 0)
mark(
  all(x > 2),
  all2_cpp_1(x > 2),
  all2_cpp_2(x > 2),
  all2_cpp_3(x > 2)
)

References