Logical Functions with Missing Values
Source:vignettes/07-logical-functions-2.Rmd
07-logical-functions-2.Rmd
Note
These functions ignore NA
values for now. Adjustments
for handling NA
values are covered in a separate
vignette.
R already provides efficient versions of the functions covered here. This is just to illustrate how to use C++ code.
Is any value in a vector ‘true’?
The following function expands the previous any_cpp()
function to handle missing values.
[[cpp4r::register]]
bool any2_cpp(logicals x, bool na_rm = false) {
int n = x.size();
bool has_na = false;
for (int i = 0; i < n; ++i) {
if (x[i] == NA_LOGICAL) {
has_na = true;
if (!na_rm) {
continue; // Skip NA values if na_rm is false
}
} else if (x[i]) {
return true;
}
}
// If we found any NA and na_rm is false, return NA
if (has_na && !na_rm) {
return NA_LOGICAL;
}
return false;
}
To test the functions, you can run the following benchmark code in the R console:
Which elements in a vector are ‘true’?
The following function expands the previous which_cpp()
function to handle missing values.
[[cpp4r::register]]
integers which2_cpp(logicals x, bool na_rm = false) {
int n = x.size();
writable::integers res;
int j = 0;
for (int i = 0; i < n; ++i) {
if (x[i] == NA_LOGICAL) {
if (!na_rm) {
// Skip NA values if na_rm is false
continue;
}
} else if (x[i]) {
++j;
res.push_back(i + 1);
}
}
if (j == 0) {
return integers(0);
} else {
return res;
}
}
To test the functions, you can run the following benchmark code in the R console:
Are all values in a vector ‘true’?
The following function expands the previous all_cpp()
function to handle missing values.
[[cpp4r::register]]
bool all2_cpp_1(logicals x, bool na_rm = false) {
int n = x.size();
bool has_na = false;
for (int i = 0; i < n; ++i) {
if (x[i] == NA_LOGICAL) {
has_na = true;
if (!na_rm) {
continue; // Skip NA values if na_rm is false
}
} else if (!x[i]) {
return false;
}
}
// If we found any NA and na_rm is false, return NA
if (has_na && !na_rm) {
return NA_LOGICAL;
}
return true;
}
More concise C++ alternatives are:
[[cpp4r::register]]
bool all2_cpp_2(logicals x, bool na_rm = false) {
bool has_na = false;
for (int i = 0; i < x.size(); ++i) {
if (x[i] == NA_LOGICAL) {
has_na = true;
if (!na_rm) {
continue;
}
} else if (!x[i]) {
return false;
}
}
if (has_na && !na_rm) {
return NA_LOGICAL;
}
return true;
}
[[cpp4r::register]]
bool all2_cpp_3(logicals x, bool na_rm = false) {
bool has_na = false;
for (int i = 0; i < x.size(); ++i) {
if (x[i] == NA_LOGICAL) {
has_na = true;
if (!na_rm) {
continue;
}
} else if (!x[i]) {
return false;
}
}
if (has_na && !na_rm) {
return NA_LOGICAL;
}
return true;
}
To test the functions, you can run the following tests and benchmark code in the R console:
set.seed(123) # for reproducibility
x <- rpois(1e6, lambda = 2) # 1,000,000 elements
all(c(TRUE, NA, TRUE))
all2_cpp_1(c(TRUE, NA, TRUE))
all2_cpp_1(c(TRUE, NA, TRUE), na_rm = TRUE)
all(c(FALSE, NA, TRUE))
all2_cpp_1(c(FALSE, NA, TRUE))
all2_cpp_1(c(FALSE, NA, TRUE), na_rm = TRUE)
# also test the TRUE-only case
all(x >= 0)
all2_cpp_1(x >= 0)
all2_cpp_2(x >= 0)
all2_cpp_3(x >= 0)