Chapter 55 Rcpp Basics

What You’ll Learn:

  • When to use C++
  • Basic Rcpp syntax
  • Common errors
  • Performance gains

Difficulty: ⭐⭐⭐ Advanced

55.1 Introduction

Rcpp allows you to write C++ code called from R:

library(Rcpp)

# Simple C++ function
cppFunction('
int add(int x, int y) {
  return x + y;
}
')

add(2, 3)
#> [1] 5

55.2 When to Use Rcpp

Use Rcpp when: - Loops that can’t be vectorized - Recursive functions - Need maximum speed - Processing large data

Don’t use Rcpp when: - R solution is fast enough - Can vectorize in R - Maintenance burden too high

55.3 Basic Syntax

💡 Key Insight: Rcpp Sugar

# R-like syntax in C++
cppFunction('
NumericVector compute(NumericVector x) {
  return sqrt(x * 2 + 1);
}
')

compute(1:5)
#> [1] 1.732051 2.236068 2.645751 3.000000 3.316625

# Loops in C++
cppFunction('
double sum_cpp(NumericVector x) {
  double total = 0;
  for(int i = 0; i < x.size(); i++) {
    total += x[i];
  }
  return total;
}
')

sum_cpp(1:1000000)
#> [1] 500000500000

55.4 Performance Example

# R version
mean_r <- function(x) {
  sum(x) / length(x)
}

# C++ version
cppFunction('
double mean_cpp(NumericVector x) {
  int n = x.size();
  double total = 0;
  for(int i = 0; i < n; i++) {
    total += x[i];
  }
  return total / n;
}
')

# Compare
x <- rnorm(1000000)

library(microbenchmark)
microbenchmark(
  mean(x),
  mean_r(x),
  mean_cpp(x),
  times = 100
)
#> Unit: microseconds
#>         expr      min       lq      mean   median       uq      max neval
#>      mean(x) 1614.553 1687.730 1825.2776 1755.481 1905.550 2447.942   100
#>    mean_r(x)  855.805  894.779  976.5183  935.518  985.955 1690.318   100
#>  mean_cpp(x) 2250.870 2366.931 2606.6384 2449.316 2654.944 6136.568   100

55.5 Common Errors

55.5.1 Error: No matching function

# Wrong: R syntax in C++
cppFunction('
NumericVector add_one(NumericVector x) {
  return x + 1  # Needs semicolon!
}
')

Solution: C++ requires semicolons

cppFunction('
NumericVector add_one(NumericVector x) {
  return x + 1;
}
')

55.6 Summary

Key Takeaways:

  1. Rcpp - Integrate C++ for speed
  2. Use for loops - When can’t vectorize
  3. Rcpp Sugar - R-like syntax
  4. Test performance - Not always faster
  5. Maintenance cost - Consider complexity

Quick Reference:

# Inline C++
library(Rcpp)

cppFunction('
double my_function(NumericVector x) {
  // C++ code here
  return result;
}
')

# Common types:
# NumericVector - numeric vector
# IntegerVector - integer vector
# CharacterVector - character vector
# NumericMatrix - matrix
# List - list