Chapter 38 Iteration Best Practices
What You’ll Learn:
- When to use loops vs apply vs purrr
- Vectorization strategies
- Performance optimization
- Common pitfalls
- Design patterns
Key Errors Covered: 12+ iteration errors
Difficulty: ⭐⭐⭐ Advanced
38.2 Vectorization First
🎯 Best Practice: Prefer Vectorized Operations
# ❌ Bad: Loop
result <- numeric(length(mtcars$mpg))
for (i in seq_along(mtcars$mpg)) {
result[i] <- mtcars$mpg[i] * 2
}
# ❌ Bad: apply
result <- sapply(mtcars$mpg, function(x) x * 2)
# ✅ Good: Vectorized
result <- mtcars$mpg * 2
# Performance comparison
n <- 10000
x <- 1:n
system.time(sapply(x, sqrt))
#> user system elapsed
#> 0.003 0.000 0.003
system.time(sqrt(x)) # Much faster!
#> user system elapsed
#> 0 0 038.3 When to Use Each
💡 Key Insight: Decision Guide
# Use VECTORIZED operations when possible
x * 2
sqrt(x)
paste0("ID_", x)
# Use FOR LOOPS when:
# - Sequential dependencies
# - Early termination needed
# - Side effects (plotting, writing files)
# Use APPLY family when:
# - Row/column operations on matrices
# - Simple transformations on lists
# - Base R only (no tidyverse)
# Use PURRR when:
# - Type safety matters
# - Complex error handling needed
# - Working with nested lists
# - Modern tidyverse workflows38.4 Growing Objects Anti-Pattern
⚠️ Avoid Growing Objects
# ❌ Very bad: Growing vector
n <- 1000
system.time({
result <- c()
for (i in 1:n) {
result <- c(result, i^2)
}
})
#> user system elapsed
#> 0.004 0.000 0.005
# ✅ Good: Pre-allocate
system.time({
result <- numeric(n)
for (i in 1:n) {
result[i] <- i^2
}
})
#> user system elapsed
#> 0.003 0.000 0.002
# ✅ Best: Vectorize
system.time({
result <- (1:n)^2
})
#> user system elapsed
#> 0 0 0
# Growing lists
# ❌ Bad
result_list <- list()
for (i in 1:n) {
result_list[[i]] <- i^2
}
# ✅ Good: Pre-allocate
result_list <- vector("list", n)
for (i in 1:n) {
result_list[[i]] <- i^2
}
# ✅ Better: Use map
result_list <- map(1:n, ~ .^2)38.5 Summary
Decision Tree:
Can it be vectorized?
├─ Yes → Use vectorized operations
└─ No → Is it row/column-wise on matrix?
├─ Yes → Use apply()
└─ No → Working with lists?
├─ Yes → Need type safety?
│ ├─ Yes → Use purrr::map_*()
│ └─ No → Use lapply/sapply
└─ No → Sequential dependencies?
└─ Yes → Use for loop
Quick Reference:
| Task | Best Choice | Why |
|---|---|---|
| Element-wise math | Vectorized | Fastest |
| Row operations | apply() | Built-in |
| List operations | purrr::map() | Type-safe |
| Sequential | for loop | Clear logic |
| Side effects | for/walk() | Explicit |
Best Practices: