Iterate functions with map

R
purrr
map
apply
When your functions are too slow because they’re not vectorized, map() can be a great alternative to for-loops.
Author

Guilherme D. Garcia

Published

April 24, 2023

I’m often in a situation where I have a function and I need to apply it iteratively to a lot of data. This is especially necessary for the package I maintain (Fonology). We want functions to be fast, of course. In R, this means we want it to be vectorized. Not coming from computer science, I find this topic quite interesting.

Not all functions can be vectorized, and that’s the issue. So what can we do? A common option is to run a for-loop. Here’s quick example: suppose you want to write a sequence of numbers where each number n repeats n times. Here’s one way to do that with a for-loop:

numbers = 1:5

for(i in numbers){
  rep(i, i) |> 
    print()
}
[1] 1
[1] 2 2
[1] 3 3 3
[1] 4 4 4 4
[1] 5 5 5 5 5

For-loops tend to do a great job if you don’t have too much data. They also tend to be intuitive, so if you’re not familiar with more exoteric functions, they are a very good place to start. That being said, it’s usually a good idea to avoid for-loops if there’s a better option out there (for-loops tend to be much slower). One common alternative is to use the apply() family of functions in R.

numbers = 1:5

lapply(numbers, function(x){
  rep(x, x)}) |> 
  unlist()
 [1] 1 2 2 3 3 3 4 4 4 4 5 5 5 5 5

A more recent option is to use the map() function from the purrr package, which is extremely useful. Here’s the same idea with map():

library(purrr)
map(1:5, \(x) rep(x, x)) |> unlist()
 [1] 1 2 2 3 3 3 4 4 4 4 5 5 5 5 5

So, if you’ve created a non-vectorized function and now need to apply it to several inputs at once (say, to a whole column of data), you can use map() or apply() to speed up the process.


Copyright © 2023 Guilherme Duarte Garcia