How to use str_which to select rows which contain a string from a Vector

I have a Table like this

name    <- c("Goku","Vegeta","Jiren","Gohan","Piccolo","Kurinin","Trunks","Buu","Frieza","Cell","Muten","Gotens")
surname <- c("San","San","San","San","San","San","San","Majin","Evil","San","Roshi","San")
email   <- c("[email protected]","[email protected]","[email protected]","[email protected]","[email protected]","[email protected]","[email protected]","[email protected]","[email protected]","[email protected]","[email protected]","[email protected]")

table <- data.frame(name, surname, email, stringsAsFactors = FALSE)


And I have a Vector with different endings in email adresses. I want to find all rows which use email adresses with this endings

searchvector = c("@patrol.ch", "@babidi.com", "@rampage.usa")
searchvector = as.character(searchvector)

There are two ways I tried to search for the rows containg the searchvector:

A. Using str_detect:

table[str_detect(table$email, "@patrol.ch|@babidi.com|@rampage.usa"), ]

This gives me the correct result

name surname              email  
3   Jiren     San    [email protected]  
8     Buu   Majin     [email protected]  
9  Frieza    Evil [email protected]  
10   Cell     San   [email protected] 




B. But when using str_which, I always only get two rows

table[str_which(table$email, searchvector), ]
table[str_which(table$email, c("@patrol.ch", "@babidi.com", "@rampage.usa")), ]

I get this result in both cases:

name surname email  
8 Buu Majin [email protected]
9 Frieza Evil [email protected]


Why is that? And how can I use str_which to do what I want to accomplish?

Answers

According to ?str_which, it is a wrapper function

str_which() is a wrapper around which(str_detect(x, pattern)), and is equivalent to grep(pattern, x).

Inorder to get the same output, we need a single string in pattern. It can he created with paste and specifying the collapse argument to |

table[str_which(table$email, paste(searchvector, collapse="|")), ]
#     name surname              email
#3   Jiren     San    [email protected]
#8     Buu   Majin     [email protected]
#9  Frieza    Evil [email protected]
#10   Cell     San   [email protected]




just like it was created for str_detect in the OP's post

If we use the vector as pattern in str_detect

table[str_detect(table$email, searchvector),]
#   name surname              email
#8    Buu   Majin     [email protected]
#9 Frieza    Evil [email protected]

returns the same output as in str_which with OP's code

Regarding the vectorization issue with str_detect, it is, but here the length of the 'email' and 'searchvector' is different. So, there would be a recycling issue

Posted on by akrun

Relevant tags