15

I have a data frame with three variables and 250K records. As an example consider

df <- data.frame(V1=c(1,2,4), V2=c("a","a","b"), V3=c(2,3,1))
V1 V2 V3
1  a  2
2  a  3
4  b  1

and want to swap values between V1 and V3 based on the value of V2 as follows: if V2 == 'b' then V1 <- V3 and V3 <- V1 resulting in

V1 V2 V3
1  a  2
2  a  3
1  b  4

I tried a do loop but it takes forever. If I use Perl, it takes seconds. I believe this task can be done efficiently in R as well. Any suggestions are appreciated.

NelsonGon
  • 13,015
  • 7
  • 27
  • 57
  • 1
    I'm curious how this situation came up, if you don't mind my asking. I have some experience using software that is at least sort of designed to work with survey data, but as we follow the trend of integrating with IT databases, questions of data structure have started coming up and I've needed to start thinking consciously about how we store stuff in tables. That's why I'm curious how your situation came up :) – Jonathan Oct 12 '11 at 21:38

3 Answers3

22

Try this

 df <- data.frame(V1=c(1,2,4), V2=c("a","a","b"), V3=c(2,3,1))
 df[df$V2 == "b", c("V1", "V3")] <- df[df$V2 == "b", c("V3", "V1")] 

which yields:

> df
  V1 V2 V3
1  1  a  2
2  2  a  3
3  1  b  4
Ryogi
  • 5,497
  • 5
  • 26
  • 46
15

You can use transform to do this.

df <- transform(df, V3 = ifelse(V2 == 'b', V1, V3), V1 = ifelse(V2 == 'b', V3, V1))
paul
  • 1,201
  • 12
  • 9
  • I am trying x=read.table("1.txt") x <- transform(x, x[[1]] <- ifelse(x[[1]]>x[[2]], x[[2]], x[[1]]), x[[2]] <- ifelse(x[[1]]>x[[2]], x[[1]], x[[2]])) but no success! – phoenix Oct 09 '13 at 12:14
4

Editted I got tripped up with column names, sorry. This works.

If you don't mind the rows ending up in different orders, this is kind of a 'cute' way to do this:

dat <- read.table(textConnection("V1 V2 V3
1  a  2
2  a  3
4  b  1"),sep = "",header = TRUE)

tmp <- dat[dat$V2 == 'b',3:1]
colnames(tmp) <- colnames(dat)
rbind(dat[dat$V2 != 'b',],tmp)

Basically, that's just grabbing the rows where V2 == 'b', reverses the columns and slaps it back together with everything else. This can be extended if you have more columns that don't need switching; you'd just use an integer index with those values transposed, rather than just 3:1.

joran
  • 169,992
  • 32
  • 429
  • 468