2

I have a data set like below and i want to swap if a value in one variable is greater than the other.

data

   start_year   end_year
    1991         1995
    1994         1990
    1997         1999
    1994         1995
    1995         1995
    1996         1991

I want to swap the rows where start_year is greater than end_year.

Expected output:

   start_year   end_year
    1991         1995
    1990         1994
    1997         1999
    1994         1995
    1995         1995
    1991         1996

Tried:

data_created = if(data$start_year > data$end_year, data$start_year == data$end_year & 
   data$end_year == data$start_year, data$start_year == data$start_year & 
   data$end_year == data$end_year)

Please help me in this way.

talat
  • 68,970
  • 21
  • 126
  • 157

2 Answers2

4

You can use ?pmax and ?pmin to compute the elementwise maximum and minimum of the years. Combined with transfrom, this would be:

transform(df, 
  start_year = pmin(start_year, end_year), 
  end_year = pmax(start_year, end_year))
#  start_year end_year
#1       1991     1995
#2       1990     1994
#3       1997     1999
#4       1994     1995
#5       1995     1995
#6       1991     1996
talat
  • 68,970
  • 21
  • 126
  • 157
0

You could use apply and sort for this.

put:

t(apply(df, 1, sort))

This will sort the data row by row (the 1 argument in apply means "by rows").

CJB
  • 1,759
  • 17
  • 26
  • This will get nasty if other columns are present, especially when those are not `numeric` since `apply` converts `data.frame`s to `matrix` class – talat Dec 14 '15 at 09:56
  • Easy enough to isolate the columns with `df[, c("start_year", "end_year")]` and then `cbind` them back in. Or perhaps even just change those columns as a subset with one line. – CJB Dec 14 '15 at 10:01