0

The dataframe I am working with has two columns: 1) person ID and 2) date. I am trying to assign numeric day values of date for each person.

For instance, person 1 has date from 2016-01-01 (baseline) to 2016-01-05 (last date for person 1). I want to create a day column that would translate this to 1, 2, 3, 4, 5. If person 2 has date from 2016-01-13 to 2016-01-16, the day column for person 2 would be 1, 2, 3, 4.

df <- for(i in length(unique(per1$date))){df$day[per1$date[1] + i] <- i+1}

This is basically what I am trying to do, but I get an error message saying:

"replacement has 17119 rows, data has 1670"

Please let me know how I can write the code for this. Thank you.

alistaire
  • 42,459
  • 4
  • 77
  • 117
JParkDS
  • 19
  • 5
  • 1
    Can you add the output from `dput(head(df))` so we can help you with this? Please add it in the question as an edit rather than as a comment – morgan121 Jan 29 '19 at 05:55
  • 1
    Welcome to SO. Please read how to give a [reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example). This will make it much easier for others to help you. OW, see `tidyr::complete` – A. Suliman Jan 29 '19 at 06:10

1 Answers1

0

you can use this

library(data.table)

## Create Data
df <- data.table(personID = c(1,1,1,2,2,2,2), 
                 Date = c("2016-01-01", "2016-01-02", "2016-01-03", "2016-01-13", "2016-01-14", "2016-01-15", "2016-01-16"))

## Order the data according to date, per user 
df <- df[order(Date), .SD, by = personID]

## Rank the date, within each personID group
df <- df[, Day:= 1:.N, .(personID)]
df

   personID       Date Day
1:        1 2016-01-01   1
2:        1 2016-01-02   2
3:        1 2016-01-03   3
4:        2 2016-01-13   1
5:        2 2016-01-14   2
6:        2 2016-01-15   3
7:        2 2016-01-16   4
Hardik Gupta
  • 4,700
  • 9
  • 41
  • 83
  • Could shorten the last step with `df[, rowid(personID)]` and the second step to `df[order(Date, personID)]`. – s_baldur Jan 29 '19 at 08:39