0

I'm testing a function that I'm writing, which uses a data.table object. Before, I "duplicate objects" by creating a new object and assigning the object I wanted to duplicate/copy to the new one like this one:

# my data.table object
unFildata <- vcfLoad_datatable("test.vcf")

# copy/duplicate unFildata to dataIn
dataIn <- unFildata

Then, I do some operation on dataIn like this:

# do some operation to replace some rows of GT column with NA
dataIn[!(GT %in% c("0/1", "0/0", "1/1")), GT := NA]

I looked at the unique values of GT of the dataIn object and the output are like this:

# unique GT
unique(dataIn$GT)

# output
[1] "0/1" "0/0" NA    "1/1"

Then, I check for the unique values of unFildata, which I did not use except for assigning to dataIn:

# unique GT of unFildata
unique(unFildata$GT)

# output 
[1] "0/1" "0/0" NA    "1/1"

So basically, I had the same output. I rerun loading the file and look at the unique values again of unFildata and it is:

unFildata <- vcfLoad_datatable("test.vcf")
unique(unFildata$GT)
[1] "0/1" "0/0" "0/4" "1/3" "1/2" "1/1" "0/2" "./." "0/3" "3/3" "2/2" 

It seems like the dataIn and unFildata are linked. Is this a data.table behaviour?

Or is this an Rstudio thing, I remember that the column numbers of data.table objects does not change in the environment panel of Rstudio when performing operations on data.table?

And how do you create a copy of a data.table object?

din
  • 692
  • 5
  • 12
  • 1
    Use `DT2 <- copy(DT)`. – Frank Apr 17 '17 at 03:25
  • Is this a data.table property? Like the linking of object and preventing multiple copies which can possibly eat up RAM? – din Apr 17 '17 at 03:29
  • 1
    Yeah, it's tied up with that. R copies an object as soon as it can recognize that it has been modified, which is arguably too often for large objects like tables. Data.table makes many modifications without triggering such a copy. The only reference I know for this is http://stackoverflow.com/questions/15759117/what-exactly-is-copy-on-modify-semantics-in-r-and-where-is-the-canonical-source – Frank Apr 17 '17 at 03:44
  • seems like a long read to fully understand that one. Why not put your comments to answer =D – din Apr 17 '17 at 03:55
  • Ah right, here's the longer answer: http://stackoverflow.com/q/10225098/ I can't close it but someone else could. – Frank Apr 17 '17 at 04:01

0 Answers0