2

I have created an empty data frame using another data frame with the below code.

compare<-data.frame(nrow=nrow(test_email),ncol=ncol(test_data))
colnames(compare)<-c("email", "gender")

Now, I am trying to assign value to the columns of compare data frame based on some conditions using simple assignment statement.

compare[1,1]<-test_email[1,1]
compare[1,2]<-test_data[1,2]

In the above, test_email[1,1] has an email ID like"abc@gmail.com" But, after assignment compare[1,1] has value 81 and not the email ID. I am not able to get it why the email is not getting assigned and some numeric vlaue is getting assigned. Can anyone let me know this reason and how to solve. Structure of test_email is below:

structure(list(email = structure(c(81L, 75L, 57L, 61L, 79L, 76L),
.Label = "ajay.bansal@siemens.com", "amanmeet.bhalla@gmail.com",
"aoneshp@gmail.com", "aparna_anand@msn.com", "ar.ashwani@gmail.com",
"ar.parulbansal@gmail.com", "ar.preet02@gmail.com",
"asdawsd@yahoo.com", "assd@yopmail.com",
"avijeet_yadav@rediffmail.com", "avneng1.negi@gmail.com",
"avnihatnagar@yahoo.com", "bansalanuj007@yahoo.com.au",
"bhanu5877@yahoo.co.in"), class = "factor")), .Names = "email",
w.names = c(NA, 6L), class = "data.frame")

I am not able to find out why R is converting email into some numeric values during assignment.

Tyler Rinker
  • 108,132
  • 65
  • 322
  • 519
Kunal Batra
  • 1,001
  • 3
  • 15
  • 23
  • 3
    Can you provide some sample data using `dput()`? Based on what you've written, it seems like you've entered your email IDs as factors (or R has automatically converted them to factors, which it generally does with strings unless you tell it not to). – A5C1D2H2I1M1N2O1R2T1 Aug 14 '12 at 09:48
  • test_email<-sqlQuery(channel, "select distinct email from abc limit 100"); test_data<-sqlQuery(channel, "select * from pqr"); Then I just created an empty data frame and then I am trying to assign it values from the above 2 created data frames test_email and test_data. But, its converting them into numeric. Please find the sample data email 1 shweta.katta@jasperindia.com 2 sanjaykhanna99@hotmail.com 3 neoneo006@gmail.com test_data: name gender 87 Aanand M 88 Aanandaswarup M – Kunal Batra Aug 14 '12 at 09:58
  • Maybe `stringsAsFactors=FALSE` helps. See `?data.frame`. – sgibb Aug 14 '12 at 10:04
  • 1
    @KunalBatra, sample data pasted into your comment like this is not very helpful. Please use something like `dput(head(--dataset--))` to at least show us the first few lines of your data. Using this approach will retain any modifications R has made to your data when reading it in (for example, factor conversions). [This is (well, should be) mandatory reading](http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example). – A5C1D2H2I1M1N2O1R2T1 Aug 14 '12 at 10:12
  • I added this option while creating compare data frame as below. compare<-data.frame(nrow=nrow(test_email),ncol=ncol(test_data),stringsAsFactors=FALSE); but, it did not help.. – Kunal Batra Aug 14 '12 at 10:14
  • @mrdwab: structure(list(email = structure(c(81L, 75L, 57L, 61L, 79L, 76L ), .Label = c("ajay.bansal@siemens.com", "amanmeet.bhalla@gmail.com", "aoneshp@gmail.com", "aparna_anand@msn.com", "ar.ashwani@gmail.com", "ar.parulbansal@gmail.com", "ar.preet02@gmail.com", "asdawsd@yahoo.com", .Names = "email", row.names = c(NA, 6L), class = "data.frame") hope this helps – Kunal Batra Aug 14 '12 at 10:17
  • Can you edit your question and add this information there? – Roman Luštrik Aug 14 '12 at 10:45
  • Can anyone please let me know what can be done to prevent this conversion from string to numeric.. – Kunal Batra Aug 14 '12 at 11:53
  • I think `as.character(test_email[1,1])` should do the trick. You should check the structure you posted for `test_email` by the way, there is something wrong with it. You probably mis-copied it. – plannapus Aug 14 '12 at 12:07
  • To prevent it though, you could probably have written your data.frame directly with the data as follows `compare <- data.frame(email=test_email, gender=test_data[,2])`. – plannapus Aug 14 '12 at 13:30
  • Try `dput` again. The code you provided here and above are missing elements. If it's too big use `dput(head(test_email))` and copy that output directly to here. Then use code tags to wrap it as you did your other code. – Tyler Rinker Aug 14 '12 at 13:39

1 Answers1

1

Your assignment is still reading the email addresses as factors.

A simple approach to this would be:

compare[1,1] <- as.character(test_email[1,1])