1

I want to assign data to a column of the data frame using for loop and a function but I got the common warning:

"SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame"

I have a data frame and three columns of date(year, month and day) and now I want a new column which converts those three columns to one.

I am using a for loop to assign the new data to the new column. I tried to use copy() and deepcopy() as you can see below but it does not work.

    for i in range(100008):
        df.new_col[i]=convert(df.year[i],df.mounth[i],df.day[i])

what I tried instead of second line:

 df.new_col[i].copy()=convert(df.year[i],df.mounth[i],df.day[i]) 
 deepcopy(df.new_col[i]) =convert(df.year[i],df.mounth[i],df.day[i])

I expected my code to assign the values to the column and it does(as I interrupted the kernel and called the df) but it takes many hours to do this. How can I fix the problem?

beginner
  • 85
  • 10
  • Please, when asking a question, remember to show the relevant part. A few lines of your starting dataframe, the expected result, and the relevant parts of your code (for example, the `g_to_j` function). So we can help better: see https://stackoverflow.com/questions/20109391/how-to-make-good-reproducible-pandas-examples – Valentino Aug 25 '19 at 13:34

1 Answers1

2

Instead of a for loop, use the pandas apply method, which is much faster.

If I understand correctly your code, you want something like this:

df['new_col'] = df.apply(convert2, axis=1)

where convert2 is defined like:

def convert2(x):
    return convert(x['year'], x['month'], x['day'])

This is because, when passing a function to apply, the function must take as its argument a row or a column (a row in this case, since axis=1) of the dataframe.

Alternatively, instead of defining convert2, you can use a lambda function:

df['new_col'] = df.apply(lambda x : convert(x['year'], x['month'], x['day']), axis=1)
Valentino
  • 7,291
  • 6
  • 18
  • 34
  • thank you, I tried the lambda function and it worked. (I changed "g_to_j" to "convert", "comments" to "df" and 'jal_col" to "new_col" in case you wanted to edit your answer for others) – beginner Aug 25 '19 at 14:01