2

The following is a sub-set of data frame:

id  words  A   B   C   D  E  
1   new    1       1   
2   good   1  
3   star            1
4   never                  
5   final   

I want to define a new variable (called FF) as a new column and assign 1 to it, if values for all other variables (columns) are "null". The new data frame would be like this:

id  words  A   B   C   D  E  FF
1   new    1       1   
2   good   1  
3   star            1
4   never                     1                
5   final                     1

How I can do it using python and Pandas ? Thanks.

Mary
  • 1,142
  • 1
  • 16
  • 37

1 Answers1

6

You can define a function that is applied row-wise to the data frame:

def fill_if_nan(row):
    if row[['A', 'B', 'C', 'D', 'E']].isnull().all():
        return 1

    return None

df['FF'] = df.apply(fill_if_nan, axis=1)

Or a more elegant numpy based solution:

df['FF'] = np.where(df[['A', 'B', 'C', 'D', 'E']].isnull().all(1), 1, np.nan)
Jan Trienes
  • 2,501
  • 1
  • 16
  • 28
  • thank you. The program can not recognize the null valuse. For some rows, all the values for the variables are null, but the FF variable does not have "1". I think I need to replase all spaces to null values. Do you have a solution for that ? – Mary Jun 18 '17 at 17:03
  • In case you want to replace a space by `nan` you can use `df.replace(r'\s+', np.nan, regex=True)`. See this [question](https://stackoverflow.com/questions/13445241/replacing-blank-values-white-space-with-nan-in-pandas). – Jan Trienes Jun 18 '17 at 17:10
  • I tried it, but it also replace the colmns of words with null values if there are several words in the column and there is space between them , how I can say except the column "word". – Mary Jun 18 '17 at 17:40
  • @Mary You need to explicitly specify the columns as in `df[['A', 'B', ...]].replace(...)` and assign it back to the df. – Jan Trienes Jun 18 '17 at 17:50