0

I am trying to use a for loop to assign a column with one of two values based on the value of another column. I created a list of the items I want to assign to one element, using else to assign the others. However, my code is only assigning the else value to the column. I also tried elif and it did not work. Here is my code:

#create list of aggressive reasons
aggressive = ['AGGRESSIVE - ANIMAL', 'AGGRESSIVE - PEOPLE', 'BITES']

#create new column assigning 'Aggressive' or 'Not Aggressive'
for reason in top_dogs_reason['Reason']:
    if reason in aggressive:
        top_dogs_reason['Aggression'] = 'Aggressive'
    else:
        top_dogs_reason['Aggression'] = 'Not Aggressive'

My new column top_dogs_reason['Aggression'] only has the value of Not Aggressive. Can someone please tell me why?

roganjosh
  • 12,594
  • 4
  • 29
  • 46
  • i think the code for for loop should be "for reason in top_dogs_reason" – Sree Mar 10 '19 at 13:20
  • 1
    Yes, because you misunderstand vectorization and pandas. You're assigning to _the entire column_ on every iteration. – roganjosh Mar 10 '19 at 13:27
  • 2
    Try `top_dogs_reason['Aggression'] = np.where(top_dogs_reason['Reason'].isin(aggressive), "Aggressive", "Not Aggressive")` – roganjosh Mar 10 '19 at 13:30
  • @roganjosh how would I go about assigning to each row instead of the entire column? – Hunter Pack Mar 10 '19 at 13:31
  • Pandas/numpy is a nightmare on a phone. Sorry for the multiple edits. I think that should be close now. – roganjosh Mar 10 '19 at 13:32
  • As I have shown in my comment. That will be row-wise – roganjosh Mar 10 '19 at 13:32
  • You should really call out the pandas/numpy involvement here--it changes things pretty dramatically. :-) – John Szakmeister Mar 10 '19 at 13:35
  • how are you looking for a match between the series and the list? is it an exact macth? or a partial one? – anky Mar 10 '19 at 13:58
  • I would like to point out that the terms "column" or "line" are not used in the dictionary data type. This leads to chaos. The dictionary is something quite different, so you can not talk about "columns" and "rows" ! That's probably the problem why you're wrong because you do not understand exactly what it is like a data type dictionary. The dictionary represents a pair by the following style: `{key: one_or_more_items_of_a_particular_data_type}`. https://docs.python.org/3/tutorial/datastructures.html#dictionaries – s3n0 Mar 10 '19 at 14:09
  • Please add to your question a sample of the resulting data - what to look like. To know exactly what you want to achieve. I assume it will also be a dictionary type. – s3n0 Mar 10 '19 at 14:17

1 Answers1

1

You should be using loc to assign things like this which isolate a part of a dataframe you want to update. The first line grabs the values in the "Aggression" column where the "Reason" column has a value contained in the list `aggressive1. The second line finds places where its not in the "Reason" column.

top_dogs_reason[top_dogs_reason['Reason'].isin(aggressive), 'Aggression'] = 'Aggressive'
top_dogs_reason[~top_dogs_reason['Reason'].isin(aggressive), 'Aggression'] = 'Not Aggressive'

or in one line as Roganjosh explained which uses np.where which is much like an excel if/else statement. so here we're saying if reason is in aggressive, give us "Aggressive", otherwise "Not Aggressive", and assign that to the "Aggression" column:

top_dogs_reason['Aggression'] = np.where(top_dogs_reason['Reason'].isin(aggressive), "Aggressive", "Not Aggressive")

or anky_91's answer which uses .map to map values. this is an effective way to feed a dictionary to a pandas series, and for each value in the series it looks at the key in the dictionary and returns the corresponding value:

top_dogs_reason['reason'].isin(aggressive).map({True:'Aggressive',False:'Not Aggressive'})
Matt W.
  • 3,692
  • 2
  • 23
  • 46
  • 1
    one more :D `top_dogs_reason['reason'].isin(aggressive).map({True:'Aggressive',False:'Not Aggressive'})` – anky Mar 10 '19 at 14:25