3

I have an question about how shape broadcasting works in Pandas. Suppose I have a dataframe:

df = pd.DataFrame({'A': [1, 2, 3, 4, 5], 'B': [11, 22, 33, 44, 55]})

And I try to replace the first two rows of column 'A' with their corresponding values in column 'B'.

When I try assigning the values in column B explicitly as a list:

df.loc[[0,1], 'A'] = list(df['B'])

I get the obvious shape broadcast error:

ValueError: shape mismatch: value array of shape (5,) could not be broadcast to indexing result of shape (2,)

But when I assign column B directly:

df.loc[[0,1], 'A'] = df['B']

I don't get any errors and Pandas implicitly subsets column B and assigns to column A. The final output is

    A   B
0  11  11
1  22  22
2   3  33
3   4  44
4   5  55

Is this expected behavior? Why does Pandas not raise a shape mismatch error in this case?

cs95
  • 379,657
  • 97
  • 704
  • 746
Adarsh Chavakula
  • 1,509
  • 19
  • 28
  • https://stackoverflow.com/questions/29954263/what-does-the-term-broadcasting-mean-in-pandas-documentation – Alexander Jan 06 '20 at 19:11

1 Answers1

4

Pandas is clever, so you can offload the broadcasting to it and it'll only assign the values at specified indices. This will work everytime you assign a Series to a column as long as the indices match.

Here's another example of how it works:

df = pd.DataFrame({'a': [1, 2, 3], 'b': [4, 5, 6],}, index=['a', 'b', 'c'])
df
   a  b
a  1  4
b  2  5
c  3  6

df.loc[['a', 'b'], 'a'] = pd.Series([4, 5, 6], index=['b', 'c', 'a'])
df

   a  b
a  6  4
b  4  5
c  3  6
cs95
  • 379,657
  • 97
  • 704
  • 746
  • Well it's great that Pandas is clever and all but it's doing things I didn't ask for. Isn't explicit better than implicit? – Adarsh Chavakula Jan 06 '20 at 19:07
  • @AdarshChavakula That's the zen of python, not pandas. Index-based alignment and broadcasting is a basic pandas feature, and I think a great one. – cs95 Jan 06 '20 at 19:11
  • @AdarshChavakula I should mention it's a little confusing here since you don't explicitly set or use indices, but take a gander at the edited example provided, you'll see how this feature is a lot more useful now. – cs95 Jan 06 '20 at 19:12
  • So if I understand it correctly, as long as Pandas finds common indices on both sides, will it broadcast them correctly regardless of their shapes? – Adarsh Chavakula Jan 06 '20 at 19:16
  • @AdarshChavakula yes. – cs95 Jan 06 '20 at 19:18
  • Makes sense now. Thanks a lot! – Adarsh Chavakula Jan 06 '20 at 19:19