60

I'm new to python/pandas and came across a code snippet.

df = df[~df['InvoiceNo'].str.contains('C')]

Would be much obliged if I could know what is the tilde sign's usage in this context?

Henry Ecker
  • 34,399
  • 18
  • 41
  • 57
Nirojan Selvanathan
  • 10,066
  • 5
  • 61
  • 82
  • 9
    Tilde means negation, i,e in this case, `InvoiceNo` that DONT contains `C` – Zero Sep 05 '17 at 11:53
  • Dupe of https://stackoverflow.com/q/8305199 – Zero Sep 05 '17 at 11:54
  • Thanks for the reference. – Nirojan Selvanathan Sep 05 '17 at 11:57
  • 3
    @Zero, arguably not a duplicate question, the question refers specifically to the context of a tilde operating on a pandas DataFrame which has behaves differently to the tilde in standard Python (e.g. Booleans), whereas the linked question asks about the tilde operator in a broad sense. – Himerzi Mar 14 '19 at 01:30

4 Answers4

61

It means bitwise not, inversing boolean mask - Falses to Trues and Trues to Falses.

Sample:

df = pd.DataFrame({'InvoiceNo': ['aaC','ff','lC'],
                   'a':[1,2,5]})
print (df)
  InvoiceNo  a
0       aaC  1
1        ff  2
2        lC  5

#check if column contains C
print (df['InvoiceNo'].str.contains('C'))
0     True
1    False
2     True
Name: InvoiceNo, dtype: bool

#inversing mask
print (~df['InvoiceNo'].str.contains('C'))
0    False
1     True
2    False
Name: InvoiceNo, dtype: bool

Filter by boolean indexing:

df = df[~df['InvoiceNo'].str.contains('C')]
print (df)
  InvoiceNo  a
1        ff  2

So output is all rows of DataFrame, which not contains C in column InvoiceNo.

jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252
4

It's used to invert boolean Series, see pandas-doc.

RobinFrcd
  • 4,439
  • 4
  • 25
  • 49
1
df = df[~df['InvoiceNo'].str.contains('C')]

The above code block denotes that remove all data tuples from pandas dataframe, which has "C" letters in the strings values in [InvoiceNo] column.

tilde(~) sign works as a NOT(!) operator in this scenario.

Generally above statement uses to remove data tuples that have null values from data columns.

Pasindu Perera
  • 489
  • 3
  • 8
-1

tilde ~ is a bitwise operator. If the operand is 1, it returns 0, and if 0, it returns 1. So you will get the InvoiceNo values in the df that does not contain the string 'C'

Haz
  • 11
  • 2