0

The class myDataFrame inherits a pandas DataFrame. When I make modifications to the DataFrame using "self =", the operation completes successfully but in fact the DataFrame object is not modified. Why is this the case and what is the correct way to modify the DataFrame?

import pandas
class myDataFrame(pandas.DataFrame):
    def __init__(self, adict):
        super().__init__(adict)

    def df_reorder_columns(self):
        self = self[["Name", "Number"]] # this assignment doesn't work
        
my_data = {'Number': [1, 2],
           'Name': ['Adam', 'Abel']}

test_myDataFrame = myDataFrame(my_data)
print(test_myDataFrame)
test_myDataFrame.df_reorder_columns()
print(test_myDataFrame)
   Number  Name
0       1  Adam
1       2  Abel
   Number  Name
0       1  Adam
1       2  Abel
CarlosE
  • 858
  • 2
  • 11
  • 22
  • I think the code should work with 'columns' attribute of the DataFrame in order to change it. – Raibek Oct 23 '22 at 03:42
  • 1
    Does this answer your question? [Why is `self` in Python objects immutable?](https://stackoverflow.com/questions/1015592/why-is-self-in-python-objects-immutable) – BeRT2me Oct 23 '22 at 05:37
  • 1
    TLDR of that; You should `return self[['Name', 'Number']]` and do `df = df.df_reorder_columns()` rather than trying to modify objects in-place. In-place methods tend to be frowned upon. – BeRT2me Oct 23 '22 at 05:44

2 Answers2

2

Personally... I think this makes more sense:

from pandas import DataFrame

class myDataFrame(DataFrame):
    def reorder_columns(self):
        self.__init__(self[['Name', 'Number']])


my_data = {'Number': [1, 2],
           'Name': ['Adam', 'Abel']}

df = myDataFrame(my_data)
print(df)

df = df.reorder_columns()
print(df)

Output:

   Number  Name
0       1  Adam
1       2  Abel

   Name  Number
0  Adam       1
1  Abel       2

But even better would be:

from pandas import DataFrame

class myDataFrame(DataFrame):
    def reorder_columns(self):
        return self[['Name', 'Number']]

my_data = {'Number': [1, 2],
           'Name': ['Adam', 'Abel']}

df = myDataFrame(my_data)
print(df)

df = df.reorder_columns()
print(df)
BeRT2me
  • 12,699
  • 2
  • 13
  • 31
0

I think the dataframe object should be reinitilized with a desired order of columns

import pandas as pd     

class myDataFrame(pandas.DataFrame):

    def __init__(self, adict):
        super().__init__(adict)

    def df_reorder_columns(self):
        self.__init__({"Name": self.Name, "Number": self.Number})

my_data = {'Number': [1, 2],
           'Name': ['Adam', 'Abel']}

test_myDataFrame = myDataFrame(my_data)
print(test_myDataFrame)
test_myDataFrame.df_reorder_columns()
print(test_myDataFrame)
Raibek
  • 558
  • 3
  • 6