Stats, Money, and NYC


Chaining methods in pandas

Chaining methods in pandas makes your code easy to read.

df = (pd.read_csv('test.csv')
      .rename(columns={'column_a': 'ColumnA'})
      .assign(colA=lambda x: x['ColumnA'] * 2)

You don't need to know the pandas API to understand what's going on here. It's also way cleaner than the alternative.

df = pd.read_csv('test.csv')
df = df.set_index('myIndex')

And so on...

The best way to chain methods is to take advantage of pipe. pipe takes a function that returns a dataframe. So you can do anything with method chaining. You don't need to wait for pandas to add chaining methods.

def head20(df)
    return df.head(20)

#Then you can throw head20 into your method chaining

You'll only receive email when 2938 publishes a new post

More from 2938: