Method chaining reference¶

This is the referenc component of the "Method chaining" section of the Advanced Pandas tutorial. For the workbook component, click here.

import pandas as pd
pd.set_option('max_rows', 5)
wine = pd.read_csv("../input/wine-reviews/winemag-data-130k-v2.csv", index_col=0)
ramen = pd.read_csv("../input/ramen-ratings/ramen-ratings.csv", index_col=0)

Why method chaining?¶

Method chaining is the last topic we will cover in this first track of the Advanced Pandas tutorial. It is also the only section of this tutorial which is a technique or a pattern, not a function or variable.

Method chaining is a methodology for performing operations on a DataFrame or Series that emphasizes continuity. To demonstrate what I mean, here's a data cleaning and dropping operation (which you should be familiar with from the last section) done two different ways:

stars = ramen['Stars']
na_stars = stars.replace('Unrated', None).dropna()
float_stars = na_stars.astype('float64')
float_stars.head()

Review #
2580    3.75
2579    1.00
2578    2.25
2577    2.75
2576    3.75
Name: Stars, dtype: float64

(ramen['Stars']
     .replace('Unrated', None)
     .dropna()
     .astype('float64')
     .head())

Review #
2580    3.75
2579    1.00
2578    2.25
2577    2.75
2576    3.75
Name: Stars, dtype: float64

In the first statement we assign data to temporary variables, creating new ones as we go further and further along. In the second statement, written in the method chaining style, we instead "chain" our operations, one after the other, all on the same original DataFrame.

Most pandas operations can written in a method chaining style, and in the last couple years or so pandas has added more and more tools for making these sorts of statements easier to write. This paradigm comes to us from the R programming language—specifically, the dpyler module, part of the "Tidyverse".

Method chaining is advantageous for several reasons. One is that it lessens the need for creating and mentally tracking temporary variables. Another is that it emphasizes a correctly structured interative approach to working with data, where each operation is a "next step" after the last. Debugging is easy: just comment out operations that don't work until you get to one that does, and then start stepping forward again. And it looks kind of cool. =)

For a deeper exploration of why method chaining, read the Method Chaining Section of the Modern Pandas Tutorial (written by a pandas core dev).

Assign and pipe¶

Now that we've learned all these ways of manipulating data with pandas, we're ready to take advantage of method chaining to write clear, clean data manipulation code. Now I'll introduce three additional methods useful for coding in this style.

wine.head()

The first of these is assign. The assign method lets you create new columns or modify old ones inside of a DataFrame inline. For example, to fill the region_1 field with the province field wherever the region_1 is null (useful if we're mixing in our own categories), we would do:

wine.assign(
    region_1=wine.apply(lambda srs: srs.region_1 if pd.notnull(srs.region_1) else srs.province, 
                        axis='columns')
)

Which is equivalent to:

wine['region_1'] = wine['region_1'].apply(
    lambda srs: srs.region_1 if pd.notnull(srs.region_1) else srs.province, 
    axis='columns'
)

You can modify as many old columns and create as many new ones as you'd like with assign, but it does have the limitation that the column being modified must not have any reserved characters like periods (.) or spaces () in the name.

The next method to know is pipe. pipe is a little mind-bending: it lets you perform an operation on the entire DataFrame at once, and replaces the current DataFrame which the output of your pipe.

For example, one way to change the give the DataFrame index a new name would be to do:

def name_index(df):
    df.index.name = 'review_id'
    return df

wine.pipe(name_index)

pipe is a power tool: it comes in handy when you're performing very intricate operations on your DataFrame. You won't need it often, but it'll be super useful when you do.

That concludes this tutorial! Bravo!

	country	description	designation	points	price	province	region_1	region_2	taster_name	taster_twitter_handle	title	variety	winery
0	Italy	Aromas include tropical fruit, broom, brimston...	Vulkà Bianco	87	NaN	Sicily & Sardinia	Etna	NaN	Kerin O’Keefe	@kerinokeefe	Nicosia 2013 Vulkà Bianco (Etna)	White Blend	Nicosia
1	Portugal	This is ripe and fruity, a wine that is smooth...	Avidagos	87	15.0	Douro	NaN	NaN	Roger Voss	@vossroger	Quinta dos Avidagos 2011 Avidagos Red (Douro)	Portuguese Red	Quinta dos Avidagos
2	US	Tart and snappy, the flavors of lime flesh and...	NaN	87	14.0	Oregon	Willamette Valley	Willamette Valley	Paul Gregutt	@paulgwine	Rainstorm 2013 Pinot Gris (Willamette Valley)	Pinot Gris	Rainstorm
3	US	Pineapple rind, lemon pith and orange blossom ...	Reserve Late Harvest	87	13.0	Michigan	Lake Michigan Shore	NaN	Alexander Peartree	NaN	St. Julian 2013 Reserve Late Harvest Riesling ...	Riesling	St. Julian
4	US	Much like the regular bottling from 2012, this...	Vintner's Reserve Wild Child Block	87	65.0	Oregon	Willamette Valley	Willamette Valley	Paul Gregutt	@paulgwine	Sweet Cheeks 2012 Vintner's Reserve Wild Child...	Pinot Noir	Sweet Cheeks

	country	description	designation	points	price	province	region_1	region_2	taster_name	taster_twitter_handle	title	variety	winery
review_id
0	Italy	Aromas include tropical fruit, broom, brimston...	Vulkà Bianco	87	NaN	Sicily & Sardinia	Etna	NaN	Kerin O’Keefe	@kerinokeefe	Nicosia 2013 Vulkà Bianco (Etna)	White Blend	Nicosia
1	Portugal	This is ripe and fruity, a wine that is smooth...	Avidagos	87	15.0	Douro	Douro	NaN	Roger Voss	@vossroger	Quinta dos Avidagos 2011 Avidagos Red (Douro)	Portuguese Red	Quinta dos Avidagos
...	...	...	...	...	...	...	...	...	...	...	...	...	...
129969	France	A dry style of Pinot Gris, this is crisp with ...	NaN	90	32.0	Alsace	Alsace	NaN	Roger Voss	@vossroger	Domaine Marcel Deiss 2012 Pinot Gris (Alsace)	Pinot Gris	Domaine Marcel Deiss
129970	France	Big, rich and off-dry, this is powered by inte...	Lieu-dit Harth Cuvée Caroline	90	21.0	Alsace	Alsace	NaN	Roger Voss	@vossroger	Domaine Schoffit 2012 Lieu-dit Harth Cuvée Car...	Gewürztraminer	Domaine Schoffit

	country	description	designation	points	price	province	region_1	region_2	taster_name	taster_twitter_handle	title	variety	winery
review_id
0	Italy	Aromas include tropical fruit, broom, brimston...	Vulkà Bianco	87	NaN	Sicily & Sardinia	Etna	NaN	Kerin O’Keefe	@kerinokeefe	Nicosia 2013 Vulkà Bianco (Etna)	White Blend	Nicosia
1	Portugal	This is ripe and fruity, a wine that is smooth...	Avidagos	87	15.0	Douro	NaN	NaN	Roger Voss	@vossroger	Quinta dos Avidagos 2011 Avidagos Red (Douro)	Portuguese Red	Quinta dos Avidagos
...	...	...	...	...	...	...	...	...	...	...	...	...	...
129969	France	A dry style of Pinot Gris, this is crisp with ...	NaN	90	32.0	Alsace	Alsace	NaN	Roger Voss	@vossroger	Domaine Marcel Deiss 2012 Pinot Gris (Alsace)	Pinot Gris	Domaine Marcel Deiss
129970	France	Big, rich and off-dry, this is powered by inte...	Lieu-dit Harth Cuvée Caroline	90	21.0	Alsace	Alsace	NaN	Roger Voss	@vossroger	Domaine Schoffit 2012 Lieu-dit Harth Cuvée Car...	Gewürztraminer	Domaine Schoffit