Renaming and combining workbook¶

Introduction¶

This is the worbook part of the "Renaming and combining" section of the Advanced Pandas tutorial. For the reference section, click here.

Renaming is covered in its own section in the "Essential Basic Functionality" (http://archive.is/e20r8#60.7%) section of the extensive official documentation. Combining is covered by the "Merge, join, concatenate" (http://archive.is/f5FsP) section there.

import pandas as pd

from learntools.advanced_pandas.renaming_and_combining import *

pd.set_option('max_rows', 5)
reviews = pd.read_csv("../input/wine-reviews/winemag-data-130k-v2.csv", index_col=0)

Checking Answers¶

Check your answers in each exercise using the check_qN function (replacing N with the number of the exercise). For example here's how you would check an incorrect answer to exercise 1:

check_q1(pd.DataFrame())

False

If you get stuck, use the answer_qN function to see the code with the correct answer.

For the first set of questions, running the check_qN on the correct answer returns True.

For the second set of questions, using this function to check a correct answer will present an informative graph!

Exercises¶

Look at your data by running the cell below:

reviews.head()

Exercise 1: region_1 and region_2 are pretty uninformative names for locale columns in the dataset. Rename these columns to region and locale.

reviews.rename(columns={'region_1': 'region', 'region_2': 'locale'})

print(check_q1(reviews.rename(columns={'region_1': 'region', 'region_2': 'locale'})))
print('-------')
print(answer_q1())

True
-------
reviews.rename(columns={'region_1': 'region', 'region_2': 'locale'})
None

Exercise 2: Set the index name in the dataset to wines.

reviews.rename_axis("wines", axis='rows')

print(check_q2(reviews.rename_axis("wines", axis='rows')))
print('-------')
print(answer_q2())

True
-------
reviews.rename_axis("wines", axis="rows")
None

Exercise 3: The Things on Reddit dataset includes product links from a selection of top-ranked forums ("subreddits") on Reddit.com. Create a DataFrame of products mentioned on either subreddit. Use the following data:

gaming_products = pd.read_csv("../input/things-on-reddit/top-things/top-things/reddits/g/gaming.csv")
gaming_products['subreddit'] = "r/gaming"
movie_products = pd.read_csv("../input/things-on-reddit/top-things/top-things/reddits/m/movies.csv")
movie_products['subreddit'] = "r/movies"

pd.concat([gaming_products, movie_products])

print(check_q3(pd.concat([gaming_products, movie_products])))
print('-------')
print(answer_q3())

True
-------
pd.concat([gaming_products, movie_products])
None

Exercise 4: The Powerlifting Database dataset on Kaggle includes one CSV table for powerlifting meets and a separate one for powerlifting competitors. Both tables include references to a MeetID, a unique key for each meet (competition) included in the database. Using this, generate a dataset combining the two tables into one. Use the following data:

powerlifting_meets = pd.read_csv("../input/powerlifting-database/meets.csv")
powerlifting_competitors = pd.read_csv("../input/powerlifting-database/openpowerlifting.csv")

powerlifting_meets.set_index("MeetID").join(powerlifting_competitors.set_index("MeetID"))

print(check_q4(powerlifting_meets.set_index("MeetID").join(powerlifting_competitors.set_index("MeetID"))))
print('-------')
print(answer_q4())

True
-------
powerlifting_meets.set_index("MeetID").join(powerlifting_competitors.set_index("MeetID"))
None

Keep going¶

Continue to the method chaining workbook.

	country	description	designation	points	price	province	region_1	region_2	taster_name	taster_twitter_handle	title	variety	winery
0	Italy	Aromas include tropical fruit, broom, brimston...	Vulkà Bianco	87	NaN	Sicily & Sardinia	Etna	NaN	Kerin O’Keefe	@kerinokeefe	Nicosia 2013 Vulkà Bianco (Etna)	White Blend	Nicosia
1	Portugal	This is ripe and fruity, a wine that is smooth...	Avidagos	87	15.0	Douro	NaN	NaN	Roger Voss	@vossroger	Quinta dos Avidagos 2011 Avidagos Red (Douro)	Portuguese Red	Quinta dos Avidagos
2	US	Tart and snappy, the flavors of lime flesh and...	NaN	87	14.0	Oregon	Willamette Valley	Willamette Valley	Paul Gregutt	@paulgwine	Rainstorm 2013 Pinot Gris (Willamette Valley)	Pinot Gris	Rainstorm
3	US	Pineapple rind, lemon pith and orange blossom ...	Reserve Late Harvest	87	13.0	Michigan	Lake Michigan Shore	NaN	Alexander Peartree	NaN	St. Julian 2013 Reserve Late Harvest Riesling ...	Riesling	St. Julian
4	US	Much like the regular bottling from 2012, this...	Vintner's Reserve Wild Child Block	87	65.0	Oregon	Willamette Valley	Willamette Valley	Paul Gregutt	@paulgwine	Sweet Cheeks 2012 Vintner's Reserve Wild Child...	Pinot Noir	Sweet Cheeks

	country	description	designation	points	price	province	region	locale	taster_name	taster_twitter_handle	title	variety	winery
0	Italy	Aromas include tropical fruit, broom, brimston...	Vulkà Bianco	87	NaN	Sicily & Sardinia	Etna	NaN	Kerin O’Keefe	@kerinokeefe	Nicosia 2013 Vulkà Bianco (Etna)	White Blend	Nicosia
1	Portugal	This is ripe and fruity, a wine that is smooth...	Avidagos	87	15.0	Douro	NaN	NaN	Roger Voss	@vossroger	Quinta dos Avidagos 2011 Avidagos Red (Douro)	Portuguese Red	Quinta dos Avidagos
...	...	...	...	...	...	...	...	...	...	...	...	...	...
129969	France	A dry style of Pinot Gris, this is crisp with ...	NaN	90	32.0	Alsace	Alsace	NaN	Roger Voss	@vossroger	Domaine Marcel Deiss 2012 Pinot Gris (Alsace)	Pinot Gris	Domaine Marcel Deiss
129970	France	Big, rich and off-dry, this is powered by inte...	Lieu-dit Harth Cuvée Caroline	90	21.0	Alsace	Alsace	NaN	Roger Voss	@vossroger	Domaine Schoffit 2012 Lieu-dit Harth Cuvée Car...	Gewürztraminer	Domaine Schoffit

	country	description	designation	points	price	province	region_1	region_2	taster_name	taster_twitter_handle	title	variety	winery
wines
0	Italy	Aromas include tropical fruit, broom, brimston...	Vulkà Bianco	87	NaN	Sicily & Sardinia	Etna	NaN	Kerin O’Keefe	@kerinokeefe	Nicosia 2013 Vulkà Bianco (Etna)	White Blend	Nicosia
1	Portugal	This is ripe and fruity, a wine that is smooth...	Avidagos	87	15.0	Douro	NaN	NaN	Roger Voss	@vossroger	Quinta dos Avidagos 2011 Avidagos Red (Douro)	Portuguese Red	Quinta dos Avidagos
...	...	...	...	...	...	...	...	...	...	...	...	...	...
129969	France	A dry style of Pinot Gris, this is crisp with ...	NaN	90	32.0	Alsace	Alsace	NaN	Roger Voss	@vossroger	Domaine Marcel Deiss 2012 Pinot Gris (Alsace)	Pinot Gris	Domaine Marcel Deiss
129970	France	Big, rich and off-dry, this is powered by inte...	Lieu-dit Harth Cuvée Caroline	90	21.0	Alsace	Alsace	NaN	Roger Voss	@vossroger	Domaine Schoffit 2012 Lieu-dit Harth Cuvée Car...	Gewürztraminer	Domaine Schoffit

	name	category	amazon_link	total_mentions	subreddit_mentions	subreddit
0	BOOMco Halo Covenant Needler Blaster	Toys & Games	https://www.amazon.com/BOOMco-Halo-Covenant-Ne...	4.0	4	r/gaming
1	Raspberry PI 3 Model B 1.2GHz 64-bit quad-core...	Electronics	https://www.amazon.com/Raspberry-Model-A1-2GHz...	19.0	3	r/gaming
...	...	...	...	...	...	...
301	Apocalypto [Blu-ray]	Movies & TV	https://www.amazon.com/Apocalypto-Blu-ray-Rudy...	1.0	1	r/movies
302	Cinelinx: A Card Game for People Who Love Movi...	Toys & Games	https://www.amazon.com/Cinelinx-Card-Game-Peop...	1.0	1	r/movies

	MeetPath	Federation	Date	MeetCountry	MeetState	MeetTown	MeetName	Name	Sex	Equipment	Age	Division	BodyweightKg	WeightClassKg	Squat4Kg	BestSquatKg	Bench4Kg	BestBenchKg	Deadlift4Kg	BestDeadliftKg	TotalKg	Place	Wilks
MeetID
0	365strong/1601	365Strong	2016-10-29	USA	NC	Charlotte	2016 Junior & Senior National Powerlifting Cha...	Angie Belk Terry	F	Wraps	47.0	Mst 45-49	59.60	60	NaN	47.63	NaN	20.41	NaN	70.31	138.35	1	155.05
0	365strong/1601	365Strong	2016-10-29	USA	NC	Charlotte	2016 Junior & Senior National Powerlifting Cha...	Dawn Bogart	F	Single-ply	42.0	Mst 40-44	58.51	60	NaN	142.88	NaN	95.25	NaN	163.29	401.42	1	456.38
...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...
8481	xpc/2017-finals	XPC	2017-03-03	USA	OH	Columbus	2017 XPC Finals	Jeff Bumanglag	M	Multi-ply	NaN	Elite	126.73	140	NaN	NaN	NaN	NaN	NaN	320.00	320.00	3	181.85
8481	xpc/2017-finals	XPC	2017-03-03	USA	OH	Columbus	2017 XPC Finals	Shane Hammock	M	Multi-ply	NaN	Elite	129.46	140	NaN	NaN	NaN	NaN	NaN	362.50	362.50	2	205.18