Renaming pandas DataFrame columns – with examples.

In the blog post, How to drop columns from a Pandas DataFrame – with examples., I covered examples of how to completely remove certain columns from a pandas DataFrame. But what if you need to keep the columns, yet their names are not to your liking? Perhaps you need to provide a report with meaningful column names in a CSV or Excel file? Again, pandas make this relatively simple. Let’s understand with some examples…

Photo by Ren Ran on Unsplash
OS, Database, and software used:
  • OpenSuse Leap 15.0
  • Python 3.7.2/li>


Self-Promotion:

If you enjoy the content written here, by all means, share this blog and your favorite post(s) with others who may benefit from or like it as well. Since coffee is my favorite drink, you can even buy me one if you would like!


Like always, we need to get pandas imported. Then, for this specific example, I’ll create a DataFrame object of CSV data with the read_csv() function:

1
2
>>> import pandas as pd
>>> df = pd.read_csv('/home/joshua/Practice Data/Fitness_DB_Data/aug_stats.csv', delimiter=',')

With the head() function, I’ll retrieve the first 5 rows (the default) from the DataFrame so we can get a sense of its structure and data:

1
2
3
4
5
6
7
>>> df.head()
   day_walked  cal_burned  miles_walked  duration  mph  additional_weight  weight_amount  trekking_poles  shoe_id  trail_id
0  2018-08-01       336.1          3.37  01:01:48  3.3               True            1.5            True        4         7
1  2018-08-02       355.3          3.70  01:15:14  3.0              False            0.0           False        4         4
2  2018-08-03       259.9          2.57  00:47:47  3.2               True            1.5            True        4         7
3  2018-08-05       341.2          3.37  01:02:44  3.2               True            1.5            True        4         7
4  2018-08-06       357.7          3.64  01:05:46  3.3               True            1.5            True        4         7

Imagine for a report, I am only interested in renaming the ‘day_walked’, ‘cal_burned’, ‘miles_walked’, and ‘duration’ columns. Therefore, I will assign just those columns worth of rows to another DataFrame object variable:

1
2
3
4
5
6
7
8
>>> report_data = df[['day_walked','cal_burned','miles_walked','duration']]
>>> report_data.head()
   day_walked  cal_burned  miles_walked  duration
0  2018-08-01       336.1          3.37  01:01:48
1  2018-08-02       355.3          3.70  01:15:14
2  2018-08-03       259.9          2.57  00:47:47
3  2018-08-05       341.2          3.37  01:02:44
4  2018-08-06       357.7          3.64  01:05:46

Targeting the newly-created ‘report_data’ DataFrame, I’ll rename just one column in this first example. DataFrames have a rename() method (see the rename() documentation for more in-depth information) you can use. Basically, the columns parameter accepts a Python dictionary, in which you supply a mapping of the current column name as the key, with the desired column name as the matching value:

1
2
3
4
5
6
7
8
9
10
>>> report_data.rename(columns={'day_walked':'Day of Walk'})
  Day of Walk  cal_burned  miles_walked  duration
0  2018-08-01       336.1          3.37  01:01:48
1  2018-08-02       355.3          3.70  01:15:14
2  2018-08-03       259.9          2.57  00:47:47
3  2018-08-05       341.2          3.37  01:02:44
4  2018-08-06       357.7          3.64  01:05:46
5  2018-08-17       184.2          1.89  00:39:00
6  2018-08-18       242.9          2.53  00:51:25
7  2018-08-30       204.4          1.95  00:37:35

And if you want to rename the remaining columns, just incorporate the same dictionary key-value pairing for each one:

1
2
3
4
5
6
7
8
>>> report_data = report_data.rename(columns={'day_walked':'Day of Walk', 'cal_burned':'Calories Burned', 'miles_walked':'Distance Walked', 'duration':'Duration of Walk'})
>>> report_data.head()
  Day of Walk  Calories Burned  Distance Walked Duration of Walk
0  2018-08-01            336.1             3.37         01:01:48
1  2018-08-02            355.3             3.70         01:15:14
2  2018-08-03            259.9             2.57         00:47:47
3  2018-08-05            341.2             3.37         01:02:44
4  2018-08-06            357.7             3.64         01:05:46

DataFrames have a columns attribute that you can reassign the desired column names to. Here is an example using the lower-case equivalents of those we previously assigned:

1
2
3
4
5
6
7
8
>>> report_data.columns = ['day of walk', 'calories burned', 'distance walked', 'duration of walk']
>>> report_data.head()
  day of walk  calories burned  distance walked duration of walk
0  2018-08-01            336.1             3.37         01:01:48
1  2018-08-02            355.3             3.70         01:15:14
2  2018-08-03            259.9             2.57         00:47:47
3  2018-08-05            341.2             3.37         01:02:44
4  2018-08-06            357.7             3.64         01:05:46

However, a key difference here between using the rename() method and assigning directly to the columns attribute is you must include a value for all the columns in the DataFrame. See the error below when I exclude the ‘distance walked’ column from the columns list:

1
2
3
4
5
6
7
8
9
10
11
>>> report_data.columns = ['day_of_walk', 'calories burned', 'duration of walk']
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/joshua/.pyenv/versions/env_37/lib/python3.7/site-packages/pandas/core/generic.py", line 5192, in __setattr__
    return object.__setattr__(self, name, value)
  File "pandas/_libs/properties.pyx", line 67, in pandas._libs.properties.AxisProperty.__set__
  File "/home/joshua/.pyenv/versions/env_37/lib/python3.7/site-packages/pandas/core/generic.py", line 690, in _set_axis
    self._data.set_axis(axis, labels)
  File "/home/joshua/.pyenv/versions/env_37/lib/python3.7/site-packages/pandas/core/internals/managers.py", line 183, in set_axis
    "values have {new} elements".format(old=old_len, new=new_len)
ValueError: Length mismatch: Expected axis has 4 elements, new values have 3 elements

Other posts you may be interested in: Bulk CSV Uploads with Pandas and PostgreSQL

There you are. Simple and easy ways to rename the columns of a pandas DataFrame. Those are the ones I know about. Do you know of any others you would like to share with me in the comments below?

Like what you have read? See anything incorrect? Please comment below and thanks for reading!!!

A Call To Action!

Thank you for taking the time to read this post. I truly hope you discovered something interesting and enlightening. Please share your findings here, with someone else you know who would get the same value out of it as well.

Visit the Portfolio-Projects page to see blog post/technical writing I have completed for clients.

Have I mentioned how much I love a cup of coffee?!?!

To receive email notifications (Never Spam) from this blog (“Digital Owl’s Prose”) for the latest blog posts as they are published, please subscribe (of your own volition) by clicking the ‘Click To Subscribe!’ button in the sidebar on the homepage! (Feel free at any time to review the Digital Owl’s Prose Privacy Policy Page for any questions you may have about: email updates, opt-in, opt-out, contact forms, etc…)

Be sure and visit the “Best Of” page for a collection of my best blog posts.


Josh Otwell has a passion to study and grow as a SQL Developer and blogger. Other favorite activities find him with his nose buried in a good book, article, or the Linux command line. Among those, he shares a love of tabletop RPG games, reading fantasy novels, and spending time with his wife and two daughters.

Disclaimer: The examples presented in this post are hypothetical ideas of how to achieve similar types of results. They are not the utmost best solution(s). The majority, if not all, of the examples provided, is performed on a personal development/learning workstation-environment and should not be considered production quality or ready. Your particular goals and needs may vary. Use those practices that best benefit your needs and goals. Opinions are my own.

One thought on “Renaming pandas DataFrame columns – with examples.

Hey thanks for commenting! Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.