CSV to JSON with Python for MySQL upload

Admittedly, I have jumped head-first into the world of JSON within MySQL. I also have a thing for Python, of which I find myself using a lot of in my day job as Pipeline Survey Data Analyst. CSV’s are everywhere: in data at work, on the web, even in my personal life (as you’ll see). Loading CSV data into MySQL is nothing new to anyone working with either of the technologies. As a matter of fact, there are several ways you can get your CSV data into MySQL. I wrote the post, Pyodbc SQL CRUD – Create: Examples with MySQL, you can read where I cover loading CSV data using the Python pyodbc module. But, how about type-casting CSV data – which typically are strings – to a compatible JSON data type? Then push the JSON data to a MySQL JSON column? And that my friends, is the focus of this post. Turns out, Python’s csv and json standard libraries makes the whole process virtually painless. Keep reading and see a simple script I devised…

json source code in green and red colors
Photo by Pankaj Patel on Unsplash

OS, Software, and DB used:

  • OpenSuse Leap 15.1
  • Python 3.7.3


Self-Promotion:

If you enjoy the content written here, by all means, share this blog and your favorite post(s) with others who may benefit from or like it as well. Since coffee is my favorite drink, you can even buy me one if you would like!


Blog posts and information sources that helped me along the way…

I would have had very little success without the help of these absolutely awesome blog posts and information resources. Do check them out!

I have a CSV file that I store walking stat data.

screen shot of csv file
CSV file with data…

In order to store JSON data, we need a JSON column. Table ‘j_walking’ is the target destination and has a ‘stats’ column just for this purpose.

1
2
3
4
5
6
7
MySQL  localhost:33060+ ssl  walking  SQL > DESC j_walking;
+-------+------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+-------+------+------+-----+---------+-------+
| stats | json | YES  |     | NULL    |       |
+-------+------+------+-----+---------+-------+
1 row in set (0.0034 sec)

The rows of present data displayed below provide an example of the format and schema:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
MySQL  localhost:33060+ ssl  walking  SQL > SELECT JSON_PRETTY(stats) FROM j_walking LIMIT 3\G
*************************** 1. row ***************************
JSON_PRETTY(stats): {
  "mph": 3.2,
  "duration": "00:33:18",
  "cal_burned": 181.1,
  "day_walked": "2019-01-02",
  "shoes_worn": "Keen Koven WP",
  "miles_walked": 1.76
}
*************************** 2. row ***************************
JSON_PRETTY(stats): {
  "mph": 3.2,
  "duration": "00:38:07",
  "cal_burned": 207.3,
  "day_walked": "2019-01-07",
  "shoes_worn": "Oboz Sawtooth Low",
  "miles_walked": 2.03
}
*************************** 3. row ***************************
JSON_PRETTY(stats): {
  "mph": 3.2,
  "duration": "00:40:07",
  "cal_burned": 218.2,
  "day_walked": "2019-01-08",
  "shoes_worn": "Oboz Sawtooth Low",
  "miles_walked": 2.13
}
3 rows in set (0.0011 sec)

While in the process of prototyping a viable function, the example output of JSON shown below, was not quite in the required format. Although it is similar to the final data structure I wanted, all of the numeric values remained as strings, and not their numeric equivalents.

json file source code
JSON file with all data values as strings…

This working code provides the desired solution. By all means, savvy readers and developers please comment below for anything you feel I should modify and improve.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
import csv
import json
import datetime
import os

'''
Simple script that converts csv file to json file
'''



def csv_to_json(csv_in_file, out_json_file):
    try:
        if os.path.isfile(csv_in_file):
            # CSV 1st row header names
            fieldnames = ('day_walked', 'cal_burned', 'miles_walked',
                'duration', 'mph', 'shoe_id')
            with open(csv_in_file, 'r') as f:
                csv_reader = csv.DictReader(f, fieldnames)
                next(csv_reader) # Advancing past the header row
                with open(out_json_file, 'w') as json_f:
                    for row in csv_reader:
                        # Typecasting all columns to appropriate data type
                        row['day_walked'] = datetime.datetime.strptime(
                            row['day_walked'], '%Y-%m-%d').date()
                        row['cal_burned'] = float(row['cal_burned'])
                        row['miles_walked'] = float(row['miles_walked'])
                        row['duration'] = datetime.datetime.strptime(
                            row['duration'], '%H:%M:%S').time()
                        row['mph'] = float(row['mph'])
                        row['shoe_id'] = int(row['shoe_id'])
                        json.dump(row, json_f, default=str)
                        json_f.write('\n')
        else:
            print('That CSV file does not exist.')
    except Exception as e:
            raise e


if __name__ == '__main__':
    csv_to_json()

I’ll point out some key areas in the csv_to_json function that are essential for successful processing.

  • fieldnames = ('day_walked', 'cal_burned', 'miles_walked', 'duration', 'mph', 'shoe_id') – By setting the field names (or column names), I am essentially specifying what will be the JSON keys.
  • next(csv_reader) – Advancing past the header row of fieldnames, prohibits that row from being written to the output file.
  • Accessing each column by name – or key since I am using a DictReader – enables me to typecast all of the fields to their appropriate data type (e.g.,row[‘cal_burned’] = float(row[‘cal_burned’]))
  • json.dump() method handles the actual writing of Python objects – and their converted JSON equivalents – to the JSON file.

And now for the moment of truth. Let’s call the csv_to_json() function, producing the desired JSON file:

1
2
>>> from csv_to_json import csv_to_json
>>> csv_to_json(r'/home/joshua/env_37/dec_2019_hiking_stats.csv', '/home/joshua/env_37/dec_2019_hiking_stats.json')

Presto!!!

json source code
JSON file with typecasted data types…

Python’s handiness as a language is one of many things that draws me to it. Manipulating CSV data with Python is one of my favorite features. And, since the csv and json modules are baked into the language, both are readily available without the need for 3rd party packages.

Feel free to comment below with any questions or suggestions you may have about the post.

Like what you have read? See anything incorrect? Please comment below and thanks for reading!!!

A Call To Action!

Thank you for taking the time to read this post. I truly hope you discovered something interesting and enlightening. Please share your findings here, with someone else you know who would get the same value out of it as well.

Visit the Portfolio-Projects page to see blog post/technical writing I have completed for clients.

Have I mentioned how much I love a cup of coffee?!?!

To receive email notifications (Never Spam) from this blog (“Digital Owl’s Prose”) for the latest blog posts as they are published, please subscribe (of your own volition) by clicking the ‘Click To Subscribe!’ button in the sidebar on the homepage! (Feel free at any time to review the Digital Owl’s Prose Privacy Policy Page for any questions you may have about: email updates, opt-in, opt-out, contact forms, etc…)

Be sure and visit the “Best Of” page for a collection of my best blog posts.


Josh Otwell has a passion to study and grow as a SQL Developer and blogger. Other favorite activities find him with his nose buried in a good book, article, or the Linux command line. Among those, he shares a love of tabletop RPG games, reading fantasy novels, and spending time with his wife and two daughters.

Disclaimer: The examples presented in this post are hypothetical ideas of how to achieve similar types of results. They are not the utmost best solution(s). The majority, if not all, of the examples provided, is performed on a personal development/learning workstation-environment and should not be considered production quality or ready. Your particular goals and needs may vary. Use those practices that best benefit your needs and goals. Opinions are my own.

Hey thanks for commenting! Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.