Admittedly, I have jumped head-first into the world of
JSON within MySQL. I also have a thing for Python, of which I find myself using a lot of in my day job as Pipeline Survey Data Analyst. CSV’s are everywhere: in data at work, on the web, even in my personal life (as you’ll see). Loading CSV data into MySQL is nothing new to anyone working with either of the technologies. As a matter of fact, there are several ways you can get your CSV data into MySQL. I wrote the post, Pyodbc SQL CRUD – Create: Examples with MySQL, you can read where I cover loading CSV data using the Python
pyodbc module. But, how about type-casting CSV data – which typically are strings – to a compatible
JSON data type? Then push the
JSON data to a MySQL
JSON column? And that my friends, is the focus of this post. Turns out, Python’s
json standard libraries makes the whole process virtually painless. Keep reading and see a simple script I devised…
OS, Software, and DB used:
- OpenSuse Leap 15.1
- Python 3.7.3
If you enjoy the content written here, by all means, share this blog and your favorite post(s) with others who may benefit from or like it as well. Since coffee is my favorite drink, you can even buy me one if you would like!
Blog posts and information sources that helped me along the way…
I would have had very little success without the help of these absolutely awesome blog posts and information resources. Do check them out!
- Working With JSON Data in Python
- How to Format Dates in Python
- Converting Strings to datetime in Python
I have a CSV file that I store walking stat data.
In order to store JSON data, we need a
JSON column. Table ‘j_walking’ is the target destination and has a ‘stats’ column just for this purpose.
MySQL localhost:33060+ ssl walking SQL > DESC j_walking;
| Field | Type | Null | Key | Default | Extra |
| stats | json | YES | | NULL | |
1 row in set (0.0034 sec)
The rows of present data displayed below provide an example of the format and schema:
MySQL localhost:33060+ ssl walking SQL > SELECT JSON_PRETTY(stats) FROM j_walking LIMIT 3\G
*************************** 1. row ***************************
"shoes_worn": "Keen Koven WP",
*************************** 2. row ***************************
"shoes_worn": "Oboz Sawtooth Low",
*************************** 3. row ***************************
"shoes_worn": "Oboz Sawtooth Low",
3 rows in set (0.0011 sec)
While in the process of prototyping a viable function, the example output of
JSON shown below, was not quite in the required format. Although it is similar to the final data structure I wanted, all of the numeric values remained as strings, and not their numeric equivalents.
This working code provides the desired solution. By all means, savvy readers and developers please comment below for anything you feel I should modify and improve.
Simple script that converts csv file to json file
def csv_to_json(csv_in_file, out_json_file):
# CSV 1st row header names
fieldnames = ('day_walked', 'cal_burned', 'miles_walked',
'duration', 'mph', 'shoe_id')
with open(csv_in_file, 'r') as f:
csv_reader = csv.DictReader(f, fieldnames)
next(csv_reader) # Advancing past the header row
with open(out_json_file, 'w') as json_f:
for row in csv_reader:
# Typecasting all columns to appropriate data type
row['day_walked'] = datetime.datetime.strptime(
row['cal_burned'] = float(row['cal_burned'])
row['miles_walked'] = float(row['miles_walked'])
row['duration'] = datetime.datetime.strptime(
row['mph'] = float(row['mph'])
row['shoe_id'] = int(row['shoe_id'])
json.dump(row, json_f, default=str)
print('That CSV file does not exist.')
except Exception as e:
if __name__ == '__main__':
I’ll point out some key areas in the
csv_to_json function that are essential for successful processing.
fieldnames = ('day_walked', 'cal_burned', 'miles_walked', 'duration', 'mph', 'shoe_id')– By setting the field names (or column names), I am essentially specifying what will be the
next(csv_reader)– Advancing past the header row of fieldnames, prohibits that row from being written to the output file.
- Accessing each column by name – or key since I am using a
DictReader– enables me to typecast all of the fields to their appropriate data type (e.g.,row[‘cal_burned’] = float(row[‘cal_burned’]))
json.dump()method handles the actual writing of Python objects – and their converted JSON equivalents – to the JSON file.
And now for the moment of truth. Let’s call the
csv_to_json() function, producing the desired JSON file:
>>> from csv_to_json import csv_to_json
>>> csv_to_json(r'/home/joshua/env_37/dec_2019_hiking_stats.csv', '/home/joshua/env_37/dec_2019_hiking_stats.json')
Python’s handiness as a language is one of many things that draws me to it. Manipulating CSV data with Python is one of my favorite features. And, since the
json modules are baked into the language, both are readily available without the need for 3rd party packages.
Feel free to comment below with any questions or suggestions you may have about the post.
Like what you have read? See anything incorrect? Please comment below and thanks for reading!!!
A Call To Action!
Thank you for taking the time to read this post. I truly hope you discovered something interesting and enlightening. Please share your findings here, with someone else you know who would get the same value out of it as well.
Visit the Portfolio-Projects page to see blog post/technical writing I have completed for clients.
Have I mentioned how much I love a cup of coffee?!?!
Be sure and visit the “Best Of” page for a collection of my best blog posts.
Josh Otwell has a passion to study and grow as a SQL Developer and blogger. Other favorite activities find him with his nose buried in a good book, article, or the Linux command line. Among those, he shares a love of tabletop RPG games, reading fantasy novels, and spending time with his wife and two daughters.
Disclaimer: The examples presented in this post are hypothetical ideas of how to achieve similar types of results. They are not the utmost best solution(s). The majority, if not all, of the examples provided, is performed on a personal development/learning workstation-environment and should not be considered production quality or ready. Your particular goals and needs may vary. Use those practices that best benefit your needs and goals. Opinions are my own.