30 mins with MySQL JSON functions
Note: You may also be interested by 30 mins with JSON in MySQL
Note 2: Handling JSON documents could be also done with MySQL Document Store.
JSON (JavaScript Object Notation) is a popular way for moving data between various systems, including databases. Starting with 5.7 MySQL supports a native JSON data type (internal binary format for efficiency) and a set of built-in JSON functions that allows you to perform operations on JSON documents.
This blog post is not a complete overview of the entire MySQL JSON functions set (RTFM instead) but rather an arbitrary presentation of some of them.
Note: MySQL 8 enables an alternative way of working with MySQL as a document store. (Not cover in this blog post).
I’m using MySQL 8.0.11, downloadable here.
JSON Utility Functions
JSON_PRETTY
Improve readability with JSON_PRETTY
By default, display a JSON document in MySQL looks like something like this :
SELECT doc FROM restaurants LIMIT 1\G *************************** 1. row *************************** doc: {"_id": "564b3259666906a86ea90a99", "name": "Dj Reynolds Pub And Restaurant", "grades": [{"date": {"$date": 1409961600000}, "grade": "A", "score": 2}, {"date": {"$date": 1374451200000}, "grade": "A", "score": 11}, {"date": {"$date": 1343692800000}, "grade": "A", "score": 12}, {"date": {"$date": 1325116800000}, "grade": "A", "score": 12}], "address": {"coord": [-73.98513559999999, 40.7676919], "street": "West 57 Street", "zipcode": "10019", "building": "351"}, "borough": "Manhattan", "cuisine": "Irish", "restaurant_id": "30191841"}
You can have a prettier display with JSON_PRETTY :
SELECT JSON_PRETTY(doc) FROM restaurants LIMIT 1\G *************************** 1. row *************************** JSON_PRETTY(doc): { "_id": "564b3259666906a86ea90a99", "name": "Dj Reynolds Pub And Restaurant", "grades": [ { "date": { "$date": 1409961600000 }, "grade": "A", "score": 2 }, { "date": { "$date": 1374451200000 }, "grade": "A", "score": 11 }, { "date": { "$date": 1343692800000 }, "grade": "A", "score": 12 }, { "date": { "$date": 1325116800000 }, "grade": "A", "score": 12 } ], "address": { "coord": [ -73.98513559999999, 40.7676919 ], "street": "West 57 Street", "zipcode": "10019", "building": "351" }, "borough": "Manhattan", "cuisine": "Irish", "restaurant_id": "30191841" }
JSON_STORAGE_SIZE
Return the number of bytes used to store the binary representation of a JSON document with JSON_STORAGE_SIZE.
SELECT max(JSON_STORAGE_SIZE(doc)) FROM restaurants; +-----------------------------+ | max(JSON_STORAGE_SIZE(doc)) | +-----------------------------+ | 916 | +-----------------------------+ SELECT avg(JSON_STORAGE_SIZE(doc)) FROM restaurants; +-----------------------------+ | avg(JSON_STORAGE_SIZE(doc)) | +-----------------------------+ | 537.2814 | +-----------------------------+ SELECT min(JSON_STORAGE_SIZE(doc)) FROM restaurants; +-----------------------------+ | min(JSON_STORAGE_SIZE(doc)) | +-----------------------------+ | 255 | +-----------------------------+
In this collection, the heavier document is 916 bytes, the lighter is 255 and the average size is 537.2814
Note: This is the space used to store the JSON document as it was inserted into the column, prior to any partial updates that may have been performed on it afterwards.
Functions That Search JSON Values
JSON_EXTRACT (->) / JSON_UNQUOTE / ->> operator
JSON_EXTRACT (or ->) returns data from a JSON document.
JSON_UNQUOTE unquotes JSON value and returns the result as a utf8mb4 string.
->> the JSON unquoting extraction operator is a shortcut for JSON_UNQUOTE(JSON_EXTRACT())
SELECT JSON_EXTRACT(doc, "$.cuisine") FROM restaurants LIMIT 1\G *************************** 1. row *************************** JSON_EXTRACT(doc, "$.cuisine"): "Irish" SELECT doc->"$.cuisine" FROM restaurants LIMIT 1\G *************************** 1. row *************************** doc->"$.cuisine": "Irish"
Both queries above are similar.
If you want the same result but without quotes use ->> or JSON_UNQUOTE(JSON_EXTRACT()) :
SELECT JSON_UNQUOTE(JSON_EXTRACT(doc, "$.cuisine")) FROM restaurants LIMIT 1\G *************************** 1. row *************************** JSON_UNQUOTE(JSON_EXTRACT(doc, "$.cuisine")): Irish SELECT doc->>"$.cuisine" FROM restaurants LIMIT 1\G doc->>"$.cuisine": Irish
Both queries above are similar.
JSON_CONTAINS
Search whether the value of specified key matches a specified value with JSON_CONTAINS.
SELECT count(*) FROM restaurants WHERE JSON_CONTAINS(doc, '"Creole"', '$.cuisine'); +----------+ | count(*) | +----------+ | 24 | +----------+ SELECT doc->>"$.name" FROM restaurants WHERE JSON_CONTAINS(doc, '"Creole"', '$.cuisine'); +-----------------------------------------------+ | doc->>"$.name" | +-----------------------------------------------+ | Belvedere Restaurant | | Chez Macoule Restaurant | | Paradise Venus Restaurant | | Heavenly Fritaille Restaurant | | Yolie'S Bar & Restaurant | | Yo-Yo Fritaille | | Kal Bakery & Restaurant | | Bon Appetit Restaurant | | Katou Fin Restaurant | | Alhpa Restaurant | | Lakay Buffet Restaurant | | La Tranquilite Restaurant | | La Caye Restaurant | | Nous Les Amis Restaurant & Bakery | | Yoyo Fritaille | | Fresh Crown Restaurant | | Tonel Restaurant & Lounge | | Grace Devine Pastry And Restaurant Restaurant | | Viva Bubble Tea | | Cafe Creole Restaurant N Bakery | | Delly'S Place Restaurant & Fritaille | | Creole Plate | | Chez Nous Restaurant & Fritaille | | Combite Creole | +-----------------------------------------------+
JSON_CONTAINS_PATH
Indicate whether a JSON document contains data at a given path or paths with JSON_CONTAINS_PATH.
Let’s insert a dummy document in the collection restaurants
INSERT INTO restaurants (doc) VALUES ('{"_id": "1234", "name": "Daz Restaurant", "cuisine": "West Indian", "restaurant_id": "4321"}');
How many documents without grades? :
SELECT count(*), JSON_CONTAINS_PATH(doc, 'one', '$.grades') cp FROM restaurants GROUP BY cp; +----------+------+ | count(*) | cp | +----------+------+ | 1 | 0 | | 25359 | 1 | +----------+------+
Ok, only 1. We can easily check the structure of this document :
SELECT JSON_PRETTY(doc) FROM restaurants WHERE JSON_CONTAINS_PATH(doc, 'one', '$.grades') = 0\G *************************** 1. row *************************** JSON_PRETTY(doc): { "_id": "1234", "name": "Daz Restaurant", "cuisine": "West Indian", "restaurant_id": "4321" }
A bridge between 2 models
To paraphrase David Stokes (MySQL Community Manager) in his book MySQL and JSON – A practical Programming Guide.
“
The advantages of traditional relational data and schemaless data are both large. But in some cases, data in a schema needs to be schemaless, or schemaless-data needs to be in a schema.
“
Making such metamorphosis is very easy to do with MySQL!
Relational to JSON
JSON_OBJECT
Evaluates a list of key-value pairs and returns a JSON object containing those pairs with JSON_OBJECT.
A traditional SQL query with a relational result set. The JSON document output non-JSON data :
SELECT doc->>"$.name" FROM restaurants WHERE JSON_CONTAINS(doc, '"Creole"', '$.cuisine') LIMIT 2; +-------------------------+ | doc->>"$.name" | +-------------------------+ | Belvedere Restaurant | | Chez Macoule Restaurant | +-------------------------+
This result set could be convert in a JSON format, actually a JSON object :
SELECT JSON_OBJECT("Name", doc->>"$.name") FROM restaurants WHERE JSON_CONTAINS(doc, '"Creole"', '$.cuisine') LIMIT 2; +-------------------------------------+ | JSON_OBJECT("Name", doc->>"$.name") | +-------------------------------------+ | {"Name": "Belvedere Restaurant"} | | {"Name": "Chez Macoule Restaurant"} | +-------------------------------------+
Other example :
SELECT Name, Population FROM City WHERE CountryCode='fra' ORDER BY Population DESC LIMIT 5; +-----------+------------+ | Name | Population | +-----------+------------+ | Paris | 2125246 | | Marseille | 798430 | | Lyon | 445452 | | Toulouse | 390350 | | Nice | 342738 | +-----------+------------+ SELECT JSON_OBJECT("CityName",Name, "CityPop", Population) FROM City WHERE CountryCode='fra' ORDER BY Population DESC LIMIT 5; +-----------------------------------------------------+ | JSON_OBJECT("CityName",Name, "CityPop", Population) | +-----------------------------------------------------+ | {"CityPop": 2125246, "CityName": "Paris"} | | {"CityPop": 798430, "CityName": "Marseille"} | | {"CityPop": 445452, "CityName": "Lyon"} | | {"CityPop": 390350, "CityName": "Toulouse"} | | {"CityPop": 342738, "CityName": "Nice"} | +-----------------------------------------------------+
JSON_OBJECTAGG
Takes two column names or expressions and returns a JSON object containing key-value pairs with JSON_OBJECTAGG.
Grouping rows are very often useful. This why we implemented some JSON aggregate functions like this one.
SELECT JSON_OBJECTAGG(Name, CountryCode) FROM City GROUP BY id ORDER BY RAND() LIMIT 5; +-----------------------------------+ | JSON_OBJECTAGG(Name, CountryCode) | +-----------------------------------+ | {"Reno": "USA"} | | {"Hanam": "KOR"} | | {"Laizhou": "CHN"} | | {"Yogyakarta": "IDN"} | | {"Tantoyuca": "MEX"} | +-----------------------------------+
- Note:
- It’s usually not a good idea to use ORDER BY RAND(). It works like a charm for small dataset, but it’s a true performance killer with huge datasets.
- The best practice is to do it in the application or pre-compute random value in the database.
JSON_ARRAY
Evaluate a list of values and returns a JSON array containing those values with JSON_ARRAY.
Next example is an hierarchical query using a recursive Common Table Expression aka recursive CTE (or WITH Syntax)
WITH RECURSIVE emp_ext (id, name, path) AS ( SELECT id, name, CAST(id AS CHAR(200)) FROM employees WHERE manager_id IS NULL UNION ALL SELECT s.id, s.name, CONCAT(m.path, ",", s.id) FROM emp_ext m JOIN employees s ON m.id=s.manager_id ) SELECT id,name, path FROM emp_ext ORDER BY path; +------+---------+-----------------+ | id | name | path | +------+---------+-----------------+ | 333 | Yasmina | 333 | | 198 | John | 333,198 | | 29 | Pedro | 333,198,29 | | 4610 | Sarah | 333,198,29,4610 | | 72 | Pierre | 333,198,29,72 | | 692 | Tarek | 333,692 | | 123 | Adil | 333,692,123 | +------+---------+-----------------+
JSON format output with JSON_OBJECT & JSON_ARRAY :
WITH RECURSIVE emp_ext (id, name, path) AS ( SELECT id, name, CAST(id AS CHAR(200)) FROM employees WHERE manager_id IS NULL UNION ALL SELECT s.id, s.name, CONCAT(m.path, ",", s.id) FROM emp_ext m JOIN employees s ON m.id=s.manager_id ) SELECT JSON_OBJECT("ID",id, "Name",name, "Path", JSON_ARRAY(path)) FROM emp_ext ORDER BY path; +-------------------------------------------------------------+ | JSON_OBJECT("ID",id, "Name",name, "Path", JSON_ARRAY(path)) | +-------------------------------------------------------------+ | {"ID": 333, "Name": "Yasmina", "Path": ["333"]} | | {"ID": 198, "Name": "John", "Path": ["333,198"]} | | {"ID": 29, "Name": "Pedro", "Path": ["333,198,29"]} | | {"ID": 4610, "Name": "Sarah", "Path": ["333,198,29,4610"]} | | {"ID": 72, "Name": "Pierre", "Path": ["333,198,29,72"]} | | {"ID": 692, "Name": "Tarek", "Path": ["333,692"]} | | {"ID": 123, "Name": "Adil", "Path": ["333,692,123"]} | +-------------------------------------------------------------+
JSON_ARRAYAGG
Aggregate a result set as a single JSON array whose elements consist of the rows with JSON_ARRAYAGG.
With this other JSON aggregate functions we will see different combinations of JSON format output :
SELECT CountryCode, JSON_ARRAYAGG(City.Name) FROM City JOIN Country ON (City.CountryCode=Country.Code) WHERE Continent='Europe' GROUP BY 1 LIMIT 5; +-------------+--------------------------------------------------------------------------------------------------------------+ | CountryCode | JSON_ARRAYAGG(City.Name) | +-------------+--------------------------------------------------------------------------------------------------------------+ | ALB | ["Tirana"] | | AND | ["Andorra la Vella"] | | AUT | ["Graz", "Linz", "Salzburg", "Innsbruck", "Wien", "Klagenfurt"] | | BEL | ["Antwerpen", "Brugge", "Gent", "Schaerbeek", "Charleroi", "Namur", "Liège", "Mons", "Bruxelles [Brussel]"] | | BGR | ["Šumen", "Sofija", "Stara Zagora", "Plovdiv", "Pleven", "Varna", "Sliven", "Burgas", "Dobric", "Ruse"] | +-------------+--------------------------------------------------------------------------------------------------------------+
SELECT JSON_OBJECT("CountryCode",CountryCode), JSON_OBJECT("CityName",JSON_ARRAYAGG(City.Name)) FROM City JOIN Country ON (City.CountryCode=Country.Code) WHERE Continent='Europe' GROUP BY 1 LIMIT 5; +----------------------------------------+----------------------------------------------------------------------------------------------------------------------------+ | JSON_OBJECT("CountryCode",CountryCode) | JSON_OBJECT("CityName",JSON_ARRAYAGG(City.Name)) | +----------------------------------------+----------------------------------------------------------------------------------------------------------------------------+ | {"CountryCode": "ALB"} | {"CityName": ["Tirana"]} | | {"CountryCode": "AND"} | {"CityName": ["Andorra la Vella"]} | | {"CountryCode": "AUT"} | {"CityName": ["Wien", "Graz", "Linz", "Salzburg", "Innsbruck", "Klagenfurt"]} | | {"CountryCode": "BEL"} | {"CityName": ["Schaerbeek", "Mons", "Namur", "Brugge", "Liège", "Antwerpen", "Charleroi", "Gent", "Bruxelles [Brussel]"]} | | {"CountryCode": "BGR"} | {"CityName": ["Burgas", "Šumen", "Dobric", "Sliven", "Pleven", "Stara Zagora", "Ruse", "Varna", "Plovdiv", "Sofija"]} | +----------------------------------------+----------------------------------------------------------------------------------------------------------------------------+
SELECT JSON_OBJECT("Code",CountryCode, "CityName", JSON_ARRAYAGG(City.Name)) FROM City JOIN Country ON (City.CountryCode=Country.Code) WHERE Continent='Europe' GROUP BY CountryCode LIMIT 5; +-------------------------------------------------------------------------------------------------------------------------------------------+ | JSON_OBJECT("Code",CountryCode, "CityName", JSON_ARRAYAGG(City.Name)) | +-------------------------------------------------------------------------------------------------------------------------------------------+ | {"Code": "ALB", "CityName": ["Tirana"]} | | {"Code": "AND", "CityName": ["Andorra la Vella"]} | | {"Code": "AUT", "CityName": ["Graz", "Linz", "Salzburg", "Innsbruck", "Wien", "Klagenfurt"]} | | {"Code": "BEL", "CityName": ["Bruxelles [Brussel]", "Antwerpen", "Brugge", "Gent", "Schaerbeek", "Charleroi", "Namur", "Liège", "Mons"]} | | {"Code": "BGR", "CityName": ["Ruse", "Šumen", "Sofija", "Stara Zagora", "Plovdiv", "Pleven", "Varna", "Sliven", "Burgas", "Dobric"]} | +-------------------------------------------------------------------------------------------------------------------------------------------+
JSON to Relational
Ok let’s transform JSON data into relational data!
JSON_TABLE
Extract data from a JSON document and returns it as a relational table having the specified columns with JSON_TABLE.
Actually I highly recommend you to spend time in the documentation for this powerful function, that allows you to map JSON data into a temporary relational table and then query from this table.
Enough blabla, let’s see some examples :
SELECT GNP FROM countryinfo, JSON_TABLE(doc, "$" COLUMNS (GNP int PATH "$.GNP")) AS jst WHERE _id='FRA'; +---------+ | GNP | +---------+ | 1424285 | +---------+
SELECT GNP, Name, LifeExpectancy FROM countryinfo, JSON_TABLE(doc, "$" COLUMNS (GNP int PATH "$.GNP", Name char(255) PATH "$.Name", LifeExpectancy int PATH "$.demographics.LifeExpectancy")) AS jst WHERE _id IN ('FRA', 'USA'); +---------+---------------+----------------+ | GNP | Name | LifeExpectancy | +---------+---------------+----------------+ | 1424285 | France | 79 | | 8510700 | United States | 77 | +---------+---------------+----------------+
SELECT name AS "Creole Cuisine" FROM restaurant.restaurants, JSON_TABLE(doc, "$" COLUMNS (name char(100) PATH "$.name", cuisine char(100) PATH "$.cuisine")) AS jst WHERE cuisine='Creole'; +-----------------------------------------------+ | Creole Cuisine | +-----------------------------------------------+ | Belvedere Restaurant | | Chez Macoule Restaurant | | Paradise Venus Restaurant | | Heavenly Fritaille Restaurant | | Yolie'S Bar & Restaurant | | Yo-Yo Fritaille | | Kal Bakery & Restaurant | | Bon Appetit Restaurant | | Katou Fin Restaurant | | Alhpa Restaurant | | Lakay Buffet Restaurant | | La Tranquilite Restaurant | | La Caye Restaurant | | Nous Les Amis Restaurant & Bakery | | Yoyo Fritaille | | Fresh Crown Restaurant | | Tonel Restaurant & Lounge | | Grace Devine Pastry And Restaurant Restaurant | | Viva Bubble Tea | | Cafe Creole Restaurant N Bakery | | Delly'S Place Restaurant & Fritaille | | Creole Plate | | Chez Nous Restaurant & Fritaille | | Combite Creole | +-----------------------------------------------+
JSON_TABLE – Nested Data
Walk down the JSON document path and retrieve nested data.
For example, extract all grades for Hawaiian cuisine restaurants :
SELECT name, cuisine, gradeID, grade FROM restaurants,JSON_TABLE(doc, "$" COLUMNS (name char(100) PATH "$.name", cuisine char(100) PATH "$.cuisine", NESTED PATH "$.grades[*]" COLUMNS (gradeID FOR ORDINALITY, grade char(20) PATH "$.grade"))) AS jst WHERE cuisine='Hawaiian'; +------------------+----------+---------+-------+ | name | cuisine | gradeID | grade | +------------------+----------+---------+-------+ | Makana | Hawaiian | 1 | C | | Makana | Hawaiian | 2 | C | | Makana | Hawaiian | 3 | A | | Makana | Hawaiian | 4 | C | | Makana | Hawaiian | 5 | A | | General Assembly | Hawaiian | 1 | A | | General Assembly | Hawaiian | 2 | A | | General Assembly | Hawaiian | 3 | A | | General Assembly | Hawaiian | 4 | A | | Onomea | Hawaiian | 1 | A | | Onomea | Hawaiian | 2 | A | +------------------+----------+---------+-------+
JSON_TABLE – Missing Data
Specify what to do when data is missing.
Default behavior :
SELECT name, cuisine, borough FROM restaurant.restaurants,JSON_TABLE(doc, "$" COLUMNS (name char(100) PATH "$.name", cuisine char(100) PATH "$.cuisine", borough char(100) PATH "$.borough")) AS jst LIMIT 2; +--------------------------------+-------------+-----------+ | name | cuisine | borough | +--------------------------------+-------------+-----------+ | Daz Restaurant | West Indian | NULL | | Dj Reynolds Pub And Restaurant | Irish | Manhattan | +--------------------------------+-------------+-----------+
Enforce the default behavior :
SELECT name, cuisine, borough FROM restaurant.restaurants,JSON_TABLE(doc, "$" COLUMNS (name char(100) PATH "$.name", cuisine char(100) PATH "$.cuisine", borough char(100) PATH "$.borough" NULL ON EMPTY)) AS jst LIMIT 2; +--------------------------------+-------------+-----------+ | name | cuisine | borough | +--------------------------------+-------------+-----------+ | Daz Restaurant | West Indian | NULL | | Dj Reynolds Pub And Restaurant | Irish | Manhattan | +--------------------------------+-------------+-----------+
Raise an error :
SELECT name, cuisine, borough FROM restaurant.restaurants,JSON_TABLE(doc, "$" COLUMNS (name char(100) PATH "$.name", cuisine char(100) PATH "$.cuisine", borough char(100) PATH "$.borough" ERROR ON EMPTY)) AS jst LIMIT 2; ERROR 3665 (22035): Missing value for JSON_TABLE column 'borough'
Specify a default value :
SELECT name, cuisine, borough FROM restaurant.restaurants,JSON_TABLE(doc, "$" COLUMNS (name char(100) PATH "$.name", cuisine char(100) PATH "$.cuisine", borough char(100) PATH "$.borough" DEFAULT '"<UNKNOW>"' ON EMPTY)) AS jst LIMIT 2; +--------------------------------+-------------+-----------+ | name | cuisine | borough | +--------------------------------+-------------+-----------+ | Daz Restaurant | West Indian | <UNKNOW> | | Dj Reynolds Pub And Restaurant | Irish | Manhattan | +--------------------------------+-------------+-----------+
Wrapup
I’ll stop here this introduction to this rich MySQL JSON functions world. I presented a subset of these functions but it definitely worth to spend some time to discover the entire set e.g. how to create, modify, indexing, … JSON documents.
Furthermore, if your workload does not fit in the relational model, you should use the MySQL 8 Document Store, that provide a CRUD API and some other cool stuffs. I’ll blog about it soon, so stay tune!
Anyway I’ll recommend you to read : Top 10 reasons for NoSQL with MySQL.
Misc
Documentation
Articles
- 30 mins with JSON in MySQL
- JSON et colonnes générées avec MySQL (in French)
- New JSON functions in MySQL 5.7.22
- MySQL 8.0: From SQL Tables to JSON Documents (and back again)
Other resources
- You’ll find some of the sample databases used in this article here.
- Restaurants collection could be find here.
- Some books that could be useful : here.
Thanks for using MySQL!
Cloud Solutions Architect at Oracle
MySQL Geek, author, blogger and speaker
I’m an insatiable hunger of learning.
—–
Blog: www.dasini.net/blog/en/
Twitter: https://twitter.com/freshdaz
SlideShare: www.slideshare.net/freshdaz
Youtube: https://www.youtube.com/channel/UC12TulyJsJZHoCmby3Nm3WQ
—–
[…] Read this post in English […]
Nice article Olivier, I am a JSON lover and loved that every db support JSON. Nice work on in detailed explanation on json function in mysql.
for JSON I love to suggest a tool which can help every json user https://jsonformatter.org, it helps me to validate and format json data.
Thanks again Olivier for this article.
[…] Note 2: You may also be interested by 30 mins with MySQL JSON functions […]
[…] 30 mins with MySQL JSON functions […]