MySQL Functional Indexes

Since MySQL 5.7 one can put indexes on expressions, aka functional indexes, using generated columns. Basically you first need to use the generated column to define the functional expression, then indexed this column.
Quite useful when dealing with JSON functions, you can find an example here and the documentation there.
Starting with MySQL 8.0.13 we have now an easiest way to create functional indexes (or functional key parts as mentioned in the documentation) \o/
Let’s see how with a quick practical example.
Below salaries table structure:
1 2 3 4 5 6 7 8 9 10 11 12 |
mysql> SHOW CREATE TABLE salaries\G *************************** 1. row *************************** Table: salaries Create Table: CREATE TABLE `salaries` ( `emp_no` int(11) NOT NULL, `salary` int(11) NOT NULL, `from_date` date NOT NULL, `to_date` date NOT NULL, PRIMARY KEY (`emp_no`,`from_date`), CONSTRAINT `salaries_ibfk_1` FOREIGN KEY (`emp_no`) REFERENCES `employees` (`emp_no`) ON DELETE CASCADE ) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci 1 row in set (0,00 sec) |
It contains some data
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 |
mysql> SELECT count(*) FROM salaries; +----------+ | count(*) | +----------+ | 2844047 | +----------+ mysql> SELECT * FROM salaries LIMIT 3; +--------+--------+------------+------------+ | emp_no | salary | from_date | to_date | +--------+--------+------------+------------+ | 10001 | 60117 | 1986-06-26 | 1987-06-26 | | 10001 | 62102 | 1987-06-26 | 1988-06-25 | | 10001 | 66074 | 1988-06-25 | 1989-06-25 | +--------+--------+------------+------------+ |
Let’s focus on the following query:
SELECT * FROM salaries WHERE YEAR(to_date)=1985
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 |
mysql> SELECT * FROM salaries WHERE YEAR(to_date)=1985; +--------+--------+------------+------------+ | emp_no | salary | from_date | to_date | +--------+--------+------------+------------+ | 14688 | 42041 | 1985-07-06 | 1985-08-08 | ...snip... | 498699 | 40000 | 1985-09-25 | 1985-09-28 | +--------+--------+------------+------------+ 89 rows in set (0,80 sec) mysql> explain SELECT * FROM salaries WHERE YEAR(to_date)=1985\G *************************** 1. row *************************** id: 1 select_type: SIMPLE table: salaries partitions: NULL type: ALL possible_keys: NULL key: NULL key_len: NULL ref: NULL rows: 2838426 filtered: 100.00 Extra: Using where |
We have a full table scan ( type: ALL
), meaning no index is used. Perhaps because there is no index on column to_date… 😉
So let’s add an index on to_date !
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 |
mysql> ALTER TABLE salaries ADD INDEX idx_to_date (to_date); Query OK, 0 rows affected (17,13 sec) Records: 0 Duplicates: 0 Warnings: 0 mysql> SHOW CREATE TABLE salaries\G *************************** 1. row *************************** Table: salaries Create Table: CREATE TABLE `salaries` ( `emp_no` int(11) NOT NULL, `salary` int(11) NOT NULL, `from_date` date NOT NULL, `to_date` date NOT NULL, PRIMARY KEY (`emp_no`,`from_date`), KEY `idx_to_date` (`to_date`), CONSTRAINT `salaries_ibfk_1` FOREIGN KEY (`emp_no`) REFERENCES `employees` (`emp_no`) ON DELETE CASCADE ) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci |
And run again the query with the hope of a better execution plan
1 2 3 4 5 6 7 8 9 10 11 12 13 14 |
mysql> explain SELECT * FROM salaries WHERE YEAR(to_date)=1985\G *************************** 1. row *************************** id: 1 select_type: SIMPLE table: salaries partitions: NULL type: ALL possible_keys: NULL key: NULL key_len: NULL ref: NULL rows: 2838426 filtered: 100.00 Extra: Using where |
Ouch! Still have a full table scan !
The index can’t be used because of the use of a function (YEAR()) on the indexed column (to_date).
BTW if you’re really surprise, maybe you should read this. 😉
This is the case when you need a functional index!
1 2 3 |
mysql> ALTER TABLE salaries ADD INDEX idx_year_to_date((YEAR(to_date))); Query OK, 0 rows affected (20,04 sec) Records: 0 Duplicates: 0 Warnings: 0 |
The syntax is very similar of the creation of a “regular” index. Although you must be aware of the double parentheses: (( <expression> ))
We can now see our new index named idx_year_to_date and the indexed expression year(to_date) :
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 |
mysql> SHOW CREATE TABLE salaries\G *************************** 1. row *************************** Table: salaries Create Table: CREATE TABLE `salaries` ( `emp_no` int(11) NOT NULL, `salary` int(11) NOT NULL, `from_date` date NOT NULL, `to_date` date NOT NULL, PRIMARY KEY (`emp_no`,`from_date`), KEY `idx_to_date` (`to_date`), KEY `idx_year_to_date` ((year(`to_date`))), CONSTRAINT `salaries_ibfk_1` FOREIGN KEY (`emp_no`) REFERENCES `employees` (`emp_no`) ON DELETE CASCADE ) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci mysql> SELECT INDEX_NAME, EXPRESSION FROM INFORMATION_SCHEMA.STATISTICS WHERE TABLE_SCHEMA='employees' AND TABLE_NAME = "salaries" AND INDEX_NAME='idx_year_to_date'; +------------------+-----------------+ | INDEX_NAME | EXPRESSION | +------------------+-----------------+ | idx_year_to_date | year(`to_date`) | +------------------+-----------------+ |
Let’s test our query again
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 |
mysql> explain SELECT * FROM salaries WHERE YEAR(to_date)=1985\G *************************** 1. row *************************** id: 1 select_type: SIMPLE table: salaries partitions: NULL type: ref possible_keys: idx_year_to_date key: idx_year_to_date key_len: 5 ref: const rows: 89 filtered: 100.00 Extra: NULL mysql> SELECT * FROM salaries WHERE YEAR(to_date)=1985; +--------+--------+------------+------------+ | emp_no | salary | from_date | to_date | +--------+--------+------------+------------+ | 14688 | 42041 | 1985-07-06 | 1985-08-08 | ...snip... | 498699 | 40000 | 1985-09-25 | 1985-09-28 | +--------+--------+------------+------------+ 89 rows in set (0,00 sec) |
Here we go!
Now the query is able to use the index. And in this case we have a positive impact on the execution time.
It is also interesting to note that it is possible to use idx_to_date, the first index created (the non functional one) if we can rewrite the original query:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 |
mysql> EXPLAIN SELECT * FROM salaries WHERE to_date BETWEEN '1985-01-01' AND '1985-12-31'\G *************************** 1. row *************************** id: 1 select_type: SIMPLE table: salaries partitions: NULL type: range possible_keys: idx_to_date key: idx_to_date key_len: 3 ref: NULL rows: 89 filtered: 100.00 Extra: Using index condition mysql> SELECT * FROM salaries WHERE to_date BETWEEN '1985-01-01' AND '1985-12-31' +--------+--------+------------+------------+ | emp_no | salary | from_date | to_date | +--------+--------+------------+------------+ | 20869 | 40000 | 1985-02-17 | 1985-03-01 | ...snip... | 45012 | 66889 | 1985-08-16 | 1985-12-31 | +--------+--------+------------+------------+ 89 rows in set (0,00 sec) |
This saves an index, I mean less indexes to maintain for the engine. Also speaking of maintenance cost, the cost to maintain a functional index is higher than the cost of a regular one.
In the other side the execution plan is less good (query cost higher) and obviously you must rewrite the query.
Requirements and restrictions.
A primary key cannot be a functional index:
1 2 |
mysql> CREATE TABLE t1 (i INT, PRIMARY KEY ((ABS(i)))); ERROR 3756 (HY000): The primary key cannot be a functional index |
You can not index non-deterministic functions (RAND(), UNIX_TIMESTAMP(), NOW()…)
1 2 |
mysql> CREATE TABLE t1 (i int, KEY ((RAND(i)))); ERROR 3758 (HY000): Expression of functional index 'functional_index' contains a disallowed function. |
SPATIAL and FULLTEXT indexes cannot have functional key parts.
Conclusion
Functional index is an interesting and a relevant feature, it could be very useful to optimize your queries without rewrite them and especially when dealing with JSON documents and other complex types.
Obviously all the details you must know are in the MySQL documentation: Functional Key Parts
If you interested in the high level architecture and the low level design please read the workload.
Thanks for using MySQL!

MySQL Geek, author, blogger and speaker
I’m an insatiable hunger of learning.
—–
Blog: www.dasini.net/blog/en/
Twitter: https://twitter.com/freshdaz
SlideShare: www.slideshare.net/freshdaz
Youtube: https://www.youtube.com/channel/UC12TulyJsJZHoCmby3Nm3WQ
—–
[…] MySQL Functional Indexes […]
It would be nice if MariaDB will implements similar solution
Well MariaDB is a forked of MySQL 5.1.
First available source code of 5.1 was in 2005…
Remember what was the #1 movie in the box office in 2005? : http://dasini.net/blog/2017/07/19/mysql-first-public-releases/
IMHO that does not make much sense to hope that these 2 products will behave in the same way nor have the same features.
Anyway MySQL 8 is Great! 🙂
Thanks for your comment.
Olivier
Hi,
thanks for your post.
I would like to understand why Function-based index is not used in my case.
Thanks and best regards.
Michel Ramirez
select count(*) from TSK_PLANNING_PLN => 4957
select count(*) from TSK_PLANNING_PLN where PLN_END_DATE IS NULL => 1660
Index => ALTER TABLE TSK_PLANNING_PLN ADD INDEX IND_MRO_PLN_END_DATE_COALESCE((COALESCE(
PLN_END_DATE
,STR_TO_DATE(‘9999/01/01′,’%Y/%m/%d’))));Request :
EXPLAIN SELECT *
FROM
TSK_PLANNING_PLN pln USE INDEX (IND_MRO_PLN_END_DATE_COALESCE)
WHERE
COALESCE(PLN_END_DATE,STR_TO_DATE(‘9999/01/01′,’%Y/%m/%d’)) >= STR_TO_DATE(‘2020/12/21′,’%Y/%m/%d’)
AND pln.pln_start_date 1505 lines.
Sorry i forgot to show you, explain plan
id|select_type|table|partitions|type|possible_keys |key|key_len|ref|rows|filtered|Extra |
–|———–|—–|———-|—-|—————————–|—|——-|—|—-|——–|———–|
1|SIMPLE |pln | |ALL |IND_MRO_PLN_END_DATE_COALESCE| | | |5032| 11.26|Using where|
Best regard.
Hello,
well there are many reason why the optimizer would not use an index.
One of them is because the optimizer think it is more efficient to not use it (usually because of the column cardinality).
If you try your query with FORCE INDEX instead of USE INDEX what is the EXPLAIN result ?
If you think the statistics are not accurate you can also run ANALYZE TABLE
Cheers,
Olivier