MySQL Functional Indexes

March 14, 2019
Sunset in Crete by Olivier DASINI

Since MySQL 5.7 one can put indexes on expressions, aka functional indexes, using generated columns. Basically you first need to use the generated column to define the functional expression, then indexed this column.

Quite useful when dealing with JSON functions, you can find an example here and the documentation there.

Starting with MySQL 8.0.13 we have now an easiest way to create functional indexes (or functional key parts as mentioned in the documentation) \o/

Let’s see how with a quick practical example.

Below salaries table structure:

mysql> SHOW CREATE TABLE salaries\G
*************************** 1. row ***************************
       Table: salaries
Create Table: CREATE TABLE `salaries` (
  `emp_no` int(11) NOT NULL,
  `salary` int(11) NOT NULL,
  `from_date` date NOT NULL,
  `to_date` date NOT NULL,
  PRIMARY KEY (`emp_no`,`from_date`),
  CONSTRAINT `salaries_ibfk_1` FOREIGN KEY (`emp_no`) REFERENCES `employees` (`emp_no`) ON DELETE CASCADE
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci
1 row in set (0,00 sec)

It contains some data

mysql> SELECT count(*) FROM salaries;
+----------+
| count(*) |
+----------+
|  2844047 |
+----------+


mysql> SELECT * FROM salaries LIMIT 3;
+--------+--------+------------+------------+
| emp_no | salary | from_date  | to_date    |
+--------+--------+------------+------------+
|  10001 |  60117 | 1986-06-26 | 1987-06-26 |
|  10001 |  62102 | 1987-06-26 | 1988-06-25 |
|  10001 |  66074 | 1988-06-25 | 1989-06-25 |
+--------+--------+------------+------------+

Let’s focus on the following query:
SELECT * FROM salaries WHERE YEAR(to_date)=1985

mysql> SELECT * FROM salaries WHERE YEAR(to_date)=1985;
+--------+--------+------------+------------+
| emp_no | salary | from_date  | to_date    |
+--------+--------+------------+------------+
|  14688 |  42041 | 1985-07-06 | 1985-08-08 |
...snip...
| 498699 |  40000 | 1985-09-25 | 1985-09-28 |
+--------+--------+------------+------------+
89 rows in set (0,80 sec)


mysql> explain SELECT * FROM salaries WHERE YEAR(to_date)=1985\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: salaries
   partitions: NULL
         type: ALL
possible_keys: NULL
          key: NULL
      key_len: NULL
          ref: NULL
         rows: 2838426
     filtered: 100.00
        Extra: Using where

We have a full table scan ( type: ALL), meaning no index is used. Perhaps because there is no index on column to_date… 😉
So let’s add an index on to_date !

mysql> ALTER TABLE salaries ADD INDEX idx_to_date (to_date);
Query OK, 0 rows affected (17,13 sec)
Records: 0  Duplicates: 0  Warnings: 0


mysql> SHOW CREATE TABLE salaries\G
*************************** 1. row ***************************
       Table: salaries
Create Table: CREATE TABLE `salaries` (
  `emp_no` int(11) NOT NULL,
  `salary` int(11) NOT NULL,
  `from_date` date NOT NULL,
  `to_date` date NOT NULL,
  PRIMARY KEY (`emp_no`,`from_date`),
  KEY `idx_to_date` (`to_date`),
  CONSTRAINT `salaries_ibfk_1` FOREIGN KEY (`emp_no`) REFERENCES `employees` (`emp_no`) ON DELETE CASCADE
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci

And run again the query with the hope of a better execution plan

mysql> explain SELECT * FROM salaries WHERE YEAR(to_date)=1985\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: salaries
   partitions: NULL
         type: ALL
possible_keys: NULL
          key: NULL
      key_len: NULL
          ref: NULL
         rows: 2838426
     filtered: 100.00
        Extra: Using where

Ouch! Still have a full table scan !
The index can’t be used because of the use of a function (YEAR()) on the indexed column (to_date).
BTW if you’re really surprise, maybe you should read this. 😉

This is the case when you need a functional index!

mysql> ALTER TABLE salaries ADD INDEX idx_year_to_date((YEAR(to_date)));
Query OK, 0 rows affected (20,04 sec)
Records: 0  Duplicates: 0  Warnings: 0

The syntax is very similar of the creation of a “regular” index. Although you must be aware of the double parentheses: (( <expression> ))
We can now see our new index named idx_year_to_date and the indexed expression year(to_date) :

mysql> SHOW CREATE TABLE salaries\G
*************************** 1. row ***************************
       Table: salaries
Create Table: CREATE TABLE `salaries` (
  `emp_no` int(11) NOT NULL,
  `salary` int(11) NOT NULL,
  `from_date` date NOT NULL,
  `to_date` date NOT NULL,
  PRIMARY KEY (`emp_no`,`from_date`),
  KEY `idx_to_date` (`to_date`),
  KEY `idx_year_to_date` ((year(`to_date`))),
  CONSTRAINT `salaries_ibfk_1` FOREIGN KEY (`emp_no`) REFERENCES `employees` (`emp_no`) ON DELETE CASCADE
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci


mysql> SELECT INDEX_NAME, EXPRESSION 
FROM INFORMATION_SCHEMA.STATISTICS 
WHERE TABLE_SCHEMA='employees' 
    AND TABLE_NAME = "salaries" 
    AND INDEX_NAME='idx_year_to_date';
+------------------+-----------------+
| INDEX_NAME       | EXPRESSION      |
+------------------+-----------------+
| idx_year_to_date | year(`to_date`) |
+------------------+-----------------+

Let’s test our query again

mysql> explain SELECT * FROM salaries WHERE YEAR(to_date)=1985\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: salaries
   partitions: NULL
         type: ref
possible_keys: idx_year_to_date
          key: idx_year_to_date
      key_len: 5
          ref: const
         rows: 89
     filtered: 100.00
        Extra: NULL


mysql> SELECT * FROM salaries WHERE YEAR(to_date)=1985;
+--------+--------+------------+------------+
| emp_no | salary | from_date  | to_date    |
+--------+--------+------------+------------+
|  14688 |  42041 | 1985-07-06 | 1985-08-08 |
...snip...
| 498699 |  40000 | 1985-09-25 | 1985-09-28 |
+--------+--------+------------+------------+
89 rows in set (0,00 sec)

Here we go!
Now the query is able to use the index. And in this case we have a positive impact on the execution time.

It is also interesting to note that it is possible to use idx_to_date, the first index created (the non functional one) if we can rewrite the original query:

mysql> EXPLAIN SELECT * 
FROM salaries 
WHERE to_date BETWEEN '1985-01-01' AND '1985-12-31'\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: salaries
   partitions: NULL
         type: range
possible_keys: idx_to_date
          key: idx_to_date
      key_len: 3
          ref: NULL
         rows: 89
     filtered: 100.00
        Extra: Using index condition


mysql> SELECT * 
FROM salaries 
WHERE to_date BETWEEN '1985-01-01' AND '1985-12-31'
+--------+--------+------------+------------+
| emp_no | salary | from_date  | to_date    |
+--------+--------+------------+------------+
|  20869 |  40000 | 1985-02-17 | 1985-03-01 |
...snip...
|  45012 |  66889 | 1985-08-16 | 1985-12-31 |
+--------+--------+------------+------------+
89 rows in set (0,00 sec)

This saves an index, I mean less indexes to maintain for the engine. Also speaking of maintenance cost, the cost to maintain a functional index is higher than the cost of a regular one.

In the other side the execution plan is less good (query cost higher) and obviously you must rewrite the query.

Requirements and restrictions.

A primary key cannot be a functional index:

mysql> CREATE TABLE t1 (i INT, PRIMARY KEY ((ABS(i))));
ERROR 3756 (HY000): The primary key cannot be a functional index

You can not index non-deterministic functions (RAND(), UNIX_TIMESTAMP(), NOW()…)

mysql> CREATE TABLE t1 (i int, KEY ((RAND(i))));
ERROR 3758 (HY000): Expression of functional index 'functional_index' contains a disallowed function.

SPATIAL and FULLTEXT indexes cannot have functional key parts.

Conclusion

Functional index is an interesting and a relevant feature, it could be very useful to optimize your queries without rewrite them and especially when dealing with JSON documents and other complex types.

Obviously all the details you must know are in the MySQL documentation: Functional Key Parts
If you interested in the high level architecture and the low level design please read the workload.



Thanks for using MySQL!

Follow me on twitter

6 Responses to “MySQL Functional Indexes”

  1. […] MySQL Functional Indexes […]

  2. It would be nice if MariaDB will implements similar solution

  3. Well MariaDB is a forked of MySQL 5.1.
    First available source code of 5.1 was in 2005…
    Remember what was the #1 movie in the box office in 2005? : http://dasini.net/blog/2017/07/19/mysql-first-public-releases/

    IMHO that does not make much sense to hope that these 2 products will behave in the same way nor have the same features.

    Anyway MySQL 8 is Great! 🙂
    Thanks for your comment.

    Olivier

  4. Hi,
    thanks for your post.
    I would like to understand why Function-based index is not used in my case.

    Thanks and best regards.
    Michel Ramirez

    select count(*) from TSK_PLANNING_PLN => 4957

    select count(*) from TSK_PLANNING_PLN where PLN_END_DATE IS NULL => 1660

    Index => ALTER TABLE TSK_PLANNING_PLN ADD INDEX IND_MRO_PLN_END_DATE_COALESCE((COALESCE(`PLN_END_DATE`,STR_TO_DATE(‘9999/01/01′,’%Y/%m/%d’))));

    Request :

    EXPLAIN SELECT *
    FROM
    TSK_PLANNING_PLN pln USE INDEX (IND_MRO_PLN_END_DATE_COALESCE)
    WHERE
    COALESCE(PLN_END_DATE,STR_TO_DATE(‘9999/01/01′,’%Y/%m/%d’)) >= STR_TO_DATE(‘2020/12/21′,’%Y/%m/%d’)
    AND pln.pln_start_date 1505 lines.

  5. Sorry i forgot to show you, explain plan

    id|select_type|table|partitions|type|possible_keys |key|key_len|ref|rows|filtered|Extra |
    –|———–|—–|———-|—-|—————————–|—|——-|—|—-|——–|———–|
    1|SIMPLE |pln | |ALL |IND_MRO_PLN_END_DATE_COALESCE| | | |5032| 11.26|Using where|

    Best regard.

  6. Hello,
    well there are many reason why the optimizer would not use an index.
    One of them is because the optimizer think it is more efficient to not use it (usually because of the column cardinality).
    If you try your query with FORCE INDEX instead of USE INDEX what is the EXPLAIN result ?
    If you think the statistics are not accurate you can also run ANALYZE TABLE

    Cheers,
    Olivier