I have rarely seen dev/DBA using this concept when their query is busy in FTS or index scanning. I have decided to write my own experience here and share it with the outer world. It might help someone who is struggling with their query performance.
In many cases, you have experienced slowness in your select queries even instead after having had an index on the column present in where clause and you are using an in-built function like substring, like, max, min, etc. Please note I’m talking about inbuilt functions only not user-defined functions.
show create table list:
Create table (id int, name char (20)),primary key (Id) ,Key idx_name(name) engine=Innodb;
Now insert at least 1 million records so that we can easily understand the performance impact :
Now if your query is like :
select id,name from list where id=123 and name =’oggy’;
This query will use index idx_name and will give you output in less than a second.
But if your query is like ;
select id, name from list where id=123 and name like ‘%abc%’;
With more than one million records this query will perform FTS (full table scanning) on the name column and will not use the index in spite name column has a secondary index on it and this may take a lot of time depending on the number of records in the table.
Fact is function never use indexes, even if involved column has index created on it and always result in FTS.
Prior to MySQL 5.7, MySQL doesn’t provide any functional-based indexing feature or something similar to it.
In 5.7, we have virtual columns feature which is known as generated columns and can be built at the time of reading data, thus resulting in no consumption of storage space.
Let’s begin with our earlier example where our query was doing FTS. Now if i create a virtual column :
alter table list add name_virt char (10) as substring(name,3,9);
So instead of evaluating at the application level, DB will take care of this. Moreover, we can create an index on these virtual columns. So, if I create an index on this newly created column( I am trying making use of the online DDL, you can choose your own) :
create index idx_name_virt on list(name_virt) algorithm=inplace lock=none;
Once the index has been added, please run the same select query and you will observe that instead of consuming more than 4-5 minutes (suppose), your query will yield a result in less than a second.
If you want to store these values in the table, then you can use the STORAGE keyword while adding a column. But by default, it remains virtual. In my experience, my queries that were taking more than 4 seconds are now taking millisecond only.
Please be informed that performance depends on how heavy evaluation we are doing. If we are doing heavy evaluation then performance may get impacted because of several IO. So we need to decide as per our requirements.
Below is the clause of virtual column from mysql docs:
col_name data_type [GENERATED ALWAYS] AS (expression) [VIRTUAL | STORED] [NOT NULL | NULL] [UNIQUE [KEY]] [[PRIMARY] KEY] [COMMENT ‘string‘]
About Ankit Kapoor
Ankit Kapoor is a database geek and enjoy playing with open source databases like MySQL and PG.
Connect with me on linkedin

One thought on “Function-based indexing in MySQL 5.7”