SEO Explorer Blog:
Try us:

Five ways to improve MySQL select speed (Part 1)

This is the second article in the MySQL optimization series. In this article, I’m going to cover how to speed up select speed and improve query performance and improve MySQL performance.

Some of the concepts here will be similar to the ones in the previous article, so we will summarize them and not go over them again.

This article covers only optimizing InnoDB. Although some of the suggestions here will work for MyISAM, this article assumes InnoDB usage.

Server and hardware selection

Look at this section in our previous article (Buy a fast server  Look at items four through six); a fast summary is that you shouldn’t use mechanical hard drives, and a dedicated server is faster than a cloud instance.

Measuring performance

Man holds a hologram with graphs

Measuring MySQL performance was also covered in the previous article. Another option is to see the time reported by MySQL to perform each query, but you need to ensure that the results were not served from the query cache.

Another possible caching related “issue” that may affect the measurement is OS cache. If there’s enough memory, it can cache MySQL raw files and skew the measurement.

Create an index

An index is the heart of every SQL based database, create the right one, and performance will be good.

If you are a MySQL professional, you can skip this part, as you are probably aware of what an Index is and how it is used.

Full table scan

To understand why indexes are needed, we need to know how MySQL gets the data.

Let’s say we have a simple table:

CREATE TABLE People (
Name VARCHAR(64),
Age int(3)
)

We want to get the age of John, so we make a simple selection:

Select Age from People where Name='John'

That would work great as long as we have a small amount of data, but the speed will decrease when our table grows.

The reason is that MySQL will need to scan every row in the table and compare Name to ‘John.’ It will only exit once it found it, or it didn’t find anything.

Checking every row in the table is called a full table scan, which has zero performance. Think of a table with 1,000,000,000 rows (we have some of those in our data store), and every MySQL query will have to scan the entirety of the data. That would be crazy slow.

For example, we have a table with 300 million keywords. For debugging purposes, we need to do a full table scan once in a while. It takes about 5 minutes to get our data.

Creating the index

Hand holding a drawer in a cabinet

Indexes allow for fast retrieval of data. Just like the index in a phone book (if you remember those), you’d search the Name, and it will point you to the page where that Name is.

There are several types of indexes in MySQL, and each has its use case and performance. It’s essential to choose the right one for what we need.

MySQL index types:

Primary key

Every table should have a primary key (unless you know what you are doing). The primary key can span multiple columns, which guarantees that the data is unique.

So in our example, if we put a primary key on the column Name, we can only have one name ‘John,’ but we can also have ‘Beth’ and ‘Karen.’

The advantage of a primary key is that MySQL sorts the data in a way that matches the key, and it’s the fastest key in terms of select performance.

Spatial index

This index is meant for geometric values, and it’s used for map and GIS-related purposes. I have never worked in that area and never used that index.

Unique index

This index is just like the primary key in that it guarantees that the columns that are part of the index will be unique, but unlike the primary key, managing this index takes space and memory.

For large indexes, you may need a big enough memory to handle any operations. I once used a 16GB instance, and the index had 200m rows; MySQL crashed when I tried to do a copy from one table to another.

Regular index

The regular index allows for multiple values with the same value; it’s managed just like the unique index.

Full text

This index allows us to index substrings inside a string column. This is useful for wildcard string searches.

Descending indexes

These are regular indexes, but their sort order is reversed. This is supported by MySQL 8 (which you should use anyway) and is useful to process new data first.

MySQL table schema

In the next paragraphs, we are going to discuss a table schema, so here it is:

CREATE TABLE Search (
SearchID BigInt,
Keyword Varchar(64)
PRIMARY KEY (SearchID)
INDEX idxKeyword (Keyword)
)

Our test table is populated with 300 million records.

Checking how your selects are behaving

MySQL select reflecting on a silver barrel

Adding indexes and primary keys doesn’t mean that MySQL will use them, MySQL has its own logic of how to use indexes for a given query. For queries that are slow or are running many times, it’s a good practice to check what happens under the hood.

The way you do it is with the command ‘explain.’ You put it before your SQL command, and MySQL lets you know the execution plans for the query.

Explain on a primary key

For example, let’s make a simple selection on a table:

explain SELECT * FROM `Search` where SearchID=1

SearchID is the primary key; the reply we get is:

+----+-------------+--------+------------+-------+---------------+---------+---------+-------+------+----------+-------+
| id | select_type | table  | partitions | type  | possible_keys | key     | key_len | ref   | rows | filtered | Extra |
+----+-------------+--------+------------+-------+---------------+---------+---------+-------+------+----------+-------+
|  1 | SIMPLE      | Search | NULL       | const | PRIMARY       | PRIMARY | 4       | const |    1 |100.00    | NULL  |
+----+-------------+--------+------------+-------+---------------+---------+---------+-------+------+----------+-------+

As we can see, MySQL uses the primary key to do the select and will only make use of one row.

Explain without primary key

Now let’s do a select on a row without a primary key:

explain SELECT * FROM `Search` as Search where Results=1;

The column results have no index or primary key, and the result is:

+----+-------------+--------+------------+------+---------------+------+---------+------+-----------+----------+-------------+
| id | select_type | table  | partitions | type | possible_keys | key  | key_len | ref  | rows      | filtered | Extra       |
+----+-------------+--------+------------+------+---------------+------+---------+------+-----------+----------+-------------+
|  1 | SIMPLE      | Search | NULL       | ALL  | NULL          | NULL | NULL    | NULL | 245410801 |    10.00 | Using where |
+----+-------------+--------+------------+------+---------------+------+---------+------+-----------+----------+-------------+

A full table scan would be very slow.

Explain for index

Our next process is to do a select on a row with an index:

explain SELECT * FROM `Search` as Search where Keyword='test';

And the result is:

+----+-------------+--------+------------+------+---------------+------------+---------+-------+-------+----------+-------+
| id | select_type | table  | partitions | type | possible_keys | key        | key_len | ref   | rows  | filtered | Extra |
+----+-------------+--------+------------+------+---------------+------------+---------+-------+-------+----------+-------+
|  1 | SIMPLE      | Search | NULL       | ref  | IdxKeyword    | IdxKeyword | 12      | const | 49082 |   100.00 | NULL  |
+----+-------------+--------+------------+------+---------------+------------+---------+-------+-------+----------+-------+

MySQL will process 49082 rows, but let’s see how many rows we actually have:

SELECT count(*) FROM `Search` as Search where Keyword='test';

And the result is:

+----------+
| count(*) |
+----------+
|    26094 |
+----------+

There are fewer rows than what the explanation showed. This can be due to how MySQL manages indexes, but it’s still fast enough and not a full table scan.

Checking the where clause

The where clause is one of the most crucial parts of every select query because that’s the purpose of a database: get specific data and fast.

This means that you need to make sure that the columns checked with the where clause are indexed or are small so that a scan will be fast.

Good index selection of complex where clause

The ‘explain’ paragraph covers a few examples with an index in the where clause, but those were straightforward examples. Let’s show another example where we use both a primary key and an index:

explain select * from Search as Search where SearchID between 1 and 1000000 and Keyword between 'aaaaaaaaa' and 'bbbbbbbbb';

The result is:

+----+-------------+--------+------------+-------+--------------------+---------+---------+------+---------+----------+-------------+
| id | select_type | table  | partitions | type  | possible_keys      | key     | key_len | ref  | rows    | filtered | Extra       |
+----+-------------+--------+------------+-------+--------------------+---------+---------+------+---------+----------+-------------+
|  1 | SIMPLE      | Search | NULL       | range | PRIMARY,IdxKeyword | PRIMARY | 4       | NULL | 1052210 |    12.30 | Using where |
+----+-------------+--------+------------+-------+--------------------+---------+---------+------+---------+----------+-------------+

This is good; MySQL used both indexes.

No index selection with calculation on a column

Let’s try to mess it up:

explain select * from Search as Search where MOD(SearchID,2)=1;

In this example, we are doing a calculation on the column with the primary key; look what MySQL will do:

+----+-------------+--------+------------+------+---------------+------+---------+------+-----------+----------+-------------+
| id | select_type | table  | partitions | type | possible_keys | key  | key_len | ref  | rows      | filtered | Extra       |
+----+-------------+--------+------------+------+---------------+------+---------+------+-----------+----------+-------------+
|  1 | SIMPLE      | Search | NULL       | ALL  | NULL          | NULL | NULL    | NULL | 245412211 |   100.00 | Using where |
+----+-------------+--------+------------+------+---------------+------+---------+------+-----------+----------+-------------+

It performed a full table scan, even though we have a primary key! The way to solve it is to add a precalculated field, index it, and use that index field in the where clause.

The wrong type of index

Let’s do a wildcard select on an indexed column:

explain select * from Search as Search where Keyword like '%a%';

We are searching for any keyword that has the letter ‘a’ inside, and MySQL does this:

+----+-------------+--------+------------+------+---------------+------+---------+------+-----------+----------+-------------+
| id | select_type | table  | partitions | type | possible_keys | key  | key_len | ref  | rows      | filtered | Extra       |
+----+-------------+--------+------------+------+---------------+------+---------+------+-----------+----------+-------------+
|  1 | SIMPLE      | Search | NULL       | ALL  | NULL          | NULL | NULL    | NULL | 245412257 |    11.11 | Using where |
+----+-------------+--------+------------+------+---------------+------+---------+------+-----------+----------+-------------+

For a full table scan, in this case, we need to add a full-text index to the column ‘Keyword.’

Order by and index

Silver sort tiles

Here we do an order by using our primary key, which is in the where clause as well:

explain select * from Search where SearchID between 0 and 1000000 order by SearchID

The result is:

+----+-------------+--------+------------+-------+---------------+---------+---------+------+---------+----------+-------------+
| id | select_type | table  | partitions | type  | possible_keys | key     | key_len | ref  | rows    | filtered | Extra       |
+----+-------------+--------+------------+-------+---------------+---------+---------+------+---------+----------+-------------+
|  1 | SIMPLE      | search | NULL       | range | PRIMARY       | PRIMARY | 4       | NULL | 1052210 |   100.00 | Using where |
+----+-------------+--------+------------+-------+---------------+---------+---------+------+---------+----------+-------------+

Which means MySQL used the primary key, now let’s use the index as the sort factor:

explain select * from Search where SearchID between 0 and 1000000 order by Keyword;

This time we have a little surprise:

+----+-------------+--------+------------+-------+---------------+---------+---------+------+---------+----------+-----------------------------+
| id | select_type | table  | partitions | type  | possible_keys | key     | key_len | ref  | rows    | filtered | Extra
|
+----+-------------+--------+------------+-------+---------------+---------+---------+------+---------+----------+-----------------------------+
|  1 | SIMPLE      | search | NULL       | range | PRIMARY       | PRIMARY | 4       | NULL | 1052210 |   100.00 | Using where; Using filesort |
+----+-------------+--------+------------+-------+---------------+---------+---------+------+---------+----------+-----------------------------+

MySQL didn’t use our index; it sorted the data manually, which will slow down our query.

No index on table join

When doing a join between one or more tables (all type of joins: inner join, left join, right join, etc.), make sure that the join column you join on is indexed. Otherwise, MySQL will need to do a full table scan to do a proper join.

How can you know if this is the case? Use EXPLAIN. You should be an expert on using it by now.  😊

Check your MySQL configuration

MySQL out of the box configuration is modest and designed to support operations for web sites like WordPress.

To achieve the performance you’d expect from a modern database, you will need to tweak the default configuration.

The settings reside in the file my.cnf. In case of centos, it’s under /etc/my.cnf

innodb_buffer_pool_size

Personally, I believe this is the most critical MySQL setting, and it sets the amount of memory MySQL uses to cache table and index data. The settings should be 75% of memory because the size is not set in stone, and MySQL may be allocated 10% more of the set size.

Also, you may have other applications running, so you need to reserve memory for them.

I helped a friend (the same one I wrote about), and the first thing I checked even before the table structure is the memory allocated to MySQL. It was on default, which was 2GB on his machine. Changing it to 48GB increased performance drastically.

innodb-buffer-pool-instances

The number of instances the buffer pool is divided into, this is to allow for better concurrency with threads. It’s only relevant if the buffer pool is over 1.3GB.

On Windows 32bit, the default is the buffer pool size divided by 128MB. On all other systems, the default is 8.

With the new AMD processors, you can have 64 or 128 cores, and you can experiment with this value to see if you get better performance.

join_buffer_size

This setting tells MySQL how much memory to allocate to the join buffer, which is used while creating joins.

If the join data is too large, MySQL will resort to using a temporary table on the hard drive.

The default value is 256KB; keep in mind that increasing this value will increase the value for every MySQL thread.

This value is also relatively small because joins should be small, so before raising this value, check if the design is correct, and maybe a design change is a better option.

Another option is to “hint” the optimizer before a query with a large join.

You can read on this setting here: https://dev.mysql.com/doc/refman/8.0/en/server-system-variables.html#sysvar_join_buffer_size

Summary

In this guide we started to cover some aspects of how to optimize MySQL query speed and performance, in the next guide we will cover more topic like Table optimization, memory settings and more.

5 1 vote
Article Rating
Subscribe
Notify of
guest
0 Comments
Inline Feedbacks
View all comments
0
Would love your thoughts, please comment.x
()
x
Share via
Copy link
Powered by Social Snap