Even more ways to increase MySQL query speed and decrease running speed (part 2)
This guide is the second part of our guide on how to improve MySQL select speed. In case you missed it, you can read the previous post: Ten ways to increase MySQL query speed and decrease running speed.
So without further ado, let’s get into the details.
Using the correct limit value without an index
Not all selects are done in real-time, we have many queries that run overnight and do a full table scan, and the results are waiting in the morning.
We migrated a table from a modest server to a strong server and used the script to do some calculations, which took many nights.
Upon inspection, we noticed that the script was using a limit to get parts of the results, and once it was done with the processing, it selected the next chunk.
Because the server was “weak”, the limit was for 10,000 records at once. The issue was that the fields were not indexed, so MySQL had to do too much computation to get the data from the previous point.
The solution was to increase the limit to 500,000 records. A few days turned into a few hours.
Creating a pre-calculated table
Some of our data have multiple indexes (over one billion records per table), and selecting with several indexes and using a limit to get part of the data can take very long.
Furthermore, sorting and using the limit clause can cause the query to run slower.
Our solution was to create a pre-calculated table and use a primary key on the data we wanted to sort.
The drawback of this method is that it takes more space, and the pre-calculated table must be updated to reflect changes to the master table, so it’s not an option for some scenarios.
Important tip: Make sure to update the pre-calculated table if you plan to use new data. We wasted many hours debugging to discover we didn’t calculate the table and received older data.
What are MySQL partitions?
Partition is a way to divide a large table into small tables (all done behind the scenes).
The secret to partition is to select the correct function to build the partition with. The function uses several methods based on the primary key.
Not all primary keys type can be used in a partition, and not all tables are good candidates for partitioning.
A classic scenario would be a primary key based on a date, for example, a year, so that we can place each year’s records inside a partition. If we query for data from a specific year, the MySQL server would only access the partition belonging to that year.
You can read a detailed guide on MySQL partitions here.
When to use MySQL partitions?
Remember that you need to measure everything, so all tips here may not be accurate for your specific scenario and data structure. Here are some reasons to use partitions:
- Large table, if you have millions of raws, don’t bother. We have 1-20 billion records in the tables that we partitioned.
- Select can be done on the primary key and not multiple indexes.
When not to use MySQL partitions
The reasons not to use partitions are mostly a reverse of when to use MySQL partitions:
- Too little data will increase overhead
- A query uses an index that is not the primary key. In this case, the MySQL server would run the query on each partition and do a union which will affect performance.
- There’s no way to structure the data to make sense to partition it.
MySQL server tuning and optimization is a never-ending process specific to your data. It could be that all of the suggestions will do nothing or even make the timing worse. The solution is to constantly measure if the change improved or hurt the query running time.