HomeMySQL Analyzing Queries for Speed with EXPLAIN
Analyzing Queries for Speed with EXPLAIN
When you are trying to optimize your queries to run quickly and efficiently, you may encounter queries that really should run faster. That's where EXPLAIN comes in handy. This article shows you how to use EXPLAIN in query analysis. It is excerpted from chapter 13 of the MySQL Certification Guide, written by Paul Dubois et al. (Sams, 2005; ISBN: 0672328127).
When a SELECT query does not run as quickly as you think it should, use the EXPLAIN statement to ask the MySQL server for information about how the query optimizer processes the query. This information is useful in several ways:
EXPLAIN can provide information that points out the need to add an index.
If a table already has indexes, you can use EXPLAIN to find out whether the optimizer is using them. (To see what indexes a table has, use SHOW INDEX, as described in section 13.1.2, "Obtaining Table Index Information.")
If indexes exist but aren't being used, you can try writing a query different ways. EXPLAIN can tell you whether the rewrites are better for helping the server use the available indexes.
When using EXPLAIN to analyze a query, it's helpful to have a good understanding of the tables involved. If you need to determine a table's structure, remember that you can use DESCRIBE to obtain information about a table's columns, and SHOW INDEX for information about its indexes.
This section describes how EXPLAIN works. Later in the chapter, section 13.3, "General Query Enhancement," discusses some general query-writing principles that help MySQL use indexes more effectively. You can apply those principles in conjunction with EXPLAIN to determine the best way of writing a query.
13.2.1 Identifying Candidates for Query Analysis
EXPLAIN can be used to analyze any SELECT query, but some query performance characteristics make it especially likely that EXPLAIN will be helpful:
When a query that you issue (for example, using the mysql client) clearly takes a long time.
When a query appears in the slow query log, particularly if it appears consistently each time it is issued.
Recognize that "slow" can be a relative term. You don't want to waste time trying to optimize a query that seems slow but is so only for external reasons and is not inherently slow:
Queries in the slow log are determined to be slow using wallclock time. Queries will appear more often in the log when the server host is heavily loaded than when it is not, so you should evaluate query execution time against general system activity on that host.
A query might appear slow if the machine is very busy, but otherwise perform acceptably. For example, if filesystem backups are taking place, they'll incur heavy disk activity that impedes the performance of other programs, including the MySQL server. The machine might be processing a heavy load for other reasons, such as if you have a very active Web server running on the same host.
Keeping in mind the preceding considerations, you have a good indicator that a query might be in need of being optimized if you find that it is consistently slow in comparison to other queries no matter when you run it, and you know the machine isn't just generally bogged down all the time.
Another factor to recognize is that the mere presence of a query in the slow query log does not necessarily mean that the query is slow. If the server is run with the --long-log-format option, the slow query log also will contain queries that execute without using any index. In some cases, such a query may indeed be a prime candidate for optimization (for example, by adding an index). But in other cases, MySQL might elect not to use an existing index simply because a table is so small that scanning all of its rows is just as fast as using an index.
The SHOW PROCESSLIST statement is another useful source of information about query execution. Use it periodically to get information about what queries currently are running. If you notice that a particular query often seems to be causing a backlog by making other queries block, see whether you can optimize it. If you're successful, it will alleviate the backlog. To get the most information from SHOW PROCESSLIST, you should have the PROCESS privilege. Then the statement will display queries being run by all clients, not just your own queries.