HomeMySQL Page 4 - Analyzing Queries for Speed with EXPLAIN
13.2.4 EXPLAIN Output Columns - MySQL
When you are trying to optimize your queries to run quickly and efficiently, you may encounter queries that really should run faster. That's where EXPLAIN comes in handy. This article shows you how to use EXPLAIN in query analysis. It is excerpted from chapter 13 of the MySQL Certification Guide, written by Paul Dubois et al. (Sams, 2005; ISBN: 0672328127).
To use EXPLAIN productively, it's important to know the meaning of the columns in each row of output that it produces:
table is the name of the table to which the information in the row applies. EXPLAIN produces one row of output for each table named in the SELECT query. The order of the tables named in these rows indicates the order in which MySQL will read the tables to process the query. This is not necessarily the order in which you name them in the FROM clause, because the optimizer attempts to determine which order will result in the most efficient processing. The example in the preceding section shows this: The table order displayed by EXPLAIN changed as indexes were added.
type indicates the join type. The value is a measure of how efficiently MySQL can scan the table. The possible type values are described later in this section.
possible_keys indicates which of the table's indexes MySQL considers to be candidates for identifying rows that satisfy the query. This value can be a list of one or more index names, or NULL if there are no candidates. The word PRIMARY indicates that MySQL considers the table's primary key to be a candidate.
key indicates the optimizer's decision about which of the candidate indexes listed in possible_keys will yield most efficient query execution. If the key value is NULL, it means no index was chosen. This might happen either because there were no candidates or because the optimizer believes it will be just as fast to scan the table rows as to use any of the possible indexes. A table scan might be chosen over an index scan if the table is small, or because the index would yield too high a percentage of the rows in the table to be of much use.
key_len indicates how many bytes of index rows are used. From this value, you can derive how many columns from the index are used. For example, if you have an index consisting of three INT columns, each index row contains three 4-byte values. If key_len is 12, you know that the optimizer uses all three columns of the index when processing the query. If key_len is 4 or 8, it uses only the first one or two columns (that is, it uses a leftmost prefix of the index).
If you've indexed partial values of string columns, take that into account when assessing the key_len value. Suppose that you have a composite index on two CHAR(8) columns that indexes only the first 4 bytes of each column. In this case, a key_len value of 8 means that both columns of the index would be used, not just the first column.
ref indicates which indexed column or columns are used to choose rows from the table. const means key values in the index are compared to a constant expression, such as in Code='FRA'. NULL indicates that neither a constant nor another column are being used, indicating selection by an expression or range of values. It might also indicate that the column does not contain the value specified by the constant expression. If neither NULL nor const is specified, a table_name.column_name combination will be shown, indicating that the optimizer is looking at column_name in the rows returned from table_name to identify rows for the current table.
rows is the optimizer's estimate of how many rows from the table it will need to examine. The value is an approximation because, in general, MySQL cannot know the exact number of rows without actually executing the query. For a multiple-table query, the product of the rows values is an estimate of the total number of row combinations that need to be read. This product gives you a rough measure of query performance. The smaller the value, the better.
Extra provides other information about the join. The possible values are described later in this section.
The type value indicates the join type, but joins may be performed with varying degrees of efficiency. The type value provides a measure of this efficiency by indicating the basis on which rows are selected from each table. The following list shows the possible values, from the best type to the worst:
The table has exactly one row.
The table has exactly one matching row. This type value is similar to system, except that the table may have other, nonmatching rows. The EXPLAIN output from the query with WHERE Code='FRA' is an example of this:
mysql> EXPLAIN SELECT * FROM Country WHERE
Code = 'FRA'\G
*********************** 1. row ***************************
The query has a type value of const because only one row out of all its rows need be read. If the table contained only the row for France, there would be no nonmatching rows and the type value would be system rather than const.
For both system and const, because only one row matches, any columns needed from it can be read once and treated as constants while processing the rest of the query.
Exactly one row is read from the table for each combination of rows from the tables listed earlier by EXPLAIN. This is common for joins where MySQL can use a primary key to identify table rows.
Several rows may be read from the table for each combination of rows from the tables listed earlier by EXPLAIN. This is similar to eq_ref, but can occur when a nonunique index is used to identify table rows or when only a leftmost prefix of an index is used. For example, the CountryLanguage table has a primary key on the CountryCode and Language columns. If you search using only a CountryCode value, MySQL can use that column as a leftmost prefix, but there might be several rows for a country if multiple languages are spoken there.
The index is used to select rows that fall within a given range of index values. This is common for inequality comparisons such as id<10.
MySQL performs a full scan, but it scans the index rather than the data rows. An index scan is preferable: The index is sorted and index rows usually are shorter than data rows, so index rows can be read in order and more of them can be read at a time.
A full table scan of all data rows. Typically, this indicates that no optimizations are done and represents the worst case. It is particularly unfortunate when tables listed later in EXPLAIN output have a join type of ALL because that indicates a table scan for every combination of rows selected from the tables processed earlier in the join.
The Extra column provides additional information about how the table is processed. Some values indicate that the query is efficient:
MySQL can optimize the query by reading values from the index without having to read the corresponding data rows. This optimization is possible when the query selects only columns that are in the index.
MySQL uses a WHERE clause to identify rows that satisfy the query. Without a WHERE clause, you get all rows from the table.
MySQL reads a single row from the table for each combination of rows from the tables listed earlier in the EXPLAIN output.
MySQL can perform a LEFT JOIN "missing rows" optimization that quickly eliminates rows from consideration.
By contrast, some Extra values indicate that the query is not efficient:
Rows that satisfy the query must be sorted, which adds an extra processing step.
A temporary table must be created to process the query.
Range checked for each record
MySQL cannot determine in advance which index from the table to use. For each combination of rows selected from previous tables, it checks the indexes in the table to see which one will be best. This is not great, but it's better than using no index at all.
Using filesort and Using temporary generally are the two indicators of worst performance.
To use EXPLAIN for query analysis, examine its output for clues to ways the query might be improved. Make the change, and then run EXPLAIN again to see how its output changes. Changes might involve rewriting the query or changing the structure of your tables.
The following query rewriting techniques can be useful:
If the keys value is NULL even when there are indexes available, you can try adding a USE INDEX option as a hint to the optimizer which index is relevant for the query. To force MySQL to use the index, use FORCE INDEX. To tell MySQL to ignore an index that it chose and choose a different one instead, use IGNORE INDEX. Each of these options is used in the FROM clause, following the table name containing the index you want to control. The option is followed by parentheses containing a comma-separated list of one or more index names. PRIMARY means the table's primary key.
SELECT Name FROM CountryList USE
INDEX(PRIMARY) WHERE Code > 'M';
SELECT Name FROM CountryList IGNORE
WHERE Code < 'B' AND Population > 50000000;
The keyword KEY may be used instead of INDEX in all three options.
If you want to force MySQL to join tables in a particular order, begin the query with SELECT STRAIGHT_JOIN rather than SELECT, and then list the tables in the desired order in the FROM clause.
Sometimes a table in a query has an index available, but the query is written in such a way that prevents the index from being used. If you can rewrite the query into an equivalent form that allows use of the index, do so. Some rewriting techniques you can use are given in section 13.3, "General Query Enhancement."
Another way to provide the optimizer with better information on which to base its decisions is to change the structure of your tables:
If the possible_keys value is NULL in the output from EXPLAIN, it means MySQL finds no applicable index for processing the query. See whether an index can be added to the columns that identify which records to retrieve. For example, if you perform a join by matching a column in one table with a column in another, but neither of the columns are indexed, try indexing them.
Keep table index statistics up-to-date to help MySQL choose optimal indexes. If the table is a MyISAM table (or an InnoDB table, as of MySQL 4.0.13), you can update its statistics with the ANALYZE TABLE statement. As a table's contents change, the statistics go out of date and become less useful to the optimizer in making good decisions about query execution strategies. You should run ANALYZE TABLE more frequently for tables that change often than for those that are updated rarely.
Please check back next week for the continuation of this article.