The following example demonstrates how to use EXPLAIN to analyze and optimize a sample query. The purpose of the query is to answer the question, "Which cities have a population of more than eight million?" and to display for each city its name and population, along with the country name. This question could be answered using only city information, except that to get each country's name rather than its code, city information must be joined to country information. The example uses tables created from world database information. Initially, these tables will have no indexes, so EXPLAIN will show that the query is not optimal. The example then adds indexes and uses EXPLAIN to determine the effect of indexing on query performance. Begin by creating the initial tables, CountryList and CityList. These are derived from the Country and City tables, but need contain only the columns involved in the query: mysql> CREATE TABLE CountryList -> SELECT Code, Name FROM Country; Query OK, 239 rows affected (0.04 sec) mysql> CREATE TABLE CityList -> SELECT CountryCode, Name, Population FROM City; Query OK, 4079 rows affected (0.04 sec) The query that retrieves the desired information in the required format looks like this: mysql> SELECT CountryList.Name, CityList.Name, While the tables are in their initial unindexed state, applying EXPLAIN to the query yields the following result: mysql> EXPLAIN SELECT CountryList.Name, The information displayed by EXPLAIN shows that no optimizations could be made:
EXPLAIN shows that MySQL would need to check nearly a million row combinations to produce a query result that contains only 10 rows. Clearly, this query would benefit from the creation of indexes that allow the server to look up information faster. Good columns to index typically are those that you use for searching, grouping, or sorting records. The query does not have any GROUP BY or ORDER BY clauses, but it does use columns for searching. Specifically:
To see the effect of indexing, try creating indexes on the columns used to join the tables. In the CountryList table, Code is a primary key that uniquely identifies each row. Add the index using ALTER TABLE: mysql> ALTER TABLE CountryList ADD PRIMARY KEY In the CityList table, CountryCode is a nonunique index because multiple cities can share the same country code: mysql> ALTER TABLE CityList ADD INDEX (CountryCode); After creating the indexes, EXPLAIN reports a somewhat different result: mysql> EXPLAIN SELECT CountryList.Name, Observe that EXPLAIN now lists the tables in a different order. CityList appears first, which indicates that MySQL will read rows from that table first and use them to search for matches in the second table, CountryList. The change in table processing order reflects the optimizer's use of the index information that is now available for executing the query. MySQL still will scan all rows of the CityList table (its type value is ALL), but now the server can use each of those rows to directly look up the corresponding CountryList row. This is seen by the information displayed for the CountryList table:
The result from EXPLAIN shows that indexing CountryList.Code as a primary key improves query performance. However, it still indicates a full scan of the CityList table. The optimizer sees that the index on CountryCode is available, but the key value of NULL indicates that it will not be used. Does that mean the index on the CountryCode column is of no value? It depends. For this query, the index is not used. In general, however, it's good to index joined columns, so you likely would find for other queries on the CityList table that the index does help. The product of the rows now is just 4,079. That's much better than 974,881, but perhaps further improvement is possible. The WHERE clause of the query restricts CityList rows based on their Population values, so try creating an index on that column: mysql> ALTER TABLE CityList ADD INDEX (Population); After creating the index, run EXPLAIN again: mysql> EXPLAIN SELECT CountryList.Name, The output for the CountryList table is unchanged compared to the previous step. That is not a surprise; MySQL already found that it could use a primary key for lookups, which is very efficient. On the other hand, the result for the CityList table is different. The optimizer now sees two indexes in the table as candidates. Furthermore, the key value shows that it will use the index on Population to look up records. This results in an improvement over a full scan, as seen in the change of the rows value from 4,079 to 78. The query now is optimized. Note that the product of the rows values, 78, still is larger than the actual number of rows produced by the query (10 rows). This is because the rows values are only estimates. The optimizer cannot give an exact count without actually executing the query. To summarize:
The example shows that using indexes effectively can substantially reduce the work required by the server to execute a query, and that EXPLAIN is a useful tool for assessing the effect of indexing.
blog comments powered by Disqus |
|
|
|
|
|
|
|