Home arrow MySQL arrow Page 3 - Analyzing Queries for Speed with EXPLAIN

13.2.3 Analyzing a Query - MySQL

When you are trying to optimize your queries to run quickly and efficiently, you may encounter queries that really should run faster. That's where EXPLAIN comes in handy. This article shows you how to use EXPLAIN in query analysis. It is excerpted from chapter 13 of the MySQL Certification Guide, written by Paul Dubois et al. (Sams, 2005; ISBN: 0672328127).

TABLE OF CONTENTS:
  1. Analyzing Queries for Speed with EXPLAIN
  2. 13.2.2 How EXPLAIN Works
  3. 13.2.3 Analyzing a Query
  4. 13.2.4 EXPLAIN Output Columns
By: Sams Publishing
Rating: starstarstarstarstar / 13
August 10, 2006

print this article
SEARCH DEV SHED

TOOLS YOU CAN USE

advertisement

The following example demonstrates how to use EXPLAIN to analyze and optimize a sample query. The purpose of the query is to answer the question, "Which cities have a population of more than eight million?" and to display for each city its name and population, along with the country name. This question could be answered using only city information, except that to get each country's name rather than its code, city information must be joined to country information.

The example uses tables created from world database information. Initially, these tables will have no indexes, so EXPLAIN will show that the query is not optimal. The example then adds indexes and uses EXPLAIN to determine the effect of indexing on query performance.

Begin by creating the initial tables, CountryList and CityList. These are derived from the Country and City tables, but need contain only the columns involved in the query:

mysql> CREATE TABLE CountryList
-> SELECT Code, Name FROM Country;
Query OK, 239 rows affected (0.04 sec)
mysql> CREATE TABLE CityList
 -> SELECT CountryCode, Name, Population FROM City;
Query OK, 4079 rows affected (0.04 sec)

The query that retrieves the desired information in the required format looks like this:

mysql> SELECT CountryList.Name, CityList.Name,
CityList.Population -> FROM CountryList, CityList -> WHERE CountryList.Code = CityList.CountryCode -> AND CityList.Population > 8000000;
+--------------------+------------------+------------+ | Name | Name | Population | +--------------------+------------------+------------+ | Brazil | Saõ Paulo | 9968485 | | Indonesia | Jakarta | 9604900 | | India | Mumbai (Bombay) | 10500000 | | China | Shanghai | 9696300 | | South Korea | Seoul | 9981619 | | Mexico | Ciudad de México | 8591309 | | Pakistan | Karachi | 9269265 | | Turkey | Istanbul | 8787958 | | Russian Federation | Moscow | 8389200 | | United States | New York | 8008278 | +--------------------+------------------+------------+

While the tables are in their initial unindexed state, applying EXPLAIN to the query yields the following result:

mysql> EXPLAIN SELECT CountryList.Name,
CityList.Name, CityList.Population -> FROM CountryList, CityList -> WHERE CountryList.Code = CityList.CountryCode -> AND CityList.Population > 8000000\G
********************** 1. row *************************** table: CountryList type: ALL possible_keys: NULL key: NULL key_len: NULL ref: NULL rows: 239 Extra: ********************** 2. row *************************** table: CityList type: ALL possible_keys: NULL key: NULL key_len: NULL ref: NULL rows: 4079 Extra: Using where

The information displayed by EXPLAIN shows that no optimizations could be made:

  • The type value in each row shows how MySQL will read the corresponding table. For CountryList, the value of ALL indicates a full scan of all rows. For CityList, the value of ALL indicates a scan of all its rows to find a match for each CountryList row. In other words, all combinations of rows will be checked to find country code matches between the two tables.

  • The number of row combinations is given by the product of the rows values, where rows represents the optimizer's estimate of how many rows in a table it will need to check at each stage of the join. In this case, the product is 239 * 4,079 or 974,881.

EXPLAIN shows that MySQL would need to check nearly a million row combinations to produce a query result that contains only 10 rows. Clearly, this query would benefit from the creation of indexes that allow the server to look up information faster.

Good columns to index typically are those that you use for searching, grouping, or sorting records. The query does not have any GROUP BY or ORDER BY clauses, but it does use columns for searching. Specifically:

  • The query uses CountryList.Code and CityList.CountryCode to match records between tables.

  • The query uses CityList.Population to cull records that do not have a large enough population.

To see the effect of indexing, try creating indexes on the columns used to join the tables. In the CountryList table, Code is a primary key that uniquely identifies each row. Add the index using ALTER TABLE:

mysql> ALTER TABLE CountryList ADD PRIMARY KEY
(Code);

In the CityList table, CountryCode is a nonunique index because multiple cities can share the same country code:

mysql> ALTER TABLE CityList ADD INDEX (CountryCode);

After creating the indexes, EXPLAIN reports a somewhat different result:

mysql> EXPLAIN SELECT CountryList.Name,
CityList.Name, CityList.Population -> FROM CountryList, CityList -> WHERE CountryList.Code = CityList.CountryCode -> AND CityList.Population > 8000000\G
********************** 1. row *************************** table: CityList type: ALL possible_keys: CountryCode key: NULL key_len: NULL ref: NULL rows: 4079 Extra: Using where ********************** 2. row *************************** table: CountryList type: eq_ref possible_keys: PRIMARY key: PRIMARY key_len: 3 ref: CityList.CountryCode rows: 1 Extra:

Observe that EXPLAIN now lists the tables in a different order. CityList appears first, which indicates that MySQL will read rows from that table first and use them to search for matches in the second table, CountryList. The change in table processing order reflects the optimizer's use of the index information that is now available for executing the query.

MySQL still will scan all rows of the CityList table (its type value is ALL), but now the server can use each of those rows to directly look up the corresponding CountryList row. This is seen by the information displayed for the CountryList table:

  • The type value of eq_ref indicates that an equality test is performed by referring to the column named in the ref field, CityList.CountryCode.

  • The possible_keys value of PRIMARY shows that the optimizer sees the primary key as a candidate for optimizing the query, and the key field indicates that it will actually use the primary key when executing the query.

The result from EXPLAIN shows that indexing CountryList.Code as a primary key improves query performance. However, it still indicates a full scan of the CityList table. The optimizer sees that the index on CountryCode is available, but the key value of NULL indicates that it will not be used. Does that mean the index on the CountryCode column is of no value? It depends. For this query, the index is not used. In general, however, it's good to index joined columns, so you likely would find for other queries on the CityList table that the index does help.

The product of the rows now is just 4,079. That's much better than 974,881, but perhaps further improvement is possible. The WHERE clause of the query restricts CityList rows based on their Population values, so try creating an index on that column:

mysql> ALTER TABLE CityList ADD INDEX (Population);

After creating the index, run EXPLAIN again:

mysql> EXPLAIN SELECT CountryList.Name,
CityList.Name, CityList.Population -> FROM CountryList, CityList -> WHERE CountryList.Code = CityList.CountryCode -> AND CityList.Population > 8000000\G
*********************** 1. row *************************** table: CityList type: range possible_keys: CountryCode,Population key: Population key_len: 4 ref: NULL rows: 78 Extra: Using where *********************** 2. row *************************** table: CountryList type: eq_ref possible_keys: PRIMARY key: PRIMARY key_len: 3 ref: CityList.CountryCode rows: 1 Extra:

The output for the CountryList table is unchanged compared to the previous step. That is not a surprise; MySQL already found that it could use a primary key for lookups, which is very efficient. On the other hand, the result for the CityList table is different. The optimizer now sees two indexes in the table as candidates. Furthermore, the key value shows that it will use the index on Population to look up records. This results in an improvement over a full scan, as seen in the change of the rows value from 4,079 to 78.

The query now is optimized. Note that the product of the rows values, 78, still is larger than the actual number of rows produced by the query (10 rows). This is because the rows values are only estimates. The optimizer cannot give an exact count without actually executing the query.

To summarize:

  • With unindexed tables, the rows product was 974,881.

  • After indexing the join columns, the rows product dropped to 4,079, a 99.6% improvement.

  • After indexing the Population column, the rows product dropped to 78, a further improvement of 98.1% over the previous step.

The example shows that using indexes effectively can substantially reduce the work required by the server to execute a query, and that EXPLAIN is a useful tool for assessing the effect of indexing.



 
 
>>> More MySQL Articles          >>> More By Sams Publishing
 

blog comments powered by Disqus
escort Bursa Bursa escort Antalya eskort
   

MYSQL ARTICLES

- Oracle Unveils MySQL 5.6
- MySQL Vulnerabilities Threaten Databases
- MySQL Cloud Options Expand with Google Cloud...
- MySQL 5.6 Prepped to Handle Demanding Web Use
- ScaleBase Service Virtualizes MySQL Databases
- Oracle Unveils MySQL Conversion Tools
- Akiban Opens Database Software for MySQL Use...
- Oracle Fixes MySQL Bug
- MySQL Databases Vulnerable to Password Hack
- MySQL: Overview of the ALTER TABLE Statement
- MySQL: How to Use the GRANT Statement
- MySQL: Creating, Listing, and Removing Datab...
- MySQL: Create, Show, and Describe Database T...
- MySQL Data and Table Types
- McAfee Releases Audit Plugin for MySQL Users

Developer Shed Affiliates

 


Dev Shed Tutorial Topics: