MySQL
  Home arrow MySQL arrow Page 3 - Analyzing Queries for Speed with EXPLAIN
Dev Shed Forums  
Administration  
AJAX  
Apache  
BrainDump  
DHTML  
Flash  
Java  
JavaScript  
Multimedia  
MySQL  
Oracle  
Perl  
PHP  
Practices  
Python  
Reviews  
Security  
Smartphone Development  
Style-Sheets  
Web Services  
XML  
Zend  
Zope  
Mobile Linux  
App Generation ROI  
IBM® developerWorks  
Forums Sitemap  
E-Commerce Hosting  
Linux Web Hosting  
Managed Hosting  
Small Business Hosting  
VPS Hosting  
Weekly Newsletter

 
Developer Updates  
Free Website Content 
 RSS  Articles
 RSS  Forums
 RSS  All Feeds
Write For Us Get Paid  
Request Media Kit
Contact Us  
Site Map  
Privacy Policy  
Support  
 USERNAME
 
 PASSWORD
 
 
  >>> SIGN UP!  
  Lost Password? 
Google.com  
MYSQL

Analyzing Queries for Speed with EXPLAIN
By: Sams Publishing
  • Search For More Articles!
  • Disclaimer
  • Author Terms
  • Rating: starstarstarstarstar / 12
    2006-08-10


    Table of Contents:
  • Analyzing Queries for Speed with EXPLAIN
  • 13.2.2 How EXPLAIN Works
  • 13.2.3 Analyzing a Query
  • 13.2.4 EXPLAIN Output Columns

  • Rate this Article: Poor Best 
      ADD THIS ARTICLE TO:
      error-file:tidyout.log Del.ici.ous error-file:tidyout.log Digg
      error-file:tidyout.log Blink error-file:tidyout.log Simpy
      error-file:tidyout.log Google error-file:tidyout.log Spurl
      error-file:tidyout.log Y! MyWeb error-file:tidyout.log Furl
    Email Me Similar Content When Posted
    Add Developer Shed Article Feed To Your Site
    Email Article To Friend
    Print Version Of Article
    PDF Version Of Article

     
     
    ADVERTISEMENT


    Analyzing Queries for Speed with EXPLAIN - 13.2.3 Analyzing a Query
    ( Page 3 of 4 )

    The following example demonstrates how to use EXPLAIN to analyze and optimize a sample query. The purpose of the query is to answer the question, "Which cities have a population of more than eight million?" and to display for each city its name and population, along with the country name. This question could be answered using only city information, except that to get each country's name rather than its code, city information must be joined to country information.

    The example uses tables created from world database information. Initially, these tables will have no indexes, so EXPLAIN will show that the query is not optimal. The example then adds indexes and uses EXPLAIN to determine the effect of indexing on query performance.

    Begin by creating the initial tables, CountryList and CityList. These are derived from the Country and City tables, but need contain only the columns involved in the query:

    mysql> CREATE TABLE CountryList
      -> SELECT Code, Name FROM Country;
    Query OK, 239 rows affected (0.04 sec)
    mysql> CREATE TABLE CityList
     -> SELECT CountryCode, Name, Population FROM City;
    Query OK, 4079 rows affected (0.04 sec)

    The query that retrieves the desired information in the required format looks like this:

    mysql> SELECT CountryList.Name, CityList.Name,
    CityList.Population -> FROM CountryList, CityList -> WHERE CountryList.Code = CityList.CountryCode -> AND CityList.Population > 8000000;
    +--------------------+------------------+------------+ | Name | Name | Population | +--------------------+------------------+------------+ | Brazil | Saõ Paulo | 9968485 | | Indonesia | Jakarta | 9604900 | | India | Mumbai (Bombay) | 10500000 | | China | Shanghai | 9696300 | | South Korea | Seoul | 9981619 | | Mexico | Ciudad de México | 8591309 | | Pakistan | Karachi | 9269265 | | Turkey | Istanbul | 8787958 | | Russian Federation | Moscow | 8389200 | | United States | New York | 8008278 | +--------------------+------------------+------------+

    While the tables are in their initial unindexed state, applying EXPLAIN to the query yields the following result:

    mysql> EXPLAIN SELECT CountryList.Name,
    CityList.Name, CityList.Population -> FROM CountryList, CityList -> WHERE CountryList.Code = CityList.CountryCode -> AND CityList.Population > 8000000\G
    ********************** 1. row *************************** table: CountryList type: ALL possible_keys: NULL key: NULL key_len: NULL ref: NULL rows: 239 Extra: ********************** 2. row *************************** table: CityList type: ALL possible_keys: NULL key: NULL key_len: NULL ref: NULL rows: 4079 Extra: Using where

    The information displayed by EXPLAIN shows that no optimizations could be made:

    • The type value in each row shows how MySQL will read the corresponding table. For CountryList, the value of ALL indicates a full scan of all rows. For CityList, the value of ALL indicates a scan of all its rows to find a match for each CountryList row. In other words, all combinations of rows will be checked to find country code matches between the two tables.

    • The number of row combinations is given by the product of the rows values, where rows represents the optimizer's estimate of how many rows in a table it will need to check at each stage of the join. In this case, the product is 239 * 4,079 or 974,881.

    EXPLAIN shows that MySQL would need to check nearly a million row combinations to produce a query result that contains only 10 rows. Clearly, this query would benefit from the creation of indexes that allow the server to look up information faster.

    Good columns to index typically are those that you use for searching, grouping, or sorting records. The query does not have any GROUP BY or ORDER BY clauses, but it does use columns for searching. Specifically:

    • The query uses CountryList.Code and CityList.CountryCode to match records between tables.

    • The query uses CityList.Population to cull records that do not have a large enough population.

    To see the effect of indexing, try creating indexes on the columns used to join the tables. In the CountryList table, Code is a primary key that uniquely identifies each row. Add the index using ALTER TABLE:

    mysql> ALTER TABLE CountryList ADD PRIMARY KEY
    (Code);

    In the CityList table, CountryCode is a nonunique index because multiple cities can share the same country code:

    mysql> ALTER TABLE CityList ADD INDEX (CountryCode);

    After creating the indexes, EXPLAIN reports a somewhat different result:

    mysql> EXPLAIN SELECT CountryList.Name,
    CityList.Name, CityList.Population -> FROM CountryList, CityList -> WHERE CountryList.Code = CityList.CountryCode -> AND CityList.Population > 8000000\G
    ********************** 1. row *************************** table: CityList type: ALL possible_keys: CountryCode key: NULL key_len: NULL ref: NULL rows: 4079 Extra: Using where ********************** 2. row *************************** table: CountryList type: eq_ref possible_keys: PRIMARY key: PRIMARY key_len: 3 ref: CityList.CountryCode rows: 1 Extra:

    Observe that EXPLAIN now lists the tables in a different order. CityList appears first, which indicates that MySQL will read rows from that table first and use them to search for matches in the second table, CountryList. The change in table processing order reflects the optimizer's use of the index information that is now available for executing the query.

    MySQL still will scan all rows of the CityList table (its type value is ALL), but now the server can use each of those rows to directly look up the corresponding CountryList row. This is seen by the information displayed for the CountryList table:

    • The type value of eq_ref indicates that an equality test is performed by referring to the column named in the ref field, CityList.CountryCode.

    • The possible_keys value of PRIMARY shows that the optimizer sees the primary key as a candidate for optimizing the query, and the key field indicates that it will actually use the primary key when executing the query.

    The result from EXPLAIN shows that indexing CountryList.Code as a primary key improves query performance. However, it still indicates a full scan of the CityList table. The optimizer sees that the index on CountryCode is available, but the key value of NULL indicates that it will not be used. Does that mean the index on the CountryCode column is of no value? It depends. For this query, the index is not used. In general, however, it's good to index joined columns, so you likely would find for other queries on the CityList table that the index does help.

    The product of the rows now is just 4,079. That's much better than 974,881, but perhaps further improvement is possible. The WHERE clause of the query restricts CityList rows based on their Population values, so try creating an index on that column:

    mysql> ALTER TABLE CityList ADD INDEX (Population);

    After creating the index, run EXPLAIN again:

    mysql> EXPLAIN SELECT CountryList.Name,
    CityList.Name, CityList.Population -> FROM CountryList, CityList -> WHERE CountryList.Code = CityList.CountryCode -> AND CityList.Population > 8000000\G
    *********************** 1. row *************************** table: CityList type: range possible_keys: CountryCode,Population key: Population key_len: 4 ref: NULL rows: 78 Extra: Using where *********************** 2. row *************************** table: CountryList type: eq_ref possible_keys: PRIMARY key: PRIMARY key_len: 3 ref: CityList.CountryCode rows: 1 Extra:

    The output for the CountryList table is unchanged compared to the previous step. That is not a surprise; MySQL already found that it could use a primary key for lookups, which is very efficient. On the other hand, the result for the CityList table is different. The optimizer now sees two indexes in the table as candidates. Furthermore, the key value shows that it will use the index on Population to look up records. This results in an improvement over a full scan, as seen in the change of the rows value from 4,079 to 78.

    The query now is optimized. Note that the product of the rows values, 78, still is larger than the actual number of rows produced by the query (10 rows). This is because the rows values are only estimates. The optimizer cannot give an exact count without actually executing the query.

    To summarize:

    • With unindexed tables, the rows product was 974,881.

    • After indexing the join columns, the rows product dropped to 4,079, a 99.6% improvement.

    • After indexing the Population column, the rows product dropped to 78, a further improvement of 98.1% over the previous step.

    The example shows that using indexes effectively can substantially reduce the work required by the server to execute a query, and that EXPLAIN is a useful tool for assessing the effect of indexing.



     
     
    >>> More MySQL Articles          >>> More By Sams Publishing
     

       

    MYSQL ARTICLES

    - MySQL Security Tips
    - Designing a MySQL Database: Tips and Techniq...
    - The Three Most Important MySQL Queries
    - Null and Empty Strings
    - MySQL Server Tuning Tips and Tricks
    - MySQL Query Optimizations and Schema Design
    - MySQL Benchmarking Tools and Utilities
    - MySQL Benchmarking Concepts and Strategies
    - Take Some Load off MySQL with MemCached
    - MySQL Table Prefix Changer Tool in PHP
    - Using the SIGNAL Statement for Error Handling
    - Error Handling Examples
    - Error Handling
    - Completing a Search Engine with MySQL and PH...
    - Paginating Result Sets for a Search Engine B...





    © 2003-2009 by Developer Shed. All rights reserved. DS Cluster 2 Hosted by Hostway
    For more Enterprise Application Development news, visit eWeek