MySQL
  Home arrow MySQL arrow Page 4 - MySQL Optimization, part 1
Dev Shed Forums 
Administration  
AJAX  
Apache  
BrainDump  
DHTML  
Flash  
Java  
JavaScript  
Multimedia  
MySQL  
Oracle  
Perl  
PHP  
Practices  
Python  
Reviews  
Security  
Style-Sheets  
Web Services  
XML  
Zend  
Zope  
Forums Sitemap 
IBM® developerWorks 
Sun Developer Network 
Dedicated Servers 
E-Commerce Hosting 
Linux Web Hosting 
Managed Hosting 
Small Business Hosting 
Moblin 
JMSL Numerical Library 
VPS Hosting 
Weekly Newsletter

 
Developer Updates  
Free Website Content 
 RSS  Articles
 RSS  Forums
 RSS  All Feeds
Write For Us Get Paid 
Request Media Kit
Contact Us 
Site Map 
Privacy Policy 
Support 
 USERNAME
 
 PASSWORD
 
 
  >>> SIGN UP!  
  Lost Password? 
MYSQL

MySQL Optimization, part 1
By: Sams Publishing
  • Search For More Articles!
  • Disclaimer
  • Author Terms
  • Rating: 4 stars4 stars4 stars4 stars4 stars / 45
    2005-04-13

    Table of Contents:
  • MySQL Optimization, part 1
  • 6.1.4 The MySQL Benchmark Suite
  • 6.2.1 EXPLAIN Syntax (Get Information About a SELECT)
  • 6.2.2 Estimating Query Performance
  • 6.2.6 How MySQL Optimizes IS NULL
  • 6.2.9 How MySQL Optimizes ORDER BY
  • 6.2.12 Speed of INSERT Queries
  • 6.2.15 Other Optimization Tips

  • Rate this Article: Poor Best 
      ADD THIS ARTICLE TO:
      Del.ici.ous Digg
      Blink Simpy
      Google Spurl
      Y! MyWeb Furl
    Email Me Similar Content When Posted
    Add Developer Shed Article Feed To Your Site
    Email Article To Friend
    Print Version Of Article
    PDF Version Of Article
     
     
    ADVERTISEMENT


    MySQL Optimization, part 1 - 6.2.2 Estimating Query Performance


    (Page 4 of 8 )

    In most cases, you can estimate the performance by counting disk seeks. For small tables, you can usually find a row in one disk seek (because the index is probably cached). For bigger tables, you can estimate that, using B-tree indexes, you will need this many seeks to find a row:
    log(row_count) / log(index_block_length / 3 * 2 / (index_length + data_pointer_length)) + 1.

    In MySQL, an index block is usually 1024 bytes and the data pointer is usually 4 bytes. For a 500,000-row table with an index length of 3 bytes (medium integer), the formula indicates log(500,000)/log(1024/3*2/(3+4)) + 1 = 4 seeks.

    This index would require storage of about 500,000 * 7 * 3/2 = 5.2MB (assuming a typical index buffer fill ratio of 2/3), so you will probably have much of the index in memory and you will probably need only one or two calls to read data to find the row.

    For writes, however, you will need four seek requests (as above) to find where to place the new index and normally two seeks to update the index and write the row.

    Note that the preceding discussion doesn't mean that your application performance will slowly degenerate by log N! As long as everything is cached by the OS or SQL server, things will become only marginally slower as the table gets bigger. After the data gets too big to be cached, things will start to go much slower until your application is only bound by disk-seeks (which increase by log N). To avoid this, increase the key cache size as the data grows. For MyISAM tables, the key cache size is controlled by the key_buffer_size system variable. See Section 6.5.2, "Tuning Server Parameters."

    6.2.3 Speed of SELECT Queries

    In general, when you want to make a slow SELECT ... WHERE query faster, the first thing to check is whether you can add an index. All references between different tables should usually be done with indexes. You can use the EXPLAIN statement to determine which indexes are used for a SELECT. See Section 6.4.5, "How MySQL Uses Indexes," and Section 6.2.1, "EXPLAIN Syntax (Get Information About a SELECT)."

    Some general tips for speeding up queries on MyISAM tables:

    • To help MySQL optimize queries better, use ANALYZE TABLE or run myisamchk --analyze on a table after it has been loaded with data. This updates a value for each index part that indicates the average number of rows that have the same value. (For unique indexes, this is always 1.) MySQL will use this to decide which index to choose when you join two tables based on a non-constant expression. You can check the result from the table analysis by using SHOW INDEX FROM tbl_name and examining the Cardinality value. myisamchk --description --verbose shows index distribution information.

    • To sort an index and data according to an index, use myisamchk --sort-index --sort-records=1 (if you want to sort on index 1). This is a good way to make queries faster if you have a unique index from which you want to read all records in order according to the index. Note that the first time you sort a large table this way, it may take a long time.

    6.2.4 How MySQL Optimizes WHERE Clauses

    This section discusses optimizations that can be made for processing WHERE clauses. The examples use SELECT statements, but the same optimizations apply for WHERE clauses in DELETE and UPDATE statements.

    Note that work on the MySQL optimizer is ongoing, so this section is incomplete. MySQL does many optimizations, not all of which are documented here.

    Some of the optimizations performed by MySQL are listed here:

    • Removal of unnecessary parentheses:
        ((a AND b) AND c OR (((a AND b) AND (c AND d))))
      -> (a AND b AND c) OR (a AND b AND c AND d)
    • Constant folding:
        (a<b AND b=c) AND a=5
      -> b>5 AND b=c AND a=5
    • Constant condition removal (needed because of constant folding):
        (B>=5 AND B=5) OR (B=6 AND 5=5) OR (B=7 AND 5=6)
      -> B=5 OR B=6
    • Constant expressions used by indexes are evaluated only once.

    • COUNT(*) on a single table without a WHERE is retrieved directly from the table information for MyISAM and HEAP tables. This is also done for any NOT NULL expression when used with only one table.

    • Early detection of invalid constant expressions. MySQL quickly detects that some SELECT statements are impossible and returns no rows.

    • HAVING is merged with WHERE if you don't use GROUP BY or group functions (COUNT(), MIN(), and so on).

    • For each table in a join, a simpler WHERE is constructed to get a fast WHERE evaluation for the table and also to skip records as soon as possible.

    • All constant tables are read first before any other tables in the query. A constant table is any of the following:

      • An empty table or a table with one row.

      • A table that is used with a WHERE clause on a PRIMARY KEY or a UNIQUE index, where all index parts are compared to constant expressions and are defined as NOT NULL.

    • All of the following tables are used as constant tables:
      SELECT * FROM t WHERE primary_key=1;
      SELECT * FROM t1,t2
      WHERE t1.primary_key=1 AND t2.primary_key=t1.id;
    • The best join combination for joining the tables is found by trying all possibilities. If all columns in ORDER BY and GROUP BY clauses come from the same table, that table is preferred first when joining.

    • If there is an ORDER BY clause and a different GROUP BY clause, or if the ORDER BY or GROUP BY contains columns from tables other than the first table in the join queue, a temporary table is created.

    • If you use SQL_SMALL_RESULT, MySQL uses an in-memory temporary table.

    • Each table index is queried, and the best index is used unless the optimizer believes that it will be more efficient to use a table scan. At one time, a scan was used based on whether the best index spanned more than 30% of the table. Now the optimizer is more complex and bases its estimate on additional factors such as table size, number of rows, and I/O block size, so a fixed percentage no longer determines the choice between using an index or a scan.

    • In some cases, MySQL can read rows from the index without even consulting the data file. If all columns used from the index are numeric, only the index tree is used to resolve the query.

    • Before each record is output, those that do not match the HAVING clause are skipped.

    Some examples of queries that are very fast:

    SELECT COUNT(*) FROM tbl_name;
    SELECT MIN(key_part1),MAX(key_part1) FROM tbl_name;
    SELECT MAX(key_part2) FROM tbl_name
    WHERE key_part1=constant;
    SELECT ... FROM tbl_name
    ORDER BY key_part1,key_part2,... LIMIT 10;
    SELECT ... FROM tbl_name
    ORDER BY key_part1 DESC, key_part2 DESC, ... LIMIT 10;

    The following queries are resolved using only the index tree, assuming that the indexed columns are numeric:

    SELECT key_part1,key_part2 FROM tbl_name WHERE key_part1=val;
    SELECT COUNT(*) FROM tbl_name
    WHERE key_part1=val1 AND key_part2=val2;
    SELECT key_part2 FROM tbl_name GROUP BY key_part1;

    The following queries use indexing to retrieve the rows in sorted order without a separate sorting pass:

    SELECT ... FROM tbl_name
    ORDER BY key_part1,key_part2,... ;
    SELECT ... FROM tbl_name
    ORDER BY key_part1 DESC, key_part2 DESC, ... ;

    6.2.5 How MySQL Optimizes OR Clauses

    The Index Merge method is used to retrieve rows with several ref, ref_or_null, or range scans and merge the results into one. This method is employed when the table condition is a disjunction of conditions for which ref, ref_or_null, or range could be used with different keys.

    This "join" type optimization is new in MySQL 5.0.0, and represents a significant change in behavior with regard to indexes, because the old rule was that the server is only ever able to use at most one index for each referenced table.

    In EXPLAIN output, this method appears as index_merge in the type column. In this case, the key column contains a list of indexes used, and key_len contains a list of the longest key parts for those indexes.

    Examples:

    SELECT * FROM tbl_name WHERE key_part1 = 10 OR key_part2 = 20;
    SELECT * FROM tbl_name
    WHERE (key_part1 = 10 OR key_part2 = 20) AND non_key_part=30;
    SELECT * FROM t1,t2
    WHERE (t1.key1 IN (1,2) OR t1.key2 LIKE 'value%')
    AND t2.key1=t1.some_col;
    SELECT * FROM t1,t2
    WHERE t1.key1=1
    AND (t2.key1=t1.some_col OR t2.key2=t1.some_col2);

    This article is excerpted from MySQL Administrator's Guide,by MySQL AB (editor) (Sams, 2004; ISBN 0672326345). Check it out at your favorite bookstore today. Buy this book now.

    More MySQL Articles
    More By Sams Publishing


     

       

    MYSQL ARTICLES

    - Take Some Load off MySQL with MemCached
    - MySQL Table Prefix Changer Tool in PHP
    - Using the SIGNAL Statement for Error Handling
    - Error Handling Examples
    - Error Handling
    - Completing a Search Engine with MySQL and PH...
    - Paginating Result Sets for a Search Engine B...
    - Building a Search Engine with MySQL and PHP 5
    - Using Boolean Operators for Full Text and Bo...
    - PHP, MySQL and the PEAR Database
    - Working with PHP and MySQL
    - Getting PHP to Talk to MySQL
    - Creating an RSS Reader: the Reader
    - MySQL Security Overview
    - Creating the Admin Script for a PHP/MySQL Bl...





    © 2003-2008 by Developer Shed. All rights reserved. DS Cluster 2 hosted by Hostway