Optimizing for Query Speed

Optimzing your queries can help them run more efficiently, which can save a significant amount of time. This article covers index optimization and index usage. It is excerpted from chapter 13 of the MySQL Certification Guide, written by Paul Dubois et al. (Sams, 2005; ISBN: 0672328127).

This chapter discusses general principles that are useful for optimizing queries to run more efficiently. It covers the following optimization strategies:

  • The primary optimization technique for reducing lookup times is to create good indexes. This is true not just for retrievals (SELECT statements); indexing reduces row lookup time for UPDATE and DELETE statements as well. You should know general principles for creating useful indexes and for avoiding unnecessary ones.

  • The EXPLAIN statement provides information about how the MySQL optimizer processes queries. This is of value when you’re trying to determine how to make a query run better (for example, if you suspect indexes are not being used as you think they should be).

  • The way a query is written might prevent indexes from being used even if they are available. Rewriting the query often will allow the optimizer to use an index and process a query faster. Other times you can use query modifiers to give the scheduler a hint about how to execute a query.

  • In some cases, query processing for a task can be improved by using a different database design. This includes techniques such as choosing a storage engine with properties that best match application requirements and using summary tables.

Questions on the material in this chapter make up approximately 15% of the exam.

Why be concerned about optimization? The most obvious reason is to make your queries run faster. Another is that optimizing your queries helps everybody who uses the server, not just you. When the server runs more smoothly and does more with less work, it performs better as a whole:

  • A query that takes less time to run doesn’t hold locks as long. Other clients that are waiting to update a table have to wait less time for a fast query than a slow one. This reduces the chance of a query backlog building up.

  • A query might be slow because it does not use indexes and therefore MySQL must scan a table in its entirety. For a large table, that involves a lot of processing and disk activity. This extra overhead affects not only your own query, it takes machine resources that could be devoted to processing other queries. Adding effective indexes allows MySQL to read only the relevant parts of the table, which is quicker and less disk intensive.

The optimization strategies covered here are guidelines known to result in generally improved query performance. However, you must test them in specific circumstances and measure the results, particularly if you can choose from more than one technique in a given situation.

The techniques discussed in this chapter can be used by any client application to improve how the queries it issues are executed by the server. Another approach to performance improvement is to reconfigure the server itself to change its overall operation. Server tuning is addressed in Chapter 16, “Advanced Server Features.”

{mospagebreak title=13.1 Index Optimization and Index Usage}

When you create a table, consider whether it should have indexes, because they have important benefits:

  • Indexes contain sorted values. This allows MySQL to find rows containing particular values faster. The effect can be particularly dramatic for joins, which have the potential to require many combinations of rows to be examined.

  • Indexes result in less disk I/O. The server can use an index to go directly to the relevant table records, which reduces the number of records it needs to read. Furthermore, if a query displays information only from indexed columns, MySQL might be able to process it by reading only the indexes and without accessing data rows at all.

13.1.1 Types of Indexes

MySQL supports four types of indexes:

  • A nonunique index is one in which any key value may occur multiple times. This type of index is defined with the keyword INDEX or KEY.

  • A UNIQUE index is unique-valued; that is, every key value is required to be different than all others. (The exception is that NULL values may occur multiple times.)

  • A PRIMARY KEY also is a unique-valued index. It is similar to a UNIQUE index, but has additional restrictions:

    • A table may have multiple UNIQUE indexes, but at most one PRIMARY KEY.

    • A UNIQUE index can contain NULL values, whereas a PRIMARY KEY cannot.

  • A FULLTEXT index is specially designed for text searching.

To define indexes when you’re initially creating a table, use CREATE TABLE. To add indexes to an already existing table, use ALTER TABLE or CREATE INDEX. To drop indexes, use ALTER TABLE or DROP INDEX.

ALTER TABLE can add or drop several indexes in the same statement, which is faster than processing each one separately. CREATE INDEX and DROP INDEX allow only one index to be added or dropped at a time.

Index creation using the INDEX, UNIQUE, and PRIMARY KEY keywords is discussed in the “Core Study Guide.” FULLTEXT indexes are not covered there because they are a more specialized kind of index. Instead, FULLTEXT indexing is discussed in section 13.1.4, “FULLTEXT Indexes.”

{mospagebreak title=13.1.2 Obtaining Table Index Information}

To find out what indexes a table has, use SHOW CREATE TABLE to display the CREATE TABLE statement that corresponds to the table structure, including its indexes.

For more detailed information about the indexes, use SHOW INDEX. For example, SHOW INDEX produces the following output for the Country table of the world database:

mysql> SHOW INDEX FROM CountryG
********************* 1. row **********************
    Table: Country
 Non_unique: 0
  Key_name: PRIMARY
Seq_in_index: 1
 Column_name: Code
  Collation: A
 Cardinality: NULL
  Sub_part: NULL
   Packed: NULL
    Null:
 Index_type: BTREE
   Comment:

The output indicates that the table has a single index, a primary key on the Code column. The output for the City table is similar except that it indicates the ID column is the primary key:

mysql> SHOW INDEX FROM CityG
********************* 1. row **********************
    Table: City
 Non_unique: 0
  Key_name: PRIMARY
Seq_in_index: 1
 Column_name: ID
  Collation: A
 Cardinality: NULL
  Sub_part: NULL
   Packed: NULL
    Null:
 Index_type: BTREE
   Comment:

For the CountryLanguage table, the output has two rows because the primary key includes two columns, Country and Language:

mysql> SHOW INDEX FROM CountryLanguageG
********************* 1. row ***************************
    Table: CountryLanguage
 Non_unique: 0
  Key_name: PRIMARY
Seq_in_index: 1
 Column_name: Country
  Collation: A
 Cardinality: NULL
  Sub_part: NULL
   Packed: NULL
    Null:
 Index_type: BTREE
   Comment:
******************** 2. row ***************************
    Table: CountryLanguage
 Non_unique: 0
  Key_name: PRIMARY
Seq_in_index: 2
 Column_name: Language
  Collation: A
 Cardinality: NULL
  Sub_part: NULL
   Packed: NULL
    Null:
 Index_type: BTREE
   Comment:

The Seq_in_index values show the order of the columns within the index. They indicate that the primary key columns are Country first and Language second. This information corresponds to the following PRIMARY KEY declaration:

PRIMARY KEY (Country, Language)
{mospagebreak title=13.1.3 Using Indexes}

An index helps MySQL perform retrievals more quickly than if no index is used. But indexes can be used with varying degrees of success, so you should keep several index-related considerations in mind when designing tables:

  • Declare an indexed column NOT NULL if possible. Although NULL values can be indexed, NULL is a special value that requires additional decisions when performing comparisons on key values. An index without NULL can be processed more simply and thus faster.

  • Avoid overindexing; don’t index a column just because you can. If you never refer to a column in comparisons (such as in WHERE, ORDER BY, or GROUP BY clauses), there’s no need to index it. Another reason to avoid unnecessary indexing is that every index you create slows down table updates. If you insert or delete a row, an entry must be added to or removed from each of the table’s indexes. If you update a row, any change to indexed columns require the appropriate indexes to be updated as well.

  • One strategy the MySQL optimizer uses is that if it appears an index will return a large percentage of the records in the table, it will be just as fast to scan the table as to incur the overhead required to process the index. As a consequence, an index on a column that has very few distinct values is unlikely to do much good. Suppose that a column is declared as ENUM('Y','N') and the values are roughly evenly distributed such that a search for either value returns about half of the records. In this case, an index on the column is unlikely to result in faster queries.

  • Choose unique and nonunique indexes appropriately. The choice might be influenced by the type of a column. If the column is declared as an ENUM, there is a fixed number of distinct column values that can be stored in it. This number is equal to the number of enumeration elements, plus one for the '' (empty string) element that is used when you attempt to store an illegal value. Should you choose to index an ENUM column, you likely should create a nonunique index. A PRIMARY KEY allows only as many rows as the number of distinct enumeration values. A UNIQUE index enforces a similar restriction, except that unless the column is declared NOT NULL, the index allows NULL values.

  • Index a column prefix rather than the entire column. MySQL caches index blocks in memory to avoid whenever possible reading them from disk repeatedly. Shortening the length of indexed values can improve performance by reducing the amount of disk I/O needed to read the index and by increasing the number of key values that fit into the key cache. This technique is discussed further in section 13.1.3.1, “Indexing Column Prefixes.”

  • Avoid creating multiple indexes that overlap (have the same initial columns). This is wasteful because MySQL can use a multiple-column index even when a query uses just the initial columns for lookups. For more information, see section 13.1.3.2, “Leftmost Index Prefixes.”

{mospagebreak title=13.1.3.1 Indexing Column Prefixes}

Short index values can be processed more quickly than long ones. Therefore, when you index a column, it’s worth asking whether it’s sufficient to index partial column values rather than complete values. This technique, known as indexing a column prefix, can be applied to string column types.

Suppose that you’re considering creating a table using this definition:

CREATE TABLE t
(
  name CHAR(255),
  INDEX (name)
);

If you index all 255 bytes of the values in the name column, index processing will be relatively slow:

  • It’s necessary to read more information from disk.

  • Longer values take longer to compare.

  • The index cache is not as effective because fewer index values fit into it at a time.

It’s often possible to overcome these problems by indexing only a prefix of the column values. For example, if you expect column values to be distinct most of the time in the first 15 bytes, index only that many bytes of each value, not all 255 bytes.

To specify a prefix length for a column, follow the column name in the index definition by a number in parentheses. The following table definition is the same as the previous one, except that the index uses just the first 15 bytes of each column value:

CREATE TABLE t
(
  name CHAR(255),
  INDEX (name(15))
);

Indexing a column prefix can speed up query processing, but works best when the prefix values tend to have about the same amount of uniqueness as the original values. Don’t use such a short prefix that you produce a very high frequency of duplicate values in the index. It might require some testing to find the optimal balance between long index values that provide good uniqueness versus shorter values that compare more quickly but have more duplicates. To determine the number of records in the table, the number of distinct values in the column, and the number of duplicates, use this query:

SELECT
  COUNT(*) AS 'Total Rows',
  COUNT(DISTINCT name) AS 'Distinct Values',
  COUNT(*) - COUNT(DISTINCT name) AS 'Duplicate
Values' FROM t;

That gives you an estimate of the amount of uniqueness in the name values. Then run a similar query on the prefix values:

SELECT
  COUNT(DISTINCT LEFT(name,n)) AS 'Distinct Prefix
Values', COUNT(*) - COUNT(DISTINCT LEFT(name,n)) AS
'Duplicate Prefix Values' FROM t;

That tells you how the uniqueness characteristics change when you use an n-byte prefix of the name values. Run the query with different values of n to determine an acceptable prefix length.

Note that when an index on a full column is a PRIMARY KEY or UNIQUE index, you might have to change the index to be nonunique if you decide to index prefix values instead. If you index partial column values and require the index to be unique, that means the prefix values must be unique, too.

{mospagebreak title=13.1.3.2 Leftmost Index Prefixes}

In a table that has a composite (multiple column) index, MySQL can use leftmost index prefixes of that index. A leftmost prefix of a composite index consists of one or more of the initial columns of the index. MySQL’s capability to use leftmost index prefixes enables you to avoid creating unnecessary indexes.

The CountryLanguage table in the world database provides an example of how a leftmost prefix applies. The table has a two-part primary key:

mysql> SHOW INDEX FROM CountryLanguageG
******************* 1. row ***************************
    Table: CountryLanguage
 Non_unique: 0
  Key_name: PRIMARY
Seq_in_index: 1
 Column_name: CountryCode
  Collation: A
 Cardinality: 246
  Sub_part: NULL
   Packed: NULL
    Null:
 Index_type: BTREE
   Comment:
******************* 2. row ***************************
    Table: CountryLanguage
 Non_unique: 0
  Key_name: PRIMARY
Seq_in_index: 2
 Column_name: Language
  Collation: A
 Cardinality: 984
  Sub_part: NULL
   Packed: NULL
    Null:
 Index_type: BTREE
   Comment:

The index on the CountryCode and Language columns allows records to be looked up quickly based on a given country name and language. However, MySQL also can use the index given just a country code. Suppose that you want to determine which languages are spoken in France:

SELECT * FROM CountryLanguage WHERE CountryCode =
'FRA';

MySQL can see that CountryCode is a leftmost prefix of the primary key and use it as though it were a separate index. This means there’s no need to define a second index on the CountryCode column alone.

On the other hand, if you want to perform indexed searches using just the Language column of the CountryLanguage table, you do need to create a separate index because Language is not a leftmost prefix of the existing index.

Note that a leftmost prefix of an index and an index on a column prefix are two different things. A leftmost prefix of an index consists of leading columns in a multiple-column index. An index on a column prefix indexes the leading bytes of values in the column.

{mospagebreak title=13.1.4 FULLTEXT Indexes}

FULLTEXT indexes are designed to make text searching fast and easy. They have the following characteristics:

  • FULLTEXT indexes currently are supported only for MyISAM tables.

  • Each column must be either CHAR or VARCHAR without the BINARY option, or one of the TEXT types. You cannot use a binary string column type such as CHAR BINARY, VARCHAR BINARY, or BLOB.

  • FULLTEXT indexes are not case sensitive. This is a consequence of the fact that the index can include only nonbinary string columns.

  • The syntax for defining a full-text index is much like that for other indexes: An index-type keyword (FULLTEXT), an optional index name, and a parenthesized list of one or more column names to be indexed. A FULLTEXT index may be created with CREATE TABLE, added to a table with ALTER TABLE or CREATE INDEX, and dropped from a table with ALTER TABLE or DROP INDEX. The following are all legal statements for FULLTEXT index manipulation:

    CREATE TABLE t (name CHAR(40),
    FULLTEXT (name)); ALTER TABLE t ADD FULLTEXT name_idx (name); ALTER TABLE t DROP INDEX name_idx; CREATE FULLTEXT INDEX name_idx ON t (name); DROP INDEX name_idx ON t;
  • Column prefixes are not applicable for FULLTEXT indexes, which always index entire columns. If you specify a prefix length for a column in a FULLTEXT index, MySQL ignores it.

  • FULLTEXT index indexes can be constructed on multiple columns, allowing searches to be conducted simultaneously on all the indexed columns. However, leftmost index prefixes are not applicable for FULLTEXT indexes. You must construct one index for every column or combination of columns you want to search. Suppose that you want to search for text sometimes only in column c1 and sometimes in both columns c1 and c2. You must construct two FULLTEXT indexes: one on column c1 and another on columns c1 and c2.

To perform a FULLTEXT search, use MATCH and AGAINST(). For example, to search the table t for records that contain 'Wendell' in the name column, use this query:

SELECT * FROM t WHERE MATCH(name)
AGAINST('Wendell');

The MATCH operator names the column or columns you want to search. As mentioned earlier, there must be a FULLTEXT index on exactly those columns. If you want to search different sets of columns, you’ll need one FULLTEXT index for each set. If a table people has name and address columns and you want to search them either separately or together, three FULLTEXT indexes are needed:

CREATE TABLE people
(
  name  CHAR(40),
  address CHAR(40),
  FULLTEXT (name),        # index for searching
name only FULLTEXT (address), # index for searching
address only FULLTEXT (name,address) # index for searching
name and address );

The indexes allow queries such as the following to be formulated:

SELECT * FROM people WHERE MATCH(name)
AGAINST('string'); SELECT * FROM people WHERE MATCH(address)
AGAINST('string'); SELECT * FROM people WHERE MATCH(name,address)
AGAINST('string');

The preceding discussion summary of FULLTEXT indexing and searching is very brief. More information may be found in the MySQL Reference Manual.

Please check back next week for the continuation of this article.

[gp-comments width="770" linklove="off" ]

chat sex hikayeleri Ensest hikaye