Optimizing the Logical Database Structure

In this final article of our series, you will learn more ways of organizing your data to achieve better query performance. It is excerpted from chapter 13 of the MySQL Certification Guide, written by Paul Dubois et al. (Sams, 2005; ISBN: 0672328127).

13.4 Optimizing the Logical Database Structure

This section describes techniques for organizing data that can help you achieve better query performance.

13.4.1 Choosing Appropriate Table Types

When creating a table, ask yourself what types of queries you’ll use it for. Then choose a table type that uses a locking level appropriate for the anticipated query mix. MyISAM table-level locking works best for a query mix that is heavily skewed toward retrievals and includes few updates. Use InnoDB if you must process a query mix containing many updates. InnoDB’s use of row-level locking and multi-versioning provides good concurrency for a mix of retrievals and updates. One query can update rows while other queries read or update different rows of the table.

If you’re using MyISAM tables, choose their structure to reflect whether you consider efficiency of processing speed or disk usage to be more important. Different MyISAM storage formats have different performance characteristics. This influences whether you choose fixed-length or variable-length columns to store string data:

  • Use fixed-length columns (CHAR) for best speed. Fixed-length columns allow MySQL to create the table with fixed-length rows. The advantage is that fixed-length rows all are stored in the table at positions that are a multiple of the row length and can be looked up very quickly. The disadvantage is that fixed-length values are always the same length even for values that do not use the full width of the column, so the column takes more storage space.

  • Use variable-length columns (VARCHAR, TEXT, BLOB) for best use of disk space. For example, values in a VARCHAR column take only as much space as necessary to store each value and on average use less storage than a CHAR column. The disadvantage is that variable-length columns result in variable-length rows. These are not stored at fixed positions within the table, so they cannot be retrieved as quickly as fixed-length rows. In addition, the contents of variable-length rows might not even be stored all in one place, another source of processing overhead.

For InnoDB tables, it is also true that CHAR columns take more space on average than VARCHAR. But there is no retrieval speed advantage for InnoDB as there is with MyISAM, because the InnoDB engine implements storage for both CHAR and VARCHAR in a similar way. In fact, retrieval of CHAR values might be slower because on average they require more information to be read from disk.

If a MyISAM table contains a mix of fixed-length and variable-length columns, the table format will be dynamic. However, if many of the queries on the table access only its fixed-length columns, it is sometimes possible to gain advantages both of static tables (faster retrieval) and of dynamic tables (lower storage requirements) by splitting the table into two tables. Use a fixed-format table to hold the fixed-length columns and a dynamic-format table to hold the variable-length columns. To split the table into two, use this procedure:

  1. Make sure that the table contains a primary key that allows each record to be uniquely identified. (You might use an AUTO_INCREMENT column, for example.)

  2. Create a second table that has columns for all the variable-length columns in the original table, plus a column to store values from the primary key of the original table. (This column should be a primary key as well, but should not be an AUTO_INCREMENT column.)

  3. Copy the primary key column and the variable-length columns from the original table to the second table. The second table will be dynamic.

  4. Use ALTER TABLE to drop the variable-length columns (but not the primary key) from the original table. MySQL will notice that the table no longer contains any variable-length columns and convert it to static format.

After modifying the table structure this way, queries that retrieve only fixed-width columns can use the static table, and will be quicker. For queries that retrieve both fixed-width and variable-width columns, join the two tables using the primary key values to match up rows.

Another option with MyISAM tables is to use compressed read-only tables.

For more information about MyISAM table structure, see section 14.2.1, “MyISAM Storage Formats.”

MERGE tables can use a mix of compressed and uncompressed tables. This can be useful for time-based records. For example, if you log records each year to a different log file, you can use an uncompressed log table for the current year so that you can update it, but compress the tables for past years to save space. If you then create a MERGE table from the collection, you can easily run queries that search all tables together.

{mospagebreak title=13.4.2 Using Summary Tables}

Suppose that you run an analysis consisting of a set of retrievals that each perform a complex SELECT of a set of records (perhaps using an expensive join), and that differ only in the way they summarize the records. That’s inefficient because it unnecessarily does the work of selecting the records repeatedly. A better technique is to select the records once, and then use them to generate the summaries. In such a situation, consider the following strategy:

  1. Select the set of to-be-summarized records into a temporary table. In MySQL, you can do this easily with a CREATE TEMPORARY TABLE … SELECT statement.

  2. Create any appropriate indexes on the temporary table.

  3. Calculate the summaries using the temporary table.

The following example creates a summary table containing the average GNP value of countries in each continent. Then it compares the summary information to individual countries to find those countries with a GNP much less than the average and much more than the average.

First, create the summary table:

mysql> CREATE TABLE ContinentGNP
  -> SELECT Continent, AVG(GNP) AS AvgGNP
  -> FROM Country GROUP BY Continent;
mysql> SELECT * FROM ContinentGNP;
+---------------+---------------+
| Continent     | AvgGNP        |
+---------------+---------------+
| Asia          | 150105.725490 |
| Europe        | 206497.065217 |
| North America | 261854.789189 |
| Africa        | 10006.465517  |
| Oceania       | 14991.953571  |
| Antarctica    |   0.000000    |
| South America | 107991.000000 |
+---------------+---------------+

Next, compare the summary table to the original table to find countries that have a GNP less than 1% of the continental average:

mysql> SELECT
  ->   Country.Continent, Country.Name,
  ->   Country.GNP AS CountryGNP,
  ->   ContinentGNP.AvgGNP AS ContinentAvgGNP
  -> FROM Country, ContinentGNP
  -> WHERE
  ->   Country.Continent = ContinentGNP.Continent
  ->   AND Country.GNP < ContinentGNP.AvgGNP * .01
  -> ORDER BY Country.Continent, Country.Name;
+-----------+---------------------------+------------+-----------------+
| Continent | Name                      | CountryGNP | ContinentAvgGNP |
+-----------+---------------------------+------------+-----------------+
| Asia      | Bhutan                    |     372.00 |  150105.725490  |
| Asia      | East Timor                |       0.00 |  150105.725490  |
| Asia      | Laos                      |    1292.00 |  150105.725490  |
| Asia      | Maldives                  |     199.00 |  150105.725490  |
| Asia      | Mongolia                  |    1043.00 |  150105.725490  |
| Europe    | Andorra                   |    1630.00 |  206497.065217  |
| Europe    | Faroe Islands             |       0.00 |  206497.065217  |
| Europe    | Gibraltar                 |     258.00 |  206497.065217  |
| Europe    | Holy See (Vat. City State)|       9.00 |  206497.065217  |
| Europe    | Liechtenstein             |    1119.00 |  206497.065217  |
...

Use the summary table again to find countries that have a GNP more than 10 times the continental average:

mysql> SELECT
  ->   Country.Continent, Country.Name,
  ->   Country.GNP AS CountryGNP,
  ->   ContinentGNP.AvgGNP AS ContinentAvgGNP
  -> FROM Country, ContinentGNP
  -> WHERE
  ->   Country.Continent = ContinentGNP.Continent
  ->   AND Country.GNP > ContinentGNP.AvgGNP * 10
  -> ORDER BY Country.Continent, Country.Name;
+---------------+---------------+------------+-----------------+
| Continent     | Name          | CountryGNP | ContinentAvgGNP |
+---------------+---------------+------------+-----------------+
| Asia          | Japan         | 3787042.00 |  150105.725490  |
| Europe        | Germany       | 2133367.00 |  206497.065217  |
| North America | United States | 8510700.00 |  261854.789189  |
| Africa        | South Africa  |  116729.00 |   10006.465517  |
| Oceania    |   Australia      |  351182.00 |   14991.953571  |
+---------------+---------------+------------+-----------------+

The technique of using a summary table has several benefits:

  • Calculating the summary information a single time reduces the overall computational burden by eliminating most of the repetition involved in performing the initial record selection.

  • If the original table is a type that is subject to table-level locking, such as a MyISAM table, using a summary table leaves the original table available more of the time for updates by other clients by reducing the amount of time that the table remains locked.

  • If the summary table is small enough that it’s reasonable to hold in memory, you can increase performance even more by making it a HEAP table. Queries on the table will be especially fast because they require no disk I/O. When the HEAP table no longer is needed, drop it to free the memory allocated for it.

  • Some queries are difficult or impossible to perform without using a summary table. For example, you cannot compute a summary from a set of rows and compare each row to the summarized value within a single query. However, you can use a summary table and join it to the original table to do this.

Use of summary tables has the disadvantage that the records they contain are up-to-date only as long as the original values remain unchanged, and thus so are any summaries calculated from them. If the original table rarely or never changes, this might be only a minor concern. For many applications, summaries that are close approximations are sufficiently accurate.

The summary table technique can be applied at multiple levels. Create a summary table that holds the results of an initial summary, and then summarize that table in different ways to produce secondary summaries. This avoids the computational expense of generating the initial summary repeatedly.

When a summary consists of a single value, you need not create a table at all. Use a SQL variable to hold the value. Then you can use the value for comparison purposes in subsequent queries without having to calculate it again.

{mospagebreak title=13.5 Exercises}

Question 1:

Consider the following table with two indexes:

mysql> DESCRIBE fastindex;
+-------+----------+------+-----+
| Field | Type     | Null | Key |
+-------+----------+------+-----+
| i1    | char(10) |      | MUL |
| i2    | char(10) | YES  | MUL |
+-------+----------+------+-----+

With no other facts given, which of the following queries would you expect to run faster?

SELECT i1 FROM fastindex WHERE i1 LIKE 'mid%';

SELECT i2 FROM fastindex WHERE i2 LIKE 'mid%';

Question 2:

Consider the following table with indexes:

mysql> SHOW CREATE TABLE fastindex;
+-----------+---------------------------
| Table     | Create Table
+-----------+---------------------------
| fastindex | CREATE TABLE ´fastindex´ (
 ´i1´ char(10) NOT NULL default '',
 ´i2´ char(10) NOT NULL default '',
 KEY ´i1´ (´i1´(3)),
 KEY ´i2´ (´i2´)
) TYPE=MyISAM |
+-----------+---------------------------

With no other facts given, which of the following queries would you expect to run faster?

SELECT i1 FROM fastindex WHERE i1 LIKE 'mid%';

SELECT i2 FROM fastindex WHERE i2 LIKE 'mid%';

Question 3:

For what reason can adding indexes to a table make table operations slower?

Question 4:

Consider the following table structure, which will be used for the next four questions:

mysql> DESCRIBE City;
+-------------+----------+------+-----+---------+----------------+
| Field       | Type     | Null | Key | Default | Extra          |
+-------------+----------+------+-----+---------+----------------+
| ID          | int(11)  |      | PRI | NULL    | auto_increment |
| Name        | char(35) | YES  |     | NULL    |                |
| CountryCode | char(3)  | YES  |     | NULL    |                |
| District    | char(20) | YES  |     | NULL    |                |
| Population  | int(11)  | YES  |     | 0       |                |
+-------------+----------+------+-----+---------+----------------+

You frequently retrieve data from the City table, using queries similar to those shown here:

mysql> SELECT * FROM City WHERE Name BETWEEN 'E'
AND 'G' ORDER BY Name;
+------+------------------+-------------+--------------+------------+ | ID | Name | CountryCode | District | Population | +------+------------------+-------------+--------------+------------+ | 735 | East London | ZAF | Eastern Cape | 221047 | | 3963 | East Los Angeles | USA | California | 126379 | | 1845 | East York | CAN | Ontario | 114034 | | 533 | Eastbourne | GBR | England | 90000 | | 1720 | Ebetsu | JPN | Hokkaido | 118805 | | ... | ... | ... | ... | ... | mysql> SELECT * FROM City WHERE CountryCode >= 'Y'
ORDER BY name;
+------+------------+-------------+----------------+------------+ | ID | Name | CountryCode | District | Population | +------+------------+-------------+----------------+------------+ | 1781 | Aden | YEM | Aden | 398300 | | 1784 | al-Mukalla | YEM | Hadramawt | 122400 | | 721 | Alberton | ZAF | Gauteng | 410102 | | 724 | Benoni | ZAF | Gauteng | 365467 | | 1792 | Beograd | YUG | Central Serbia | 1204000 | | ... | ... | ... | ... | ... |

How would you determine the number of rows MySQL must inspect to calculate the result sets?

Question 5:

Consider, once again, the table structure and sample queries shown for the City table in the previous question. What index or indexes would you add to the table to speed up the queries?

Question 6:

Here again are the table structure and sample queries first shown for the City table two questions previously, but with the addition of the indexes on the Name and CountryCode columns from the previous question:

mysql> DESCRIBE City;
+-------------+----------+------+-----+---------+----------------+
| Field       | Type     | Null | Key | Default | Extra          |
+-------------+----------+------+-----+---------+----------------+
| ID          |  int(11) |      | PRI | NULL    | auto_increment |
| Name        | char(35) | YES  | MUL | NULL    |                |
| CountryCode |  char(3) | YES  | MUL | NULL    |                |
| District    | char(20) | YES  |     | NULL    |                |
| Population  |  int(11) | YES  |     | 0       |                |
+-------------+----------+------+-----+---------+----------------+

mysql> SELECT * FROM City WHERE Name BETWEEN 'E'
AND 'G' ORDER BY Name;
+------+------------------+-------------+--------------+------------+ | ID | Name | CountryCode | District | Population | +------+------------------+-------------+--------------+------------+ | 735 | East London | ZAF | Eastern Cape | 221047 | | 3963 | East Los Angeles | USA | California | 126379 | | 1845 | East York | CAN | Ontario | 114034 | | 533 | Eastbourne | GBR | England | 90000 | | 1720 | Ebetsu | JPN | Hokkaido | 118805 | | ... | ... | ... | ... | ... | mysql> SELECT * FROM City WHERE CountryCode >= 'Y'
ORDER BY name;
+------+------------+-------------+----------------+------------+ | ID | Name | CountryCode | District | Population | +------+------------+-------------+----------------+------------+ | 1781 | Aden | YEM | Aden | 398300 | | 1784 | al-Mukalla | YEM | Hadramawt | 122400 | | 721 | Alberton | ZAF | Gauteng | 410102 | | 724 | Benoni | ZAF | Gauteng | 365467 | | 1792 | Beograd | YUG | Central Serbia | 1204000 | | ... | ... | ... | ... | ... |

In addition to adding indexes to the City table, what else can be done, with regard to the table’s columns, to improve performance?

Question 7:

Consider, once again, the new table structure and the sample queries shown for the City table in the previous question. How would you find out whether the new indexes on the table are actually used to resolve the queries?

Question 8:

Consider the following table:

mysql> DESCRIBE enumtest;
+-------+--------------------------------+------+-----+---------+-------+
| Field | Type                           | Null | Key | Default | Extra |
+-------+--------------------------------+------+-----+---------+-------+
| col   | enum('first','second','third') |      | PRI | first   |       |
+-------+--------------------------------+------+-----+---------+-------+
mysql> SELECT * FROM enumtest;
Empty set

Will the following statement fail or will it insert rows? What will the contents of the enumtest table be after executing the statement?

mysql> INSERT INTO enumtest VALUES
  -> ('first'),('second'),('third'),('false'),
('fourth');

Question 9:

Consider the following table, which has two single-column FULLTEXT indexes:

mysql> DESCRIBE faq;
+----------+---------------+------+-----+---------+-------+
| Field    | Type          | Null | Key | Default | Extra |
+----------+---------------+------+-----+---------+-------+
| cdate    | timestamp(14) | YES  |     | NULL    |       |
| question | char(150)     |      | MUL |         |       |
| answer   | char(250)     |      | MUL |         |       |
+----------+---------------+------+-----+---------+-------+

mysql> SHOW INDEX FROM faq;
+-------+------------+----------+-  -+-------------+-  -+------------+-
| Table | Non_unique | Key_name | ...| Column_name | ...| Index_type | ...
+-------+------------+----------+-  -+-------------+-  -+------------+-
| faq   |          1 | question | ...| question    | ...| FULLTEXT   | ...
| faq   |          1 | answer   | ...| answer      | ...| FULLTEXT   | ...
+-------+------------+----------+-  -+-------------+-  -+------------+-

With MATCH … AGAINST(), you can search the answers and the questions stored in the table. How would you search for a search term 'MySQL' in the question column?

Question 10:

Consider the following table, which has two single-column FULLTEXT indexes:

mysql> DESCRIBE faq;
+----------+---------------+------+-----+---------+-------+
| Field    | Type          | Null | Key | Default | Extra |
+----------+---------------+------+-----+---------+-------+
| cdate    | timestamp(14) | YES  |     | NULL    |       |
| question | char(150)     |      | MUL |         |       |
| answer   | char(250)     |      | MUL |         |       |
+----------+---------------+------+-----+---------+-------+
mysql> SHOW INDEX FROM faq;
+-------+------------+----------+-  -+-------------+-  -+------------+-
| Table | Non_unique | Key_name | ...| Column_name | ...| Index_type | ...
+-------+------------+----------+-  -+-------------+-  -+------------+-
| faq   |          1 | question | ...| question    | ...| FULLTEXT   | ...
| faq   |          1 | answer   | ...| answer      | ...| FULLTEXT   | ...
+-------+------------+----------+-  -+-------------+-  -+------------+-

With MATCH … AGAINST(), you can search the answers and the questions stored in the table. How would you search for the search term 'Access' in either the question or the answer column?

{mospagebreak title=More Exercises}

Question 11:

Consider the following tables:

mysql> DESCRIBE City; DESCRIBE Country;
+-------------+----------+------+-----+---------+-------+
| Field       | Type     | Null | Key | Default | Extra |
+-------------+----------+------+-----+---------+-------+
| ID          |  int(11) |      |     | 0       |       |
| Name        | char(35) |      |     |         |       |
| CountryCode |  char(3) |      |     |         |       |
| District    | char(20) |      |     |         |       |
| Population  |  int(11) |      |     | 0       |       |
+-------------+----------+------+-----+---------+-------+
+----------------+------------------------+------+-----+---------+-------+
| Field          | Type                   | Null | Key | Default | Extra |
+----------------+------------------------+------+-----+---------+-------+
| Code           | char(3)                |      | PRI |         |       |
| Name           | char(52)               |      |     |         |       |
| Continent      | enum('Asia','Europe',) |      |     | Asia    |       |
| Region         | char(26)               |      |     |         |       |
| SurfaceArea    | float(10,2)            |      |     | 0.00    |       |
| IndepYear      | smallint(6)            | YES  |     | NULL    |       |
| Population     | int(11)                |      |     | 0       |       |
| LifeExpectancy | float(3,1)             | YES  |     | NULL    |       |
| GNP            | float(10,2)            | YES  |     | NULL    |       |
| GNPOld         | float(10,2)            | YES  |     | NULL    |       |
| LocalName      | char(45)               |      |     |         |       |
| GovernmentForm | char(45)               |      |     |         |       |
| HeadOfState    | char(60)               | YES  |     | NULL    |       |
| Capital        | int(11)                | YES  |     | NULL    |       |
| Code2          | char(2)                |      |     |         |       |
+----------------+------------------------+------+-----+---------+-------+

The tables are related: CountryCode in City references Code in Country. What information does the following EXPLAIN statement give you regarding possible optimization of the query?

mysql> EXPLAIN
  -> SELECT
  -> City.Name, City.Population, Country.Name
  -> FROM City INNER JOIN Country
  -> ON City.CountryCode = Country.Code
  -> WHERE City.Population > 10000000
  -> ORDER BY City.Population DESC
  -> G
*********************** 1. row ***************************
    table: City
     type: ALL
possible_keys: NULL
     key: NULL
   key_len: NULL
     ref: NULL
     rows: 4079
    Extra: Using where; Using filesort
*********************** 2. row ***************************
    table: Country
     type: eq_ref
possible_keys: PRIMARY
     key: PRIMARY
   key_len: 3
     ref: City.CountryCode
     rows: 1
    Extra:

Question 12:

Based on the information provided by the EXPLAIN in the previous question, what would you do to optimize the query performance?

Question 13:

Consider, once again, the EXPLAIN output for the Country and City tables from the previous two questions. How would you roughly “measure” the performance for the unoptimized query? For the optimized query?

Question 14:

Most of the time, the MySQL optimizer makes the right choice of indexes to use for a query. However, you suspect that, for a certain query, the optimizer is not making the right choice. How can you determine whether the optimizer is choosing the index you want it to use?

Question 15:

Most of the time, the MySQL optimizer makes the right choice of indexes to use for a query. However, you suspect that, for a certain query, the optimizer is not making the right choice. How could you rewrite the query to determine whether it runs faster without using an index?

Question 16:

Most of the time, the MySQL optimizer makes the right choice of indexes to use for a query. However, you suspect that, for a certain query, the optimizer is not making the right choice. How could you force MySQL to use an index that is different from the index which the optimizer would choose?

Question 17:

Consider the following table and its indexes:

mysql> DESCRIBE key1;
+-------+----------+------+-----+---------+-------+
| Field | Type     | Null | Key | Default | Extra |
+-------+----------+------+-----+---------+-------+
| col   | char(10) | YES  | MUL | NULL    |       |
+-------+----------+------+-----+---------+-------+
mysql> SHOW KEYS FROM key1;
+-------+------------+----------+--------------+-------------+-
| Table | Non_unique | Key_name | Seq_in_index | Column_name | ...
+-------+------------+----------+--------------+-------------+-
| key1  |          1 | col      |            1 |     col     | ...
+-------+------------+----------+--------------+-------------+-

Which of the following queries will most likely perform faster, and why? How could you actually find out which query runs faster?

SELECT * FROM key1 WHERE col LIKE '%2%'

SELECT * FROM key1 WHERE col LIKE 'hey 2%'

Question 18:

Assume that you have a table that is subject to many read (SELECT) requests. Compared to the number of reads, you have only a few write (INSERT) requests taking place. Furthermore, you consider the reads more important than the write requests. What could you do to give read requests priority over write requests?

Question 19:

Consider the following table and its indexes:

mysql> DESCRIBE mix1;
+-------+-------------+------+-----+---------+-------+
| Field | Type        | Null | Key | Default | Extra |
+-------+-------------+------+-----+---------+-------+
| id    | int(11)     |      | PRI | 0       |       |
| name  | varchar(20) | YES  |     | NULL    |       |
| story | text        | YES  |     | NULL    |       |
+-------+-------------+------+-----+---------+-------+
mysql> SHOW KEYS FROM mix1;
+-------+------------+----------+-
| Table | Non_unique | Key_name | ...
+-------+------------+----------+-
| mix1  |          0 | PRIMARY  | ...
+-------+------------+----------+-

Assume that you have many seeks on the mix1 table, most of which use id or name as a search term. Searches are becoming considerably slow. What can you do to improve the situation?

Question 20:

Consider the following table and its indexes:

mysql> DESCRIBE mix1;
+-------+-------------+------+-----+---------+-------+
| Field | Type        | Null | Key | Default | Extra |
+-------+-------------+------+-----+---------+-------+
| id    | int(11)     |      | PRI | 0       |       |
| name  | varchar(20) | YES  |     | NULL    |       |
| story | text        | YES  |     | NULL    |       |
+-------+-------------+------+-----+---------+-------+
mysql> SHOW KEYS FROM mix1;
+-------+------------+----------+-
| Table | Non_unique | Key_name | ...
+-------+------------+----------+-
| mix1  |     0      | PRIMARY  | ...
+-------+------------+----------+-

Assume that you have many seeks on the mix1 table, most of which look for a search term in the story column. What can you do to speed up those searches?

Question 21:

Assume that you hit a filesystem limit on file size with a MyISAM table. That table contains a FULLTEXT index, so you cannot switch to another storage engine. Also, assume that it isn’t possible to change the filesystem you’re using. What else could you do to overcome the filesystem size limit?

{mospagebreak title=Answers to Exercises}

Answer 1:

A column or index that can contain NULL values cannot be processed as fast as one that cannot contain NULL. i1 and i2 are identical except that i1 cannot contain NULL values, so i1 should be faster to process. Therefore, this query should be faster:

SELECT i1 FROM fastindex WHERE i1 LIKE 'mid%';

Answer 2:

SELECT i1 FROM fastindex WHERE i1 LIKE 'mid%';

would probably perform faster because i1 is indexed with only the first three bytes as subpart of that index. MySQL can look up that index faster because it contains only up to three-character rows, as compared to the second index that could contain up to ten- character rows.

Answer 3:

Insert, delete, and update operations will become slower when the table has indexes, because those operations require the indexes to be updated, too.

Answer 4:

You can use EXPLAIN to determine the number of rows MySQL must inspect to calculate the result sets:

mysql> EXPLAIN SELECT * FROM City WHERE Name
BETWEEN 'E' AND 'G' -> ORDER BY NameG
*********************** 1. row *************************** table: City type: ALL possible_keys: NULL key: NULL key_len: NULL ref: NULL rows: 4079 Extra: Using where; Using filesort mysql> EXPLAIN SELECT * FROM City WHERE
CountryCode >= 'Y' -> ORDER BY NameG
*********************** 1. row *************************** table: City type: ALL possible_keys: NULL key: NULL key_len: NULL ref: NULL rows: 4079 Extra: Using where; Using filesort

The EXPLAIN output shows that MySQL would not use indexes to process the queries. All rows (4,079) must be scanned to calculate the results. This is indicated by the ALL value in the type column as well.

See section A.1.15, “EXPLAIN.”

Answer 5:

To improve performance, indexes should be added to the Name and CountryCode columns because those are the columns used in the comparisons that determine which rows to return. Also, because Name is used in the ORDER BY clause, an index on Name can speed up sorting operations.

For the Name column, the results of the queries in question indicate that an index with a prefix length that is shorter than the full column length is likely to improve performance even more. However, the prefix length should be long enough to differentiate cities that begin with words like “East Lo…”, so we choose a prefix length of 10:

mysql> ALTER TABLE City
  -> ADD INDEX (Name(10)),
  -> ADD INDEX (CountryCode)
  -> ;

Answer 6:

Another means of making table lookups faster is to declare the table’s columns to be NOT NULL. Assume that City must contain a city name in each row, as well as a country code for each city. To disallow NULL values in the Name and CountryCode columns, you could alter the table with this SQL statement:

mysql> ALTER TABLE City
  -> MODIFY Name CHAR(35) NOT NULL,
  -> MODIFY CountryCode CHAR(3) NOT NULL
  -> ;
Query OK, 4079 rows affected (0.21 sec)
Records: 4079 Duplicates: 0 Warnings: 0

mysql> DESCRIBE City;
+-------------+----------+------+-----+---------+----------------+
| Field       | Type     | Null | Key | Default | Extra          |
+-------------+----------+------+-----+---------+----------------+
| ID          | int(11)  |      | PRI | NULL    | auto_increment |
| Name        | char(35) |      | MUL |         |                |
| CountryCode | char(3)  |      | MUL |         |                |
| District    | char(20) | YES  |     | NULL    |                |
| Population  | int(11)  | YES  |     | 0       |                |
+-------------+----------+------+-----+---------+----------------+

See section A.1.1, “ALTER TABLE.”

Answer 7:

To check whether MySQL actually uses the new indexes to resolve the queries, use EXPLAIN once again:

mysql> EXPLAIN SELECT * FROM City WHERE Name
BETWEEN 'E' AND 'G' -> ORDER BY NameG
*********************** 1. row *************************** table: City type: range possible_keys: Name key: Name key_len: 5 ref: NULL rows: 146 Extra: Using where; Using filesort mysql> EXPLAIN SELECT * FROM City WHERE CountryCode
>= 'Y' ORDER BY NameG
*********************** 1. row *************************** table: City type: range possible_keys: CountryCode key: CountryCode key_len: 3 ref: NULL rows: 76 Extra: Using where; Using filesort

The EXPLAIN output shows that the indexes you would expect to be used actually are used by MySQL to resolve the queries. Compared to the previous results from EXPLAIN (three questions previously), the number of rows inspected drops dramatically from 4,079 to 146 and 76 due to the use of indexes.

See section A.1.15, “EXPLAIN.”

Answer 8:

Table enumtest has a primary key on its only column col. Therefore, there can be only unique values in that column. Because of the ENUM column type, this means that there can be only four different values in the column (the three enumeration members and the empty string that is used for invalid values). false is an invalid value, so it is converted to '' (the empty string). The last value (fourth) is not in the ENUM list, either, so it too is converted to the error value ''. The primary key, however, prevents that same value from being stored again, which leads to a duplicate key error:

mysql> INSERT INTO enumtest VALUES
  -> ('first'),('second'),('third'),('false'),
('fourth');
ERROR 1062: Duplicate entry '' for key 1

For a multiple-row INSERT statement, rows are inserted as long as no error occurs. If a row fails, that row and any following rows are not inserted. As a result, the table contents are:

mysql> SELECT * FROM enumtest;
+--------+
| col    |
+--------+
|        |
| first  |
| second |
| third  |
+--------+
4 rows in set

See section A.3, “Column Types.”

Answer 9:

A search for 'MySQL' in the question column only could be performed as follows:

mysql> SELECT
  -> LEFT(question,20), LEFT(answer,20)
  -> FROM faq
  -> WHERE MATCH(question) AGAINST('MySQL')
  -> ;

The result of the query could look like this:

+------------------------+-----------------------+
| LEFT(question,20)      | LEFT(answer,20)       |
+------------------------+-----------------------+
| Does MySQL support t   | Yes, as of version 3  |
| When will MySQL supp   | This is on the TODO   |
| Does MySQL support f   | Yes, as of version 3  |
| Does MySQL support s   | Not yet, but stored   |
| Is MySQL available u   | Yes, you can buy a l  |
| When was MySQL relea   | MySQL was first rele  |
+------------------------+-----------------------+

See sections A.1.39, “SHOW INDEX” and A.1.11, “DESCRIBE.”

Answer 10:

A search for 'Access' in either the question or the answer column could be performed as follows:

mysql> SELECT
  -> LEFT(question,20), LEFT(answer,20)
  -> FROM faq
  -> WHERE MATCH(question) AGAINST('Access')
  -> OR MATCH(answer) AGAINST('Access')
  -> ;

The result of the query could look like this:

+----------------------+-----------------------+
| LEFT(question,20)    | LEFT(answer,20)       |
+----------------------+-----------------------+
| Is there a database  | Access will most pro  |
| Is Microsoft Access  | It's sold as a datab  |
+----------------------+-----------------------+

Note that OR in the preceding query means that you’re looking for the word “Access” whether it appears only in the question, only in the answer, or in both the question and the answer. To find records that contain “Access” in both the question and the answer, you would use AND instead of OR in the query.

See sections A.1.39, “SHOW INDEX,” and A.1.11, “DESCRIBE.”

{mospagebreak title=More answers}

Answer 11:

EXPLAIN provides the following information:

  • For table City, EXPLAIN indicates that all table rows must be scanned to find the desired information (ALL). There are no keys on the columns that should be retrieved, nor on the column mentioned in the ORDER BY clause, so no keys are used as indicated by the NULL entries for possible_keys, key, key_len, and ref. Therefore, all 4,079 table rows are scanned. Using filesort indicates that MySQL needs to do an extra pass to find out how to retrieve the rows in sorted order.

  • For table Country, EXPLAIN shows a join type of eq_ref. This is the best possible join type; it means that only one row is read from this table for each row from the previous table. This join type is possible because the index used for table Country is a primary key, as indicated by the PRIMARY entries for possible_keys and key. The primary key has the same length as the column itself (3 bytes, as indicated by key_len, too). ref shows which column is used with the key to select rows from the table: the CountryCode column of the City table. The rows entry of 1 thus indicates that MySQL must examine one row of the Country table to find the match for each CountryCode value selected from the City table.

See section A.1.15, “EXPLAIN.”

Answer 12:

To optimize the query shown by the EXPLAIN in the last question, you could create an index for the Population column of the City table because it is used both in the WHERE clause to determine which rows to retrieve and in the ORDER BY clause, to sort the result:

mysql> ALTER TABLE City
  -> ADD INDEX (Population)
-> ; Query OK, 4079 rows affected (0.68 sec) Records: 4079 Duplicates: 0 Warnings: 0

With the new index, EXPLAIN displays the following for the same query:

mysql> EXPLAIN
  -> SELECT
  -> City.Name, City.Population, Country.Name
  -> FROM City INNER JOIN Country
  -> ON City.CountryCode = Country.Code
  -> WHERE City.Population > 10000000
  -> ORDER BY City.Population DESC
  -> G
*********************** 1. row ***************************
    table: City
     type: range
possible_keys: Population
     key: Population
   key_len: 4
     ref: NULL
     rows: 9
    Extra: Using where
*********************** 2. row ***************************
    table: Country
     type: eq_ref
possible_keys: PRIMARY
     key: PRIMARY
   key_len: 3
     ref: City.CountryCode
     rows: 1
    Extra:

The EXPLAIN output for the Country table is unchanged, but the output for the City table indicates a much improved search. It shows that only rows within a given range of Population values will be retrieved (type: range), using an index to select the rows. The possible key Population is actually used with its full key length (4). Due to the use of the new index, MySQL now has to inspect only nine rows to resolve the WHERE clause.

See section A.1.15, “EXPLAIN.”

Answer 13:

As a rough measure of performance, take the product of the rows output of the EXPLAIN statements before and after the addition of the index: In the original, unoptimized situation, the product of the rows values is 4,079 * 1 = 4,079. With the index added to optimize the query, the product is only 9 * 1 = 9. This lower value indicates that performance is better with the new index.

Answer 14:

To find out which indexes the optimizer will use, prefix your query with EXPLAIN. For example:

EXPLAIN SELECT Name FROM City;

See section A.1.15, “EXPLAIN.”

Answer 15:

To rewrite a query that forces MySQL not to use a specific index that the optimizer would otherwise choose, you would use the IGNORE INDEX (or IGNORE KEY) option. For example:

SELECT Name FROM City IGNORE INDEX (idx_name);

See section A.1.29, “SELECT.”

Answer 16:

To force the optimizer to use a specific index, you would use the FORCE INDEX (or FORCE KEY) option. For example:

SELECT Name FROM City FORCE INDEX (idx_name);

Another option is USE INDEX, (or USE KEY) but this provides only a hint whereas FORCE INDEX requires the index to be used.

See section A.1.29, “SELECT.”

Answer 17:

To find out which query runs faster, you could look at the query execution times the server reports to the client (for example, mysql). These values could, however, be affected by other circumstances than the actual server execution time. More reliable values could be retrieved with the query analyzer (EXPLAIN). This would show that the MySQL optimizer can use indexes more efficiently for the second query:

mysql> EXPLAIN SELECT * FROM key1 WHERE col LIKE
'%2%'G
************************ 1. row *************************** table: key1 type: index possible_keys: NULL key: col key_len: 11 ref: NULL rows: 3599 Extra: Using where; Using index 1 row in set (0.05 sec) mysql> EXPLAIN SELECT * FROM key1 WHERE col LIKE
'hey2%'G
************************ 1. row *************************** table: key1 type: range possible_keys: col key: col key_len: 11 ref: NULL rows: 1 Extra: Using where; Using index 1 row in set (0.27 sec)

The listing shows—besides other things—that MySQL will have to examine 3,783 rows for the first query, but only 221 for the second one. This occurs because MySQL can use an index for a LIKE pattern match if the pattern begins with a literal value, but not if it begins with a wildcard character.

See section A.1.15, “EXPLAIN.”

Answer 18:

To give read requests higher priority than write requests, you can use either of the following strategies:

  • INSERT DELAYED will cause INSERT statements to wait until there are no more pending read requests on that table.

  • SELECT HIGH_PRIORITY will give a SELECT statement priority over write requests.

See section A.1.18, “INSERT.”

Answer 19:

To improve searches on the id and name columns, you essentially have two choices:

  • You could add an index to the name column, thus improving searches for names.

  • You could split the table into two separate tables, thus avoiding disk I/O caused by the TEXT column when MySQL has to scan the table. The mix1 table could be split as shown here:

    mysql> DESCRIBE mix1; DESCRIBE mix2;
    +-------+-------------+------+-----+---------+-------+
    | Field | Type        | Null | Key | Default | Extra |
    +-------+-------------+------+-----+---------+-------+
    | id    | int(11)     |      | PRI | 0       |       |
    | name  | varchar(20) | YES  |     | NULL    |       |
    +-------+-------------+------+-----+---------+-------+
    +---------+---------+------+-----+---------+-------+
    | Field   | Type    | Null | Key | Default | Extra |
    +---------+---------+------+-----+---------+-------+
    | mix1_id | int(11) |      |     | 0       |       |
    | story   | text    | YES  |     | NULL    |       |
    +---------+---------+------+-----+---------+-------+

You could also combine both of the strategies just described.

See sections A.1.11, “DESCRIBE,” and A.1.39, “SHOW INDEX.”

Answer 20:

To improve searches on the story column, you could add a FULLTEXT index to that column, like this:

mysql> ALTER TABLE mix1 ADD FULLTEXT (story);
mysql> SHOW KEYS FROM mix1;
+-------+------------+----------+-   -+------------+
| Table | Non_unique | Key_name | ... | Index_type |
+-------+------------+----------+-   -+------------+
| mix1  |     0      | PRIMARY  | ... | BTREE      |
| mix1  |     1      | story    | ... | FULLTEXT   |
+-------+------------+----------+-   -+------------+

See sections A.1.11, “DESCRIBE,” A.1.39, “SHOW INDEX,” and A.1.1, “ALTER TABLE.”

Answer 21:

In that scenario, the only solution would be to use MERGE tables, and to split up the MyISAM tables into a number of smaller MyISAM tables, each of which will not hit the filesystem size limit.

[gp-comments width="770" linklove="off" ]
antalya escort bayan antalya escort bayan