HomeMySQL Page 4 - SQL Performance and Tuning Considerations, concluded
Displaying Execution Plans Using SQL Query Analyzer - MySQL
This article, the second of two parts, will show you how to make the SQL statements you write run faster and more efficiently. It is excerpted from chapter 11 of the book SQL DeMYSTiFied, written by Andy Oppel (McGraw-Hill/Osborne, 2005; ISBN: 0072262249).
Microsoft provides a facility for explaining query execution plans just as Oracle and DB2 do. In Microsoft SQL Server 2000, the SQL Query Analyzer tool has a button labeled “Display Estimated Execution Plan” that graphically displays how the SQL statement will be executed. This feature is also accessible from the Query menu item as the option Show Execution Plan. These items may have different names in other versions of Microsoft SQL Server. In SQL Server 2005, there does not appear to be a tool called SQL Query Analyzer, and it is not clear what the tool will be called when the product is formally released.
In the Query Analyzer tool, the default mode is Object Browser. To display an execution plan, click the Display Estimated Execution Plan icon or press CTRL-L. The execution plan is displayed graphically as shown in Figure 11-1. Each step is shown as an icon and the number near the icon shows the percentage of the total query cost that applies to that step. You can place your cursor pointer over any icon to see a pop-up window showing more detailed information for that step.
Figure 11-1.SQL Query Analyzer output
DB2 UDB Considerations
Like Oracle and Microsoft SQL Server, DB2 also provides a utility to display query execution plans. IBM provides a script called EXPLAIN.DDL that creates the tables needed to hold the output of the explain facility. Once the explain tables are created, the explain facility can be used to create execution plans for queries, and the plans can then be displayed using either Visual Explain in the Control Center or by querying the explain tables using SQL.
Tuning DML Statements
DML (Data Manipulation Language) statements generally produce fewer performance problems than query statements. However, there can be issues.
For INSERT statements, there are two main considerations:
Ensuring adequate free space for new rows Tablespaces that are short on space present problems as the DBMS searches for free space to hold rows being inserted. Moreover, inserts do not usually put rows into the table in primary key sequence because there usually isn’t free space in exactly the right places. Therefore, reorganizing the table, which is essentially a process of unloading the rows to a flat file, re-creating the table, and then reloading the table can improve both insert and query performance.
Index maintenance Every time a row is inserted into a table, a corresponding entry must be inserted into every index built on the table (although null values are usually not indexed). The more indexes, the more overhead every insert will require. Index free space can usually be tuned just as table free space can.
UPDATE statements have the following considerations:
Index maintenance If columns that are indexed are updated, the corresponding index entries must also be updated. In general, updating primary key values has particularly bad performance implications, so much so that some RDBMSs prohibit them.
Row expansion When columns are updated in such a way that the row grows significantly in size, the row may no longer fit in its original location, and there may not be free space around the row for it to expand in place (other rows might be right up against the one just updated). When this occurs, the row must either be moved to another location in the data file where it will fit or be split with the expanded part of the row placed in a new location, connected to the original location by a pointer. Both of these situations are not only expensive when they occur but are also detrimental to the performance of subsequent queries that touch those rows. Table reorganizations can resolve the issue, but it’s better to prevent the problem by designing the application so that rows tend not to grow in size after they are inserted.
DELETE statements are the least likely to present performance issues. However, a table that participates as a parent in a relationship that is defined with the ON DELETE CASCADE option can perform poorly if there are many child rows to delete. Moreover, index maintenance also comes into play here because index entries must be removed for any deleted rows.