MySQL Optimization, part 1 - 6.2.6 How MySQL Optimizes IS NULL (
Page 5 of 8 )
MySQL can do the same optimization on col_name IS NULL that
it can do with col_name = constant_value. For example,
MySQL can use indexes and ranges to search for NULL with IS
NULL.
SELECT * FROM tbl_name WHERE key_col IS NULL;
SELECT * FROM tbl_name WHERE key_col <=> NULL;
SELECT * FROM tbl_name
WHERE key_col=const1 OR key_col=const2 OR key_col IS NULL;
If a WHERE clause includes a col_name IS NULL
condition for a column that is declared as NOT NULL, that expression
will be optimized away. This optimization does not occur in cases when the
column might produce NULL anyway; for example, if it comes from a table
on the right side of a LEFT JOIN.
MySQL 4.1.1 and up can additionally optimize the combination
col_name = expr AND col_name IS NULL, a form that
is common in resolved subqueries. EXPLAIN will show
ref_or_null when this optimization is used.
This optimization can handle one IS NULL for any key part.
Some examples of queries that are optimized, assuming that there is an index
on columns a and b or table t2:
SELECT * FROM t1 WHERE t1.a=expr OR t1.a IS NULL;
SELECT * FROM t1,t2 WHERE t1.a=t2.a OR t2.a IS NULL;
SELECT * FROM t1,t2
WHERE (t1.a=t2.a OR t2.a IS NULL) AND t2.b=t1.b;
SELECT * FROM t1,t2
WHERE t1.a=t2.a AND (t2.b=t1.b OR t2.b IS NULL);
SELECT * FROM t1,t2
WHERE (t1.a=t2.a AND t2.a IS NULL AND ...)
OR (t1.a=t2.a AND t2.a IS NULL AND ...);
ref_or_null works by first doing a read on the reference key, and
then a separate search for rows with a NULL key value.
Note that the optimization can handle only one IS NULL level. In the
following query, MySQL will use key lookups only on the expression
(t1.a=t2.a AND t2.a IS NULL) and not be able to use the key part on
b:
SELECT * FROM t1,t2
WHERE (t1.a=t2.a AND t2.a IS NULL)
OR (t1.b=t2.b AND t2.b IS NULL) ;
6.2.7 How MySQL Optimizes DISTINCT
DISTINCT combined with ORDER BY will need a temporary table
in many cases.
Note that because DISTINCT may use GROUP BY, you should be
aware of how MySQL works with columns in ORDER BY or HAVING
clauses that are not part of the selected columns.
MySQL extends the use of GROUP BY so that you can use columns or
calculations in the SELECT list that don't appear in the GROUP
BY clause. This stands for any possible value for this group. You
can use this to get better performance by avoiding sorting and grouping on
unnecessary items. For example, you don't need to group on
customer.name in the following query:
mysql> SELECT order.custid, customer.name, MAX(payments)
-> FROM order,customer
-> WHERE order.custid = customer.custid
-> GROUP BY order.custid;
In standard SQL, you would have to add customer.name to the
GROUP BY clause. In MySQL, the name is redundant if you don't run in
ANSI mode.
Do not use this feature if the columns you omit from the GROUP
BY part are not unique in the group! You will get unpredictable
results.
In some cases, you can use MIN() and MAX() to obtain a
specific column value even if it isn't unique. The following gives the value of
column from the row containing the smallest value in the sort
column:
SUBSTR(MIN(CONCAT(RPAD(sort,6,' '),column)),7)
When combining LIMIT row_count with DISTINCT, MySQL
stops as soon as it finds row_count unique rows.
If you don't use columns from all tables named in a query, MySQL stops
scanning the not-used tables as soon as it finds the first match. In the
following case, assuming that t1 is used before t2 (which you
can check with EXPLAIN), MySQL stops reading from t2 (for any
particular row in t1) when the first row in t2 is found:
SELECT DISTINCT t1.a FROM t1,t2 where t1.a=t2.a;
6.2.8 How MySQL Optimizes LEFT JOIN and RIGHT
JOIN
A LEFT JOIN B join_condition is implemented in MySQL as
follows:
-
Table B is set to depend on table A and all tables on which
A depends.
-
Table A is set to depend on all tables (except B) that are
used in the LEFT JOIN condition.
-
The LEFT JOIN condition is used to decide how to retrieve rows from
table B. (In other words, any condition in the WHERE clause is
not used.)
-
All standard join optimizations are done, with the exception that a table is
always read after all tables on which it depends. If there is a circular
dependence, MySQL issues an error.
-
All standard WHERE optimizations are done.
-
If there is a row in A that matches the WHERE clause, but
there is no row in B that matches the ON condition, an extra
B row is generated with all columns set to NULL.
-
If you use LEFT JOIN to find rows that don't exist in some table and
you have the following test: col_name IS NULL in the
WHERE part, where col_name is a column that is declared
as NOT NULL, MySQL stops searching for more rows (for a particular key
combination) after it has found one row that matches the LEFT JOIN
condition.
RIGHT JOIN is implemented analogously to LEFT JOIN, with
the roles of the tables reversed.
The join optimizer calculates the order in which tables should be joined. The
table read order forced by LEFT JOIN and STRAIGHT_JOIN helps
the join optimizer do its work much more quickly, because there are fewer table
permutations to check. Note that this means that if you do a query of the
following type, MySQL will do a full scan on b because the LEFT
JOIN forces it to be read before d:
SELECT *
FROM a,b LEFT JOIN c ON (c.key=a.key) LEFT JOIN d ON (d.key=a.key)
WHERE b.key=d.key;
The fix in this case is to rewrite the query as follows:
SELECT *
FROM b,a LEFT JOIN c ON (c.key=a.key) LEFT JOIN d ON (d.key=a.key)
WHERE b.key=d.key;
Starting from 4.0.14, MySQL does the following LEFT JOIN
optimization: If the WHERE condition is always false for the generated
NULL row, the LEFT JOIN is changed to a normal join.
For example, the WHERE clause would be false in the following query
if t2.column1 would be NULL:
SELECT * FROM t1 LEFT JOIN t2 ON (column1) WHERE t2.column2=5;
Therefore, it's safe to convert the query to a normal join:
SELECT * FROM t1,t2 WHERE t2.column2=5 AND t1.column1=t2.column1;
This can be made faster because MySQL can now use table t2 before
table t1 if this would result in a better query plan. To force a
specific table order, use STRAIGHT_JOIN.
This article is excerpted from MySQL
Administrator's Guide, by MySQL AB (editor) (Sams, 2004;
ISBN 0672326345). Check it out at your
favorite bookstore today. Buy
this book now. |