HomeMySQL Page 2 - What’s New In MySQL 4.1 Part One: Overview and Subqueries
Scalar and Correlated Subqueries - MySQL
The current release of MySQL, version 4.1.10, offers significant improvements over version 4. While it still has some room for improvement, its new features and capabilities should silence the critics who have up until now regarded it as little more than a toy. In this article, the first of two parts, David Fells covers scalar and correlated queries, derived tables, and row level subqueries.
Subqueries and derived tables are arguably the biggest and most important change in MySQL 4.1. A subquery is, as the name suggests, a query within a query. There are five general types of subqueries in standard SQL, all of which are supported. Subqueries may be used in the context of a row, a table to select from, a table to test membership, a correlated table, or as a scalar.
A scalar subquery is a subquery that returns a single value of a basic data type – meaning that the result of the subquery contains no rows or columns. This can also be thought of as one column from one row. Scalar subqueries may be used in the field list of a SELECT statement, for comparison in a WHERE statement, or in the VALUES list of an INSERT statement. Scalar subqueries are the kind that are most easily replaced by a join, whereas some of the other subquery types cannot be replicated with joins. Here is an example of a scalar subquery.
The sample query would return a 1, because the value selected from table1 would be 1. If there were multiple rows in table1, the subquery would fail without adding a WHERE clause to restrict the results to a single row. This example is academic, but demonstrates the nature of a scalar subquery. Scalar subqueries can typically be replaced with joins, which are far more efficient in terms of processing time. Keep this in mind when constructing queries – just because subqueries are available does not make them right tool for the job.
The next type of subquery is possibly the most useful, and the most taxing on the server: the correlated subquery. A correlated subquery is a subquery that refers to one or more tables outside of the subquery expression. Consider this example:
SELECT * FROM table1 WHERE column1 = ( SELECT column1 FROM table2 WHERE table1.column1 = table2.column1 )
In this example, the subquery does not contain a reference to table1 in the FROM clause, so the engine goes to the outer query, where it finds table1. If table1 was not in the outer query’s FROM clause, this statement would produce an error. You may have noticed that this subquery is also a scalar subquery – it only returns one value to the outer query. This type of query often cannot be replicated with a join, depending on the complexity of the query.
Subqueries can be nested to 63 levels in theory, though the query optimizer will likely fall apart after four or five if they are correlated. It is important to remember that when nesting subqueries, the innermost subquery will be processed first, working outward. In order to get the data you expect, this often requires aliasing columns as well as tables if you intend to relate to a column in a table beyond the first outer query.