Using Subqueries In MySQL (part 2) - A Solitary Existence (
Page 4 of 9 )
In addition to comparison operators and the IN membership test, MySQL also
comes with a special EXISTS operator, designed specifically for use with
subqueries. The EXISTS operator is used to test if a subquery generates any
result rows, and executes the outer query only if it does.
Here's a
simple example:
mysql> SELECT * FROM clients WHERE EXISTS (SELECT bid FROM
branches_services GROUP BY bid HAVING COUNT(sid) >= 5);
Empty set (0.17 sec)
In this case, since the subquery returns an empty result set - there are no
branches using five or more services - the EXISTS test will return false and the
outer query will not execute.
Look what happens when you doctor the subquery above to return something:
mysql> SELECT * FROM clients WHERE EXISTS (SELECT bid FROM
branches_services GROUP BY bid HAVING COUNT(sid) >= 4);
+-----+-----------------------------+
| cid | cname |
+-----+-----------------------------+
| 101 | JV Real Estate |
| 102 | ABC Talent Agency |
| 103 | DMW Trading |
| 104 | Rabbit Foods Inc |
| 110 | Sharp Eyes Detective Agency |
+-----+-----------------------------+
5 rows in set (0.27 sec)
In this case, since there are some branches using four or more services, the
inner query will return a result set consisting of at least one row, the EXISTS
test will return true, and the outer query will be executed.
It's important to note that the result set of the outer query above
does
*not* list which customers have more than four services. Rather, it is
simply a list of all customers, and it is returned only because the inner query
generated a result set. In the example above, the result set generated by the
inner query is itself immaterial; I could accomplish the same thing with the
following query:
mysql> SELECT * FROM clients WHERE EXISTS (SELECT 1);
+-----+-----------------------------+
| cid | cname |
+-----+-----------------------------+
| 101 | JV Real Estate |
| 102 | ABC Talent Agency |
| 103 | DMW Trading |
| 104 | Rabbit Foods Inc |
| 110 | Sharp Eyes Detective Agency |
+-----+-----------------------------+
5 rows in set (0.16 sec)
With this in mind, you might be wondering what use the EXISTS operator really
is. Good question.
You see, the EXISTS operator is most often used in the context of what SQL
gurus like to call an "outer reference" - that is, a reference within the
subquery to a field in one or more of the queries enclosing it. When such an
outer reference is used, MySQL has to run the subquery once for every record
generated by the outer query, and therefore test the subquery as many times as
there are records in the outer query's result set, since the inner query
contains a reference to a field (whose value is different for every record) in
the outer query.
An example might make this clearer. Consider the following, which re-runs the
previous example with an outer reference to make the result set
more
useful:
mysql> SELECT bid, bdesc FROM branches WHERE EXISTS (SELECT bid FROM
branches_services WHERE branches.bid = branches_services.bid GROUP BY bid HAVING
COUNT(sid) >= 4);
+------+--------------+
| bid | bdesc |
+------+--------------+
| 1011 | Corporate HQ |
+------+--------------+
1 row in set (0.22 sec)
And if you look at the data, you'll see that there is, in fact, only one
branch using four or more services (branch ID 1011).
In this case, since the inner query contains a reference to a field in the
outer query, MySQL cannot run the inner query just once (as it usually does).
Rather, it has to run it over and over, once for every row in the outer table,
then substitute the value of the named field from that row in the subquery, and
then decide whether or not to include that outer row in the final result set on
the basis of whether or not the corresponding subquery returns a result set or
not.
Needless to say, this is expensive in terms of performance, and outer
references should hence be avoided unless absolutely necessary. Usually, you can
accomplish the same thing faster and more optimally with a join,
as
below:
mysql> SELECT branches.bid, branches.bdesc FROM branches_services,
mysql> branches
WHERE branches_services.bid = branches.bid GROUP BY bid HAVING COUNT(sid)
>= 4;
+------+--------------+
| bid | bdesc |
+------+--------------+
| 1011 | Corporate HQ |
+------+--------------+
1 row in set (0.28 sec)