In addition to comparison operators and the IN membership test, MySQL also comes with a special EXISTS operator, designed specifically for use with subqueries. The EXISTS operator is used to test if a subquery generates any result rows, and executes the outer query only if it does. Here's a simple example:
In this case, since the subquery returns an empty result set - there are no branches using five or more services - the EXISTS test will return false and the outer query will not execute. Look what happens when you doctor the subquery above to return something:
In this case, since there are some branches using four or more services, the inner query will return a result set consisting of at least one row, the EXISTS test will return true, and the outer query will be executed. It's important to note that the result set of the outer query above does
With this in mind, you might be wondering what use the EXISTS operator really is. Good question. You see, the EXISTS operator is most often used in the context of what SQL gurus like to call an "outer reference" - that is, a reference within the subquery to a field in one or more of the queries enclosing it. When such an outer reference is used, MySQL has to run the subquery once for every record generated by the outer query, and therefore test the subquery as many times as there are records in the outer query's result set, since the inner query contains a reference to a field (whose value is different for every record) in the outer query. An example might make this clearer. Consider the following, which re-runs the previous example with an outer reference to make the result set more
And if you look at the data, you'll see that there is, in fact, only one branch using four or more services (branch ID 1011). In this case, since the inner query contains a reference to a field in the outer query, MySQL cannot run the inner query just once (as it usually does). Rather, it has to run it over and over, once for every row in the outer table, then substitute the value of the named field from that row in the subquery, and then decide whether or not to include that outer row in the final result set on the basis of whether or not the corresponding subquery returns a result set or not. Needless to say, this is expensive in terms of performance, and outer references should hence be avoided unless absolutely necessary. Usually, you can accomplish the same thing faster and more optimally with a join, as
blog comments powered by Disqus |