HomeMySQL Optimizing Queries with Operators, Branching and Functions
Optimizing Queries with Operators, Branching and Functions
This article will give you a good grounding in operators, branching and functions in MySQL, so you can make the database, instead of your own code, do the bulk of the work. It is the first of three parts, and excerpted from chapter four of Beginning MySQL Database Design and Optimization: From Novice to Professional, written by Jon Stephens and Chad Russell (Apress; ISBN: 1590593324).
IN THIS CHAPTER, we’re going to explore the central premise of this book: It is almost always the best course of action to make the database perform as much work as possible, thus minimizing the amount of filtering and manipulating data that you do in your programming code. Another way of saying this is that by replacing program logic with SQL logic, it is possible to gain significant optimization benefits. A strong working knowledge of operators, branching, and functions in MySQL, with this optimization goal in mind, is a key part of your database skills.
While the SQL dialect supported in MySQL isn’t a complete programming language in and of itself, it does provide many capabilities for manipulating and filtering data. These include logical and arithmetic operators, functions for working with strings and numbers, and branching operators that allow you to make and act on decisions within queries. As you’ll see in this chapter, using these, you can create powerful queries and open new optimization opportunities.
We’ll start out the chapter by demonstrating that it’s possible to replace a lot of your program logic with SQL logic and achieve optimization benefits. Then we’ll cover the following MySQL features:
Logical, arithmetic, and comparison operators
Math functions
Functions for manipulating strings
Date and time functions
User-defined variables
Decision-making and flow-control constructs
With the possible exception of the basic operators, these are among MySQL’s most underused and underappreciated features.
By the time you’ve finished this chapter, you’ll be familiar with most of these features, and you’ll be able to understand their role in optimizing your own projects to improve their performance.
Replacing Program Logic with SQL Logic: A Demonstration
Many application developers don’t seem to realize that SQL, while not a complete programming language in its own right, still has a wealth of constructs that can be used to fine-tune queries so that only necessary data is returned by them. Quite often, we’ve seen cases where programmers will return overly large result-sets, and then filter or modify them in program code.
For example, consider a scenario in which we wish to output to a web page a list of the members of some organization. Let’s suppose the member data is stored in a memberstable that was created using this statement:
CREATE TABLE members ( firstname VARCHAR(30) NOT NULL, lastname VARCHAR(30) NOT NULL, # possibly more column definitions that don’t relate to this example... dob DATE NOT NULL );
We want to format the listing nicely as an HTML table with the full name of each member in one of two columns and an age range in the other. Here’s one way we could do that using PHP 4:
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd"> <html> <head> <meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1"> <title>List All Members With Age Ranges [PHP4 / Logic In PHP Code]</title> </head> <body> <?php // establish a connection to MySQL, and select the proper database.. . $connection = mysql_connect("localhost", "jon", "mypass"); mysql_select_db("mydb", $connection); // get the current year for the age calculation below... $year = (int) date("Y"); // submit a query... $query = "SELECT firstname, lastname, dob FROM members ORDER BY lastname"; $result = mysql_query($query); ?> <!-- begin table display --> <table cellpadding="5" cellspacing="0" border="1"> <tr><th colspan="2">MEMBERS BY AGE GROUP</th></tr> <tr><th>Name</th><th>Age Group</th></tr> <?php // for each row returned from the database... while($row = mysql_fetch_assoc($result)) { extract($row); // get the year of birth from the birthdate column // get the age by subtracting it from the current year $age = $year - (int) substr($dob, 0, 4); // determine the age range based on the year if($age >= 65) $age_range = "Over 65"; elseif($age >= 45) $age_range = "45-64"; elseif($age >= 30) $age_range = "30-44"; elseif($age >= 18) $age_range = "18-29"; else $age_range = "Under 18"; // output the resulting age range along with the member's name echo "<tr><td>$firstname $lastname</td><td>$age_range</td></tr>\n"; } ?> </table> <!-- end table display --> </body> </html>
NOTE Anyone who is familiar with PHP might wonder why we didn’t use a switch ... case block instead of the if ... elseif ... else. The reason is that Python doesn’t have an analogue to the former. And the reason why we care about this will become apparent shortly. So please bear with us.
There’s nothing really wrong with this approach (it does work, after all), but notice that we need to process each row of the result set as part of the output loop. Wouldn’t it be nice if we could obtain exactly the data we needed from the database, formatted exactly as we would like to use it in the output?
It turns out that we can do exactly that, by moving the formatting and decision-making functionality into the query that we send to MySQL:
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd"> <html> <head> <meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1"> <title>List All Members With Age Ranges [PHP4 / Logic in SQL]</title> </head> <body> <?php // connect and select.. . $connection = mysql_connect("localhost", "jon", "mypass"); mysql_select_db("mydb", $connection); // new and improved query, which handles the logic and formatting $query = "SELECT CONCAT(firstname, ' ', lastname) AS name, CASE WHEN (@age := YEAR(CURRENT_DATE) - YEAR(dob)) > 65 THEN 'Over 65' WHEN @age >= 45 THEN '45-64' WHEN @age >= 30 THEN '30-44' WHEN @age >= 18 THEN '18-19' ELSE 'Under 18' END AS age_range FROM members ORDER BY lastname"; $result = mysql_query($query); ?> <!-- begin table display --> <table cellpadding="5" cellspacing="0" border="1"> <tr><th colspan="2">MEMBERS BY AGE GROUP</th></tr> <tr><th>Name</th><th>Age Group</th></tr> <?php // for each record returned by the query... while($row = mysql_fetch_assoc($result)) { // extract and output the data extract($row); echo "<tr><td>$name</td><td>$age_range</td></tr> \n"; } ?> </table> <!-- end table display --> </body> </html>
By the time you reach the end of this chapter, you’ll be able to come back to this example and see exactly what the complex query does, as well as write your own. For now, just assume that we know what we’re doing, and that the end result is the same as in the first example.
This second method has three main optimization advantages. We’ll cover each advantage one by one now in some detail, as they support the central premise that we are conveying to you in this chapter. We’ll also revisit this example at the end of the chapter.