It’s hard to admit, but sooner or later we have to face the facts: most web developers (including myself) are reluctant to write SQL queries from scratch. To be honest, this happens for a logical reason: having so many languages and technologies to deal with at the same time on the client and server-side, it’s perfectly understandable that we don’t want to waste our “valuable” time hard coding boring SQL commands that in most cases have been neatly wrapped inside a framework or a library.
Having such an evasive attitude with SQL is simply to defer an unavoidable fact: even if you use a framework that does the hard work for you and runs a bunch of SQL statements behind a fancy API, eventually you’ll be confronted with a project where you’ll find yourself writing fine-tuned SQL code that (hopefully) will improve the performance of the database trips performed by your application.
If you want to take your current programming skills to the next level, you must have at least an average background in SQL. One of the best ways to embrace this challenge is by learning the basics of SQL joins, as they permit you to get the most out of any RDBMS.
Therefore, if you’re still wondering how SQL joins can help you in your web development endeavors, or simply want to fill some gaps in your knowledge, in this two-part tutorial I’ll be providing you with some easy-to-grasp examples, which will show you how to perform the most common types of joins, including FULL, INNER, LEFT and RIGHT joins in MySQL. Keep in mind, however, that you’ll be able to port with little or no modifications the entirety of the examples to other popular RDBMS’ as well.
Taking the First Step: Defining Some Sample MySQL Tables
As its name suggests, SQL joins allow you to retrieve data from two or more tables in one go. While the driving logic is that simple, the functionality of joins is indeed a powerful one, as it permits you to optimize the number (and the quality) of queries that are executed against a database.
Since I want to keep the examples included in this tutorial approachable and easy to follow, to best show how to work with joins in MySQL, I’m going to use only two sample tables (even though it’s possible to join more tables in a fairly easy fashion).
Having said that, it’s time to show the definition of the first table. This is a “classic”, and comes in handy for storing information about some books. Its definition is as follows:
DROP TABLE IF EXISTS `test`.`books`;
CREATE TABLE `test`.`books` (
`id` int(4) unsigned NOT NULL auto_increment,
`title` varchar(100) NOT NULL,
`author` varchar(100) NOT NULL,
`category` varchar(100) NOT NULL,
PRIMARY KEY (`id`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8;
There’s not much that can be said about the above “books” table, except that it defines some text columns (aside from the primary key “id”), which allow you to save the title, the author, and the category of a given book. Admittedly, the structure of this table is nothing complicated, so the next thing that needs to be done is to populate it with some actual records. The following batch of INSERTS do exactly that:
INSERT INTO books (id, title, author, category) VALUES (NULL, 'PHP and MySQL Web Development (4th Edition)', 'Luke Welling and Laura Thomson', 'PHP Programming'); INSERT INTO books (id, title, author, category) VALUES (NULL, 'PHP Solutions: Dynamic Web Design Made Easy', 'David Powers', 'PHP Programming'); INSERT INTO books (id, title, author, category) VALUES (NULL, 'Learning Java', 'Patrick Niemeyer and Jonathan Knudsen', 'Java Programming'); INSERT INTO books (id, title, author, category) VALUES (NULL, 'Effective Java (2nd Edition)', 'Joshua Bloch', 'Java Programming'); INSERT INTO books (id, title, author, category) VALUES (NULL, 'C++ Primer Plus (5th Edition)', 'Stephen Prata', 'C++ Programming'); INSERT INTO books (id, title, author, category) VALUES (NULL, 'C++ How to Program (7th Edition)', 'Paul Deitel and Harvey M. Deitel', 'C++ Programming'); INSERT INTO books (id, title, author, category) VALUES (NULL, 'Pro C# 2010 and the .NET 4 Platform', 'Andrew W. Troelsen', 'C# Programming'); INSERT INTO books (id, title, author, category) VALUES (NULL, 'C# 4.0 in a Nutshell: The Definitive Reference', 'Joseph Albahari and Ben Albahari', 'C# Programming'); INSERT INTO books (id, title, author, category) VALUES (NULL, 'C# in Depth, Second Edition', 'Jon Skeet', 'C# Programming'); INSERT INTO books (id, title, author, category) VALUES (NULL, 'Microsoft Visual C# 2010 Step by Step', 'John Sharp', 'C# Programming');
Mission accomplished. Now that the “books” table contains data about some real programming books, it’s time to create a second table. In this way we can play around with both of them and have some fun with joins. This brand new table is called “customers” and – not surprisingly – will be used for storing information about the customers that purchased (or not) some of the previous books.
Here’s the definition of this additional sample table:
DROP TABLE IF EXISTS `test`.`customers`;
CREATE TABLE `test`.`customers` (
`id` int(4) unsigned NOT NULL auto_increment,
`first_name` varchar(45) default NULL,
`last_name` varchar(45) default NULL,
`book_id` int(4) unsigned default NULL,
PRIMARY KEY (`id`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8;
Even though the structure of the “customers” table is self-explanatory, it’s worth stressing that it defines a foreign key called “book_id”, which comes in handy for showing what book has been purchased by a customer (in this case, our costumers have a rather small wallet, so they can buy only one book).
With the previous table already defined, the last step that must be taken is to fill it with a few records. The below code snippet performs this task in a jiffy:
INSERT INTO users (first_name, last_name, book_id) VALUES (NULL, 'Sandra', 'Smith', 1); INSERT INTO users (first_name, last_name, book_id) VALUES (NULL, 'Daniel', 'Norton', 2); INSERT INTO users (first_name, last_name, book_id) VALUES (NULL, 'Susan', 'Wilson', 5); INSERT INTO users (first_name, last_name, book_id) VALUES (NULL, 'Jennifer', 'Adams', 8); INSERT INTO users (first_name, last_name, book_id) VALUES (NULL, 'Joan', 'Wilkinson');
Done. If at this moment, you take a look at the sample tables just defined, they should look like this:
So far, so good, right? Now that the “books” and “customers” tables are up and running, it’s time to play around a bit and retrieve some data from them through some simple joins.
Playing Around with Joins: Performing a FULL Join
As explained in the introduction, the beauty of joins resides in its ability to fetch data from two or more tables using a single query. Considering that I defined only two tables, the first example that I want to show you is how to use the tables in question for performing the simplest kind of join, which turns out to be one called the FULL join or Cartesian product.
Simply put, a FULL join returns a new table containing all the possible combinations between the rows of the first and the second table. Since the earlier “customers” table has five rows, while its cousin “books” contains 10 records, the result of a FULL join would generate a table with a total of 50 (5*10) records. Of course, the best way to understand this process is with functional code, so be sure to check the following query, which performs a FULL join with the aforementioned tables:
SELECT customers.first_name, customers.last_name, books.title FROM customers, books
As you can see, the syntax of a FULL join is pretty intuitive, as you only need to specify which columns should be retrieved, followed by the names of the joined tables separated by a comma (notice the use of the dot notation, in order to avoid possible clashing with fields that have been named the same).
In addition, it’s possible to run the previous join using explicitly the JOIN keyword, something depicted in the following example:
SELECT customers.first_name, customers.last_name, books.title FROM customers JOIN books
Effectively, both expressions are equivalent, even when the latter uses the JOIN clause instead of a comma to specify which tables should be joined. What’s more, regardless of the syntax that you pick up for executing the query, it should produce the table shown in the below snapshot:
There’s no need to be a rocket scientist to realize that FULL joins aren’t very useful in practice, as they only return a bunch of rows that in most cases don’t meet any specified condition. But fear not, as you can easily address this issue by using a new kind of join called INNER join.
Being More Specific with Returned Rows: Working with INNER Joins
As I said in the earlier segment, FULL joins are per se pretty useless, since most of the times you’ll need to retrieve data from two tables that match a given condition. In a case like this, you’ll want to use another type of join called “INNER” join, which will return the rows from both tables that meet the specified condition.
Again, an example is the most useful ally to make things clear; so take a peek at the one below, which uses an INNER join to retrieve the customers that purchased a book along with the book’s title:
SELECT customers.first_name, customers.last_name, books.title FROM customers INNER JOIN books ON customers.book_id=books.id ORDER BY customers.first_name
That was pretty easy to code and read, wasn’t it? As one might expect, the above query will return only the rows in both tables that match the given condition. Moreover, if you’re curious to see the result that this joins produces, the following screen capture should be pretty helpful. Check it out:
At this point, it should be clear to you why our friend Joan Wilkinson doesn’t appear in the result, as she (for unknown reasons) didn’t buy a programming book. Or expressed in other words, her associated row didn’t match the specified condition.
As with a FULL join, it’s possible to use an alternative syntax for an INNER join. Since in this particular case, we’re looking for customers that did purchase a book, the previous query could be rewritten by using a WHERE clause, as shown below:
SELECT customers.first_name, customers.last_name, books.title FROM customers, books WHERE customers.book_id=books.id ORDER BY customers.first_name
There you have it. While the “INNER” keyword hasn’t been explicitly coded in the above statement, if you try it using your own MySQL client program, you should get exactly the same output that you saw before. Considering that both expressions are equivalent, simply feel free to use the one that best suits your preferences.
And with this final example, we’ve come to the end of this first chapter of this hopefully instructive journey on MySQL joins. Of course, there’re still a few other types of joins to study. But these ones will be discussed in the upcoming episode. Meanwhile, stop by the article’s closing thoughts.
In this introductory part of this two-part tutorial, you learned the basic concepts that surround the use of joins in MySQL. So far, I covered FULL and INNER joins, which as you just saw are indeed a breeze to grasp. As I said before, there’re still some additional types of joins that need to discussed in detail. With that idea in mind, in the last installment I’ll be showing you how to retrieve table rows that don’t match a given condition by using the functionality of LEFT and RIGHT joins.
As usual, don’t miss the last part!