MySQL Table Joins

One of the great characteristics of SQL (Structured Query Language) is the fact that one can develop, access and modify data across various tables. There are several benefits to this, including greater ease of manipulation, increased speed of access, and reduced data redundancy. In MySQL (as well as many other SQL languages), this is accomplished via the join command.

One of the great characteristics of SQL (Structured Query Language) is the fact that one can develop access and modify data across various tables. There are several benefits to this, including greater ease of manipulation, increased speed of access, and reduced data redundancy. In MySQL (as well as many other SQL languages), this is accomplished via the join command.

So what does a join do, anyway? In short, a join command unites some or all of the data from two or more tables into one comprehensive structure. If this capability was not available, then the administrator would be forced to create ever-expanding, all-encompassing tables that would result in the reduction of overall efficiency and speed of data access and manipulation, even eventually possibly causing the database structure to fail entirely. The joining capability allows the administrator to create small, compact relational tables that can greatly influence database performance. This is especially useful among larger organizations, given the fact that they have the opportunity to divide their database into several departmental sections, allowing each department to effectively manage their relevant data. The alternative to this would be to force one administrator to manage all departmental sections, something that probably does not bode well both in terms of overall organization and efficiency.

For sake of illustration, let’s assume a computer manufacturer is in need of a database in which it can efficiently mine data. This data will be divided into various compact tables, each specifying data relative to a certain concept, such as orders, clients and products. After the tables are created, various examples with be illustrated, showing how this manufacturer can use joins to make the most of the database.

(Note to reader: While the concepts to follow could be applied to practically all SQL-based databases, it should be noted that the examples are illustrated using MySQL syntax. Therefore, there could be minor differences between this syntax and other sql-based languages.)

The first table will hold the various types of PCs that the manufacturer distributes:


mysql> create table pcs (     -> pid INT, // product id     -> spec char(3),     -> os char(10),     -> ram INT,     -> hd char(4)     -> );
The second table will contain data regarding the various clients of the manufacturer:


mysql> create table clients (     -> name char(25),     -> cid char(8), // client id     -> email char(25),     -> tel char(10)     -> );

The third table will contain data regarding order information:


mysql> create table orders (     -> order_date date,     -> pid INT,     -> cid char(8)     -> );

Data Insertion

We next insert data into each table:

pcs
 
pid spec os ram hd
1 386 linux 64 3.1
2 386 linux 128 4.2
3 486 WinNT 64 3.1
4 586 Linux 128 4.2
5 586 Win98 128 6.4

clients
 
name cid email tel
acme, inc. acm-042 acme@acme.com 123-456-7890
widgets-r-us, inc. wig-043 widgets@rus.com 421-555-3434
italimp, inc. imp-042 italian@imports.com 459-555-3212
fedey, inc. fed-043 fed@ey.com 439-555-8899

orders
 
date pid cid
1999-12-05 2 acm-042
1999-12-04 3 wig-043
1999-12-04 1 acm-042
1999-12-05 2 acm-042
1999-12-12 5 fed-043
1999-12-05 5 imp-042


So, how does one go about pulling useful data from these tables? What if the boss wanted a report listing the email addresses of all clients who ordered a particular pc on a particular date? Or perhaps the amount of RAM used in filling orders placed by a particular client? This is easily accomplished via the various joins available to us. Let’s use the tables we have created and construct our first join statement. {mospagebreak title=The Cross Join} The cross-join is a very basic type of join that simply matches each row from one table to every row from another table. While not very efficient, it does illustrate the common characteristic of all joins: table unification.

An example of a cross-join would be as follows:


mysql> SELECT * FROM pcs, clients;

So why are there so many rows? Remember that a cross-join matches up each row of one table with every row of the second table. Therefore, a join involving two tables in which one table has 3 rows and the second table has 4 rows would result in 12 rows.

Perhaps it is easy to remember the cross-join by putting the result in the perspective that each row in table1 crosses every row in table 2 once.

VoilĂ ! You have successfully performed your first join! Not so hard after all, is it?

Now try the following and see what happens:


mysql> select c.name, o.cid from orders o, clients c where o.cid = “acm-042″;

name cid
acme, Inc. acm-042
acme, Inc. acm-042
acme, Inc. acm-042
widgets-r-us, inc. acm-042
widgets-r-us, inc. acm-042
widgets-r-us, inc. acm-042
italimp, inc. acm-042
italimp, inc. acm-042
italimp, inc. acm-042
fedey, inc. acm-042
fedey, inc. acm-042
fedey, inc. acm-042

So what did you think about the result? Was it what you expected it to be? This time, each name in table clients was matched to every row in orders containing “acm-042″. It still fits our above explanation of a cross-join, yet it is more selective in which rows are “crossed”. Try playing around with the order of the query, and see what happens. Notice that when the table order is switched, the output will change accordingly.

Note: Why are aliases used when naming tables? Aliases are used as a method of saving repetitious keystrokes when entering queries. Thus, instead of repeatedly spelling out ‘clients’ when specifying columns, we could use ‘c’ and then state ‘from clients c’ within our query.

Although the cross-join does in fact bring our tables together, it is not very efficient. So how can we sparingly choose the data we want from each table? Read on to find out more. {mospagebreak title=The Equi-join} The equi-join is performed when certain values contained in one table are matched with values contained within a second (or multiple) table(s).

Let’s assume that we want a list of clients who ordered a pc with the product id 1:


mysql> select p.os, c.name from orders o, pcs p, clients c
where p.pid=o.pid and o.pid = 1 and o.cid=c.cid”;

os name
linux acme, Inc.

The Non-Equi-Join

The equi-join works great when we want to match corresponding data between multiple tables. But what if there was a need to display data according to their characteristic of inequality? For example, the boss requires a list of all operating systems (os) and their corresponding order id numbers where the order pid is greater than the product pid. The solution? The aptly named non-equi join.


mysql> SELECT p.os, o.pid from orders o, pcs p where o.pid > p.pid;

os pid
linux 2
linux 3
linux 2
linux 5
linux 5
linux 3
linux 5
linux 5
WinNT 5
WinNT 5
linux 5
linux 5

This matches all rows where the order table pid is greater than the product table pid. Look carefully, as you will realize that this is simply a cross-join with various restrictions placed upon it. While this may not be exceptionally useful to the boss, it does set the stage for a very useful function, the left join. Let’s move on, concentrating upon the various options available when using the left join. {mospagebreak title=The Left Join} The left join allows the user to pull out all sorts of interesting data based upon certain restrictions. This is an extremely powerful option of table joins, and greatly facilitates table manipulation.

Now, the boss is screaming for some details. Details, details, details! It’s no problem however, as the left join will solve our problem.

Assume that


mysql> select * from orders left join pcs on orders.pid = pcs.pid”;

order_date pid cid pid spec os ram hd
1999-12-05 2 acm-042 2 386 linux 128 4.2
1999-12-04 3 wig-043 3 486 WinNT 64 3.1
1999-12-04 1 acm-042 1 386 linux 64 3.1
1999-12-05 2 acm-042 2 386 linux 128 4.2
1999-12-12 5 fed-043 5 586 Win98 128 6.4
1999-12-05 5 imp-042 5 586 Win98 128 6.4

We now have a informative listing of all pcs ordered by our clients! Using a PHP3 or Perl script, one could see how this could be used to print out receipts, for example. We could combine this with the client table for reason of emailing the client an occasional email with a list of all products he has purchased from our company.

Perhaps another useful report we could generate would involve learning of the number of pcs ordered that had product id (pid) number 3.


mysql> select * from orders left join pcs on pcs.pid = 3 and orders.pid = pcs.pid;

order_date pid cid pid spec os ram hd
1999-12-05 2 acm-042 2 null null null null
1999-12-04 3 wig-043 3 486 WinNT 64 3.1
1999-12-04 1 acm-042 1 null null null null
1999-12-05 2 acm-042 2 null null null null
1999-12-12 5 fed-043 5 null null null null
1999-12-05 5 imp-042 5 null null null null

The Using Clause

A variation to the left join allows us to further correlate identical columns residing in multiple tables. This is the using option. Instead of the following:


mysql> SELECT * from clients join on orders where clients.cid = orders.cid;

We could state:


mysql> SELECT * from clients join on orders using (cid);

However, both would produce the same result:

name cid email tel order_date pid cid
acme, inc. acm-042 acme@acme.com 123-465-789 1999-12-05 2 acm-042
acme, inc. acm-042 acme@acme.com 123-465-789 1999-12-04 1 acm-042
acme, inc. acm-042 acme@acme.com 123-465-789 1999-12-05 2 acm-042
widgets-r-us, inc. wig-043 widgets@rus.com 421-555-3434 1999-12-04 3 wig-043
italimp, inc. imp-042 widgets@rus.com 459-555-3434 1999-12-05 5 imp-042
fedey, inc. fed-043 fed@ey.com 439-555-8899 1999-12-12 5 fed-043

There you have it. Table joins made easy. Try playing around with variations of the commands highlighted within this article to gain a clear understanding of the syntax. Once this is understood, you will find that table joins will play an integral part in your development activities. Be sure to check out MySQL’s various discussion groups (http://www.mysql.com), as there is usually quite a bit of information exchanged regarding table joins. {mospagebreak title=Self-joins} The self-join provides the administrator with a powerful method of centralizing relational data to a single table. In fact, the self-join is performed by joining a particular table to itself. Let’s illustrate this concept with an example:

Suppose we are in control of a large database containing information regarding various pieces of hardware used to build a computer workstation. A workstation may consist of a desk, pc, monitor, keyboard and mouse. Furthermore, the desk can be considered the ‘parent’ of all other parts of the workstation. We want to keep accurate records of each workstation, so we will correlate all parts of a specific workstation together via a unique id number. Actually, each part will contain two id numbers, one unique to that specific item, and one identifying its’ parent (the desk) id number.

Assume that this is our table:

uniq_id name parent_id
d001 desktop null
m4gg monitor d001
k245 keyboard d001
pc345 200mhz pc d001
d002 desktop null
m156 monitor d002
k9334 keyboard d002
pa556 350 mhz pc d002

Notice that the desktop does not have a parent_id, since it is in fact the parent for all of its’ corresponding parts. With the table filled with data, we can now begin querying it for useful information. Also note that while our table is simple for reason of best illustration of use of the self-join, one could provide significantly more useful information regarding each item.


mysql> select t1.*, t2.* from page5 as t1, page5 as t2;
So what is the outcome? Like previously seen with such as join regarding two tables, each row from the first table will be matched with every row in the second table. Try it and see. Again, however, this is not very useful to us. Let’s look at a more interesting example:

We are interested in viewing information regarding a specific workstation in which we several technical support calls had been made. We know what the particular workstation id is (the desk id). Let’s query the database to pull up all relevant pieces of this workstation:


mysql> select parent.uniqid, parent.name, child.uniqid, child.name     -> from page5 as parent, page5 as child     -> where child.parent_id = parent.uniqid AND parent.uniqid = “d001″;
This provides a much more interesting outcome, displayed as follows:
uniqid name uniqid name
d001 desktop m4gg monitor
d001 desktop k245 keyboard
d001 desktop pc345 200 mhz pc

The self-join is also used as an efficient method of verifying table data. Since the uniqid column within the table is intended to be unique, it would not be good if the data-entry dept. accidentally entered two items with the same uniqid into the database. This could be periodically checked by using a self-join. Assume that we modified the 350 mhz pc uniqid to be ‘m156′ (which is incidentally the uniqid value of the monitor belonging to workstation ‘d002′). Consider the following example:


mysql> select parent.uniqid, parent.name, child.uniqid, child.name     -> from page5 as parent, page5 as child     -> where parent.uniqid = child.uniqid AND parent.name <> child.name
This would result in the following:

uniqid name uniqid name
m156 350 mhz pc m156 monitor
m156 monitor m156 350 mhz pc

There you have it. Table joins made easy. Try playing around with variations of the commands highlighted within this article to gain a clear understanding of the syntax. Once this is understood, you will find that table joins will play an integral part in your development activities. Be sure to check out MySQL’s various discussion groups (http://www.mysql.com), as there is usually quite a bit of information exchanged regarding table joins.

[gp-comments width="770" linklove="off" ]
antalya escort bayan antalya escort bayan