Perl and DBI - The Relational of Relational Database (
Page 3 of 4 )
Normally when we create a database of information, we spread the data out among several different tables. These tables will relate to one another in some way, usually by a key or other field in the table.
As an example, let’s expand our information about musicians to describe what instruments each of our musicians play and some important facts about those instruments. We could add each instrument to the row in the
musicians
table, but that would cause a lot of repeated information. For instance, three of our musicians play the guitar, so any information we provide for a guitar would have to be repeated for each of the three musicians. Also, several of our musicians play more than one instrument (for instance, Thom Yorke plays guitar, sings vocals, and also plays keyboard). If we provide each instrument that Thom plays, our table would become big and difficult to work with.
Instead, let’s create another table, named
instruments
, that will have this information:
| inst_id |
instrument |
type |
difficulty |
|
1 |
bagpipes |
reed |
9 |
| 2 |
oboe |
reed |
9 |
| 3 |
violin |
string |
7 |
| 4 |
harp |
string |
8 |
| 5 |
trumpet |
brass |
5 |
| 6 |
bugle |
brass |
6 |
| 7 |
keyboards |
keys |
1 |
| 8 |
timpani |
percussion |
4 |
| 9 |
drums |
percussion |
0 |
| 10 |
piccolo |
flute |
5 |
| 11 |
guitar |
string |
4 |
| 12 |
bass |
string |
3 |
| 13 |
conductor |
for-show-only |
0 |
| 14 |
vocals |
vocal |
5 |
Now that we have defined some instruments and our opinions of their related difficulties, we somehow need to map the instrument information to the information stored in the
musicians
table. In other words, we need to indicate how the
instruments
table relates to the
musicians
table. We could simply add the
inst_id
value to the
musicians
table like this:
|
player_id name phone inst_id |
|
1 Roger Waters 555-1212 12 |
and so on, but remember that many of our musicians play more than one instrument. We would then need two rows for Roger Waters (he sings, too) and three rows for Thom Yorke. Repeating their information is a waste of memory and makes the database too complex. Instead, let’s create another table that will connect these two tables. We will call it
what_they_play
and it will have two fields:
player_id
and
inst_id
.
| player_id |
inst_id |
|
1 |
11 |
| 1 |
14 |
| 2 |
12 |
| 2 |
14 |
| 3 |
14 |
| 4 |
7 |
| 4 |
11 |
| 4 |
14 |
| 5 |
11 |
| 5 |
14 |
| 6 |
9 |
To read all this information and make sense of how it relates, we would first look in the
musicians
table and find the musician we want, for instance Geddy Lee. We find his
player_id
, 2, and use that value to look in the
what_they_play
table. We find two entries in that table for his
player_id
that map to two
instr_id
s: 12 and 14. Taking those two values, we use them as the keys in the
instruments
table and find that Geddy Lee plays the bass and sings for his band.2
This example illustrates that the
musicians
table relates to the
instruments
table through the
what_they_play
table. Breaking up the data in our database into separate tables allow us to list the information that we need only once and is often more logical than listing all the information in a single table—this is called normalization.