MySQL
  Home arrow MySQL arrow Page 3 - An Introduction to Database Normalizat...
Dev Shed Forums 
Administration  
Apache  
BrainDump  
DHTML  
Flash  
Java  
JavaScript  
Multimedia  
MySQL  
Oracle  
Perl  
PHP  
Practices  
Python  
Reviews  
Security  
Style-Sheets  
Web Services  
XML  
Zend  
Zope  
Forums Sitemap 
IBM® developerWorks 
Dedicated Servers 
E-Commerce Hosting 
Linux Web Hosting 
Managed Hosting 
Small Business Hosting 
Download TestComplete 
VPS Hosting 
Weekly Newsletter

 
Developer Updates  
Free Website Content 
IBM Rational Software Development Conference
 RSS  Articles
 RSS  Forums
 RSS  All Feeds
Write For Us Get Paid 
Request Media Kit
Contact Us 
Site Map 
Privacy Policy 
Support 
 USERNAME
 
 PASSWORD
 
 
  >>> SIGN UP!  
  Lost Password? 
MYSQL

An Introduction to Database Normalization
By: W.J. Gilmore
  • Search For More Articles!
  • Disclaimer
  • Author Terms
  • Rating: 4 stars4 stars4 stars4 stars4 stars / 114
    2000-11-27

    Table of Contents:
  • An Introduction to Database Normalization
  • Preliminary Definitions
  • So Why Normalize?
  • The Three Normal Forms
  • What's Next

  • Rate this Article: Poor Best 
      ADD THIS ARTICLE TO:
      Del.ici.ous Digg
      Blink Simpy
      Google Spurl
      Y! MyWeb Furl
    Email Me Similar Content When Posted
    Add Developer Shed Article Feed To Your Site
    Email Article To Friend
    Print Version Of Article
    PDF Version Of Article
     
     
     
    ADVERTISEMENT

    TestComplete™ automates software testing for a fraction of what the big guys charge. Easy functional and load testing for all Windows, .NET, Java and Web apps. Download a free trial now.

    An Introduction to Database Normalization - So Why Normalize?
    (Page 3 of 5 )

    Thus far the only thing that I have really stated about database normalization is that it provides for table optimization through the investigation of entity relationships. But why is this necessary? In this section, I’ll elaborate a bit upon why normalization is necessary when creating commercial database applications.
    Essentially, table optimization is accomplished through the elimination of all instances of data redundancy and unforeseen scaleability issues.

    Redundancy

    Data redundancy is exactly what you think it is; the repetition of data. One obvious drawback of data repetition is that it consumes more space and resources than is necessary. Consider the following table:

    Table 1-1: Poorly defined table

    student_id

    class_name

    time

    location

    professor_id

    999-40-9876

    Math 148

    MWF 11:30

    Rm. 432

    prof145

    999-43-0987

    Physics 113

    TR 1:30

    Rm. 12

    prof143

    999-42-9842

    Botany 42

    F 12:45

    Rm. 9

    prof167

    999-41-9832

    Matj 148

    MWF 11:30

    Rm. 432

    prof145


    Basically this table is a mapping of various students to the classes found within their schedule. Seems logical enough, right? Actually, there are some serious issues with the choice to store data in this format. First of all, assuming that the only intention of this table is to create student-class mappings, then there really is no need to repeatedly store the class time and professor ID. Just think that if there are 30 students to a class, then the class information would be repeated 30 times over!

    Moreover, redundancy introduces the possibility for error. You might have noticed the name of the class found in the final row in the table (Matj 148). Given the name of the class found in the first row, chances are that Matj 148 should actually be Math 148! While this error is easily identifiable when just four rows are present in the table, imagine finding this error within the rows representing the 60,000 enrolled students at my alma mater, The Ohio State University. Chances that you’ll find these errors are unlikely, at best. And the cost of even attempting to find them will always be high.

    Unforeseen Scaleability Issues

    Unforeseen scaleability issues generally arise due to lack of forethought pertaining to just how large a database might grow. Of course, as a database grows in size, initial design decisions will continue to play a greater role in the speed of and resources allocated to this database. For example, it is typically a very bad idea to limit the potential for expansion of the information that is to be held within the db, even if there are currently no plans to expand. For example, structurally limiting the database to allot space for only three classes per student could prove deadly if next year the school board decides to permit all students to schedule three classes. This also works in the opposite direction; What if the school board subsequently decides to only allow students to schedule two classes? Have you allowed for adequate flexibility in the design so as to easily adapt to these new policies?

    The remedy to these problems is through the use of a process known as database normalization. A subject of continued research and debate over the years, several general rules have been formulated that layout the process one should follow in the quest to normalize a database. I’ll discuss these rules in the next section, “The Three Normal Forms”.

    More MySQL Articles
    More By W.J. Gilmore


     

       

    MYSQL ARTICLES

    - MySQL Table Prefix Changer Tool in PHP
    - Using the SIGNAL Statement for Error Handling
    - Error Handling Examples
    - Error Handling
    - Completing a Search Engine with MySQL and PH...
    - Paginating Result Sets for a Search Engine B...
    - Building a Search Engine with MySQL and PHP 5
    - Using Boolean Operators for Full Text and Bo...
    - PHP, MySQL and the PEAR Database
    - Working with PHP and MySQL
    - Getting PHP to Talk to MySQL
    - Creating an RSS Reader: the Reader
    - MySQL Security Overview
    - Creating the Admin Script for a PHP/MySQL Bl...
    - Creating the Blog Script for a PHP/MySQL Blo...




    © 2003-2008 by Developer Shed. All rights reserved. DS Cluster 2 hosted by Hostway