A quick and dirty approach to prevent record duplication: using the “PRIMARY KEY” clause
One of the simplest approaches that can be used for preventing duplicated records consists of defining some of the fields of the target table (or all of them, if you want to) as primary keys. Considering that a column which is a primary key can have only one value assigned to it, the problem can be solved in a jiffy.
With that being said, let me show you how to implement this approach with the “users” table defined before. Here’s how its enhanced definition looks:
DROP TABLE IF EXISTS `test`.`users`; CREATE TABLE `test`.`users` ( `first_name` varchar(45) NOT NULL, `last_name` varchar(45) NOT NULL, `email` varchar(45) NOT NULL, PRIMARY KEY (`first_name`,`last_name`,`email`) ) ENGINE=MyISAM DEFAULT CHARSET=utf8;
As seen above, now the table’s “first_name”, “last_name” and “email” columns have been created as primary keys (even though this may sound rather strange at first, a table can have multiple primary keys), which implies that the values assigned to the columns in question can’t be duplicated.
But what impact does this have in the practice? Well, now it’s possible to run the below SQL snippet any number of times and feel confident that the INSERT statements won’t produce record duplication:
INSERT INTO users (first_name, last_name, email) VALUES ('Sandra', 'Smith', 'email@example.com'); INSERT INTO users (first_name, last_name, email) VALUES ('Daniel', 'Norton', 'firstname.lastname@example.org'); INSERT INTO users (first_name, last_name, email) VALUES ('Susan', 'Wilson', 'email@example.com'); INSERT INTO users (first_name, last_name, email) VALUES ('Jenny', 'Adams', 'firstname.lastname@example.org'); INSERT INTO users (first_name, last_name, email) VALUES ('Thomas', 'Everson', 'email@example.com'); INSERT INTO users (first_name, last_name, email) VALUES ('Mary', 'Brown', 'firstname.lastname@example.org'); INSERT INTO users (first_name, last_name, email) VALUES ('Julie', 'Owen', 'email@example.com'); INSERT INTO users (first_name, last_name, email) VALUES ('Matthew', 'Burke', 'firstname.lastname@example.org'); INSERT INTO users (first_name, last_name, email) VALUES ('Alfred', 'Wayne', 'email@example.com'); INSERT INTO users (first_name, last_name, email) VALUES ('Margaret', 'White', 'firstname.lastname@example.org');
That seems to work like a charm! But before you launch your MySQL client and try out this method, it’s fair to stress that it has a tiny - yet important - catch. If you attempt to assign the same values to the table’s primary keys, they won’t be duplicated at all, but MySQL will issue an error.
In short, we’ve solved one problem, only to face a new one. The question that comes up here is: how can we use our shiny multiple primary keys without getting an error from the database server? This problem can be solved using an “INSERT” statement along with the “IGNORE” clause, which makes MySQL behave silently when attempting to assign multiple values to a primary key.
Avoiding MySQL errors: using the “INSERT IGNORE” statement
As I explained in the preceding segment, it’s really simple to make MySQL discard the errors that it generates when trying to insert multiple values into a column defined as a primary key. In reality, the fix is as easy as performing the insertion process using the INSERT IGNORE statement instead of a common INSERT. As one might expect, the “IGNORE” clause tells the database server to ignore any error raised during the insertion operation(s), something demonstrated by the following code fragment:
If you execute the previous set of commands several times (and assuming that the definition of the “users” table remains the same), no record duplication will occur, and best of all, MySQL won’t issue any errors due to the use of the “IGNORE” clause. In short, we’ve managed to implement a simple method that prevents our tables from having duplicated rows, even at expenses of using multiple primary keys.
It’s valid to note that there exists a few other methods that can be used for fixing record duplication issues - we'll discuss these in the last part of this tutorial series.
That’s all for the moment. Over this introductory part of this two-part tutorial, I discussed a couple of approachable methods that can be used for preventing record duplication in a MySQL table. As you just saw, the use of multiple primary keys along with the “INSERT IGNORE” statement addresses this issue in a pretty efficient manner, and most importantly, without making the database server to throw ugly and unexpected errors.
Nonetheless, the above ones aren’t the only approaches that you can utilize for avoiding record duplication at database level. In fact, it’s possible to solve the problem by using the “REPLACE INTO” statement and even the simple “UNIQUE” clause.
However, the fine details concerning the use of these alternative methods will be covered in the final tutorial.