Data Definition Language, Part 1

Studying for the MySQL Certification exam? This article, the first of two parts, covers roughly 10 percent of the material that will appear on the exam, including general database and table properties, storage engines and table types, and more. It is excerpted from chapter four of the book MySQL Certification Guide written by Paul Dubois et. al. (Sams, 2004, ISBN: 0672326329).

This chapter covers the following exam topics:

  • General database and table properties

  • Storage engines and table types

  • Limits on number and size of database components

  • Identifier syntax

  • The CREATE DATABASE and DROP DATABASE statements

  • The CREATE TABLE, ALTER TABLE, and DROP TABLE statements

  • The CREATE INDEX and DROP INDEX statements; specifying indexes at table-creation time

  • Creating and using primary keys

  • Column types

  • Using AUTO_INCREMENT

  • String and number formats

  • Using SHOW and DESCRIBE to review table structures

Questions on the material in this chapter make up approximately 20% of the exam.

Several of MySQL’s SQL statements comprise the Data Definition Language (DDL) that is used to define the structural characteristics of your databases. The following statements create or remove databases and tables or modify the structure of tables:

  • CREATE DATABASE creates a new database.

  • DROP DATABASE removes a database and any tables it contains.

  • CREATE TABLE creates a new table.

  • DROP TABLE removes a table and any data it contains.

  • ALTER TABLE modifies the structure of an existing table.

  • CREATE INDEX adds an index to a table.

  • DROP INDEX removes an index from a table.

Several of the table-related DDL statements require you to provide column definitions. MySQL allows several different types of data to be stored, and it’s important to understand what column datatypes are available so that you can define your tables appropriately for the information they’ll contain.

This chapter provides a general overview of how MySQL manages databases and tables and a discussion of the syntax of legal names that can be used to refer to them. It also describes how to use each of the DDL statements and discusses the available column datatypes, their properties, how to use them, and the syntax for writing column definitions.

Related to the DDL statements, MySQL supports several statements that are helpful for checking what databases or tables exist and for getting information about the internal column and index structure of tables. These statements include SHOW and DESCRIBE; they are discussed at the end of this chapter.

This chapter excerpt is from MySQL Certification Guide by Paul Dubois et al. (Sams, 2004, ISBN: 0672326329 ). Check it out at your favorite bookstore today. Buy this book now.

{mospagebreak title=4.1 General Database and Table Properties}

Every MySQL server has a data directory under which it manages the contents of databases and tables. The server represents these using directories and files under the data directory as follows:

  • MySQL associates each database with a directory under the data directory. (This means that the data directory is the parent of all database directories.) A database directory has the same name as the database that it represents. For example, a database named world corresponds to a directory named world under the data directory. MySQL uses the database directory to manage the components of the database—that is, its tables and indexes. A database may be empty, or have one or more tables. Databases cannot be nested; one database cannot contain another.

  • Each table in a database consists of rows and columns. A table can be empty (it can have zero rows of data), but it must have at least one column. A table may also be indexed to improve query performance. Every table is associated with a format file in the database directory that contains the definition, or structure, of the table. The format filename is the same as the table name, plus an .frm suffix. For example, the format file for a table named Country in the world database is named Country.frm and is located in the world directory under the server’s data directory. Depending on the table type, the storage engine for a table might create additional files for the table. If Country is a MyISAM table, the MyISAM storage engine creates data and index files named Country.MYD and Country.MYI to store data rows and indexes (respectively) for the table. If Country is an InnoDB table, MySQL still creates a Country.frm format file in the database directory, but the InnoDB storage engine stores the table data and index information elsewhere, in the InnoDB tablespace.

4.2 Storage Engines and Table Types

The MySQL server uses storage engines to manage data in tables.
Each storage engine handles a particular table type. Each table type has differing characteristics and features; these are summarized in this section as an overview. Elsewhere, this study guide concentrates primarily on the MyISAM and InnoDB table types, which are also discussed in more detail in the “Professional Study Guide.” For
additional information on all table types, see the MySQL Reference Manual.

4.2.1 MyISAM Tables

The MyISAM storage engine manages tables that have the following characteristics:

  • Each MyISAM table is represented on disk by an .frm format file, as well as an .MYD datafile and an .MYI index file. All these files are located in the database directory.

  • MyISAM has the most flexible AUTO_INCREMENT column handling of all the table types.

  • MyISAM tables can be used to set up MERGE tables.

  • MyISAM tables can be converted into fast, compressed, read-only tables.

  • MyISAM supports FULLTEXT searching.

  • MySQL manages contention between queries for MyISAM table access using table-level locking. Query performance is very fast for retrievals. Multiple queries can read the same table simultaneously. For a write query, an exclusive table-level lock is used to prevent use of the table by other read or write queries, leading to reduced performance in environments with a mix of read and write queries. Deadlock cannot occur with table-level locking. (Deadlock occurs when two or more queries are blocked, or stopped from completing, because each is waiting for one of the others to finish.)

4.2.2 InnoDB Tables

The InnoDB storage engine manages tables that have the following characteristics:

  • Each InnoDB table is represented on disk by an .frm format file in the database directory, as well as data and index storage in the InnoDB tablespace. The InnoDB tablespace is a logical single storage area that is made up of one or more files or partitions on disk. The tablespace is shared by all InnoDB tables.

  • InnoDB supports transactions (using the SQL COMMIT and ROLLBACK statements) with full ACID compliance.

  • InnoDB provides auto-recovery after a crash of the MySQL server or the host where the server runs.

  • InnoDB supports foreign keys and referential integrity, including cascaded deletes and updates.

  • MySQL manages query contention for InnoDB tables using multi-versioning and row-level locking. Multi-versioning gives each transaction its own view of the database. This, combined with row-level locking, keeps contention to a minimum. The result is good concurrency even in an environment consisting of mixed reads and writes. However, it’s possible for deadlock to occur. Multi-versioning is discussed further in the “Professional Study Guide.”

4.2.3 MERGE Tables

The MERGE storage engine manages tables that have the following characteristics:

  • A MERGE table is a collection of identically structured MyISAM tables. Each MERGE table is represented on disk by an .frm format file and an .MRG file that lists the names of the constituent MyISAM files. Both files are located in the database directory.

  • Logically, a query on a MERGE table acts as a query on all the MyISAM tables of which it consists.

  • A MERGE table creates a logical entity that can exceed the maximum MyISAM table size.

4.2.4 BDB (Berkeley DB) Tables

The BDB storage engine manages tables that have the following characteristics:

  • Each BDB table is represented on disk by an .frm format file and a .db file that stores data and index information. Both files are located in the database directory.

  • BDB supports transactions (using the SQL COMMIT and ROLLBACK statements) with full ACID compliance.

  • BDB provides auto-recovery after a crash of the MySQL server or the host where the server runs.

  • MySQL manages query contention for BDB tables using page-level locking. This locking level provides concurrency performance that is intermediate to that of row-level and table-level locking. It’s possible for deadlock to occur.

4.2.5 HEAP (MEMORY) Tables

The HEAP storage engine manages tables that have the following characteristics:

  • Each HEAP table is represented on disk by an .frm format file in the database directory. Table data and indexes are stored in memory.

  • In-memory storage results in very fast performance.

  • HEAP table contents do not survive a restart of the server. The table structure itself survives, but the table contains zero data rows after a restart.

  • HEAP tables use up memory, so they should not be used for large tables.

  • MySQL manages query contention for HEAP tables using table-level locking. Deadlock cannot occur.

This chapter excerpt is from MySQL Certification Guide by Paul Dubois et al. (Sams, 2004, ISBN: 0672326329 ). Check it out at your favorite bookstore today. Buy this book now.

{mospagebreak title=4.3 Limits on Number and Size of Database Components}

The MySQL server can manage multiple databases, each of which may contain multiple tables. MySQL does not place any limits on the number of databases, although your operating system or filesystem might. Each database is represented as a directory under the server’s data directory, so if the filesystem on which the data directory resides has a limit on the number of subdirectories a directory may contain, MySQL can create no more than that number of databases.

The MySQL server places no limits on the number of tables in a database. The InnoDB storage engine, on the other hand, allows a maximum of two billion tables to exist within the InnoDB tablespace. This places a limit (albeit a rather high one) on the number of InnoDB tables that can be created among all databases combined. (The limit isn’t enforced on a per-database basis because the InnoDB tablespace is shared among all databases.)

Your operating system or filesystem might also impose limits on the maximum number of tables allowed. For example, the MyISAM storage engine places no limits on the number of tables in a database. However, MyISAM tables are represented by files in the directory that MySQL associates with the database, so a limit on the number of tables in the database might arise from factors external to MySQL:

  • If the operating system or filesystem places a limit on the number of files in a directory, MySQL is bound by that constraint.

  • The efficiency of the operating system in handling large numbers of files in a directory can place a practical limit on the number of tables in a database. If the time required to open a file in the directory increases significantly as the number of files increases, database performance can be adversely affected.

  • The amount of available disk space limits the number of tables.

MySQL storage engines do place limits on the allowable maximum size of individual tables. These limits vary per storage engine, but they tend to be rather high. Another factor that limits table size is the maximum file size allowed by your operating system or filesystem. An operating system may support different types of filesystems, each of which may have a different maximum file size.

For large tables, you might find that you run up against operating system or filesystem limits on file sizes before you reach MySQL’s internal table size limits. Several strategies can be used for working around file size limits:

  • Exploit any features allowed by a given table storage manager for increasing table size. For example, the contents of a MyISAM table can sometimes be distributed into several smaller tables, which can then be treated as a single logical unit by combining them into a MERGE table. This effectively multiplies the maximum table size by the number of component MyISAM tables in the MERGE table.

  • The MyISAM storage engine supports a software RAID feature that partitions data storage for a table into a set of files under the database directory. This has the effect of breaking the single-file size barrier, although only for the datafile. Indexes are still stored in a single file, so software RAID might not be feasible for a heavily indexed table.

  • Convert the table for use with a storage engine that allows larger tables. For example, convert a MyISAM table to an InnoDB table. The InnoDB storage engine manages tables within a tablespace that can be configured to be much larger than the size of a single file, and InnoDB tables can grow as large as the available storage within the tablespace.

  • Modify your operating system. A factor external to MySQL that can be used to allow larger tables is to modify your operating system to support larger files. This might be possible by using a different filesystem type, or by using a newer version of the operating system that relaxes the limits on file sizes compared to an older version. You might also consider switching to an operating system that supports larger files than does your current operating system.

This chapter excerpt is from MySQL Certification Guide by Paul Dubois et al. (Sams, 2004, ISBN: 0672326329 ). Check it out at your favorite bookstore today. Buy this book now.

{mospagebreak title=4.4 Identifier Syntax}

When you write SQL statements, you use names to refer to databases and tables as well as to elements of tables such as columns and (sometimes) indexes. It’s also possible to create aliases, which act as synonyms for table and column names. All of these types of names are known as identifiers; they identify a specific database element. This section describes the rules for writing identifiers.

4.4.1 Legal Characters for Identifiers

Identifiers for databases, tables, columns, and indexes may be unquoted or quoted. An unquoted identifier must follow these rules:

  • An identifier may contain all alphanumeric characters, the underline character (_), and the dollar sign ($).

  • An identifier may begin with any of the legal characters, even a digit. However, it’s best to avoid identifiers that might be misinterpreted as constants. For example, 1e3 might be taken as a number in scientific notation, whereas 0x1 might be interpreted as a hex constant, so neither is a good choice for an identifier.

  • An identifier cannot consist entirely of digits.

An identifier may be quoted, in which case it can contain characters such as spaces or dashes that aren’t otherwise legal. To quote an identifier, you may enclose it within backtick (´) characters. If the server was started with the –ansi or –sql-mode=ANSI_QUOTES option, you may also quote an identifier by enclosing it within double quotes ( ). Quoting causes the identifier syntax rules to be relaxed as follows:

  • Any character may be used in a quoted identifier except characters with a numeric value of 0 or 255. For database and table names, other illegal characters are ., /, and .

  • A quoted identifier may consist entirely of digits.

An alias identifier can include any character, but should be quoted if it’s a reserved word (such as SELECT or DESC), contains special characters, or consists entirely of digits. Aliases may be quoted within single quotes ( ), double quotes, or backticks.

4.4.2 Using Qualifiers for Table and Column Names

Column and table identifiers can be written in qualified form—that is, together with the identifier of a higher-level element, with a period (.) separator.

A table name may be qualified with the name of the database to which it belongs. For example, the Country table in the world database may be referred to as world.Country (note the . separating the two identifiers in the name). If world is the default database, these statements are equivalent:

SELECT * FROM Country;
SELECT * FROM world.Country;

A column name may be qualified with the name of the table to which it belongs. For example, the Name column in the Country table may be referred to as Country.Name.

A further level of column qualification is possible because a table name may be qualified with a database name. So, another way to refer to the Name column is world.Country.Name. If world is the default database, the following statements are equivalent. They differ only in having successively more specific levels of name qualification:

SELECT Name FROM Country;
SELECT Country.Name FROM Country;
SELECT world.Country.Name FROM world.Country;

Sometimes qualifiers are necessary to resolve ambiguity. Other times you may elect to use them to make a statement clearer or more precise.

This chapter excerpt is from MySQL Certification Guide by Paul Dubois et al. (Sams, 2004, ISBN: 0672326329 ). Check it out at your favorite bookstore today. Buy this book now.

{mospagebreak title=4.5 CREATE DATABASE and DROP DATABASE}

To create a new database, use the CREATE DATABASE statement. The following statement creates a database named mydb:

CREATE DATABASE mydb;

After a database has been created, you can create new tables in it using the CREATE TABLE statement, which is described in section 4.6, “CREATE TABLE.”

If you try to create a database that already exists, an error occurs. If you simply want to ensure that the database exists, add an IF NOT EXISTS clause to the statement:

CREATE DATABASE IF NOT EXISTS mydb;

With the additional clause, the statement creates the database only if it does not already exist. Otherwise, the statement does nothing and no error occurs. This can be useful in applications that need to ensure that a given database is available, without disrupting any existing database with the same name.

Creating a database has no effect on the database that’s currently selected as the default database. To make the new database the default database, issue a USE statement:

USE mydb;

To see a list of available databases, use the SHOW DATABASES statement. This statement will not show the names of databases to which you have no access. To see a list of tables in a database, use SHOW TABLES FROM db_name (or just SHOW TABLES if db_name is the name of the current database). The SHOW command is described in more detail in section 4.11, “Using SHOW and DESCRIBE to Review Table Structures.”

When you no longer need a database, you can remove it with DROP DATABASE:

DROP DATABASE mydb;

It’s unnecessary to remove the tables in a database before dropping it. DROP DATABASE does not require the database to be empty, so it does not fail if the database contains tables. DROP DATABASE removes the tables in the process of removing the database.

DROP DATABASE is a dangerous statement and you should use it with care. There is no statement to “undo” DROP DATABASE. If you drop a database by mistake, your only option is to recover it from your backups.

4.6 CREATE TABLE

Use the CREATE TABLE statement to create a new table. A table’s definition includes its name and a list of columns, each of which has a name and a definition. The table definition may also include index definitions.

This section describes basic CREATE TABLE syntax using simple column definitions. More information on column datatypes and properties can be found in section 4.10, “Column Types.”

To create a table, give its name followed by a list of column definitions within parentheses:

CREATE TABLE table_name (definition1, definition2, …);

In the simplest case, a table contains only one column. The following statement creates a table named t with a single column named id that will contain INT (integer) values:

CREATE TABLE t (id INT);

A column definition may include options to define the column data more precisely. For example, to disallow NULL values in the column, include NOT NULL in the definition:

CREATE TABLE t (id INT NOT NULL);

More complex tables have multiple columns, with the column definitions separated by commas. The following table definition includes, in addition to the id column, two 30-byte character columns for storing last names and first names, and a column for storing date values. All columns are declared NOT NULL to indicate that they require non-NULL values.

CREATE TABLE t
(
  id     INT NOT NULL,
  last_name CHAR(30) NOT NULL,
  first_name CHAR(30) NOT NULL,
  d     DATE NOT NULL
);

Every table must belong to a database. That is, you cannot create a table that is not located within some database. If the table named in the CREATE TABLE statement isn’t qualified with a database name, the table is created in the default database. To indicate explicitly where to create the table, you can qualify the table name with the name of the desired database, using db_name.table_name syntax. For example, if you want to create a table called mytable in the test database, write the CREATE TABLE statement like this:

CREATE TABLE test.mytable (i INT);

The qualified identifier syntax is helpful when there’s no default database or when some other database is currently selected as the default. (If test happens to be the default database, the statement still works. In that case, the database name is unnecessary but harmless.)

When you create a table, you can provide index definitions in addition to the column definitions. Indexes are useful for increasing query performance by reducing lookup time. Here’s a simple example that includes two index definitions. The first creates an index on the id column and requires each id value to be unique. The second index definition creates a two- column index on the last_name and first_name columns of the table:

CREATE TABLE t
(
  id     INT NOT NULL,
  last_name CHAR(30) NOT NULL,
  first_name CHAR(30) NOT NULL,
  UNIQUE (id),
  INDEX (last_name, first_name)
);

Section 4.9, “Creating and Dropping Indexes,” discusses index creation further.

If you try to create a table that already exists, an error occurs. If you simply want to ensure that the table exists, add an IF NOT EXISTS clause to the statement:

CREATE TABLE IF NOT EXISTS t (i INT);

Note, however, that MySQL does not perform any check on the table structure when you add this clause. In particular, MySQL will issue no warning if a table with the given name exists but has a structure different from the one you’ve defined in the CREATE TABLE statement.

A temporary table can be created by adding the keyword TEMPORARY to the CREATE TABLE statement:

CREATE TEMPORARY TABLE t (i INT);

Temporary tables exist only for the duration of the current connection. The server drops temporary tables when you disconnect, if you haven’t already dropped them explicitly. This is convenient because you need not remember to remove the table yourself. A temporary table is visible only to the client that creates it, so different clients can create temporary tables in the same database, using the same name, without conflicting with one another.

This chapter excerpt is from MySQL Certification Guide by Paul Dubois et al. (Sams, 2004, ISBN: 0672326329 ). Check it out at your favorite bookstore today. Buy this book now.

{mospagebreak title=4.7 DROP TABLE}

When you no longer need a table, you can destroy it with the DROP TABLE statement:

DROP TABLE t;

In MySQL, a single DROP TABLE statement can name several tables to be dropped simultaneously:

DROP TABLE t1, t2, t3;

Normally, an error occurs if you attempt to drop a table that does not exist:

mysql>
DROP TABLE no_such_table;ERROR 1051: Unknown table ‘no_such_table’

To prevent an error from occurring if a table does not exist when you attempt to drop it, add an IF EXISTS clause to the statement:

mysql>
DROP TABLE IF EXISTS no_such_table;

If you drop a table by mistake, you must recover it from backups, so be careful. (This is the same principle as the one mentioned earlier for databases: If you drop a database, you cannot undo the action. A dropped database can only be recovered from your backups.)

4.8 ALTER TABLE

After creating a table, you might discover that its structure is not quite right for its intended use. If that happens, you can change the table’s structure. One way to do this is to remove the table with DROP TABLE and then issue another CREATE TABLE statement that defines the table correctly. This can be a drastic method: If the table already contains data, dropping and re-creating the table destroys its contents unless you first make a backup. To change a table “in place,” use the ALTER TABLE statement. With ALTER TABLE, you can modify a table’s structure in the following ways:

  • Add or drop columns

  • Change the name or definition of a column

  • Add or drop indexes

  • Sort the table’s rows in a particular order

  • Rename the table

This section describes how to perform all the possible changes except for adding and dropping indexes. Adding and dropping of indexes is covered in a later section that focuses specifically on indexing issues. (See section 4.9, “Creating and Dropping Indexes.”)

Most of the examples shown in this section use a table named HeadOfState, designed to keep track of world leaders. Assume that the table initially has the following structure:

CREATE TABLE HeadOfState
(
  ID     INT NOT NULL,
  LastName  CHAR(30) NOT NULL,
  FirstName  CHAR(30) NOT NULL,
  CountryCode CHAR(3) NOT NULL,
);

The corresponding DESCRIBE output for the table is as follows:

mysql>
DESCRIBE HeadOfState;+————-+———-+—–+—
| Field       | Type     | Null | Key | Default | Extra |
+————-+———-+——+—–+———+——-+
| ID          | int(11)  |      |     | 0       |       |
| LastName    | char(30) |      |     |         |       |
| FirstName   | char(30) |      |     |         |       |
| CountryCode | char(3)  |      |     |         |       |
+————-+———-+——+—–+———+——-+
4.8.1 Adding and Dropping Columns

To add a new column to a table, use ALTER TABLE with an ADD clause that specifies the column’s definition. A column definition uses the same syntax for ALTER TABLE as for CREATE TABLE. For example, to add a DATE column named Inauguration to record the date the leaders listed in the table assumed office, you can issue this statement:

ALTER TABLE HeadOfState ADD Inauguration DATE NOT NULL;

This ALTER TABLE changes the table structure as follows:

mysql>
DESCRIBE HeadOfState;+————–+———-+——+—–+
| Field        | Type     | Null | Key | Default    | Extra |
+————–+———-+——+—–+————+——-+
| ID           | int(11)  |      |     | 0          |       |
| LastName     | char(30) |      |     |            |       |
| FirstName    | char(30) |      |     |            |       |
| CountryCode  | char(3)  |      |     |            |       |
| Inauguration | date     |      |     | 0000-00-00 |       |
+————–+———-+——+—–+————+——-+

As shown in the DESCRIBE output, when you add a new column to a table, MySQL places it after all existing columns. This is the default placement unless you specify otherwise. To indicate that MySQL should place the new column in a specific position within the table, append either the keyword FIRST or the keyword-identifier combination AFTER column_name to the column definition. For example, assume that you had executed this ALTER TABLE statement instead of the previous one:

ALTER TABLE HeadOfState ADD Inauguration DATE NOT NULL FIRST;

The FIRST keyword tells ALTER TABLE to place the new column before all existing columns (in the “first” position), resulting in the following table structure:

mysql>
DESCRIBE HeadOfState;+————–+———-+——+—–+
| Field        | Type     | Null | Key | Default    | Extra |
+————–+———-+——+—–+————+——-+
| Inauguration | date     |      |     | 0000-00-00 |       |
| ID           | int(11)  |      |     | 0          |       |
| LastName     | char(30) |      |     |            |       |
| FirstName    | char(30) |      |     |            |       |
| CountryCode  | char(3)  |      |     |            |       |
+————–+———-+——+—–+————+——-+

Using AFTER column_name tells ALTER TABLE to place the new column after a specific existing column. For example, to place the new Inauguration column after the existing FirstName column, you would issue this statement:

ALTER TABLE HeadOfState ADD Inauguration DATE NOT NULL AFTER FirstName;

This ALTER TABLE statement would result in a table structure that looks like this:

mysql>
DESCRIBE HeadOfState;+————–+———-+——+—–+
| Field        | Type     | Null | Key | Default    | Extra |
+————–+———-+——+—–+————+——-+
| ID           | int(11)  |      |     | 0          |       |
| LastName     | char(30) |      |     |            |       |
| FirstName    | char(30) |      |     |            |       |
| Inauguration | date     |      |     | 0000-00-00 |       |
| CountryCode  | char(3)  |      |     |            |       |
+————–+———-+——+—–+————+——-+

You cannot add a column with the same name as one that already exists in the table; column names within a table must be unique. Column names are not case sensitive, so if the table already contains a column named ID, you cannot add a new column using any of these names: ID, id, Id, or iD. They all are considered to be the same name.

To drop a column, use a DROP clause. In this case, it’s necessary only to name the column you want to drop:

ALTER TABLE table_name DROP column_name;
4.8.2 Modifying Existing Columns

There are two ways to change the definition of an existing column within a table. One of these also enables you to rename the column.

The first way to alter a column definition is to use a MODIFY clause. You must specify the name of the column that you want to change, followed by its new definition. Assume that you want to change the ID column’s datatype from INT to BIGINT, to allow the table to accommodate larger identification numbers. You also want to make the column UNSIGNED to disallow negative values. The following statement accomplishes this task:

ALTER TABLE HeadOfState MODIFY ID BIGINT UNSIGNED NOT NULL;

DESCRIBE now shows the table structure to be as follows:

mysql>
DESCRIBE HeadOfState;+————–+———————+——+—–+
| Field        | Type                | Null | Key | Default    | Extra |
+————–+———————+——+—–+————+——-+
| ID           | bigint(20) unsigned |      |     | 0          |       |
| LastName     | char(30)            |      |     |            |       |
| FirstName    | char(30)            |      |     |            |       |
| Inauguration | date                |      |     | 0000-00-00 |       |
| CountryCode  | char(3)             |      |     |            |       |
+————–+———————+——+—–+————+——-+

Note that if you want to disallow NULL in the column, the column definition provided for MODIFY must include the NOT NULL option, even if the column was originally defined with NOT NULL. This is true for other column options as well; if you don’t specify them explicitly, the new definition won’t carry them over from the old definition.

The second way to alter a column definition is to use a CHANGE clause. CHANGE enables you to modify both the column’s definition and its name. To use this clause, specify the CHANGE keyword, followed by the column’s existing name, its new name, and its new definition, in that order. Note that this means you must specify the existing name twice if you want to change only the column definition (and not the name). For example, to change the LastName column from CHAR(30) to CHAR(40) without renaming the column, you’d do this:

ALTER TABLE HeadOfState CHANGE LastName LastName CHAR(40) NOT NULL;

To change the name as well (for example, to Surname), provide the new name following the existing name:

ALTER TABLE HeadOfState CHANGE LastName Surname CHAR(40) NOT NULL;
4.8.3 Renaming a Table

Renaming a table changes neither a table’s structure nor its contents. The following statement renames table t1 to t2:

ALTER TABLE t1 RENAME TO t2;

Another way to rename a table is by using the RENAME TABLE statement:

RENAME TABLE t1 TO t2;

RENAME TABLE has an advantage over ALTER TABLE in that it can perform multiple table renaming operations in a single statement. One use for this feature is to swap the names of two tables:

RENAME TABLE t1 TO tmp, t2 TO t1, tmp TO t2;
4.8.4 Specifying Multiple Alterations

You can specify multiple alterations for a table with a single ALTER TABLE statement. Just separate the actions by commas:

ALTER TABLE HeadOfState RENAME TO CountryLeader,
  MODIFY ID BIGINT UNSIGNED NOT NULL,
  ORDER BY LastName, FirstName;

This chapter excerpt is from MySQL Certification Guide by Paul Dubois et al. (Sams, 2004, ISBN: 0672326329 ). Check it out at your favorite bookstore today. Buy this book now.

{mospagebreak title=4.9 Creating and Dropping Indexes}

Tables in MySQL can grow very large, but as a table gets bigger, retrievals from it become slower. To keep your queries performing efficiently, it’s essential to index your tables. Indexes allow column values to be found more quickly, so retrievals based on indexes are faster than those that are not. Another reason to use indexes is that they can enforce uniqueness to ensure that duplicate values do not occur and that each row in a table can be distinguished from every other row.

MySQL supports four types of indexes:

  • A nonunique index is an index in which any key value may occur multiple times.

  • A UNIQUE index is unique-valued; that is, every key value is required to be different from all other keys.

  • A PRIMARY KEY is a unique-valued index that’s similar to a UNIQUE index but has additional restrictions (the major one being that no NULL values are allowed).

  • A FULLTEXT index is specially designed for text searching.

This section discusses the following index-related topics:

  • Defining indexes at table creation time with CREATE TABLE

  • Using primary keys

  • Adding indexes to existing tables with ALTER TABLE or CREATE INDEX

  • Dropping indexes from tables with ALTER TABLE or DROP INDEX

The discussion here does not consider in any depth indexing topics such as query optimization, assessing how well indexes are used, or FULLTEXT searching. The “Professional Study Guide” covers those topics in more detail.

4.9.1 Defining Indexes at Table-Creation Time

To define indexes for a table at the time you create it, include the index definitions in the CREATE TABLE statement along with the column definitions. An index definition consists of the appropriate keyword or keywords to indicate the index type, followed by a list in parentheses that names the column or columns that comprise the index. Suppose that the definition of a table HeadOfState without any indexes looks like this:

CREATE TABLE HeadOfState
(
  ID      INT NOT NULL,
  LastName   CHAR(30) NOT NULL,
  FirstName  CHAR(30) NOT NULL,
  CountryCode CHAR(3) NOT NULL,
  Inauguration DATE NOT NULL
);

To create the table with the same columns but with a nonunique index on the date-valued column Inauguration, include an INDEX clause in the CREATE TABLE statement as follows:

CREATE TABLE HeadOfState
(
  ID      INT NOT NULL,
  LastName   CHAR(30) NOT NULL,
  FirstName  CHAR(30) NOT NULL,
  CountryCode CHAR(3) NOT NULL,
  Inauguration DATE NOT NULL,
  INDEX (Inauguration)
);

The keyword KEY may be used instead of INDEX.

To include multiple columns in an index (that is, to create a composite index), list all the column names within the parentheses, separated by commas. For example, a composite index that includes both the LastName and FirstName columns can be defined as follows:

CREATE TABLE HeadOfState
(
  ID      INT NOT NULL,
  LastName   CHAR(30) NOT NULL,
  FirstName  CHAR(30) NOT NULL,
  CountryCode CHAR(3) NOT NULL,
  Inauguration DATE NOT NULL,
  INDEX (LastName, FirstName)
);

Composite indexes can be created for any type of index.

For all index types other than PRIMARY KEY, you can name an index by including the name just before the column list. If you don’t provide a name, MySQL assigns a name for you based on the name of the first column in the index. For a PRIMARY KEY, you don’t specify a name because its name is always PRIMARY. (A consequence of this fact is that you cannot define more than one PRIMARY KEY per table.)

The preceding indexing examples each include just one index in the table definition, but a table can have multiple indexes. The following table definition includes two indexes:

CREATE TABLE HeadOfState
(
  ID      INT NOT NULL,
  LastName   CHAR(30) NOT NULL,
  FirstName  CHAR(30) NOT NULL,
  CountryCode CHAR(3) NOT NULL,
  Inauguration DATE NOT NULL,
  INDEX (LastName, FirstName),
  INDEX (Inauguration)
);

To create a unique-valued index, use the UNIQUE keyword instead of INDEX. For example, if you want to prevent duplicate values in the ID column, create a UNIQUE index on it like this:

CREATE TABLE HeadOfState
(
  ID      INT NOT NULL,
  LastName   CHAR(30) NOT NULL,
  FirstName  CHAR(30) NOT NULL,
  CountryCode CHAR(3) NOT NULL,
  Inauguration DATE NOT NULL,
  UNIQUE (ID)
);

There’s one exception to the uniqueness of values in a UNIQUE index: If a column in the index may contain NULL values, multiple NULL values are allowed. This differs from the behavior for all non-NULL values.

A PRIMARY KEY is a type of index that’s similar to a UNIQUE index. The differences between the two are as follows:

  • A UNIQUE index can contain NULL values; a PRIMARY KEY cannot. If a unique-valued index might be required to contain NULL values, you must use a UNIQUE index, not a PRIMARY KEY.

  • It’s possible to have multiple UNIQUE indexes for a table, but each table may have only one index defined as a PRIMARY KEY. (The internal name for a PRIMARY KEY is always PRIMARY, and there can be only one index with a given name.)

To index a column as a PRIMARY KEY, just use the keywords PRIMARY KEY rather than UNIQUE and declare the column NOT NULL to make sure that it cannot contain NULL values.

The use of PRIMARY KEY and UNIQUE to create indexes that ensure unique identification for any row in a table is discussed in the next section.

4.9.2 Creating and Using Primary Keys

The most common reason for creating an index is that it decreases lookup time for operations that search the indexed columns, especially for large tables. Another important use for indexing is to create a restriction that requires indexed columns to contain only unique values.

An index with unique values allows you to identify each record in a table as distinct from any other. This kind of index provides a primary key for a table. Without a primary key, there might be no way to identify a record that does not also identify other records at the same time. That is a problem when you need to retrieve, update, or delete a specific record in a table. A unique ID number is a common type of primary key.

Two of MySQL’s index types can be used to implement the concept of a primary key:

  • An index created with a PRIMARY KEY clause

  • An index created with the UNIQUE keyword

In both cases, the column or columns in the index should be declared as NOT NULL. For a PRIMARY KEY, this is a requirement; MySQL won’t create a PRIMARY KEY from any column that may be NULL. For a UNIQUE index, declaring columns as NOT NULL is a logical requirement if the index is to serve as a primary key. If a UNIQUE index is allowed to contain NULL values, it may contain multiple NULL values. As a result, some rows might not be distinguishable from others and the index cannot be used as a primary key.

A PRIMARY KEY is a type of unique-valued index, but a UNIQUE index isn’t necessarily a primary key unless it disallows NULL values. If it does, a UNIQUE index that cannot contain NULL is functionally equivalent to a PRIMARY KEY.

The following definition creates a table t that contains an id column that’s NOT NULL and declared as a primary key by means of a PRIMARY KEY clause:

CREATE TABLE t
(
  id  INT NOT NULL,
  name CHAR(30) NOT NULL,
  PRIMARY KEY (id)
);

A primary key on a column also can be created by replacing PRIMARY KEY with UNIQUE in the table definition, provided that the column is declared NOT NULL:

CREATE TABLE t
(
  id  INT NOT NULL,
  name CHAR(30) NOT NULL,
  UNIQUE (id)
);

An alternative syntax is allowed for the preceding two statements. For a single-column primary key, you can add the keywords PRIMARY KEY or UNIQUE directly to the end of the column definition. The following CREATE TABLE statements are equivalent to those just shown:

CREATE TABLE t
(
  id  INT NOT NULL PRIMARY KEY,
  name CHAR(30) NOT NULL
);

CREATE TABLE t
(
  id  INT NOT NULL UNIQUE,
  name CHAR(30) NOT NULL
);

Like other indexes, you can declare a PRIMARY KEY or UNIQUE index as a composite index that spans multiple columns. In this case, the index must be declared using a separate clause. (You cannot add the PRIMARY KEY or UNIQUE keywords to the end of a column definition because the index would apply only to that column.) The following definition creates a primary key on the last_name and first_name columns using a PRIMARY KEY clause:

CREATE TABLE people
(
  last_name CHAR(30) NOT NULL,
  first_name CHAR(30) NOT NULL,
  PRIMARY KEY (last_name, first_name)
);

This primary key definition allows any given last name or first name to appear multiple times in the table, but no combination of last and first name can occur more than once.

You can also create a multiple-column primary key using UNIQUE, if the columns are declared NOT NULL:

CREATE TABLE people
(
  last_name CHAR(30) NOT NULL,
  first_name CHAR(30) NOT NULL,
  UNIQUE (last_name, first_name)
);

Primary keys are an important general database design concept because they allow unique identification of each row in a table. For MySQL in particular, primary keys are frequently defined as columns that are declared with the AUTO_INCREMENT option. AUTO_INCREMENT columns provide a convenient way to automatically generate a unique sequence number for each row in a table and are described in section 4.10, “Column Types.”

4.9.3 Modifying Indexes of Existing Tables

To add an index to a table, you can use ALTER TABLE or CREATE INDEX. To drop an index from a table, you can use ALTER TABLE or DROP INDEX. Of these statements, ALTER TABLE is the most flexible, as will become clear in the following discussion.

To add an index to a table with ALTER TABLE, use ADD followed by the appropriate index-type keywords and a parenthesized list naming the columns to be indexed. For example, assume that the HeadOfState table used earlier in this chapter is defined without indexes as follows:

CREATE TABLE HeadOfState
(
  ID      INT NOT NULL,
  LastName   CHAR(30) NOT NULL,
  FirstName  CHAR(30) NOT NULL,
  CountryCode CHAR(3) NOT NULL,
  Inauguration DATE NOT NULL
);

To create a PRIMARY KEY on the ID column and a composite index on the LastName and FirstName columns, you would issue these statements:

ALTER TABLE HeadOfState ADD PRIMARY KEY (ID);
ALTER TABLE HeadOfState ADD INDEX (LastName,FirstName);

MySQL allows multiple actions to be performed with a single ALTER TABLE statement. One common use for multiple actions is to add several indexes to a table at the same time, which is more efficient than adding each one separately. The preceding two ALTER TABLE statements can be combined as follows:

ALTER TABLE HeadOfState ADD PRIMARY KEY (ID), ADD INDEX (LastName,FirstName);

To drop an index with ALTER TABLE, use a DROP clause and name the index to be dropped. Dropping a PRIMARY KEY is easy:

ALTER TABLE HeadOfState DROP PRIMARY KEY;

To drop another kind of index, you must specify its name. If you don’t know the name, you can use SHOW CREATE TABLE to see the table’s structure, including any index definitions, as shown here:

mysql>
SHOW CREATE TABLE HeadOfStateG*************************** 
1. row *************************** Table: HeadOfState Create Table: CREATE TABLE ´HeadOfState´ ( ´ID´ int(11) NOT NULL default ‘0’, ´LastName´ char(30) NOT NULL default ”, ´FirstName´ char(30) NOT NULL default ”, ´CountryCode´ char(3) NOT NULL default ”, ´Inauguration´ date NOT NULL default ‘0000-00-00′, PRIMARY KEY (´ID´), KEY ´LastName´ (´LastName´,´FirstName´) ) TYPE=MyISAM

The KEY clause of the output shows that the index name is LastName, so you can drop the index using the following statement:

ALTER TABLE HeadOfState DROP INDEX LastName;

After you’ve dropped an index, you can recover it merely by re-creating it. This differs from dropping a database or a table, which cannot be undone except by recourse to backups. The distinction is that when you drop a database or a table, you’re removing data. When you drop an index, you aren’t removing table data, you’re merely removing a structure that’s derived from the data. The act of removing an index is a reversible operation as long as the columns from which the index was constructed have not been removed.

CREATE INDEX and DROP INDEX provide alternatives to ALTER TABLE for index manipulation.

The syntax for CREATE INDEX is as follows, where the statements shown create a single- column UNIQUE index and a multiple-column nonunique index, respectively:

CREATE UNIQUE INDEX IDIndex ON HeadOfState (ID);
CREATE INDEX NameIndex ON HeadOfState (LastName,FirstName);

Note that with CREATE INDEX, it’s necessary to provide a name for the index, whereas ALTER TABLE creates an index name automatically if you don’t provide one.

To drop an index with DROP INDEX, indicate the index name and table name:

DROP INDEX IDIndex ON t;
DROP INDEX NameIndex ON t;

Unlike ALTER TABLE, the CREATE INDEX and DROP INDEX statements can operate only on a single index per statement. In addition, neither statement supports the use of PRIMARY KEY. This is the reason that ALTER TABLE is more flexible.

This chapter excerpt is from MySQL Certification Guide by Paul Dubois et al. (Sams, 2004, ISBN: 0672326329 ). Check it out at your favorite bookstore today. Buy this book now.

{mospagebreak title=4.10 Column Types}

MySQL can work with many different kinds of data. Generally speaking, data values can be grouped into three categories:

  • Numeric values. Numbers may or may not have a fractional part and may have a leading sign. For example, 14, -428.948, and +739 all are legal numbers. Integer values have no fractional part; columns for values with a fractional part can be declared to have either a fixed or variable number of decimal places. Numeric columns can be declared to be unsigned to prevent negative values from being accepted in the column.

  • String values. Strings may be case sensitive or case insensitive. Strings may store characters or raw data values that contain arbitrary byte values. Strings are written within quotes (for example, I am a string or I am a string ). String columns can be declared as either fixed length or variable length.

  • Temporal values. Temporal values include dates (such as 2005-11-03 ), times (such as 14:23:00 ), and values that have both a date and a time part ( 2005-11-03 14:23:00 ). MySQL also supports a special temporal type that represents year-only values efficiently. Date and time values can be written as quoted strings and may sometimes be written as numbers in contexts where numeric temporal values are understood.

When you create a table, the declaration for each of its columns includes the column name, a datatype specification that indicates what kind of values the column may hold, and possibly some options that more specifically define how MySQL should handle the column. For example, the following statement creates a table named people, which contains a numeric column named id and two 30-byte string columns named first_name and last_name:

CREATE TABLE people
(
  id     INT,
  first_name CHAR(30),
  last_name CHAR(30)
);

The column definitions in this CREATE TABLE statement contain only names and column datatype specifications. To control the use of a column more specifically, options may be added to its definition. For example, to disallow negative values in the id column, add the UNSIGNED option. To disallow NULL (missing or unknown) values in any of the columns, add NOT NULL to the definition of each one. The modified CREATE TABLE statement looks like this:

CREATE TABLE people
(
  id     INT UNSIGNED NOT NULL,
  first_name CHAR(30) NOT NULL,
  last_name CHAR(30) NOT NULL
);

For each of the general datatype categories (number, string, date, and time), MySQL has several specific column types from which to choose. It’s important to properly understand the datatypes that are available for representing data, to avoid choosing a column type that isn’t appropriate. The following sections provide a general description of the column datatypes and their properties. For additional details, the MySQL Reference Manualprovides an extensive discussion on column datatypes.

4.10.1 Numeric Column Types

MySQL provides numeric column types for integer values, values with a fixed number of decimal places, and floating-point values that have a variable number of decimal places. When you choose a numeric column datatype, consider the following factors:

  • The range of values the datatype represents

  • The amount of storage space that column values require

  • The display width indicating the maximum number of characters to use when presenting column values in query output

  • The column precision (number of digits before the decimal) for values with a scale

4.10.1.1 Integer Column Types

Integer datatypes include TINYINT, SMALLINT, MEDIUMINT, INT, and BIGINT. Smaller datatypes require less storage space, but are more limited in the range of values they represent. For example, a TINYINT column has a small range (–128 to 127), but its values take only one byte each to store. INT has a much larger range (–2,147,483,648 to 2,147,483,647) but its values require four bytes each. The integer datatypes are summarized in the following table, which indicates the amount of storage each type requires as well as its range. For integer values declared with the UNSIGNED option, negative values are not allowed, and the high end of the range shifts upward to approximately double the maximum value.

Type

Storage Required

Signed Range

Unsigned Range

TINYINT

1 byte

-128 to 127

0 to 255

SMALLINT

2 bytes

-32,768 to 32,767

0 to 65,535

MEDIUMINT

3 bytes

-8,388,608 to 8,388,607

0 to 16,777,215

INT

4 bytes

-2,147,683,648 to 2,147,483,647

0 to 4,294,967,295

BIGINT

8 bytes

-9,223,372,036,854,775,808 to 9,223,372,036,854,775,807

0 to 18,446,744,073,709,551,615


Integer datatypes may be declared with a display width, which affects the number of characters used to display column values in query output. For example, assume that you declare an INT column with a display width of 4 like this:

century INT(4)

The result is that values in the century column will usually be displayed four digits wide.

It’s important to remember that the display width is unrelated to the range of the datatype. The display width you define for a column affects only the maximum number of digits MySQL will use to display column values. Values shorter than the display width are padded with spaces as necessary. Note also that the display width is not a hard limit; it won’t cause output truncation of a value that’s too long to fit within the width. Instead, the full value will be shown. For example, assume that you’ve inserted the number 57622 into the century column. When you SELECT the column in a query, MySQL will display the entire value (57622) rather than just the first four digits of the value.

If you don’t specify a display width for an integer type, MySQL chooses a default based on the number of characters needed to display the full range of values for the type (including the minus sign). For example, SMALLINT has a default display width of 6 because the widest possible value is -32768.

4.10.1.2 Floating-Point and Fixed-Decimal Column Types

The floating-point datatypes include FLOAT and DOUBLE. The fixed-point datatype is DECIMAL. Each of these types may be used to represent numeric values that have a scale, or fractional part. FLOAT and DOUBLE datatypes represent floating-point values in the native binary format used by the server host’s CPU. This is a very efficient type for storage and computation, but values are subject to rounding error. DECIMAL uses a fixed-decimal storage format: All values in a DECIMAL column have the same number of decimal places. Values are stored in string format using one byte per digit. Numbers represented as strings cannot be processed as quickly as numbers represented in binary, so operations on DECIMAL columns are slower than operations on FLOAT and DOUBLE columns. DECIMAL values are not subject to rounding error when stored, which makes the DECIMAL column type a popular choice for financial applications involving currency calculations. However, be aware that currently MySQL does internal calculations using floating-point arithmetic, which can produce rounding error in the result.

FLOAT and DOUBLE are used to represent single-precision and double-precision floating-point values. They use 4 and 8 bytes each for storage, respectively. By default, MySQL represents values stored in FLOAT and DOUBLE columns to the maximum precision allowed by the hardware, but you can specify a display width and precision in the column definition. The following single-precision column definition specifies a display width of 10 digits, with a precision of 4 decimals:

avg_score FLOAT(10,4)

DECIMAL columns may also be declared with a display width and scale. If you omit them, the defaults are 10 and 0, so the following declarations are equivalent:

total DECIMAL
total DECIMAL(10)
total DECIMAL(10,0)

If you want to represent values such as dollar-and-cents currency figures, you can do so using a two-digit scale:

total DECIMAL(10,2)

The amount of storage required for DECIMAL column values depends on the type. Normally, the number of bytes of storage required per value is equal to the display width plus 2. For example, DECIMAL(6,3) requires 8 bytes: the display width is 6 and 2 bytes are needed to store the sign and decimal point. If the scale is 0, no decimal point needs to be stored, so one fewer byte is required. If the column is UNSIGNED, no sign character needs to be stored, also requiring one fewer byte.

This chapter excerpt is from MySQL Certification Guide by Paul Dubois et al. (Sams, 2004, ISBN: 0672326329 ). Check it out at your favorite bookstore today. Buy this book now.

{mospagebreak title=4.10.2 String Column Types}

The string column types are listed in the following table:

Type

Description

CHAR

Fixed-length string

VARCHAR

Variable-length string

BLOB

Variable-length binary string

TEXT

Variable-length nonbinary string

ENUM

Enumeration consisting of a fixed set of legal values

SET

Set consisting of a fixed set of legal values


When you choose a string datatype, consider the following factors:

  • The maximum length of values you need to store.

  • Whether to use a fixed or variable amount of storage.

  • Whether you need to store binary or nonbinary strings.

  • The number of distinct values required; ENUM or SET may be useful if the set of values is fixed.

The following discussion first describes the general differences between binary and nonbinary strings, and then the specific characteristics of each of the string column datatypes.

4.10.2.1 Binary and Nonbinary String Characteristics

Strings in MySQL may be treated as binary or nonbinary. The two types are each most suited to different purposes.

Binary strings have the following characteristics:

  • A binary string is treated as a string of byte values.

  • Comparisons of binary strings are performed on the basis of those byte values. This has the following implications:

    • Uppercase and lowercase versions of a given character have different byte values, so binary string comparisons are case sensitive.

    • Versions of a character that have different accent marks have different byte values, so binary string comparisons are also accent sensitive.

  • A multi-byte character, if stored as a binary string, is treated simply as multiple individual bytes. Character boundaries of the original data no longer apply.

Nonbinary strings are associated with a character set. The character set affects interpretation of string contents and sorting as follows:

  • A nonbinary string is a string of characters, all of which must belong to a specific character set. Characters may consist of a single byte, or multiple bytes if the character set allows it. For example, each character in the MySQL default character set (Latin-1, also known as ISO-8859-1) requires one byte to store. In contrast, the Japanese SJIS character set contains so many characters that they cannot all be represented in a single byte, so each character requires multiple bytes to store.

  • Nonbinary comparisons are based on the collating (sorting) order of the character set associated with the string.

  • Collating orders, or collations, sometimes treat uppercase and lowercase versions of a given character as equivalent. This means that comparisons using such collations are not case sensitive, so that, for example, ABC , Abc , and abc are all considered equal.

  • Collations may also treat a given character with different accent marks as equivalent. The result is that comparisons of nonbinary strings may not be accent sensitive. For example, an a with no accent may be considered the same as the á and à characters.

  • Multi-byte character comparisons are performed in character units, not in byte units.

The preceding remarks regarding case and accent sensitivity are not absolute, just typical. A given character set can be defined with a collating order that’s case or accent sensitive, or both. MySQL takes care to create character sets that correspond to the sorting order rules of different languages.

String comparison rules are addressed in more detail in section 6.1.1, “Case Sensitivity in String Comparisons.”

The different treatment of binary and nonbinary strings in MySQL is important when it comes to choosing datatypes for table columns. If you want column values to be treated as case and accent insensitive, you should choose a nonbinary column type. Conversely, if you want case and accent sensitive values, choose a binary type. You should also choose a binary type for storing raw data values that consist of untyped bytes.

The CHAR and VARCHAR string column types are nonbinary by default, but can be made binary by including the keyword BINARY in the column definition. Other string types are inherently binary or nonbinary. BLOB columns are always binary, whereas TEXT columns are always nonbinary.

You can mix binary and nonbinary string columns within a single table. For example, assume that you want to create a table named auth_info, to store login name and password authorization information for an application. You want login names to match in any lettercase but passwords to be case sensitive. This statement would accomplish the task:

CREATE TABLE auth_info
(
  login  CHAR(16),     # not case sensitive
  password CHAR(16) BINARY  # case sensitive
);
4.10.2.2 The CHAR and VARCHAR Column Types

The CHAR and VARCHAR column types hold strings up to the maximum length specified in the column definition. To define a column with either of these datatypes, provide the column name, the keyword CHAR or VARCHAR, the maximum length of acceptable values in parentheses, and possibly the keyword BINARY. The maximum length should be a number from 0 to 255. (One of the sample exercises at the end of this chapter discusses why you might declare a zero-length column.) By default, CHAR and VARCHAR columns contain nonbinary strings. The BINARY modifier causes the values they contain to be treated as binary strings.

The CHAR datatype is a fixed-length type. Values in a CHAR column always take the same amount of storage. A column defined as CHAR(30), for example, requires 30 bytes for each value, even empty values. In contrast, VARCHAR is a variable-length datatype. A VARCHAR column takes only the number of bytes required to store each value, plus one byte per value to record the value’s length.

For MySQL 4.0, the length for CHAR and VARCHAR columns is measured in bytes, not characters. There’s no difference for single-byte character sets, but the two measures are different for multi-byte character sets. In MySQL 4.1, this will change; column lengths will be measured in characters. For example, CHAR(30) will mean 30 characters, even for multi-byte character sets.

4.10.2.3 The BLOB and TEXT Column Types

The BLOB and TEXT datatypes each come in four different sizes, differing in the maximum length of values they can store. All are variable-length types, so an individual value requires storage equal to the length (in bytes) of the value, plus 1 to 4 bytes to record the length of the value. The following table summarizes these datatypes; L represents the length of a given value.

Type

Storage Required

Maximum Length

TINYBLOB, TINYTEXT

L + 1 byte

255 bytes

BLOB, TEXT

L + 2 bytes

65,535 bytes

MEDIUMBLOB, MEDIUMTEXT

L + 3 bytes

16,777,215 bytes

LONGBLOB, LONGTEXT

L + 4 bytes

4,294,967,295 bytes


BLOB column values are always binary and TEXT column values are always nonbinary. When deciding which of the two to choose for a column, you would normally base your decision on whether you want to treat column values as case sensitive or whether they contain raw bytes rather than characters. BLOB columns are more suitable for case-sensitive strings or for raw data such as images or compressed data. TEXT columns are more suitable for case- insensitive character strings such as textual descriptions.

4.10.2.4 The ENUM and SET Column Types

Two of the string column types, ENUM and SET, are used when the values to be stored in a column are chosen from a fixed set of values. You define columns for both types in terms of string values, but MySQL represents them internally as integers. This leads to very efficient storage, but can have some surprising results unless you keep this string/integer duality in mind.

ENUM is an enumeration type. An ENUM column definition includes a list of allowable values; each value in the list is called a “member” of the list. Every value stored in the column must equal one of the values in the list. A simple (and very common) use for ENUM is to create a two-element list for columns that store yes/no or true/false choices. The following table shows how to declare such columns:

CREATE TABLE booleans
(
  yesno   ENUM(‘Y’,’N’),
  truefalse ENUM(‘T’,’F’)
);

Enumeration values aren’t limited to being single letters or uppercase. The columns could also be defined like this:

CREATE TABLE booleans
(
  yesno   ENUM(‘yes’,’no’),
  truefalse ENUM(‘true’,’false’)
);

An ENUM column definition may list up to 65,535 members. Enumerations with up to 255 members require one byte of storage per value. Enumerations with 256 to 65,535 members require two bytes per value. The following table contains an enumeration column continent that lists continent names as valid enumeration members:

CREATE TABLE Countries
(
  name char(30),
  continent ENUM (‘Asia’,’Europe’,’North America’,’Africa’,
          ‘Oceania’,’Antarctica’,’South America’)
);

The values in an ENUM column definition are given as a comma-separated list of quoted strings. Internally, MySQL stores the strings as integers, using the values 1 through n for a column with n enumeration members. The following statement assigns the enumeration value Africa to the continent column; MySQL actually stores the value 4 because Africa is the fourth continent name listed in the enumeration definition:

INSERT INTO Countries (name,continent) VALUES(‘Kenya’,’Africa’);

MySQL reserves the internal value 0 as an implicit member of all ENUM columns. It’s used to represent illegal values assigned to an enumeration column. For example, if you assign USA to the continent column, MySQL will store the value 0, rather than any of the values 1 through 7, because USA is not a valid enumeration member. If you select the column later, MySQL displays 0 values as (the empty string).

The SET datatype, like ENUM, is declared using a comma-separated list of quoted strings that define its valid members. But unlike ENUM, a given SET column may be assigned a value consisting of any combination of those members. The following definition contains a list of symptoms exhibited by allergy sufferers:

CREATE TABLE allergy
(
  symptom SET(‘sneezing’,’runny nose’,’stuffy head’,’red eyes’)
);

A patient may have any or all (or none) of these symptoms, and symptom values therefore might contain zero to four individual SET members, separated by commas. The following statements set the symptom column to the empty string (no SET members), a single SET member, and multiple SET members, respectively:

INSERT INTO allergy (symptom) VALUES(”);
INSERT INTO allergy (symptom) VALUES(‘stuffy head’);
INSERT INTO allergy (symptom) VALUES(‘sneezing,red eyes’);

MySQL represents SET columns as a bitmap using one bit per member, so the elements in the symptom definition have internal values of 1, 2, 4, and 8 (that is, they have the values of bits 0 through 3 in a byte). Internally, MySQL stores the values shown in the preceding INSERT statements as 0 (no bits set), 4 (bit 2 set), and 9 (bits 0 and 3 set; that is, 1 plus 8).

A SET definition may contain up to 64 members. The internal storage required for set values varies depending on the number of SET elements (1, 2, 3, 4, or 8 bytes for sets of up to 8, 16, 24, 32, or 64 members).

If you try to store an invalid list member into a SET column, it’s ignored because it does not correspond to any bit in the column definition. For example, setting a symptom value to coughing,sneezing,wheezing results in an internal value of 1 ( sneezing ). The coughing and wheezing elements are ignored because they aren’t listed in the column definition as legal set members.

As mentioned earlier in this section, the conversion between string and numeric representations of ENUM and SET values can result in surprises if you aren’t careful. For example, although you would normally refer to an enumeration column using the string forms of its values, you can also use the internal numeric values. The effect of this can be very subtle if the string values look like numbers. Suppose that you define a table t like this:

CREATE TABLE t (age INT, siblings ENUM(‘0′,’1′,’2′,’3′,’>3′));

In this case, the enumeration values are the strings 0 , 1 , 2 , 3 , and >3 , and the matching internal numeric values are 1, 2, 3, 4, and 5, respectively. Now suppose that you issue the following statement:

INSERT INTO t (age,siblings) VALUES(14,’3′);

The siblings value is specified here as the string 3 , and that is the value assigned to the column in the new record. However, you can also specify the siblings value as a number, as follows:

INSERT INTO t (age,siblings) VALUES(14,3);

But in this case, 3 is interpreted as the internal value, which corresponds to the enumeration value 2 ! The same principle applies to retrievals. Consider the following two statements:

SELECT * FROM t WHERE siblings = ‘3’;
SELECT * FROM t WHERE siblings = 3;

In the first case, you get records that have an enumeration value of 3 . In the second case, you get records where the internal value is 3; that is, records with an enumeration value of 2 .

This chapter excerpt is from MySQL Certification Guide by Paul Dubois et al. (Sams, 2004, ISBN: 0672326329 ). Check it out at your favorite bookstore today. Buy this book now.

{mospagebreak title=4.10.3 Date and Time Column Types}

MySQL provides column types for storing different kinds of temporal information. In the following descriptions, the terms YYYY, MM, DD, hh, mm, and ss stand for a year, month, day of month, hour, minute, and second value, respectively.

The storage requirements and ranges for the date and time datatypes are summarized in the following table:

Type

Storage Required

Range

DATE

3 bytes

‘1000-01-01′ to ‘9999-12-31′

TIME

3 bytes

‘-838:59:59′ to ‘838:59:59′

DATETIME

8 bytes

‘1000-01-01 00:00:00′ to ‘9999-12-31 23:59:59′

TIMESTAMP

4 bytes

‘1970-01-01 00:00:00′ to mid-year 2037

YEAR

1 byte

1901 to 2155 (YEAR(4)), 1970 to 2069 (YEAR(2))


For TIMESTAMP, MySQL 4.0 displays values such as 1970-01-01 00:00:00 as 19700101000000. In MySQL 4.1, this changes so that TIMESTAMP display format is the same as DATETIME.

Each of these temporal datatypes also has a “zero” value that’s used when you attempt to store an illegal value. The “zero” value is represented in a format appropriate for the type (such as 0000-00-00 for DATE and 00:00:00 for TIME).

4.10.3.1 The DATE and TIME Column Types

The DATE datatype represents date values in YYYY-MM-DD format. This representation corresponds to the ANSI SQL date format, also known as ISO 8601 format.

The supported range of DATE values is 1000-01-01 to 9999-12-31 . You might be able to use earlier dates than that, but it’s better to stay within the supported range to avoid unexpected behavior.

MySQL always represents DATE values in ISO 8601 format when it displays them. If necessary, you can reformat DATE values into other display formats using the DATE_FORMAT() function.

For date entry, MySQL also expects to receive dates in ISO format, or at least close to ISO format. That is, date values must be given in year-month-day order, but some deviation from strict ISO format is allowed:

  • Leading zeros on month and day values may be omitted. For example, 2000-1-1 and 2000-01-01 are both accepted as legal.

  • The delimiter between date parts need not be -; you can use other punctuation characters, such as /.

  • Two-digit years are converted to four-digit years. You should be aware that this conversion is done based on the rule that year values from 70 to 99 represent the years 1970 to 1999, whereas values from 00 to 69 represent the years 2000 to 2069. It’s better to provide values with four-digit years to avoid problems with conversion of values for which the rule does not apply.

Instead of attempting to load values that aren’t in an acceptable format into a DATE column, you should convert them into ISO format. An alternative approach that’s useful in some circumstances is to load the values into a string column and perform reformatting operations using SQL string functions to produce ISO format values that can be assigned to a DATE column.

The TIME datatype represents time values in hh:mm:ss format. TIME values may represent elapsed time, and thus might be outside the range of time-of-day values. They may even be negative values. (The actual range of TIME values is -838:59:59 to 838:59:59 .)

MySQL represents TIME values in hh:mm:ss format when displaying them. If necessary, you can reformat TIME values into other display formats using the TIME_FORMAT() function.

For TIME value entry, some variation on this format is allowed. For example, leading zeros on TIME parts may be omitted.

4.10.3.2 The TIMESTAMP and DATETIME Column Types

The DATETIME column type stores date-and-time values in YYYY-MM-DD hh:mm:ss format. It’s similar to a combination of DATE and TIME values, but the TIME part represents time of day rather than elapsed time and has a range limited to 00:00:00 to 23:59:59 . The date part of DATETIME columns has the same range as DATE columns; combined with the TIME part, this results in a DATETIME range from 1000-01-01 00:00:00 to 9999-12-31 23:59:59 .

The TIMESTAMP type, like DATETIME, stores date-and-time values, but has a different range and some special properties that make it especially suitable for tracking data modification times. TIMESTAMP also has a different display format from DATETIME prior to MySQL 4.1:

  • Until MySQL 4.1, TIMESTAMP values are represented as numbers in YYYYMMDDhhmmss format. The default display width is 14 digits, but you can specify an explicit width of any even number from 2 to 14. The display width affects only how MySQL displays TIMESTAMP values, not how it stores them. Stored values always include the full 14 digits.

  • From MySQL 4.1 on, the TIMESTAMP format is YYYY-MM-DD hh:mm:ss , just like DATETIME. Display widths are not supported.

The range of TIMESTAMP values begins at 1970-01-01 00:00:00 (GMT) and extends partway into the year 2037. TIMESTAMP values actually represent the number of seconds elapsed since the beginning of 1970 and are stored using four bytes. This provides room for sufficient seconds to represent a date in the year 2037. Note that TIMESTAMP values are stored using the server’s local timezone.

TIMESTAMP columns have the following special properties:

  • Storing NULL into a TIMESTAMP column sets it to the current date and time. Updating a TIMESTAMP column to NULL also sets it to the current date and time.

  • If you omit a TIMESTAMP column from an INSERT statement, MySQL inserts the current date and time if the column is the first TIMESTAMP column in the table, and inserts zero if it is not.

  • MySQL automatically updates the first TIMESTAMP column in a table to the current date and time when you update (change the existing data in) any other column in the table. (Setting a column to its current value doesn’t count as updating it.) Only the first TIMESTAMP column is subject to automatic updating. All other TIMESTAMP columns do not change unless you update them explicitly.

It’s important to know about the automatic-update property. It’s what makes TIMESTAMP columns useful for tracking record modification times, but is a source of confusion if you’re not aware of it. People who choose TIMESTAMP for a column on the basis of the fact that it stores date-and-time values become dismayed and mystified when they discover that the column’s values change unexpectedly.

4.10.3.3 The YEAR Column Type

The YEAR column type represents year-only values. You can declare such columns as YEAR(4) or YEAR(2) to obtain a four-digit or two-digit display format. If you don’t specify any display width, the default is four digits.

If you don’t need a full date and the range of values you need to store falls into the YEAR range, consider using YEAR to store temporal values. It’s a very space-efficient datatype because values require only one byte of storage each.

This chapter excerpt is from MySQL Certification Guide by Paul Dubois et al. (Sams, 2004, ISBN: 0672326329 ). Check it out at your favorite bookstore today. Buy this book now.

{mospagebreak title=4.10.4 Column Options}

The final part of a column definition (following the datatype) can include optional modifiers. These options are described in the following list. Note that many of them apply only to certain column types.

  • UNSIGNED applies to numeric datatypes and causes negative values to be disallowed. If you attempt to store a negative value in an UNSIGNED column, MySQL stores zero instead.

  • ZEROFILL applies to integer numeric column types. It causes retrieved values to be left-padded with leading zeros up to the column’s display width. For example, if you store the values 0, 14, and 1234 in a column that’s defined as INT(5) ZEROFILL, MySQL displays them as 00000, 00014, and 01234 when you retrieve them.

    Using the ZEROFILL option for a column causes it to be UNSIGNED as well.

  • AUTO_INCREMENT applies to integer numeric column types. It’s used to generate sequences of successive unique values. Defining a column with AUTO_INCREMENT causes a special behavior: When you insert NULL into the column, MySQL generates the next value in the sequence automatically and stores that in the column instead. Use of AUTO_INCREMENT carries with it other requirements: There may be only one AUTO_INCREMENT column per table, the column must be indexed, and the column must be defined as NOT NULL. Section 4.10.5, “Using the AUTO_INCREMENT Column Option,” provides specific details on the use of AUTO_INCREMENT columns.

  • BINARY applies to the CHAR and VARCHAR datatypes. CHAR and VARCHAR columns are nonbinary by default; adding BINARY to the definition causes column values to be treated as binary strings.

    Beginning with MySQL 4.1, BINARY may also be applied to ENUM and SET columns to cause case-sensitive treatment of column values.

  • NULL and NOT NULL apply to all column types. They indicate whether or not a column can contain NULL values. If you specify neither option, the default is NULL, which allows NULL values in the column.

  • DEFAULT value provides the column with a default value to be used when you create a new record but don’t explicitly specify a value for the column (for example, when you execute an INSERT statement that doesn’t provide values for all columns in the table). This attribute applies to all column types except BLOB and TEXT. A default value must be a constant; it cannot be an expression whose value is calculated at record-creation time.

    If you don’t specify a DEFAULT value for a column, MySQL chooses a default for you. The value is NULL if the column may contain NULL; otherwise, the value depends on the column type:

    • For numeric columns, the default is zero.

    • For string columns other than ENUM, the default is the empty string. For ENUM columns, the default is the first enumeration member.

    • For temporal columns, the default value is the “zero” value for the datatype, represented in whatever format is appropriate to the column type (for example, 0000-00-00 for DATE and 00:00:00 for TIME).

    The exceptions to the preceding are the first TIMESTAMP column in a table and integer columns that have the AUTO_INCREMENT attribute. For such columns, MySQL uses a default value of the current date and time, and the next sequence number, respectively. Furthermore, if you supply a DEFAULT option for these column types, MySQL ignores it or produces an error.

    It’s an error to specify a default value of NULL for a NOT NULL column.

  • PRIMARY KEY and UNIQUE may be given at the end of a column definition, for all datatypes except BLOB and TEXT. They cause the creation of a PRIMARY KEY or UNIQUE index for the column. Adding either of these options to a column definition is the same as defining the index in a separate clause. For example, the following table definitions are equivalent:

    CREATE TABLE t (i INT NOT NULL PRIMARY KEY);
    
    CREATE TABLE t (i INT NOT NULL, PRIMARY KEY (i));

This chapter excerpt is from MySQL Certification Guide by Paul Dubois et al. (Sams, 2004, ISBN: 0672326329 ). Check it out at your favorite bookstore today. Buy this book now.

{mospagebreak title=4.10.5 Using the AUTO_INCREMENT Column Option}

The AUTO_INCREMENT option may be added to an integer column definition to create a column for which MySQL automatically generates a new sequence number each time you create a new row. The option is used in conjunction with an index (usually a primary key) and provides a mechanism whereby each value is a unique identifier that can be used to refer unambiguously to the row in which it occurs. MySQL also provides a LAST_INSERT_ID() function that returns the most recently generated AUTO_INCREMENT value. This function is useful for determining the identifier when you need to look up the record just created, or when you need to know the identifier to create related records in other tables.

The following scenario illustrates how you can set up and use an AUTO_INCREMENT column. Assume that you’re organizing a conference and need to keep track of attendees and the seminars for which each attendee registers. (When someone submits a registration form for the conference, the form must indicate which of the available seminars the person wants to attend.)

Your task is to record seminar registrations and associate them with the appropriate attendee. Unique ID numbers provide a way to keep track of attendees and an AUTO_INCREMENT column makes the implementation for the task relatively easy:

  1. Set up an attendee table to record information about each person attending the conference. The table shown here includes columns for ID number, name, and job title:

    mysql>
    CREATE TABLE attendee
      -> (
      ->   att_id   INT UNSIGNED NOT NULL AUTO_INCREMENT,
      ->   att_name  CHAR(100),
      ->   att_title  CHAR(40),
      ->   PRIMARY KEY (att_id)
      -> );

    The att_id column is created as a PRIMARY KEY because it must contain unique values, and as an AUTO_INCREMENT column because it’s necessary for MySQL to generate values for the column automatically.

  2. Set up a seminar table to record the seminars for which each attendee registers. Assume that there are four seminars: Database Design, Query Optimization, SQL Standards, and Using Replication. There are various ways in which these seminars can be represented; an ENUM column is one that works well because the seminar titles form a small fixed list of values. The table must also record the ID of each attendee taking part in the seminar. The table can be created with this statement:

    mysql>
    CREATE TABLE seminar
      -> (
      ->   att_id   INT UNSIGNED NOT NULL,
      ->   sem_title ENUM(‘Database Design’,’Query Optimization’,
      ->           ‘SQL Standards’,’Using Replication’),
      ->   INDEX (att_id)
      -> );

    Note both the differences and similarities of the att_id column declarations in the two tables. In attendee, att_id is an AUTO_INCREMENT column and is indexed as a PRIMARY KEY to ensure that each value in the column is unique. In seminar, att_id is indexed for faster lookups, but it isn’t indexed as a PRIMARY KEY. (There might be multiple records for a given attendee and a PRIMARY KEY does not allow duplicates.) Nor is the column declared in the seminar table with the AUTO_INCREMENT option because ID values should be tied to existing IDs in the attendee table, not generated automatically. Aside from these differences, the column is declared using the same datatype (INT) and options (UNSIGNED, NOT NULL) as the att_id column in the attendee table.

  3. Each time a conference registration form is received, enter the attendee information into the attendee table. For example:

    mysql>
    INSERT INTO attendee (att_name,att_title)
      -> VALUES(‘Charles Loviness’,’IT Manager’);

    Note that the INSERT statement doesn’t include a value for the att_id column. Because att_id is an AUTO_INCREMENT column, MySQL generates the next sequence number (beginning with 1) and sets the att_id column in the new row to that value. You can use the new att_id value to look up the record just inserted, but how do you know what value to use? The answer is that you don’t need to know the exact value. Instead, you can get the ID by invoking the LAST_INSERT_ID() function, which returns the most recent AUTO_INCREMENT value generated during your current connection with the server. Thus, the record for Charles Loviness can be retrieved like this:

    mysql>
    SELECT * FROM attendee WHERE att_id = LAST_INSERT_ID();
    +——–+——————+————+ | att_id | att_name | att_title | +——–+——————+————+ | 3 | Charles Loviness | IT Manager | +——–+——————+————+

    This output indicates that the Loviness form was the third one entered.

  4. Next, enter new records into the seminar table for each seminar marked on the entry form. The att_id value in each of these records must match the att_id value in the newly created attendee record. Here again, the LAST_INSERT_ID() value can be used. If Loviness will participate in Database Design, SQL Standards, and Using Replication, create records for those seminars as follows:

    mysql>
    INSERT INTO seminar (att_id,sem_title)
      -> VALUES(LAST_INSERT_ID(),’Database Design’);mysql>
    INSERT INTO seminar (att_id,sem_title)
      -> VALUES(LAST_INSERT_ID(),’SQL Standards’);mysql>
    INSERT INTO seminar (att_id,sem_title)
      -> VALUES(LAST_INSERT_ID(),’Using Replication’);

    To see what the new seminar records look like, use the LAST_INSERT_ID() value to retrieve them:

    mysql>
    SELECT * FROM seminar WHERE att_id = LAST_INSERT_ID();
    +——–+——————-+ | att_id | sem_title | +——–+——————-+ | 3 | Database Design | | 3 | SQL Standards | | 3 | Using Replication | +——–+——————-+
  5. When you receive the next registration form, repeat the process just described. For every new attendee record, the value of LAST_INSERT_ID() will change to reflect the new value in the att_id column.

The preceding description shows how to use an AUTO_INCREMENT column—how to declare the column, how to generate new ID values when inserting new records, and how to use the ID values to tie together related tables. However, the description glosses over some of the details. These are presented in the following discussion, beginning with declaration syntax and then providing further information about how AUTO_INCREMENT columns work.

The att_id-related declarations in the attendee table look like this:

att_id INT UNSIGNED NOT NULL AUTO_INCREMENT,
PRIMARY KEY (att_id)

These declarations involve the following factors, which you should consider when creating an AUTO_INCREMENT column:

  • The column must be an integer type. Choose the specific datatype based on the number of values the column must be able to hold. For the largest range, use BIGINT. However, BIGINT requires 8 bytes per value. If you want to use less storage, INT requires only 4 bytes per value and provides a range that’s adequate for many applications. You can use integer types smaller than INT as well, but it’s a common error to choose one that’s toosmall. For example, TINYINT has a range that allows very few unique numbers, so you’ll almost certainly run into problems using it as an AUTO_INCREMENT column for identification purposes.

  • An AUTO_INCREMENT sequence contains only positive values. For this reason, it’s best to declare the column to be UNSIGNED. Syntactically, it isn’t strictly required that you declare the column this way, but doing so doubles the range of the sequence because an UNSIGNED integer column has a larger maximum value. Defining the column as UNSIGNED also serves as a reminder that you should never store negative values in an AUTO_INCREMENT column.

  • The most common way to use an AUTO_INCREMENT column is as a primary key, which ensures unique values and prevents duplicates. The column should thus be defined to contain unique values, either as a PRIMARY KEY or a UNIQUE index. (MySQL allows you to declare an AUTO_INCREMENT column with a nonunique index, but this is less common.)

  • An AUTO_INCREMENT column defined as a PRIMARY KEY must also be NOT NULL.

After setting up an AUTO_INCREMENT column, use it as follows:

  • Inserting NULL into an AUTO_INCREMENT column causes MySQL to generate the next sequence value and store it in the column. Omitting the AUTO_INCREMENT column from an INSERT statement is the same as inserting NULL explicitly. In other words, an INSERT statement that does not provide an explicit value for an AUTO_INCREMENT column also generates the next sequence value for the column. For example, if id is an AUTO_INCREMENT column in the table t, the following two statements are equivalent:

    INSERT INTO t (id,name) VALUES(NULL,’Hans’);
    INSERT INTO t (name) VALUES(‘Hans’);
  • Currently, inserting 0 into an AUTO_INCREMENT column has the same effect as inserting NULL: the next sequence value is generated. However, it isn’t recommended that you rely on this behavior because it might change in the future.

  • A positive value can be inserted explicitly into an AUTO_INCREMENT column if the value isn’t already present in the column. If this value is larger than the current sequence counter, subsequent automatically generated values begin with the value plus one:

    mysql>
    CREATE TABLE t (id INT AUTO_INCREMENT, PRIMARY KEY (id));mysql>
    INSERT INTO t (id) VALUES(NULL),(NULL),(17),(NULL),(NULL);mysql>
    SELECT id FROM t;
    +—-+ | id | +—-+ | 1 | | 2 | | 17 | | 18 | | 19 | +—-+
  • After an AUTO_INCREMENT value has been generated, the LAST_INSERT_ID() function returns the generated value. LAST_INSERT_ID() will continue to return the same value, regardless of the number of times it’s invoked, until another AUTO_INCREMENT value is generated.

  • The value returned by LAST_INSERT_ID() is specific to the client that generates the AUTO_INCREMENT value. That is, it’s connection-specific, so the LAST_INSERT_ID() value is always correct for the current connection, even if other clients also generate AUTO_INCREMENT values of their own. Another client cannot change the value that LAST_INSERT_ID() returns to you, nor can one client use LAST_INSERT_ID() to determine the AUTO_INCREMENT value generated by another.

  • AUTO_INCREMENT behavior is the same for REPLACE as it is for INSERT. Any existing record is deleted, and then the new record is inserted. Consequently, replacing an AUTO_INCREMENT column with NULL or 0 causes it to be set to the next sequence value.

  • If you update an AUTO_INCREMENT column to NULL or 0 in an UPDATE statement, the column is set to 0.

  • If you delete rows containing values at the high end of a sequence, those values are not reused for MyISAM or InnoDB tables when you insert new records. For example, if an AUTO_INCREMENT column contains the values from 1 to 10 and you delete the record containing 10, the next sequence value is 11, not 10. (This differs from ISAM and BDB tables, for which values deleted from the high end of a sequence arereused.)

The MyISAM storage engine supports composite indexes that include an AUTO_INCREMENT column. This allows creation of independent sequences within a single table. Consider the following table definition:

CREATE TABLE multisequence
(
  name   CHAR(10) NOT NULL,
  name_id INT UNSIGNED NOT NULL AUTO_INCREMENT,
  PRIMARY KEY (name, name_id)
);

Inserting name values into the multisequence table generates separate sequences for each distinct name:

mysql>
INSERT INTO multisequence (name)
  -> VALUES(‘Petr’),(‘Ilya’),(‘Ilya’),(‘Yuri’),(‘Ilya’),(‘Petr’);mysql>
SELECT * FROM multisequence ORDERBYname, name_id;
+——+———+ | name | name_id | +——+———+ | Ilya | 1 | | Ilya | 2 | | Ilya | 3 | | Petr | 1 | | Petr | 2 | | Yuri | 1 | +——+———+

Note that for this kind of AUTO_INCREMENT column, values deleted from the high end of any sequence are reused. This differs from MyISAM behavior for single-column AUTO_INCREMENT sequences.

This chapter excerpt is from MySQL Certification Guide by Paul Dubois et al. (Sams, 2004, ISBN: 0672326329 ). Check it out at your favorite bookstore today. Buy this book now.

{mospagebreak title=4.10.6 Automatic Type Conversion and Value Clipping}

For historical reasons, MySQL is forgiving about signaling an error if a given value doesn’t match the datatype of the column that is the insert target. Instead, MySQL does its best to perform automatic type conversion. For example, if you attempt to store a negative value in an UNSIGNED integer column, MySQL silently converts it to zero, which is the nearest legal value for the column. In other words, the MySQL server converts input values to the types expected from the column definitions, inserts the result, and continues on its way.

If you need to prevent attempts to insert invalid values into a table, you should first validate the values on the client side; however, because that isn’t an exam topic, it isn’t discussed further here.

This section describes the kinds of conversions that MySQL performs and the circumstances under which they occur. After you know these principles, you’ll know what types of validation are necessary before trying to store your data in a MySQL database.

In many cases, type conversion affords you the flexibility to write a statement different ways and get the same result. For example, if i is an integer column, the following statements both insert 43 into it, even though the value is specified as a number in one statement and as a string in the other:

INSERT INTO t (i) VALUES(43);
INSERT INTO t (i) VALUES(’43’);

MySQL performs automatic string-to-number conversion for the second statement.

In other cases, the effects of type conversion might be surprising, particularly if you’re unaware that these conversions occur. You can avoid such surprises by understanding the conditions under which conversion takes place. In general, MySQL performs type conversion based on the constraints implied by a column’s definition. These constraints apply in several contexts:

  • When you insert or update column values with statements such as INSERT, REPLACE, UPDATE, or LOAD DATA INFILE.

  • When you change a column definition with ALTER TABLE.

  • When you specify a default value using a DEFAULT value option in a column definition. (For example, if you specify a negative default for an UNSIGNED column, the value is converted, resulting in a default of zero.)

The following list discusses some of the conversions that MySQL performs. It isn’t exhaustive, but is sufficiently representative to provide you with a good idea of how MySQL treats input values and what you’ll be tested on in the exam. Circumstances under which automatic type conversion occurs include the following:

  • Conversion of out-of-range values to in-range values.If you attempt to store a value that’s smaller than the minimum value allowed by the range of a column’s datatype, MySQL stores the minimum value in the range. If you attempt to store a value that’s larger than the maximum value in the range, MySQL stores the range’s maximum value. Some examples of this behavior are as follows:

    • TINYINT has a range of –128 to 127. If you attempt to store values less than –128 in a TINYINT column, MySQL stores –128 instead. Similarly, MySQL stores values greater than 127 as 127.

    • If you insert a negative value into an UNSIGNED integer column, MySQL converts the value to 0.

    • When you reach the upper limit of an AUTO_INCREMENT column, an attempt to generate the next sequence value results in a duplicate-key error. This is a manifestation of MySQL’s general out-of-range value clipping behavior. For example, assume that you have a TINYINT UNSIGNED column as an AUTO_INCREMENT column and that it currently contains 254 as the maximum sequence value. The upper limit for this column type is 255, so the next insert generates a sequence value of 255 and successfully stores it in the new record. However, the insert after that fails because MySQL generates the next sequence value, which is 256. Because 256 is higher than the column’s upper limit of 255, MySQL clips 256 down to 255 and attempts to insert that value. But because 255 is already present in the table, a duplicate-key error occurs.

  • Conversion to datatype default.If you attempt to store a value for which MySQL cannot decide on an appropriate conversion, it stores the default value for the datatype of the target column. For example, if you try to store the value Sakila in an INT column, MySQL stores the value 0. For dates, the “zero” value is 0000-00-00 and for time columns 00:00:00. More details on the default for each column type are given in section 4.10.4, “Column Options.”

  • String truncation.If you attempt to store a string value into a VARCHAR or CHAR column with a defined length that’s shorter than the string, the string is truncated to fit the column’s length. That is, only the leading characters that fit into the column are stored. The remaining characters are discarded. For example, if you try to store the value Sakila into a column defined as CHAR(4), MySQL stores the value Saki .

  • Date and time interpretation.The server performs streamlined checking of temporal values. It looks at individual components of date and time values, but does not perform an exhaustive check of the value as a whole. For example, day values may be considered valid as long as they’re within the range 1 to 31. This means you can specify a date such as 2000-04-31 and MySQL will store it as given. However, a DATETIME value such as 2000-01-01 24:00:00 contains an hour value of 24, which is never valid as a time of day. Consequently, MySQL stores the “zero” value in DATETIME format ( 0000-00-00 00:00:00 ).

  • Addition of century for two-digit years.Like many other computer programs, MySQL converts two-digit years to four-digit years. Values 00 to 69 are converted to 2000-2069; values 70 to 99 are converted to 1970-1999.

  • Enumeration and set value conversion.If a value that’s assigned to an ENUM column isn’t listed in the ENUM definition, MySQL converts it to (the empty string). If a value that’s assigned to a SET column contains elements that aren’t listed in the SET definition, MySQL discards those elements, retaining only the legal elements.

  • Handing assignment ofNULL toNOT NULL columns.The effect of assigning NULL to a NOT NULL column depends on whether the assignment occurs in a single-row or multiple-row INSERT statement. For a single-row INSERT, the statement fails. For a multiple-row INSERT, the column is assigned the default value for the column type.

  • Conversion of fixed-point values.Conversion, into numbers, of string values that can be interpreted as numbers is different for DECIMAL than for other numeric datatypes. This occurs because DECIMAL values are represented as strings rather than in native binary format. For example, if you assign 0003 to an INT or FLOAT, it’s stored as 3 in integer or floating-point binary format. In contrast, if you assign 0003 to a DECIMAL column, it’s stored without change, including the leading zeros, even though it will behave identically to a 3 in mathematical operations. If the DECIMAL column isn’t wide enough to accommodate the leading zeros, as many are stored as possible. If you store 0003 into a DECIMAL(2,0) UNSIGNED column, it’s converted to 03 .

Using ALTER TABLE to change a column’s datatype maps existing values to new values according to the constraints imposed by the new datatype. This might result in some values being changed. For example, if you change a TINYINT to an INT, no values are changed because all TINYINT values fit within the INT range. However, if you change an INT to a TINYINT, any values that lie outside the range of TINYINT are clipped to the nearest endpoint of the TINYINT range. Similar effects occur for other types of conversions, such as TINYINT to TINYINT UNSIGNED (negative values are converted to zero), and converting a long string column to a shorter one (values that are too long are truncated to fit the new size).

If a column is changed to NOT NULL using ALTER TABLE, MySQL converts NULL values to the default value for the column type.

The following table shows how several types of string values are handled when converted to date or numeric datatypes. It demonstrates several of the points just discussed. Note that only string values that look like dates or numbers convert properly without loss of information. Note too that leading zeros are retained for the DECIMAL column during conversion.

String Value

Converted to DATE

Converted to INT

Converted to DECIMAL

2010-12-03

2010-12-03

2010

2010

zebra

0000-00-00

0

0

500 hats

0000-00-00

500

500

1978-06-12

1978-06-12

1970

1970

06-12-1978

0000-00-00

6

06

0017

0000-00-00

17

0017

 

This chapter excerpt is from MySQL Certification Guide by Paul Dubois et al. (Sams, 2004, ISBN: 0672326329 ). Check it out at your favorite bookstore today. Buy this book now.

[gp-comments width="770" linklove="off" ]

chat