In this article, Vikram gives us a sneak-peek under the hood of MySQL to see what makes it tick, all the while explaining the various MySQL subsystems and how they interact with each other. This excerpt comes from Chapter two of MySQL: The Complete Reference (McGraw-Hill/Osborne, ISBN 0-07-222477-0, 2004).
Atomicity A transaction is defined as an action, or a series of actions, that can access or change the contents of a database. In SQL terminology, a transaction occurs when one or more SQL statements operate as one unit. Each SQL statement in such a unit is dependent on the others; in other words, if one statement does not complete, the entire unit will be rolled back, and all the affected data will be returned to the state it was in before the transaction was started. Grouping the SQL statements as part of a single unit (or transaction) tells MySQL that the entire unit should be executed atomically.
Atomic execution is an all-or-nothing proposition. All of the SQL statements must be completed for the database to maintain a state of data integrity; otherwise, none of the statements will be finalized and committed to disk. In MySQL, the beginning of a transaction is marked with a BEGIN statement. The transaction (or unit of work) will not be considered complete until a COMMIT command is issued to tell MySQL to complete the action. When necessary, the ROLLBACK command will initiate a rolling back of all changes to the state before the BEGIN statement.
An everyday real-world example of this can be found in the banking business. By debiting and crediting your bank account, your bank adds and subtracts money from the account within one transaction. These updates usually involve multiple tables. The bank would not be able to maintain data integrity without guaranteeing that the entire transaction will take place, not just part of it.
Transaction management is particularly important to client-server systems that perform data entry or to any application that must be able to count on a high degree of safety from undetected data loss, such as the banking example described here.
Consistency Consistency exists when every transaction leaves the system in a consistent state, regardless of whether the transaction completes successfully or fails midway.
For example, imagine that your bank uses a transaction that is supposed to transfer money from one bank account to another. If the transaction debits one bank account for the requisite amount but fails to credit the other account with a corresponding amount, the system would no longer be in a consistent state. In this case, the transaction would violate the consistency constraint, and the system would no longer be ACID-compliant.
In MySQL, consistency is primarily handled by MySQL’s logging mechanisms, which record all changes to the database and provide an audit trail for transaction recovery. If the system goes down in the middle of a transaction, the MySQL recovery process will use these logs to discover whether or not the transaction was successfully completed and roll it back if required.
In addition to the logging process, MySQL also provides locking mechanisms that ensure that all of the tables, rows, and indexes that make up the transaction are locked by the initiating process long enough to either commit the transaction or roll it back.
Isolation Isolation implies that every transaction occurs in its own space, isolated from other transactions that may be occurring in the system, and that the results of a transaction are visible only once the entire sequence of events making up the transaction has been fully executed. Even though multiple transactions may be occurring simultaneously in such a system, the isolation principle ensures that the effects of a particular transaction are not visible until the transaction is fully complete.
This is particularly important when the system supports multiple simultaneous users and connections (as MySQL does); systems that do not conform to this fundamental principle can cause massive data corruption, as the integrity of each transaction’s individual space will be quickly violated by other competing, often conflicting, transactions.
Interestingly, MySQL offers the use of server-side semaphore variables that act as traffic managers to help programs manage their own isolation mechanisms. These variables are useful in cases for which you prefer not to incur the overhead of the Transaction Managers, or when a recovery plan is possible outside of the confines of log recovery. MySQL InnoDB tables offer isolation in transactions involving multiple queries, while MyISAM tables allow you to simulate isolation via the LOCK TABLES command.
Durability Durability, which means that changes from a committed transaction persist even if the system crashes, comes into play when a transaction has completed and the logs have been updated in the database. Most RDBMS products ensure data consistency by keeping a log of all activity that alters data in the database in any way. This database log keeps track of any and all updates made to tables, queries, reports, and so on. If you have turned on the database log , you already know that using it will slow down the performance of your database when it comes to writing data. (It will not, however, affect the speed of your queries.)
In MySQL, you can specify whether or not you wish to use transactions by choosing the appropriate table handlers, depending on your application. The InnoDB table handler performs logging a bit differently than BDB does, while MyISAM does not support the type of logs that would permit you to be assured of a durable database. By default, InnoDB tables are 100% durable to the last second prior to a crash. MyISAM tables offer partial durability—all changes committed to the system prior to the last FLUSH TABLES command are guaranteed to be saved to disk.
SQL Server and Oracle, for instance, are able to restore a database to a previous state by restoring a previously backed-up database and, in essence, “replaying” all subsequent transactions up until the point of failure. These database products do not encourage the direct use of—nor do they expose the inner data structures of—the log files, because those files form part of the database engine’s recovery mechanism.
MySQL also keeps a binary log of all data manipulation activity in sequential order. However, unlike the logs used in other databases, this log is easy to read, which means that it’s a relatively straightforward task to recover lost data by using the last backup in combination with the log.
Remember: this is chapter two of MySQL: The Complete Reference, by Vikram Vaswani (McGraw-Hill/Osborne, ISBN 0-07-222477-0, 2004). Vikram is the founder of Melonfire, and has had numerous articles featured on Dev Shed. Buy this book now.