This chapter covers the MySQL architecture, locking and concurrency, and transactions. It also discusses how to select the right engine and looks at each of MySQL's storage engines in detail. (From the book High Performance MYSQL: Optimization, Backups, Replication and Load Balancing, by Jeremy Zawodny and Derek Balling, ISBN: 0596-003064, O'Reilly Media, 2004.)
There is a final technique for increasing concurrency: Multi-Version Concurrency Control (MVCC). Often referred to simply as versioning, MVCC is used by Oracle, by PostgreSQL, and by MySQL’s InnoDB storage engine. MVCC can be thought of as a new twist on row-level locking. It has the added benefit of allowing nonlocking reads while still locking the necessary records only during write operations. Some of MVCC’s other properties will be of particular interest when we look at transactions in the next section.
So how does this scheme work? Conceptually, any query against a table will actually see a snapshot of the data as it existed at the time the query began—no matter how long it takes to execute. If you’ve never experienced this before, it may sound a little crazy. But give it a chance.
In a versioning system, each row has two additional, hidden values associated with it. These values represent when the row was created and when it was expired (or deleted). Rather than storing the actual time at which these events occur, the database stores the version number at the time each event occurred. The database version (or system version) is a number that increments each time a query* begins. We’ll call these two values the creation id and the deletion id.
* That’s not quite true. As you’ll see when we start talking about transactions later, the version number is incremented for each transaction rather than each query.
Under MVCC, a final duty of the database server is to keep track of all the running queries (with their associated version numbers). Let’s see how this applies to particular operations:
SELECT
When records are selected from a table, the server must examine each row to ensure that it meets several criteria:
Its creation id must be less than or equal to the system version number. This ensures that the row was created before the current query began.
Its deletion id, if not null, must be greater than the current system version. This ensures that the row wasn’t deleted before the current query began.
Its creation id can’t be in the list of running queries. This ensures that the row wasn’t added or changed by a query that is still running.
Rows that pass all of these tests may be returned as the result of the query.
INSERT
When a row is added to a table, the database server records the current version number along with the new row, using it as the row’s creation id.
DELETE
To delete a row, the database server records the current version number as the row’s deletion id.
UPDATE
When a row is modified, the database server writes a new copy of the row, using the version number as the new row’s creation id. It also writes the version number as the old row’s deletion id.
The result of all this extra record keeping is that read queries never lock tables, pages, or rows. They simply read data as fast as they can, making sure to select only rows that meet the criteria laid out earlier. The drawbacks are that the server has to store a bit more data with each row and do a bit more work when examining rows. Table 2-1 summarizes the various locking models and concurrency in MySQL.
Locking strategy
Concurrency
Overhead
Engines
Table locks
Lowest
Lowest
MyISAM, Heap, Merge
Page locks
Modest
Modest
BDB
Multiversioning
Highest
High
InnoDB
Table 2-1.Locking models and concurrency in MySQL
If you've enjoyed what you've seen here, or to get more information, click on the "Buy the book!" graphic. Pick up a copy today!