A Technical Tour of MySQL

In this article, Vikram gives us a sneak-peek under the hood of MySQL to see what makes it tick, all the while explaining the various MySQL subsystems and how they interact with each other. This excerpt comes from Chapter two of MySQL: The Complete Reference (McGraw-Hill/Osborne, ISBN 0-07-222477-0, 2004).

MySQL: The Complete ReferenceThe bulk of this book addresses the capabilities of the MySQL engine and illustrates the uses of MySQL through sample applications that manipulate data stored in various types of databases. Before we get into that, though, it’s important that you have a sound technical understanding of how MySQL works, so that you can make informed decisions about how best to deploy and optimize the application for your needs.

To that end, this chapter is primarily focused on giving you an overall view of the components and functionality that make up MySQL. Understanding these features will be helpful as you read further through the book and apply your knowledge to the development of MySQL-based applications.

An Overview of MySQL Architecture

MySQL is based on a tiered architecture, consisting of both primary subsystems and support components that interact with each other to read, parse, and execute queries, and to cache and return query results.

Primary Subsystems
The MySQL architecture consists of five primary subsystems that work together to respond to a request made to the MySQL database server:

  • The Query Engine
  • The Storage Manager
  • The Buffer Manager
  • The Transaction Manager
  • The Recovery Manager

The organization of these features is shown in Figure 1.We’ll explain each one briefly to help you gain a better understanding of how the parts fit together.

FIGURE 1 MySQL subsystems

The Query Engine
This subsystem contains three interrelated components:

  • The Syntax Parser
  • The Query Optimizer
  • The Execution Component

The Syntax Parser decomposes the SQL commands it receives from calling programs into a form that can be understood by the MySQL engine. The objects that will be used are identified, along with the correctness of the syntax. The Syntax Parser also checks the objects being referenced to ensure that the privilege level of the calling program allows it to use them.

The Query Optimizer then streamlines the syntax for use by the Execution Component, which then prepares the most efficient plan of query execution. The Query Optimizer checks to see which index should be used to retrieve the data as quickly and efficiently as possible. It chooses one from among the several ways it has found to execute the query and then creates a plan of execution that can be understood by the Execution Component.

The Query Optimizer uses probability-based induction, so you may want to override it if you think that you already know the ideal way to access your query results; this will prevent the engine from using another, less optimal plan. MySQL provides you the option of giving the engine certain “hints” if you want it to use specific indexes without checking the Optimizer.

The Execution Component then interprets the execution plan and, based on the information it has received, makes requests of the other components to retrieve the records.

Adapting to the Environment

MySQL takes advantage of the separation between the Query Engine and the rest of the components to provide table handlers in addition to the native MyISAM handler, such as handlers for HEAP and MERGE table types. It also provides a new Storage Manager and a new Transaction Manager for each table handler. Since InnoDB and BDB table types have transaction features not present in the native MySQL types, these were modified by their respective third-party creators to adapt to being plugged into MySQL’s existing Query Engine.

You can read more about the various table types supported by MySQL in Chapter 8.

Remember: this is chapter two of MySQL: The Complete Reference, by Vikram Vaswani (McGraw-Hill/Osborne, ISBN 0-07-222477-0, 2004). Vikram is the founder of Melonfire, and has had numerous articles featured on Dev Shed. 
Buy this book now.

{mospagebreak title=An Overview of MySQL Architecture (cont’d)}

The Storage Manager
The Storage Manager interfaces with the operating system (OS) to write data to the disk efficiently. Because the storage functions reside in a separate subsystem, the MySQL engine operates at a level of abstraction away from the operating system. This means that if you port to a different operating system that uses a different storage mechanism, for example, you can rewrite only the storage portion of the code while leaving the rest of the engine as is. With the help of MySQL’s Function Libraries (discussed shortly in the section “Support Components”), the Storage Manager writes to disk all of the data in the user tables, indexes, and logs as well as the internal system data.

The Query Cache   If a query returns a given set of records, repeating the same query should return the same set of records unless the underlying data has somehow changed. As obvious as this sounds, few of the other major relational database management system (RDBMS) vendors provide features that take advantage of this principle. Other database products are efficient in storing optimized access plans that detail the process by which data is retrieved; such plans allow queries similar to those that have been issued previously to bypass the process of analyzing indexes yet again to get to the data.

Result set caching takes this principle a step further by storing the result sets themselves in memory, thus circumventing the need to search the database at all.  The data from a query is simply placed in a cache, and when a similar query is issued, this data is returned as if in response to the query that created it in the first place.

The MySQL engine uses an extremely efficient result set caching mechanism, known as the Query Cache, that dramatically enhances response times for queries that are called upon to retrieve the exact same data as a previous query.

This mechanism is so efficient that a major computing publication declared MySQL queries to be faster than those of Oracle and SQL Server (which are both known for their speed). If implemented properly, decision support systems using MySQL with canned reports or data-driven web pages can provide response speeds far beyond those that would be expected without the Query Cache.

The Buffer Manager
This subsystem handles all memory management issues between requests for data by the Query Engine and the Storage Manager. MySQL makes aggressive use of memory to cache result sets that can be returned as-is rather than making duplicate requests to the Storage Manager; this cache is maintained in the Buffer Manager.

This is also the area where new records can be cached while waiting for availability of targeted tables and indexes. If any new data is needed, it’s requested from the Storage Manager and placed in the buffer before then being sent to the Query Engine.

The Transaction Manager
The function of the Transaction Manager is to facilitate concurrency in data access. This subsystem provides a locking facility to ensure that multiple simultaneous users access the data in a consistent way, without corrupting or damaging the data in any way. Transaction control takes place via the Lock Manager subcomponent, which places and releases locks on various objects being used in transactions. Each transactional table handler implements its own Transaction Manager to handle all locking and concurrency needs.

The Recovery Manager
The Recovery Manager’s job is to keep copies of data for retrieval later, in case of a loss of data. It also logs commands that modify the data and other significant events inside the database.

So far, only the InnoDB and BDB table handlers provide recovery management. The MyISAM handler doesn’t have transactional recovery procedures, but it does provide mechanisms that apply certain recovery features in case of a server outage; these features “fix” any internal inconsistencies that might occur as the result of such a crash. Such inconsistencies are usually related to indexes not being properly updated to reflect the contents of a table or records being incompletely written to a database.

Remember: this is chapter two of MySQL: The Complete Reference, by Vikram Vaswani (McGraw-Hill/Osborne, ISBN 0-07-222477-0, 2004). Vikram is the founder of Melonfire, and has had numerous articles featured on Dev Shed. 
Buy this book now.

{mospagebreak title=Support Components}

In addition to the five primary subsystems, the MySQL architecture contains the following two support components:

  • The Process Manager
  • Function Libraries

The Process Manager
This component performs two functions in the system. First, it manages user connections, via modules for network connection management with clients. Second, it synchronizes competing tasks and processes, via modules for multi-threading, thread locking, and performing thread-safe operations.

Function Libraries
This component contains general-purpose routines that are used by all the other subsystems. It includes routines for string manipulation, sorting operations, and such operating-system-specific functions as memory management and file I/O.

Subsystem/Component Interaction and Control Flow
The Query Engine requests that data be read from or written to the Buffer Manager to satisfy a user query. It depends on the Transaction Manager to request the locking of data so that concurrency is ensured. To perform table creation and drop operations, the Query Engine accesses the Storage Manager directly, bypassing the Buffer Manager, to create or delete files in the file system.

The Buffer Manager caches data from the Storage Manager for efficient retrieval by the Query Engine. It depends on the Transaction Manager to check the locking status of the data before it performs any modification operations.

The Transaction Manager depends on the Query Cache and the Storage Manager to place locks on data in memory and in the file system, respectively.

The Recovery Manager uses the Storage Manager to store command/event logs and backups of the data in the file system. It depends on the Transaction Manager to obtain locks on the log files being written. The Recovery Manager also needs to use the Buffer Manager during recovery from crashes.

The Storage Manager depends on the operating system file system for persistent storage and retrieval of data. It depends on the Transaction Manager to obtain locking status information.

Remember: this is chapter two of MySQL: The Complete Reference, by Vikram Vaswani (McGraw-Hill/Osborne, ISBN 0-07-222477-0, 2004). Vikram is the founder of Melonfire, and has had numerous articles featured on Dev Shed. 
Buy this book now.

{mospagebreak title=The MySQL Engine}

MySQL supports small, embedded kiosk-style applications as well as the occasional five billion-record data warehouse. This versatility is possible in part because of the MySQL engine, which has been designed for maximum scalability, maximum resource efficiency, and easy portability to various platforms and architectures. This section will discuss the important characteristics of this engine in greater detail.

MySQL is designed on the assumption that the vast majority of its applications will be running on a TCP/IP (Transmission Control Protocol/Internet Protocol) network. This is a fairly good assumption, given that TCP/IP is not only highly robust and secure, but is also common to UNIX, Windows, OS/2, and almost any other serious operating system you’ll likely encounter. When the client and the server are on the same UNIX machine, MySQL uses TCP/IP with UNIX sockets, which operate in the UNIX domain; that is, they are generally used between processes on the same UNIX system (as opposed to Internet sockets, which operate between networks).

That being said, MySQL does allow named-pipe connections, which were designed mainly to support network connections in earlier non-TCP/IP networks, such as LAN Manager and Windows NETBEUI. (NETBEUI uses an addressing scheme based on the NETBIOS machine name rather than a routable IP address.)

The Structured Query Language (SQL) is an open standard that has been maintained by the American National Standards Institute (ANSI) since 1986. Although it’s true that the implementation of this standard does differ in varying degrees from vendor to vendor, it’s fair to say that SQL is today one of the most widely used cross-vendor languages. As with other implementations, such as SQL Server’s T-SQL (Transact-SQL) and Oracle’s SQL, MySQL has its own variations of the SQL standard that add power beyond what is available within the standard. Although MySQL’s SQL queries will be explored in much greater detail in subsequent chapters, we’ll introduce their use in this section to provide a framework for your understanding of the data-retrieval and decision-support capabilities of MySQL.

One thing that can be said about SQL is that it’s easy to get started. With a simple statement like this one, you can begin retrieving data:

mysql> SELECT year, make, model, price FROM vehiclesales LIMIT 10;
| year | make            | model               | price |
| 1982 | AMERICAN MOTORS | EAGLE 50-4WD-6 CYL. | 575   |
| 1983 | BUICK           | CENTURY LIMITED-V6  | 1000  |
| 1983 | BUICK           | REGAL LIMITED       | 525   |
| 1983 | BUICK           | REGAL LIMITED       | 1250  |
| 1983 | BUICK           | REGAL LIMITED       | 2250  |
| 1983 | BUICK           | REGAL LIMITED       | 700   |
| 1983 | BUICK           | SKYLARK LIMITED-V6  | 500   |
| 1983 | BUICK           | LESABRE CUSTOM-V8   | 550   |
| 1983 | BUICK           | ELECTRA LIMITED-V8  | 500   |
| 1984 | BUICK           | CENTURY CUSTOM-V6   | 850   |
10 rows in set (0.00 sec)

As you can see, the syntax isn’t arcane, nor is it difficult to understand—in fact, it looks almost like spoken English (which is great for English speakers). If you wanted to sort the above result set by price, you would simply add the phrase ORDER BY price to the end of the statement and run it.

Compared to C++, Java, and Perl, SQL at first glance seems easy to learn and hard to obfuscate, which often makes newcomers to SQL think that the learning curve will be much shorter than it actually is. In reality, it can take just as long to master SQL as it does other languages—people have written entire books about using SQL for data extraction without even addressing the other data manipulation features that the language offers. Take a look at the following query, which is perhaps more representative of the type you’ll be called upon to perform:

mysql> SELECT
AVG(a.price) AS s_price,
AVG(b.price) AS a_price
FROM vehiclesales a INNER JOIN vehiclesales b
a.year = (b.year + 1) and
a.make = b.make and
a.model = b.model and
a.trim = b.trim
HAVING AVG(a.price) > (AVG(b.price)* 1.25);

This query retrieves the names of vehicles that have an average price that has gone over 25 percent of the average price of the same type of vehicle for a previous year.

Data Integrity
MySQL supports engine-level data integrity through the use of primary key and foreign key constraints. Columns can be defined so that explicit NULL values cannot be entered into them. To prevent empty columns, MySQL supports the use of default values, which, when combined with NOT NULL properties, ensure that valid data is entered into a column that would otherwise be left blank.

Until recently, MySQL was not known for its transaction-handling capabilities; however, since version 3.23, MySQL has been providing table handlers, such as InnoDB and BDB, that manage transactions in much the same manner as other commercial RDBMS products. A transaction-safe database system must pass what is known as the ACID test to qualify for compliance. An ACID-compliant database must support the following characteristics:

  • Atomicity
  • Consistency
  • Isolation
  • Durability

Let’s look at these characteristics in detail.

Coming Soon…
While future releases of MySQL are expected to have foreign keys in all table types, these referential integrity declarations are currently available only with the InnoDB table type.

Remember: this is chapter two of MySQL: The Complete Reference, by Vikram Vaswani (McGraw-Hill/Osborne, ISBN 0-07-222477-0, 2004). Vikram is the founder of Melonfire, and has had numerous articles featured on Dev Shed. 
Buy this book now.

{mospagebreak title=Characteristics in Detail}

A transaction is defined as an action, or a series of actions, that can access or change the contents of a database. In SQL terminology, a transaction occurs when one or more SQL statements operate as one unit. Each SQL statement in such a unit is dependent on the others; in other words, if one statement does not complete, the entire unit will be rolled back, and all the affected data will be returned to the state it was in before the transaction was started. Grouping the SQL statements as part of a single unit (or transaction) tells MySQL that the entire unit should be executed atomically.

Atomic execution is an all-or-nothing proposition. All of the SQL statements must be completed for the database to maintain a state of data integrity; otherwise, none of the statements will be finalized and committed to disk. In MySQL, the beginning of a transaction is marked with a BEGIN statement. The transaction (or unit of work) will not be considered complete until a COMMIT command is issued to tell MySQL to complete the action. When necessary, the ROLLBACK command will initiate a rolling back of all changes to the state before the BEGIN statement.

An everyday real-world example of this can be found in the banking business. By debiting and crediting your bank account, your bank adds and subtracts money from the account within one transaction. These updates usually involve multiple tables. The bank would not be able to maintain data integrity without guaranteeing that the entire transaction will take place, not just part of it.

Transaction management is particularly important to client-server systems that perform data entry or to any application that must be able to count on a high degree of safety from undetected data loss, such as the banking example described here.

Consistency exists when every transaction leaves the system in a consistent state, regardless of whether the transaction completes successfully or fails midway.

For example, imagine that your bank uses a transaction that is supposed to transfer money from one bank account to another. If the transaction debits one bank account for the requisite amount but fails to credit the other account with a corresponding amount, the system would no longer be in a consistent state. In this case, the transaction would violate the consistency constraint, and the system would no longer be ACID-compliant.

In MySQL, consistency is primarily handled by MySQL’s logging mechanisms, which record all changes to the database and provide an audit trail for transaction recovery. If the system goes down in the middle of a transaction, the MySQL recovery process will use these logs to discover whether or not the transaction was successfully completed and roll it back if required.

In addition to the logging process, MySQL also provides locking mechanisms that ensure that all of the tables, rows, and indexes that make up the transaction are locked by the initiating process long enough to either commit the transaction or roll it back.

Isolation implies that every transaction occurs in its own space, isolated from other transactions that may be occurring in the system, and that the results of a transaction are visible only once the entire sequence of events making up the transaction has been fully executed. Even though multiple transactions may be occurring simultaneously in such a system, the isolation principle ensures that the effects of a particular transaction are not visible until the transaction is fully complete.

This is particularly important when the system supports multiple simultaneous users and connections (as MySQL does); systems that do not conform to this fundamental principle can cause massive data corruption, as the integrity of each transaction’s individual space will be quickly violated by other competing, often conflicting, transactions.

Interestingly, MySQL offers the use of server-side semaphore variables that act as traffic managers to help programs manage their own isolation mechanisms. These variables are useful in cases for which you prefer not to incur the overhead of the Transaction Managers, or when a recovery plan is possible outside of the confines of log recovery. MySQL InnoDB tables offer isolation in transactions involving multiple queries, while MyISAM tables allow you to simulate isolation via the LOCK TABLES command.

Durability, which means that changes from a committed transaction persist even if the system crashes, comes into play when a transaction has completed and the logs have been updated in the database. Most RDBMS products ensure data consistency by keeping a log of all activity that alters data in the database in any way. This database log keeps track of any and all updates made to tables, queries, reports, and so on. If you have turned on the database log , you already know that using it will slow down the performance of your database when it comes to writing data. (It will not, however, affect the speed of your queries.)

In MySQL, you can specify whether or not you wish to use transactions by choosing the appropriate table handlers, depending on your application. The InnoDB table handler performs logging a bit differently than BDB does, while MyISAM does not support the type of logs that would permit you to be assured of a durable database. By default, InnoDB tables are 100% durable to the last second prior to a crash. MyISAM tables offer partial durability—all changes committed to the system prior to the last FLUSH TABLES command are guaranteed to be saved to disk.

SQL Server and Oracle, for instance, are able to restore a database to a previous state by restoring a previously backed-up database and, in essence, “replaying” all subsequent transactions up until the point of failure. These database products do not encourage the direct use of—nor do they expose the inner data structures of—the log files, because those files form part of the database engine’s recovery mechanism.

MySQL also keeps a binary log of all data manipulation activity in sequential order. However, unlike the logs used in other databases, this log is easy to read, which means that it’s a relatively straightforward task to recover lost data by using the last backup in combination with the log.

Remember: this is chapter two of MySQL: The Complete Reference, by Vikram Vaswani (McGraw-Hill/Osborne, ISBN 0-07-222477-0, 2004). Vikram is the founder of Melonfire, and has had numerous articles featured on Dev Shed. 
Buy this book now.

{mospagebreak title=Extensibility}

In most RDBMS products, you can extend the capabilities of the database by using stored procedures. The programmability is usually further extended by enhancements to SQL that contains control-of-flow statements and conditional logic, as SQL Server does with T-SQL and Oracle with PL/SQL.

As of yet, MySQL includes no support for stored procedures, but one of the great benefits of this RDBMS is its extensibility. In keeping with its open-source roots, MySQL makes the original source code available as part of the distribution, which permits developers to add new functions and features that are compiled into the engine as part of the core product. MySQL also allows separate C and C++ libraries to be loaded in the same memory space as the engine when MySQL starts up.

Either of these methods will allow users to interact with your functions in the same way as they would with any of the already built-in functions available in MySQL, such as SUM() or AVG(). Because these functions run in the same memory space as MySQL, and because they execute on the server where MySQL is located, using them minimizes network traffic between the calling program and the server, thereby vastly increasing performance.

You can add functions to MySQL through a special user-defined function interface. User-defined functions are created initially as special C/C++ libraries and are then added and removed dynamically by means of the CREATE FUNCTION and DROP FUNCTION statements. User-defined functions come with the added burden of having to install all your libraries with every installation of MySQL; however, this does make deployment faster, since you can load these functions into the binary distribution rather than having to go through the trouble of compiling MySQL all over again to incorporate the new functions. In addition, there’s generally a good chance that your libraries will continue to work unchanged as new versions of MySQL are released.

You can also add functions as native (built-in) MySQL functions. Native functions are compiled into the MySQL server engine and become a permanent part of the MySQL installation itself. This makes new installations much easier than using libraries, because once you have defined your distribution, you can be sure that all of your subsequent installations will contain the functions that you have added. On the other hand, you must recompile and re-create your distribution for every new release of MySQL.

Getting Creative

You can simulate the encapsulation and centralization of a stored procedure in MySQL by storing code in a table and then retrieving the text of that code to be executed by the calling program. For example, a Perl program could have a table that includes rows containing nothing but Perl code. The client program would retrieve the needed code block and evaluate the code at run time.

This technique lets you place logic at the server without requiring that client programs be aware of any code changes. Because of the multiple round trips to and from the server, and because this approach works only with interpreted languages, it is not a true replacement for all the capabilities that one gets from stored procedures in other RDBMS implementations…but it’s close.

Remember: this is chapter two of MySQL: The Complete Reference, by Vikram Vaswani (McGraw-Hill/Osborne, ISBN 0-07-222477-0, 2004). Vikram is the founder of Melonfire, and has had numerous articles featured on Dev Shed.
Buy this book now.

{mospagebreak title=Symmetric Multiprocessing with MySQL}

To take advantage of multiprocessor architecture, MySQL is built using a multi-threaded design, which allows threads to be allocated between processors to achieve a higher degree of parallelism. This is important to know not only for the database administrator, who needs to understand how MySQL takes best advantage of processing power, but also for developers, who can extend MySQL with custom functions. All custom functions must be thread safe—that is, that they must not interfere with the workings of other threads in the same process as MySQL.

Threading the Needle

To better understand threads in general, we must look at the relationship between threads and processes. A process contains both an executing program and a collection of resources, such as the file map and address space. All threads associated with a given task share the task’s resources. Thus, a thread is essentially a program counter, a stack, and a set of registers; all the other data structures belong to the task. A process effectively starts out as a task with a single thread.

A thread is a miniprocess that has its own stack and that executes a given piece of code. A thread normally shares its memory with other threads, unlike true processes, which will usually have a different memory area for each one. A thread group is a set of threads all executing inside the same process. They all share the same memory and thus can access the same global variables, the same heap memory, the same set of file descriptors, and so on. All the threads in a thread group execute either by using time slices when running on a single processor or in parallel if multiple processors are available.

The advantage of using a thread group over using a process group is that context switching between threads is much faster than context switching between processes; in other words, the system switches from running one thread to running another thread much faster than it can switch from one process to another. Also, communication between two threads is usually faster and easier to implement than communication between two processes, since the threads already share common address space in which to share variables.

The POSIX thread libraries are a standards-based thread API for C and C++. They are most effectively used on multiprocessor systems, where the process flow can be scheduled to run on another processor, thus increasing speed through parallel or distributed processing. Threads require less overhead than forking, or spawning a new process, because the system will not initialize a new virtual memory space and environment for the process. While POSIX is most beneficial on a multiprocessor system, gains are also found on single processor systems, which exploit latency in input/output (I/O) and other system functions that can halt process execution.

MySQL makes use of various thread packages, depending on the platform. POSIX threads are used on most UNIX variants, such as FreeBSD and Solaris. LinuxThreads are used for Linux distributions, while, for efficiency reasons, Windows threads are used on the Windows platform (but the code that handles them is designed to simulate POSIX threads).

Because MySQL is a threaded application, it is able to let the operating system take over the task of coordinating the allocation of threads to balance the workload across multiple processors. MySQL uses these threads to do the following:

  • A global connection thread handles all connection requests and creates a new dedicated thread to handle authentication and SQL query processing for each connection. This type of thread works on both TCP/IP and namedpipe connections.

  • Internal semaphores and alarm listening functions are handled by a separate thread.

  • When requested, a dedicated thread is created to flush all tables at the specified interval.

  • Every single table on which a delayed INSERT is being used gets its own thread.

  • In replication, master-host synchronization is handled by separate threads.

Of course, another way to take advantage of multiprocessing is to run multiple instances of MySQL on the same machine, thereby spawning a separate process for each instance. This approach is especially practical for hosting companies and even for internal hosting within corporate environments. By running multiple instances of MySQL on the same computer, you can easily accommodate multiple user bases that need different configuration options.

Remember: this is chapter two of MySQL: The Complete Reference, by Vikram Vaswani (McGraw-Hill/Osborne, ISBN 0-07-222477-0, 2004). Vikram is the founder of Melonfire, and has had numerous articles featured on Dev Shed. 
Buy this book now.

{mospagebreak title=Security}

The process of accessing a MySQL database can be broken down into two tasks: connecting to the MySQL server itself, and accessing individual objects, such as tables or columns, in a database. MySQL has built-in security to verify user credentials at both stages.

MySQL manages user authentication through user tables, which check not only that a user has logged on correctly with the proper username and password, but also that the connection is originating from an authorized TCP/IP address.

Once a user is connected, a system administrator can to bestow user-specific privileges on objects and on the actions that can be taken in MySQL. For example, you might allow fred@thiscompany.com to perform only SELECT queries against an inventory table, while allowing anna@thatcompany.net to run INSERT, UPDATE, and DELETE statements against the same table.

Security is also an issue for connections. Passwords and other important data transmitted across a network are vulnerable to interception by any one of the many network packet analyzers available on the market. By default, older versions of MySQL used a rather simple encryption method for usernames and passwords, which might have discouraged the casual hacker but would not have stopped someone who was willing to spend a little time cracking the encryption algorithms. As of version 4.1, a more secure protocol has been implemented that makes MySQL much more difficult to break into, even if a hacker has the ability to sniff and decode network traffic.

The actual data that travels over a network, such as query results, isn’t encrypted and is therefore open to viewing by a hacker. To secure your data, you can use one of the SSH (Secure Shell) protocols; you’ll need to install it on both the client applications and the operating system you’re using. If you’re using MySQL 4.0 or later, you can also use the SSL (Secure Socket Layer) encryption protocol, which can be configured to work from within MySQL, making it safe for use over the Internet or other public network infrastructures.

Remember: this is chapter two of MySQL: The Complete Reference, by Vikram Vaswani (McGraw-Hill/Osborne, ISBN 0-07-222477-0, 2004). Vikram is the founder of Melonfire, and has had numerous articles featured on Dev Shed. 
Buy this book now.

{mospagebreak title=Replication}

Replication is a data distribution mechanism that allows you to place copies of tables and databases in remote locations so that users can more easily access them.

Product databases are typical of such replication mechanisms—for example, a national or global company might have a common product database that is updated centrally but that is used locally by each office. Rather than forcing applications to query this table remotely every time it’s needed, it is more cost effective to distribute a copy to everyone, thus incurring the transmission overhead only once for each office.

Prior to release 3.23.15, MySQL had no replication capabilities; subsequent to this release, and continuing to the present day, MySQL supports one-way replication, with one database as the master and one or more databases as the slaves.

The replication mechanism relies on a MySQL log that tracks all changes made to a database. The master ships this log to the slave, which then applies the log to its own data. Nothing about a slave prevents users from updating it outside the context of replication, so great care must be taken to ensure that this doesn’t happen, since the relative assurance of synchronization will otherwise be lost and subsequent log playbacks might even fail.

Remember: this is chapter two of MySQL: The Complete Reference, by Vikram Vaswani (McGraw-Hill/Osborne, ISBN 0-07-222477-0, 2004). Vikram is the founder of Melonfire, and has had numerous articles featured on Dev Shed.
Buy this book now.

{mospagebreak title=Application Programming Interfaces and Add-On Tools}

For application developers, MySQL provides a client library that is written in the C programming language and a set of APIs that provide an understandable set of rules by which host languages can connect to MySQL and send commands. Using an API protects client programs from any underlying changes in MySQL that could affect connectivity.

Many APIs are available, one or more for almost every programming language. Most of them use the C client library, with the exception of the Java driver, which is written in 100 percent native Java. Currently, MySQL APIs are available for Perl, PHP, C and C++, Java, Visual Basic, Python, Ruby, and .NET.

Add-On Tools
The list of software programs that work in tandem with MySQL is growing. Here’s a brief list of the better ones:

  • MySQL CC, available from http://www.mysql.com/products/mysqlcc/index.html, is an excellent front-end query and database management tool for MySQL. Currently Windows, UNIX, and Linux versions are available, and a version of Mac OS X may be available by the time you read this.

  • DBTools Manager Professional, available from http://www.dbtools.com.br, is a graphical client used to manage MySQL databases, tables, and indices; import data from other RDBMS; and provide a point-and-click interface to query and report design.

  • For web-based MySQL management, you should try phpMyAdmin, available from http://www.phpmyadmin.org.

To help you make the best decisions concerning the installation of the MySQL database sever, database and table design, and eventual optimization efforts, it is of paramount importance that you have a clear understanding of the underlying architecture governing the behavior of tables, rows, indexes, and databases. This knowledge is the basis from which database administrators make their tuning decisions, and from which application designers optimize their code.

In this chapter, we’ve sought to give you a basic understanding of all the elements of MySQL, both from the standpoint of the database administrator/developer and for those who may eventually create their own libraries or even add to MySQL’s source code. The rest of this book will address these concepts in far greater detail.

Remember: this is chapter two of MySQL: The Complete Reference, by Vikram Vaswani (McGraw-Hill/Osborne, ISBN 0-07-222477-0, 2004). Vikram is the founder of Melonfire, and has had numerous articles featured on Dev Shed.
Buy this book now.

Google+ Comments

Google+ Comments