Administering MySQL: International Usage and Log Files

If you need to administer MySQL, this article gets you off to a good start. In this section, we discuss localization and international usage, as well as the MySQL log files. The sixth of a multi-part series, it is excerpted from chapter four of the book MySQL Administrator’s Guide, written by Paul Dubois (Sams; ISBN: 0672326345).

4.7 MySQL Localization and International Usage

4.7.1 The Character Set Used for Data and Sorting

By default, MySQL uses the ISO-8859-1 (Latin1) character set with sorting according to Swedish/Finnish rules. These defaults are suitable for the United States and most of western Europe.

All MySQL binary distributions are compiled with --with-extra-charsets=complex. This adds code to all standard programs that enables them to handle latin1 and all multi-byte character sets within the binary. Other character sets will be loaded from a character-set definition file when needed.

The character set determines what characters are allowed in names. It also determines how strings are sorted by the ORDER BY and GROUP BY clauses of the SELECT statement.

You can change the character set with the --default-character-set option when you start the server. The character sets available depend on the --with-charset=charset and --with-extra-charsets= list-of-charsets | complex | all | none options to configure, and the character set configuration files listed in SHAREDIR/charsets/Index. See Section 2.3.2, “Typical configure Options.”

As of MySQL 4.1.1, you can also change the character set collation with the --default-collation option when you start the server. The collation must be a legal collation for the default character set. (Use the SHOW COLLATION statement to determine which collations are available for each character set.) See Section 2.3.2, “Typical configure Options.”

If you change the character set when running MySQL, that may also change the sort order. Consequently, you must run myisamchk -r -q --set-character-set=charset on all tables, or your indexes may not be ordered correctly.

When a client connects to a MySQL server, the server indicates to the client what the server’s default character set is. The client will switch to use this character set for this connection.

You should use mysql_real_escape_string() when escaping strings for an SQL query. mysql_real_escape_string() is identical to the old mysql_escape_string() function, except that it takes the MYSQL connection handle as the first parameter so that the appropriate character set can be taken into account when escaping characters.

If the client is compiled with different paths than where the server is installed and the user who configured MySQL didn’t include all character sets in the MySQL binary, you must tell the client where it can find the additional character sets it will need if the server runs with a different character set than the client.

You can do this by specifying a --character-sets-dir option to indicate the path to the directory in which the dynamic MySQL character sets are stored. For example, you can put the following in an option file:

[client]
character-sets-dir=/usr/local/mysql/share/mysql/
charsets

You can force the client to use a specific character set as follows:

[client]
default-character-set=charset

This is normally unnecessary, however.

4.7.1.1 Using the German Character Set

To get German sorting order, you should start mysqld with a --default-character-set=latin1_de option. This affects server behavior in several ways:

  • When sorting and comparing strings, the following mapping is performed on the strings before doing the comparison:

    ä -> ae
    ö -> oe
    ü -> ue
    ß -> ss
  • All accented characters are converted to their unaccented uppercase counterpart. All letters are converted to uppercase.

  • When comparing strings with LIKE, the one-character to two-character mapping is not done. All letters are converted to uppercase. Accents are removed from all letters except Ü, ü, Ö, ö, Ä, and ä.

4.7.2 Setting the Error Message Language

By default, mysqld produces error messages in English, but they can also be displayed in any of these other languages: Czech, Danish, Dutch, Estonian, French, German, Greek, Hungarian, Italian, Japanese, Korean, Norwegian, Norwegian-ny, Polish, Portuguese, Romanian, Russian, Slovak, Spanish, or Swedish.

To start mysqld with a particular language for error messages, use the --language or -L option. The option value can be a language name or the full path to the error message file. For example:

shell> mysqld --language=swedish

Or:

shell> mysqld --language=/usr/local/share/swedish

The language name should be specified in lowercase.

The language files are located (by default) in the share/LANGUAGE directory under the MySQL base directory.

To change the error message file, you should edit the errmsg.txt file, and then execute the following command to generate the errmsg.sys file:

shell> comp_err errmsg.txt errmsg.sys

If you upgrade to a newer version of MySQL, remember to repeat your changes with the new errmsg.txt file.

4.7.3 Adding a New Character Set

This section discusses the procedure for adding another character set to MySQL. You must have a MySQL source distribution to use these instructions.

To choose the proper procedure, decide whether the character set is simple or complex:

  • If the character set does not need to use special string collating routines for sorting and does not need multi-byte character support, it is simple.

  • If it needs either of those features, it is complex.

For example, latin1 and danish are simple character sets, whereas big5 and czech are complex character sets.

In the following procedures, the name of your character set is represented by MYSET.

For a simple character set, do the following:

  1. Add MYSET to the end of the sql/share/charsets/Index file. Assign a unique number to it.

  2. Create the file sql/share/charsets/MYSET.conf. (You can use a copy of sql/share/charsets/latin1.conf as the basis for this file.)

    The syntax for the file is very simple:

    • Comments start with a ‘#‘ character and proceed to the end of the line.

    • Words are separated by arbitrary amounts of whitespace.

    • When defining the character set, every word must be a number in hexadecimal format.

    • The ctype array takes up the first 257 words. The to_lower[], to_upper[] and sort_order[] arrays take up 256 words each after that.

  3. See Section 4.7.4, “The Character Definition Arrays.”

  4. Add the character set name to the CHARSETS_AVAILABLE and COMPILED_CHARSETS lists in configure.in.

  5. Reconfigure, recompile, and test.

For a complex character set, do the following:

  1. Create the file strings/ctype-MYSET.c in the MySQL source distribution.

  2. Add MYSET to the end of the sql/share/charsets/Index file. Assign a unique number to it.

  3. Look at one of the existing ctype-*.c files (such as strings/ctype-big5.c) to see what needs to be defined. Note that the arrays in your file must have names like ctype_MYSET, to_lower_MYSET, and so on. These correspond to the arrays for a simple character set. See Section 4.7.4, “The Character Definition Arrays.”

  4. Near the top of the file, place a special comment like this:

    /*
     * This comment is parsed by configure to
    create ctype.c, * so don't change it unless you know what
    you are doing. * * .configure. number_MYSET=MYNUMBER
    * .configure. strxfrm_multiply_MYSET=N
    * .configure. mbmaxlen_MYSET=N
    */
  5. The configure program uses this comment to include the character set into the MySQL library automatically.

    The strxfrm_multiply and mbmaxlen lines are explained in the following sections. You need include them only if you need the string collating functions or the multi-byte character set functions, respectively.

  6. You should then create some of the following functions:

    • my_strncoll_MYSET()

    • my_strcoll_MYSET()

    • my_strxfrm_MYSET()

    • my_like_range_MYSET()

  7. See Section 4.7.5, “String Collating Support.”

  8. Add the character set name to the CHARSETS_AVAILABLE and COMPILED_CHARSETS lists in configure.in.

  9. Reconfigure, recompile, and test.

The sql/share/charsets/README file includes additional instructions.

If you want to have the character set included in the MySQL distribution, mail a patch to the MySQL internals mailing list. See Section 1.7.1.1, “The MySQL Mailing Lists.”

{mospagebreak title=4.7.4 The Character Definition Arrays}

to_lower[] and to_upper[] are simple arrays that hold the lowercase and uppercase characters corresponding to each member of the character set. For example:

to_lower['A'] should contain 'a'
to_upper['a'] should contain 'A'

sort_order[] is a map indicating how characters should be ordered for comparison and sorting purposes. Quite often (but not for all character sets) this is the same as to_upper[], which means that sorting will be case-insensitive. MySQL will sort characters based on the values of sort_order[] elements. For more complicated sorting rules, see the discussion of string collating in Section 4.7.5, “String Collating Support.”

ctype[] is an array of bit values, with one element for one character. (Note that to_lower[], to_upper[], and sort_order[] are indexed by character value, but ctype[] is indexed by character value + 1. This is an old legacy convention to be able to handle EOF.)

You can find the following bitmask definitions in m_ctype.h:

#define _U   01   /* Uppercase */
#define _L   02   /* Lowercase */
#define _N   04   /* Numeral (digit) */
#define _S   010   /* Spacing character */
#define _P   020   /* Punctuation */
#define _C   040   /* Control character */
#define _B   0100  /* Blank */
#define _X   0200  /* heXadecimal digit */

The ctype[] entry for each character should be the union of the applicable bitmask values that describe the character. For example, 'A' is an uppercase character (_U) as well as a hexadecimal digit (_X), so ctype['A'+1] should contain the value:

_U + _X = 01 + 0200 = 0201

4.7.5 String Collating Support

If the sorting rules for your language are too complex to be handled with the simple sort_order[] table, you need to use the string collating functions.

Right now the best documentation for this is the character sets that are already implemented. Look at the big5, czech, gbk, sjis, and tis160 character sets for examples.

You must specify the strxfrm_multiply_MYSET=N value in the special comment at the top of the file. N should be set to the maximum ratio the strings may grow during my_strxfrm_MYSET (it must be a positive integer).

4.7.6 Multi-Byte Character Support

If you want to add support for a new character set that includes multi-byte characters, you need to use the multi-byte character functions.

Right now the best documentation on this consists of the character sets that are already implemented. Look at the euc_kr, gb2312, gbk, sjis, and ujis character sets for examples. These are implemented in the ctype-'charset'.c files in the strings directory.

You must specify the mbmaxlen_MYSET=N value in the special comment at the top of the source file. N should be set to the size in bytes of the largest character in the set.

4.7.7 Problems with Character Sets

If you try to use a character set that is not compiled into your binary, you might run into the following problems:

  • Your program has an incorrect path to where the character sets are stored. (Default /usr/local/mysql/share/mysql/charsets). This can be fixed by using the --character-sets-dir option when you run the program in question.

  • The character set is a multi-byte character set that can’t be loaded dynamically. In this case, you must recompile the program with support for the character set.

  • The character set is a dynamic character set, but you don’t have a configure file for it. In this case, you should install the configure file for the character set from a new MySQL distribution.

  • If your Index file doesn’t contain the name for the character set, your program will display the following error message:

    ERROR 1105: File '/usr/local/share/mysql/
    charsets/?.conf' not found (Errcode: 2)
  • In this case, you should either get a new Index file or manually add the name of any missing character sets to the current file.

For MyISAM tables, you can check the character set name and number for a table with myisamchk -dvv tbl_name.

{mospagebreak title=4.8 The MySQL Log Files}

MySQL has several different log files that can help you find out what’s going on inside mysqld:

Log File Types of Information Logged to File
The error log Logs problems encountered starting, running, or stopping mysqld.
The isam log Logs all changes to the ISAM tables. Used only for debugging the isam code.
The query log Logs established client connections and executed statements.
The update log Logs statements that change data. This log is deprecated.
The binary log Logs all statements that change data. Also used for replication.
The slow log Logs all queries that took more than long_query_time seconds to execute or didn’t use indexes.

By default, all logs are created in the mysqld data directory. You can force mysqld to close and reopen the log files (or in some cases switch to a new log) by flushing the logs. Log flushing occurs when you issue a FLUSH LOGS statement or execute mysqladmin flush-logs or mysqladmin refresh.

If you are using MySQL replication capabilities, slave replication servers maintain additional log files called relay logs. These are discussed in Chapter 5, “Replication in MySQL.”

4.8.1 The Error Log

The error log file contains information indicating when mysqld was started and stopped and also any critical errors that occur while the server is running.

If mysqld dies unexpectedly and mysqld_safe needs to restart it, mysqld_safe will write a restarted mysqld message to the error log. If mysqld notices a table that needs to be automatically checked or repaired, it writes a message to the error log.

On some operating systems, the error log will contain a stack trace if mysqld dies. The trace can be used to determine where mysqld died.

Beginning with MySQL 4.0.10, you can specify where mysqld stores the error log file with the --log-error[=file_name] option. If no file_name value is given, mysqld uses the name host_name.err and writes the file in the data directory. (Prior to MySQL 4.0.10, the Windows error log name is mysql.err.) If you execute FLUSH LOGS, the error log is renamed with a suffix of -old and mysqld creates a new empty log file.

In older MySQL versions on Unix, error log handling was done by mysqld_safe which redirected the error file to host_name.err. You could change this filename by specifying a --err-log=filename option to mysqld_safe.

If you don’t specify --log-error, or (on Windows) if you use the --console option, errors are written to stderr, the standard error output. Usually this is your terminal.

On Windows, error output is always written to the .err file if --console is not given.

4.8.2 The General Query Log

If you want to know what happens within mysqld, you should start it with the --log[=file_name] or -l [file_name] option. If no file_name value is given, the default name is host_name.log This will log all connections and statements to the log file. This log can be very useful when you suspect an error in a client and want to know exactly what the client sent to mysqld.

Older versions of the mysql.server script (from MySQL 3.23.4 to 3.23.8) pass a --log option to safe_mysqld to enable the general query log. If you need better performance when you start using MySQL in a production environment, you can remove the --log option from mysql.server or change it to --log-bin. See Section 4.8.4, “The Binary Log.”

mysqld writes statements to the query log in the order that it receives them. This may be different from the order in which they are executed. This is in contrast to the update log and the binary log, which are written after the query is executed, but before any locks are released.

Server restarts and log flushing do not cause a new general query log file to be generated (although flushing closes and reopens it). On Unix, you can rename the file and create a new one by using the following commands:

shell> mv hostname.log hostname-old.log
shell> mysqladmin flush-logs
shell> cp hostname-old.log to-backup-directory
shell> rm hostname-old.log

On Windows, you cannot rename the log file while the server has it open. You must stop the server and rename the log. Then restart the server to create a new log.

4.8.3 The Update Log

Note: The update log has been deprecated and replaced by the binary log. See Section 4.8.4, “The Binary Log.” The binary log can do anything the old update log could do, and more. The update log is unavailable as of MySQL 5.0.0.

When started with the --log-update[=file_name] option, mysqld writes a log file containing all SQL statements that update data. If no file_name value is given, the default name is name of the host machine. If a filename is given, but it doesn’t contain a leading path, the file is written in the data directory. If file_name doesn’t have an extension, mysqld creates log files with names of the form file_name.###, where ### is a number that is incremented each time you start the server or flush the logs.

Note: For this naming scheme to work, you must not create your own files with the same names as those that might be used for the log file sequence.

Update logging is smart because it logs only statements that really update data. So, an UPDATE or a DELETE with a WHERE that finds no rows is not written to the log. It even skips UPDATE statements that set a column to the value it already has.

The update logging is done immediately after a query completes but before any locks are released or any commit is done. This ensures that statements are logged in execution order.

If you want to update a database from update log files, you could do the following (assuming that your update logs have names of the form file_name.###):

shell> ls -1 -t -r file_name.[0-9]* | xargs cat |
mysql

ls is used to sort the update log filenames into the right order.

This can be useful if you have to revert to backup files after a crash and you want to redo the updates that occurred between the time of the backup and the crash.

{mospagebreak title=4.8.4 The Binary Log}

The binary log has replaced the old update log, which is unavailable starting from MySQL 5.0. The binary log contains all information that is available in the update log in a more efficient format and in a manner that is transactionally safe.

The binary log, like the old update log, logs only statements that really update data. So an UPDATE or a DELETE with a WHERE that finds no rows is not written to the log. It even skips UPDATE statements that set a column to the value it already has.

The binary log also contains information about how long each statement took that updated the database. It doesn’t contain statements that don’t modify any data. If you want to log all statements (for example, to identify a problem query) you should use the general query log. See Section 4.8.2, “The General Query Log.”

The primary purpose of the binary log is to be able to update the database during a restore operation as fully as possible, because the binary log will contain all updates done after a backup was made.

The binary log is also used on master replication servers as a record of the statements to be sent to slave servers. See Chapter 5, “Replication in MySQL.”

Running the server with the binary log enabled makes performance about 1% slower. However, the benefits of the binary log for restore operations and in allowing you to set up replication generally outweigh this minor performance decrement.

When started with the --log-bin[=file_name] option, mysqld writes a log file containing all SQL commands that update data. If no file_name value is given, the default name is the name of the host machine followed by -bin. If the file name is given, but it doesn’t contain a path, the file is written in the data directory.

If you supply an extension in the log name (for example, --log-bin=file_name.extension), the extension is silently removed and ignored.

mysqld appends a numeric extension to the binary log name. The number is incremented each time you start the server or flush the logs. A new binary log also is created automatically when the current log’s size reaches max_binlog_size. A binary log may become larger than max_binlog_size if you are using large transactions: A transaction is written to the binary log in one piece, never split between binary logs.

To be able to know which different binary log files have been used, mysqld also creates a binary log index file that contains the name of all used binary log files. By default this has the same name as the binary log file, with the extension '.index'. You can change the name of the binary log index file with the --log-bin-index=[file_name] option. You should not manually edit this file while mysqld is running; doing so would confuse mysqld.

You can delete all binary log files with the RESET MASTER statement, or only some of them with PURGE MASTER LOGS.

You can use the following options to mysqld to affect what is logged to the binary log. See also the discussion that follows this option list.

  • --binlog-do-db=db_name

    Tells the master that it should log updates to the binary log if the current database (that is, the one selected by USE) is db_name. All other databases that are not explicitly mentioned are ignored. If you use this, you should ensure that you only do updates in the current database.

    An example of what does not work as you might expect: If the server is started with binlog-do-db=sales, and you do USE prices; UPDATE sales.january SET amount=amount+1000;, this statement will not be written into the binary log.

  • --binlog-ignore-db=db_name

    Tells the master that updates where the current database (that is, the one selected by USE) is db_name should not be stored in the binary log. If you use this, you should ensure that you only do updates in the current database.

    An example of what does not work as you might expect: If the server is started with binlog-ignore-db=sales, and you do USE prices; UPDATE sales.january SET amount=amount+1000;, this statement will be written into the binary log.

To log or ignore multiple databases, specify the appropriate option multiple times, once for each database.

The rules for logging or ignoring updates to the binary log are evaluated in the following order:

  1. Are there binlog-do-db or binlog-ignore-db rules?

    • No: Write the statement to the binary log and exit.

    • Yes: Go to the next step.

  2. There are some rules (binlog-do-db or binlog-ignore-db or both). Is there a current database (has any database been selected by USE?)?

    • No: Do not write the statement, and exit.

    • Yes: Go to the next step.

  3. There is a current database. Are there some binlog-do-db rules?

    • Yes: Does the current database match any of the binlog-do-db rules?

      • Yes: Write the statement and exit.

      • No: Do not write the statement, and exit.

    • No: Go to the next step.

  4. There are some binlog-ignore-db rules. Does the current database match any of the binlog-ignore-db rules?

    • Yes: Do not write the statement, and exit.

    • No: Write the query and exit.

For example, a slave running with only binlog-do-db=sales will not write to the binary log any statement whose current database is different from sales (in other words, binlog-do-db can sometimes mean “ignore other databases”).

If you are using replication, you should not delete old binary log files until you are sure that no slave still needs to use them. One way to do this is to do mysqladmin flush-logs once a day and then remove any logs that are more than three days old. You can remove them manually, or preferably using PURGE MASTER LOGS, which will also safely update the binary log index file for you (and which can take a date argument since MySQL 4.1).

A client with the SUPER privilege can disable binary logging of its own statements by using a SET SQL_LOG_BIN=0 statement.

You can examine the binary log file with the mysqlbinlog utility. This can be useful when you want to reprocess statements in the log. For example, you can update a MySQL server from the binary log as follows:

shell> mysqlbinlog log-file | mysql -h server_name

See Section 7.5, “The mysqlbinlog Binary Log Utility,” for more information on the mysqlbinlog utility and how to use it.

If you are using transactions, you must use the MySQL binary log for backups instead of the old update log.

The binary logging is done immediately after a query completes but before any locks are released or any commit is done. This ensures that the log will be logged in the execution order.

Updates to non-transactional tables are stored in the binary log immediately after execution. For transactional tables such as BDB or InnoDB tables, all updates (UPDATE, DELETE, or INSERT) that change tables are cached until a COMMIT statement is received by the server. At that point, mysqld writes the whole transaction to the binary log before the COMMIT is executed. When the thread that handles the transaction starts, it allocates a buffer of binlog_cache_size to buffer queries. If a statement is bigger than this, the thread opens a temporary file to store the transaction. The temporary file is deleted when the thread ends.

The max_binlog_cache_size (default 4GB) can be used to restrict the total size used to cache a multiple-statement transaction. If a transaction is larger than this, it will fail and roll back.

If you are using the update log or binary log, concurrent inserts will be converted to normal inserts when using CREATE ... SELECT or INSERT ... SELECT. This is to ensure that you can re-create an exact copy of your tables by applying the log on a backup.

The binary log format is different in versions 3.23, 4.0, and 5.0.0. Those format changes were required to implement enhancements to replication. MySQL 4.1 has the same binary log format as 4.0. See Section 5.5, “Replication Compatibility Between MySQL Versions.”

{mospagebreak title=4.8.5 The Slow Query Log}

When started with the --log-slow-queries[=file_name] option, mysqld writes a log file containing all SQL statements that took more than long_query_time seconds to execute. The time to acquire the initial table locks are not counted as execution time.

If no file_name value is given, the default is the name of the host machine with a suffix of -slow.log. If a filename is given, but doesn’t contain a path, the file is written in the data directory.

A statement is logged to the slow query log after it has been executed and after all locks have been released. Log order may be different from execution order.

The slow query log can be used to find queries that take a long time to execute and are therefore candidates for optimization. However, examining a long slow query log can become a difficult task. To make this easier, you can pipe the slow query log through the mysqldumpslow command to get a summary of the queries that appear in the log.

If you also use the --log-long-format when logging slow queries, then queries that are not using indexes are logged as well. See Section 4.2.1, “mysqld Command-Line Options.”

4.8.6 Log File Maintenance

The MySQL Server can create a number of different log files that make it easy to see what is going on. See Section 4.8, “The MySQL Log Files.” However, you must clean up these files regularly to ensure that the logs don’t take up too much disk space.

When using MySQL with logging enabled, you will want to back up and remove old log files from time to time and tell MySQL to start logging to new files. See Section 4.6.1, “Database Backups.”

On a Linux (Red Hat) installation, you can use the mysql-log-rotate script for this. If you installed MySQL from an RPM distribution, the script should have been installed automatically. You should be careful with this script if you are using the binary log for replication! (You should not remove binary logs until you are certain that their contents have been processed by all slaves.)

On other systems, you must install a short script yourself that you start from cron to handle log files.

You can force MySQL to start using new log files by using mysqladmin flush-logs or by using the SQL statement FLUSH LOGS. If you are using MySQL 3.21, you must use mysqladmin refresh.

A log flushing operation does the following:

  • If standard logging (--log) or slow query logging (--log-slow-queries) is used, closes and reopens the log file (mysql.log and ´hostname´-slow.log as default).

  • If update logging (--log-update) or binary logging (--log-bin) is used, closes the log and opens a new log file with a higher sequence number.

If you are using only an update log, you only have to rename the old log file and then flush the logs before making a backup. For example, you can do something like this:

shell> cd mysql-data-directory
shell> mv mysql.log mysql.old
shell> mysqladmin flush-logs

Then make a backup and remove mysql.old.

Please check back next week for the conclusion of this article.

[gp-comments width="770" linklove="off" ]

antalya escort bayan antalya escort bayan Antalya escort diyarbakir escort