Take Some Load off MySQL with MemCached

While the execution speed of your codebase can be a factor in the overall scalability of your application, more often than not, your database will become a bottleneck first. Modern web development environments can generally serve many page loads per second, and each of these pages will often make many requests to the database for fresh information. These pages may also be rendered by an easily expandable pool of web servers. While databases, including MySQL, are adequately designed to handle a significant number of queries, eventually, the load from all these requests can become too much to handle.

The Overworked Database

A web page may generate dozens of queries on each page load – user credentials, session information, system notifications, new messages, latest headlines, configuration information, and content areas may be loaded on every single page and displayed to the user or acted upon by the page logic. As this data is read by the rendering process, other web requests or background processes may be modifying the data in these tables by inserting new headlines, publishing new content areas, and creating sessions for other users. Often, this contention between database reads and writes is where the bottlenecks occur. In order to maintain database integrity, protections are generally in place to prevent the same data from being read as it is written.

In a typical MySQL installation, this problem can become quite apparent because the default table format of MySQL requires that an entire database table be locked while data is being updated in any row. This can create a cascading effect when one process writes to an important table. Perhaps the sessions table is updated whenever a user logs in or logs out.  If so, every page load will then read from the sessions table to determine the authentication status of a given user (based on their cookies). Once you reach a tipping point, the minor pauses caused by the writes to the table can pile up like a traffic jam, slowing the entire site down simply because of writes to a single database table.

In a situation where you notice intermittent slowdowns of your site, you can attempt to troubleshoot the cause from the MySQL command line:

mysql> show processlist;

+—-+——+———–+——+———+——+——–+————–

——————————+

| Id | User | Host | db | Command | Time | State | Info

|

+—-+——+———–+——+———+——+——–+————–

| 1 | root | localhost | test | Query | 3 | Locked | select * from | 2 | root | localhost | test | Query | 6 | Locked | insert into ON

As you can see, you’ll get simple feedback if queries are waiting on a locked table by the State field. The Time field shows how many seconds the query has been running.

{mospagebreak title=A Solution – Caching}

The bottleneck in many modern web sites is the database. This may seem like a daunting problem because database hardware can be expensive and rearchitecting the application with a different database structure can require a large time commitment. Luckily, most of the data presented on the internet is ideal for caching.

Let us consider our example set of queries from before: user credentials, session information, system notifications, new messages, latest headlines, configuration information, and content areas. Of those, latest headlines, configuration information, system notifications, and content areas are not likely to change very often or vary per user. Digging down further, we’ll consider news headlines specifically and build up a solution to dramatically reduce the database queries to this table.

For a typical website, news headlines will be generated with a query like this:

SELECT *

FROM news

WHERE category_id=21 AND status=1

ORDER BY publish_date DESC

LIMIT 5;

Each time this query is executed, MySQL will (assuming the query cache is not enabled, which is another issue) first parse the query, decide how to locate the data, read the table index, filter out rows by status or category_id, order the remaing rows, and return the first five. With proper indexes, this query may take a fraction of a second, but enough queries will eventually bog down the database. And 99% of the time, that query will return the same data it returned the last time.

So here, we’ve found a query that runs on nearly every page load, is the same for every user to the site, and returns a relatively small result. Five news headlines, perhaps the links to the full article, the URL for a photo, and a published date can likely fit in under one kilobyte of memory.

Now that we’ve identified a candidate for caching, we simply need a place to cache it. PHP and the Linux/Apache environment provide numerous ways to do this, but we’ll focus on one solution here, MemCached.

{mospagebreak title=Enter MemCached}

MemCached is designed as a solution for low complexity caching. Data is stored as simple key-value pairs, with the two primary operations “get” and “set.” All data is stored in RAM allocated to the MemCache server. This allows for incredibly fast access times, generally beating those of a database handily.

Once you’ve installed MemCached on your system (from source, RPM, deb, or tarball), you’ll want to fire up an instance and allocate it some memory:

sudo /usr/bin/memcached -m 8 -p 11211 -u nobody -l 127.0.0.1

MemCached is now running, listening on port 11211 of your localhost (127.0.0.1) interface, and using 8 Megabytes of memory. It’s running as user nobody to mitigate any security concerns. (At some point, you’ll want to add this command to your start-up scripts, ensuring that MemCached starts up with your server).

While you’ll generally access MemCached with a higher-level client API, you can get a feel for the system and the protocol used with a telnet client.

> telnet 127.0.0.1 11211

set mykey 0 0 13

Hello, World!

STORED

get mykey

VALUE mykey 0 13

Hello, World!

END

Quit

The lines beginning with “set” and “Hello, World!” after it are the basic set command. It tells MemCached to store some data for a given key:

set [KEYNAME] [FLAGS] [EXPIRES] [BYTES]rn

[BYTES OF DATA]rn

KEYNAME can be any string up to 250 characters. FLAGS is used by client libraries to note any characteristic of the stored data. EXPIRES is 0 if the data should never expire, a number of seconds to retain the data, or a unix timestamp when the data should expire. BYTES is the size of the data to be stored and DATA is sent after a line break.

get [KEYNAME]rn

“get” is much simpler; it takes one parameter and returns the stored data.

{mospagebreak title=PHP MemCache PECL Extension}

Obviously, you are not going to want to interact with MemCached on a regular basis by using telnet and by typing out the protocol commands. Instead, client libraries are provided for nearly every popular web development language: PHP, Perl, Ruby, Python, Java, and more. Here, we’ll focus on the PHP client.

The MemCache extension is a PECL module and may need to be installed separately from your PHP installation. In most cases, the PECL module can be installed and MemCache can be enabled without rebuilding or installing a new version of PHP. You can verify that the installation was successful by viewing the output of phpinfo(). You’ll get a section of the output with the MemCache status, number of active connections, and PECL extension version:

memcache support

enabled

Active persistent connections

0

Revision

$Revision: 1.80 $

Once you’ve verified the installation, you can start using your cache! Here’s an obligatory “Hello, World!” using MemCache.

$memcache = new Memcache;

$memcache->connect(‘localhost’, 11211) or die(‘Memcache Connection Failed’);


$memcache->set(‘mykey’, ‘Hello, World!’);


$myvar = $memcache->get(‘mykey’);


print $myvar; // Outputs: Hello, World!

Using our news headlines example from before, here’s a better example:

// Obtain an array of news headlines

// Each healine is an array or headline, link and img

function fetchHeadlines($category_id) {

// Load cached headlines

$memcache = getMemcache(); // wraps creation and connection of

// MemCache object

$articles = $memcache->get(‘articles:’ . $category_id);

if ($articles) {

// cache hasn’t expired

return $articles;

}



// The cache doesn’t exist (probably expired)

// Load articles from the database

$dbh = getMysql(); // Wraps MySQL server connection

$result = mysql_query(‘SELECT * FROM news WHERE category_id=”’ . mysql_escape_string($category_id) . ‘” AND status=1 ORDER BY publish_date DESC LIMIT 5’, $dbh);


// Build an array of articles

$articles = array();

while ($article = mysql_fetch_assoc($result)) {

$articles[] = $article;

}


// Store back into the cache

$memcache->set(‘articles:’ . $category_id,

$articles,

0,

60 * 15); // Store for 15 minutes


return $articles;

}

In this example, we see a couple different things. The value of data used with the PHP library can be nearly any PHP variable. The data is converted into a string with the serialize() function, so numbers, strings, arrays, and most objects will be stored easily… only resources (file descriptors, database connections, MemCache connections) can not be stored effectively. Additionally, we see using the key as an example of an arbitrary lookup table: an ad-hoc string with the keyword ‘articles:’ and the category id from a simple, but effective key.

At this point, if you’re site is loaded one time per second, you’ve decreased the number of queries for the news table from 900 every fifteen minutes to one every fifteen minutes. That’s not bad for a handful of code. And even if your site traffic increases from one hit per second to 100, you’ll still only be querying the news table once every fifteen minutes.

It is important to keep in mind that if the MemCached server crashes or the process dies, all your cache data will be lost. Always store the actual data in a database server or another persistent storage system.

Conclusion

While the example we created is fairly trivial, it’s easy to see how this technique can be expanded throughout your site. Other queries and other data can be cached with similar benefits. Obviously, MemCached isn’t the only solution to web scaling and data caching, but it can be a pretty useful. Often, as seen in our example, a MemCached solution can be implemented and put in place with minimal code changes, minimal time, and, hopefully, minimum downtime for your users!

[gp-comments width="770" linklove="off" ]
antalya escort bayan antalya escort bayan