There are a number of ways to get your database-backed Web application to run faster on the same hardware--allowing you to postpone upgrades for at least a little while, thus cutting costs. One way involves examining how your applications are interacting with the database. Tom Copeland explains how to "tune the queries" for a PostgreSQL database.
Now you've got a pile of data in your syslog file. I usually run the SQL analysis on another machine, so I'll tar up the syslog file and copy it to my workstation:
Of course, you can run the analysis on the database server as well; just do whatever's appropriate for your environment.
The next step is to actually analyze the data and see what we find. We could just scroll through the raw data, but that would take forever. So I wrote a little Ruby(http://ruby-lang.org/) library called PostgreSQL Query Analyzer (PQA) (http://rubyforge.org/projects/pqa/) to parse the data file and analyze the queries. If you're not familiar with Ruby, it's an object-oriented scripting language that's great for various system administration and data manipulation tasks. To run PQA, you'll need to have Ruby installed.
On to the results. Here's a sample run showing the top 8 queries:
[tom@hal pqa]$ ./pqa.rb -top 8 -file overnight_query_data.txt 10077 queries (2826 unique) parsed in 1.984141 seconds 8 most frequent queries 931 times: BEGIN;ROLLBACK; 780 times: SELECT * FROM plugins 574 times: SELECT language_code FROM supported_languages WHERE language_id='1' 574 times: SELECT language_id FROM supported_languages WHERE classname='English' 295 times: select * from supported_languages where language_code = 'en' 162 times: select * from supported_languages where language_code = 'en-us' 85 times: SELECT total FROM forum_group_list_vw WHERE group_forum_id='721' [tom@hal pqa]$
Already we see something very interesting - the supported_languages table is queried the same number of times by two rather similar queries. This is an indication that there might be a sequence of two queries that could be combined into one, which is a big win - we'll eliminate five percent of the total number of queries in one fell swoop. Looks like our analysis efforts have been repaid already. And if we were to look at the top 30 or 40 queries, we could probably find some more instances of this sort of thing.
We can also see that we're selecting the entire contents of the plugins table. Is there a way we can add some constraints to the query to limit the results? Is that moving a lot of data around? In this specific case, this isn't a problem since the plugins table is very small. Still this is a good thing to be aware of if we start adding plugins - this query may be one to watch.