There are a number of ways to get your database-backed Web application to run faster on the same hardware--allowing you to postpone upgrades for at least a little while, thus cutting costs. One way involves examining how your applications are interacting with the database. Tom Copeland explains how to "tune the queries" for a PostgreSQL database.
But let's not stop there - let's see what happens if we "normalize" the SQL queries. By that I mean we remove all the literal values, so that select language_id FROM supported_languages WHERE classname='English' gets turned into select language_id FROM supported_languages WHERE classname=''. That way we can get an even better picture of what queries it might be helpful to rewrite as stored procedures. Let's see:
[tom@hal pqa]$ ./pqa.rb -top 8 -normalize -file overnight_query_data.txt 10077 queries (260 unique) parsed in 1.977645 seconds 8 most frequent queries 931 times: BEGIN;ROLLBACK; 780 times: SELECT * FROM plugins 780 times: INSERT INTO activity_log (day,hour,group_id,browser,ver,platform,time,page,type) VALUES (20040225,'','','','','','','',''); 688 times: SELECT language_id FROM supported_languages WHERE classname='' 688 times: SELECT language_code FROM supported_languages WHERE language_id='' 644 times: select * from supported_languages where language_code = '' 634 times: SELECT total FROM forum_group_list_vw WHERE group_forum_id='' [tom@hal pqa]$
Note that the number of unique queries has been reduced from 2826 to 260 - normalizing them shows that there really aren't that many unique queries. Now we can also see that combining those two supported_languages queries will save us even more than we thought - we'll be able to eliminate almost 7% of our queries. Obviously that activity_log query is a frequently used one - but with differing values each time, which is why it only showed up when we normalized. It might be a candidate for a stored procedure.
We've discussed a couple different types of optimizations and when they are useful. We've poked around the PostgreSQL database configuration file and learned how to log SQL statements. We've seen how we can use the open source PostgreSQL Query Analyzer (http://rubyforge.org/projects/pqa/) utility to analyze the data and report useful results, and we've done a bit of reasoning about the results to see what queries might actually be worthy of rewriting. Also, development on PQA will continue - drop by the project site if you have any suggestions, bug reports, or comments. Thanks!
Many thanks to the folks on the #ruby-lang Internet Relay Chat (IRC) channel for helping me figure out some regular expressions that PQA uses. Thanks also to the folks on the #pgsql channel for a philosophical discussion on general optimization techniques. Finally, thanks to my co-workers Joe Coffman, Dave Craine, and Rich Kilmer, who helped me work through various technical issues in the development of PQA.