Counting Queries: PostgreSQL Analysis - Normalize the queries
(Page 4 of 4 )
But let's not stop there - let's see what happens if we "normalize" the SQL queries. By that I mean we remove all the literal values, so that select language_id FROM supported_languages WHERE classname='English' gets turned into select language_id FROM supported_languages WHERE classname=''. That way we can get an even better picture of what queries it might be helpful to rewrite as stored procedures. Let's see:
[tom@hal pqa]$ ./pqa.rb -top 8 -normalize -file overnight_query_data.txt
10077 queries (260 unique) parsed in 1.977645 seconds
8 most frequent queries
931 times: BEGIN;ROLLBACK;
780 times: SELECT * FROM plugins
780 times: INSERT INTO activity_log (day,hour,group_id,browser,ver,platform,time,page,type) VALUES (20040225,'','','','','','','','');
688 times: SELECT language_id FROM supported_languages WHERE classname=''
688 times: SELECT language_code FROM supported_languages WHERE language_id=''
644 times: select * from supported_languages where language_code = ''
634 times: SELECT total FROM forum_group_list_vw WHERE group_forum_id=''
[tom@hal pqa]$
Note that the number of unique queries has been reduced from 2826 to 260 - normalizing them shows that there really aren't that many unique queries. Now we can also see that combining those two supported_languages queries will save us even more than we thought - we'll be able to eliminate almost 7% of our queries. Obviously that activity_log query is a frequently used one - but with differing values each time, which is why it only showed up when we normalized. It might be a candidate for a stored procedure.
We've discussed a couple different types of optimizations and when they are useful. We've poked around the PostgreSQL database configuration file and learned how to log SQL statements. We've seen how we can use the open source PostgreSQL Query Analyzer (http://rubyforge.org/projects/pqa/) utility to analyze the data and report useful results, and we've done a bit of reasoning about the results to see what queries might actually be worthy of rewriting. Also, development on PQA will continue - drop by the project site if you have any suggestions, bug reports, or comments. Thanks!
Many thanks to the folks on the #ruby-lang Internet Relay Chat (IRC) channel for helping me figure out some regular expressions that PQA uses. Thanks also to the folks on the #pgsql channel for a philosophical discussion on general optimization techniques. Finally, thanks to my co-workers Joe Coffman, Dave Craine, and Rich Kilmer, who helped me work through various technical issues in the development of PQA.
| DISCLAIMER: The content provided in this article is not warranted or guaranteed by Developer Shed, Inc. The content provided is intended for entertainment and/or educational purposes in order to introduce to the reader key ideas, concepts, and/or product reviews. As such it is incumbent upon the reader to employ real-world tactics for security and implementation of best practices. We are not liable for any negative consequences that may result from implementing any information covered in our articles or tutorials. If this is a hardware review, it is not recommended to open and/or modify your hardware. |