Counting Queries: PostgreSQL Analysis - How to get the data
(Page 2 of 4 )
Now that we've narrowed our focus a bit, we need to figure out how to get some data to analyze. A list of SQL queries that the database has executed is a good place to start, and fortunately PostgreSQL provides an easy way to get that. Note that I'm running PostgreSQL 7.3 installed from an RPM on a RedHat Linux machine - your PostgreSQL installation may be different if you're on another platform, but hopefully the principles will be the same.
To turn on query logging, edit your postgresql.conf - mine is at the path /var/lib/pgsql/data/postgresql.conf. We'll need to set the following options:
log_statement = true
syslog = 2
syslog_facility = 'LOCAL0'
syslog_ident = 'postgres'
You'll need to not only uncomment the log_statement option, but also set it to "true". There are some other logging options like log_duration that can supply more information, but we'll leave those for later. Now, tell PostgreSQL to reload the configuration file by issuing a "reload" command:
[root@rubyforge tom]# /etc/rc.d/init.d/postgresql reload
Click around your application a bit and PostgreSQL will start sending data to syslog. You can ensure this is happening by checking the last few lines of the syslog file:
[root@rubyforge tom]# tail -5 /var/log/messages
Mar 4 16:40:06 rubyforge postgres[24574]: [78] LOG: query: SELECT * FROM groups WHERE group_id='1'
Mar 4 16:40:06 rubyforge postgres[24574]: [79-1] LOG: query: SELECT * FROM user_group
Mar 4 16:40:06 rubyforge postgres[24574]: [79-2] ^I^I^IWHERE user_id='102'
Mar 4 16:40:06 rubyforge postgres[24574]: [79-3] ^I^I^IAND group_id='1'
Mar 4 16:40:06 rubyforge postgres[24574]: [80] LOG: query: BEGIN;ROLLBACK;
[root@rubyforge tom]#
Yep, looks like we're collecting data. Now we need to leave it running for a while to collect "live" data. This is a bit of a balancing act - we don't want to let it run too long, since it's logging a lot of data and that slows things down, but at the same time we need to collect a fair bit of data to ensure we have useful results. In my case, I let the logging run for 4 hours, during which it collected about 2.5 MB of data representing around 10,000 SQL queries.
When you've got enough data, you can turn off logging by editing your postgresql.conf and commenting out the changes you made earlier:
#log_statement = true
#syslog = 2
#syslog_facility = 'LOCAL0'
#syslog_ident = 'postgres'
After you've commented those lines out, rerun the "reload" command again - i.e., /etc/rc.d/init.d/postgresql reload. Of course, if you're going to do this a few times, it's worth making both a postgresql.conf.nologging and a postgresql.conf.logging file and copying the appropriate one on top of the active postgresql.conf.
Next: Analyze the data >>
More MySQL Articles
More By Tom Copeland