Now you've got a pile of data in your syslog file. I usually run the SQL analysis on another machine, so I'll tar up the syslog file and copy it to my workstation:
Of course, you can run the analysis on the database server as well; just do whatever's appropriate for your environment. The next step is to actually analyze the data and see what we find. We could just scroll through the raw data, but that would take forever. So I wrote a little Ruby(http://ruby-lang.org/) library called PostgreSQL Query Analyzer (PQA) (http://rubyforge.org/projects/pqa/) to parse the data file and analyze the queries. If you're not familiar with Ruby, it's an object-oriented scripting language that's great for various system administration and data manipulation tasks. To run PQA, you'll need to have Ruby installed. On to the results. Here's a sample run showing the top 8 queries:
Already we see something very interesting - the supported_languages table is queried the same number of times by two rather similar queries. This is an indication that there might be a sequence of two queries that could be combined into one, which is a big win - we'll eliminate five percent of the total number of queries in one fell swoop. Looks like our analysis efforts have been repaid already. And if we were to look at the top 30 or 40 queries, we could probably find some more instances of this sort of thing. We can also see that we're selecting the entire contents of the plugins table. Is there a way we can add some constraints to the query to limit the results? Is that moving a lot of data around? In this specific case, this isn't a problem since the plugins table is very small. Still this is a good thing to be aware of if we start adding plugins - this query may be one to watch.
blog comments powered by Disqus |