HomeMySQL SQL Performance and Tuning Considerations
SQL Performance and Tuning Considerations
This article, the first of two parts, will show you how to make the SQL statements you write run faster and more efficiently. It is excerpted from chapter 11 of the book SQL DeMYSTiFied, written by Andy Oppel (McGraw-Hill/Osborne, 2005; ISBN: 0072262249).
This ﬁnal chapter covers performance and tuning considerations for making the SQL statements you write run faster and more efﬁciently. First, we'll have a look at general things that can be done at the DBMS level, followed by guidelines for writing efﬁcient SQL. The many differences across RDBMSs (even different versions from the same vendor) make universal solutions to performance issues difﬁcult to develop. What works well in one DBMS often has a lesser, or even negative, effect on another. While I have made every attempt to include guidelines in this chapter that are universally applicable, I have also included some of the most important vendor-speciﬁc guidelines. In all cases, be sure to check your RDBMS documentation for current information.
Any seasoned DBA (Database Administrator) will tell you that database performance tuning is a never ending task. It seems there is always something that can be tweaked to make it run better. The key to success is managing your time and the expectations of the database users and setting the performance requirements for an application before it is even written. Simple statements such as "every database update must complete within four seconds" are usually the best. With that done, performance tuning becomes a simple matter of looking for things that do not conform to the performance requirement and tuning them until they do. The law of diminishing returns applies to database tuning, and you can put lots of effort into tuning a database process for little or no gain. The beauty of having a standard performance requirement is that you can stop when the process meets the requirement and then move on to the next problem.
General RDBMS Tuning Considerations
Most RDBMS performance problems are the result of poorly written SQL statements, but there are things that the DBA can do to make the entire database system run more efﬁciently. So, before we move on to tuning SQL queries, the sections that follow offer some general considerations for tuning the RDBMS.
Minimize Disk Reads and Writes
This may seem obvious, but the slowest operation on most computer systems is I/O (input/output), such as reading data from and writing data to the storage system (the hard disks). Moving data around in memory and performing various calculations and transformations on the data are both much faster than an I/O operation. So why wouldn't you just put the entire database in memory and keep it there? Actually, that may happen someday. However, for now, memory is too expensive to be a complete replacement for disks, and most memory is volatile, meaning that the data held in it is lost when the computer system is shut down. There is nonvolatile memory, but it costs considerably more. Therefore, the best thing the DBA can do is to make efﬁcient use of the available memory to minimize I/O operations and the time spent waiting for them to complete. Here are some ways to do that:
Allocate correctly sized buffers. You may recall that a buffer is an area of memory used to hold data that has been recently read from or is destined to be written to the storage system. With properly sized buffers, recently read data will stay in memory for a reasonable time, improving the probability that a new query will ﬁnd the data it needs already sitting in a buffer. On the output side, if there is room in the correct buffer, the RDBMS can write the data to the buffer and copy the buffer from memory to the storage system for permanent storage at a later time. This is called asynchronous I/O, which is fundamental to an efﬁcient RDBMS.
Spread out disk I/O. If you have multiple physical disk drives, spread the database ﬁles out to allow for parallel I/O. (Multiple partitions on a personal computer hard drive don't count because those partitions are all on the same physical drive.) In general, a disk drive can access only one spot on the drive at a time, so multiple I/O operations on a single drive must be handled serially. By spreading ﬁles out, multiple I/O operations can happen at the same time (one per drive at any instant).
Tune the Computer System and Environment
It should be obvious that the computer system on which the DBMS runs should be as fast and efﬁcient as possible. Here are some general considerations:
Select fast and reliable hardware. The faster the hardware on your computer system, particularly disk drives and processors, the faster your database queries will run.
Tune the operating system. While there is a lot of variation across vendors and operating system versions, there are usually somethings that can be done to improve performance. For example, in Microsoft Windows the size of the swap ﬁle can have a signiﬁcant effect on operating system performance. Consult a tuning manual or white paper on your operating system for speciﬁcs.