Measuring web server performance is a daunting task, to which we shall give some attention here and supply pointers to more detailed works. There are far too many variables involved in web server performance to do it full justice here. Most measuring strategies involve a “client” program that pretends to be a browser but, in fact, sends a huge number of requests more or less concurrently and measures the response times.*
You’ll need to choose how to performance test and what exactly you’ll test. For example, should the load test client and server software packages run on the same machine? We strongly suggest against doing that. Running the client on the same machine as the server is bound to change and destabilize your results. Is the server machine running anything else at the time of the tests? Should the client and server be connected via a gigabit Ethernet, or 100baseT, or 10baseT? In our experience, if your load test client machine is connected to the server machine via a link slower than a gigabit Ethernet, the network link itself can slow down the test, which changes the results.
Should the client ask for the same page over and over again, mix several different kinds of requests concurrently, or pick randomly from a large lists of pages? This can affect the server’s caching and multithreading performance. What you do here depends on what kind of client load you’re simulating. If you are simulating human users, they would likely request various pages and not one page repeatedly. If you are simulating programmatic HTTP clients, they may request the same page repeatedly, so your test client should probably do the same. Characterize your client traffic, and then have your load test client behave as your actual clients would.
Should the test client send requests regularly or in bursts? For benchmarking, when you want to know how fast your server is capable of completing requests, you should make your test client send requests in rapid succession without pausing between requests. Are you running your server in its final configuration, or is there still some debugging enabled that might cause extraneous overhead? For benchmarks, you should turn off all debugging, and you may also want to turn off some logging. Should the HTTP client request images or just the HTML page that embeds them? That depends on how closely you want to simulate human web traffic. We hope you see the point: there are many different kinds of performance tests you could run, and each will yield different (and probably interesting) results.
The point of most web load measuring tools is to request one or more resource(s) from the web server a certain (large) number of times, and to tell you exactly how long it took from the client’s perspective (or how many times per second the page could be fetched). There are many web load measuring tools available on the Web—see http://www.softwareqatest.com/qatweb1.html#LOADfor a list of some of them. A few measuring tools of note are the Apache Benchmark tool (ab, included with distributions of the Apache httpd web server at http://httpd.apache.org), Siege (see http://www.joedog.org/JoeDog/Siege), and JMeter from Apache Jakarta (see http://jakarta.apache.org/jmeter).
Of those three load-testing tools, JMeter is the most featureful. It is implemented in pure multiplatform Java, sports a nice graphical user interface that is used for both configuration and load graphing, is very featureful and flexible for web testing and report generation, can be used in a text-only mode, and has detailed online documentation showing how to configure and use it. In our experience, JMeter gave the most reporting options for the test results, is the most portable to different operating systems, and supports the most features. But, for some reason, JMeter was not able to request and complete as many HTTP requests per second as aband siegedid. If you’re not trying to find out how many requests per second your Tomcat can serve, JMeter works well because it probably implements all of the features you’ll need. But, if you are trying to determine the maximum number of requests per second your server can successfully handle, you should instead use ab or siege.
If you are looking for a command-line benchmark tool, ab works wonderfully. It is only a benchmarking tool, so you probably won’t be using it for regression testing. It does not have a graphical user interface, nor can it be given a list of more than one URL to benchmark at a time, but it does exceptionally well at benchmarking one URL and giving sharply accurate and detailed results. On most non-Windows operating systems, ab is preinstalled with Apache httpd, or there is an official Apache httpd package to install that contains ab, making the installation of ab the easiest of all of the web load-testing tools.
Siege is another good command-line (no GUI) web load tester. It does not come pre-installed in most operating systems, but its build and install instructions are straightforward and about as easy as they can be, and Seige’s code is highly portable C code. Siege supports many different authentication features and can perform benchmark testing, regression testing, and also supports an “Internet” mode that attempts to more closely simulate the load your webapp would get with many real users over the Internet. With other, less featureful tools, there seems to be spotty support for webapp authentication. They support sending cookies, but some may not support receiving them. And, while Tomcat supports several different authorization methods (basic, digest, form, and client-cert), some of these less featureful tools support only HTTP basic authentication. Form-based authentication is testable with any tool that is able to submit the form, which depends on whether the tool supports submitting aPOSTHTTP request for the login form submission (JMeter, ab, and siege each support sendingPOSTrequests like this). Only some of them do. Being able to closely simulate the production user authentication is an important part of performance testing because the authentication itself is often a heavy weight operation and does change the performance characteristics of a web site. Depending on which authentication method you are using in production, you may need to find different tools that support it.
As this book was going to print, a new benchmarking software package became available: Faban (http://faban.sunsource.net). Faban is written in pure Java 1.5+ by Sun Microsystems and is open source under the CDDL license. Faban appears to be focused on nothing but careful benchmarking of servers of various types, including web servers. Faban is carefully written for high performance and tight timing so that any measurements will be as close as possible to the server’s real performance. For instance, the benchmark timing data is collected when no other Faban code is running, and analysis of the data happens only after the benchmark has concluded. For best accuracy, this is the way all benchmarks should be run. Faban also has a very nice configuration and management console in the form of a web application. In order to serve that console webapp, Faban comes with its own integrated Tomcat server! Yes, Tomcat is a part of Faban. Any Java developers interested in both Tomcat and benchmarking can read Faban’s documentation and source code and optionally also participate in Faban’s development. If you are a Java developer, and you are looking for the most featureful, long-term benchmarking solution, Faban is probably what you should use. We did not have enough time to write more about it in this book, but luckily Faban’s web site has excellent documentation.