Tomcat Benchmark Procedure

In this third part of a five-part series focusing on Tomcat performance tuning, you will learn benchmarking procedures and some of the qualities of the application that you can benchmark. This article is excerpted from chapter four of Tomcat: The Definitive Guide, Second Edition, written by Jason Brittain and Ian F. Darwin (O’Reilly; ISBN: 0596101066). Copyright © 2008 O’Reilly Media, Inc. All rights reserved. Used with permission from the publisher. Available from booksellers or direct from O’Reilly Media.

Benchmark procedure

We benchmarked two different types of static resource requests: small text files and 9k image files. For both of these types of benchmark tests, we set the server to be able to handle at least 150 concurrent client connections, and set the benchmark client to open no more than 149 concurrent connections so that it never attempted to use more concurrency than the server was configured to handle. We set the benchmark client to use HTTP keep-alive connections for all tests.

For the small text files benchmark, we’re testing the server’s ability to read the HTTP request and write the HTTP response where the response body is very small. This mainly tests the server’s ability to respond fast while handling many requests concurrently. We set the benchmark client to request the file 100,000 times, with a possible maximum of 149 concurrent connections. This is how we created the text file:

  $ echo ‘Hello world.’ > test.html

We copied this file into Tomcat’s ROOT webapp and also into Apache httpd ’s document root directory.

Here is the ab  command line showing the arguments we used for the small text file benchmark tests:

  $ ab -k -n 100000 -c 149 http://192.168.1.2/test.html

We changed the requested URL appropriately for each test so that it made requests that would benchmark the server we intended to test each time.

For the 9k image files benchmark, we’re testing the server’s ability to serve a larger amount of data in the response body to many clients concurrently. We set the benchmark client to request the file 20,000 times, with a possible maximum of 149 concurrent connections. We specified a lower total number of requests for this test because the size of the data was larger, so we adjusted the number of requests down to compensate somewhat, but still left it high to place a significant load on the server. This is how we created the image file:

  $ dd if=a-larger-image.jpg of=9k.jpg bs=1 count=9126

We chose a size of 9k because if we went much higher, both Tomcat and Apache httpd  would easily saturate our 1 Mb Ethernet link between the client machine and the server machine. Again, we copied this file into Tomcat’s ROOT webapp and also into Apache httpd ’s document root directory.

Here is the ab command line showing the arguments we used for the small text file benchmark tests:

  $ ab -k -n 20000 -c 149 http://192.168.1.2/20k.jpg

For each invocation of ab , we obtained the benchmark results by following this procedure:

  1. Configure and restart the Apache httpd  and/or Tomcat instances that are being tested.
  2. Make sure the server(s) do not log any startup errors. If they do, fix the problem before proceeding.
  3. Run one invocation of the ab command line to get the servers serving their first requests after the restart.
  4. Run the ab  command line again as part of the benchmark.
  5. Make sure that ab  reports that there were zero errors and zero non-2xx responses, when all requests are complete.
  6. Wait a few seconds between invocations of ab so that the servers go back to an idle state.
  7. Note the requests per second in the ab statistics.
  8. Go back to step 4 if the requests per second change significantly; otherwise, this iteration’s requests per second are the result of the benchmark. If the numbers continue to change significantly, give up after 10 iterations of ab , and record the last requests per second value as the benchmark result.

The idea here is that the servers will be inefficient for the first couple or few invocations of ab , but then the server software arrives at a state where everything is well initialized. The Tomcat JVM begins to profile itself and natively compile the most heavily used code for that particular use of the program, which further speeds response time. It takes a few ab  invocations for the servers to settle into their more optimal runtime state, and it is this state that we should be benchmarking—the state the servers would be in if they were serving for many hours or days as production servers tend to do.

{mospagebreak title=Benchmark results and summary}

We ran the benchmarks and graphed the results data as bar charts, listing the web server configurations in descending performance order (one graph per test per computer). First, we look at how the machines did in the small text files benchmark (see Figures 4-3 and 4-4).

Notice that Figures 4-3 and 4-4 look very similar. On both machines, Tomcat standalone JIO is the fastest web server for serving these static text files, followed by APR, followed by NIO. The two build configurations of Apache httpd came in fourth and fifth fastest, followed by all of the permutations of Apache httpd connected to Tomcat via a connector module. And, dominating the slow end of the graphs is mod_jk.


Figure 4-3.  Benchmark results for serving small text files on the AMD64 laptop


Figure 4-4.  Benchmark results for serving small text files on the EM64T tower

It is also interesting to compare the requests per second results for one web server configuration on both graphs. The AMD64 laptop has one single core processor, and the EM64T has two single core processors; thus, if dual EM64T computer works efficiently, and if the operating system and JVM can effectively take advantage of both processors, the dual EM64T computer should be able to sustain slightly less than double the requests per second that the single processor AMD64 machine could. Of course, this assumes that the two processor models are equally fast at executing instructions; they may not be. But, comparing the results for the two computers, the same web server configuration on the dual EM64T computer does sustain nearly double the requests per second, minus a percent for the overhead of the two processors sharing one set of system resources, such as RAM, data and I/O buses, and so on. This one computer with two processors in it can handle nearly the same number of requests that two single processor computers can, and both Tomcat and Apache httpd are able to take advantage of that.

Next, we examine the results of the 9k image files benchmark on both machines. Figures 4-5 and 4-6 show the results for the AMD64 computer and the dual EM64T computer, respectively.

Figure 4-5.  Benchmark results for serving 9k image files on the AMD64 laptop

In Figure 4-5, you can see that on AMD64, Tomcat standalone JIO wins again, with Apache httpd worker MPM trailing close behind. In this benchmark, their performance is nearly identical, with Tomcat standalone APR in a very close third place. Tomcat standalone NIO is in fourth place, trailing a little behind APR. Apache httpd prefork MPM is fifth fastest again behind all of the Tomcat standalone configurations. Slower still are all of the permutations of Apache httpd connecting to Tomcat via connector modules. This time, we observed mod_jk  perform about average among the connector modules, with some configurations of mod_proxy_http  performing the slowest.

Figure 4-6 is somewhat different, showing that on the dual EM64T, Apache httpd  edges slightly ahead of Tomcat standalone’s fastest connector: JIO. The difference in performance between the two is very small—about 1 percent. This may hint that there is a difference in how EM64T behaves versus AMD64. It appears that Apache httpd  is 1 percent faster than Tomcat on EM64T when serving the image files, at least on the computers we benchmarked. You should not assume this is the case with newer computers, as many hardware details change! Also, we observed all three Tomcat standalone connectors performing better than Apache httpd prefork in this set of benchmarks. The configurations where Apache httpd  connects to Tomcat via a connector module were again the slowest performing configurations, with mod_jk  performing the slowest.

 
Figure 4-6.  Benchmark results for serving 9k image files on the EM64T tower

{mospagebreak title=Benchmark results and summary continued}

 

Does the dual EM64T again serve roughly double the number of requests per second as the single processor AMD64 when serving the image files? No. For some reason, it’s more like four times the number of requests per second. How could it be possible that by adding one additional processor, the computer can do four times the work? It probably can’t. The only explanation we can think of is that something is slowing down the AMD64 laptop’s ability to serve the image files to the processor’s full potential. This isn’t necessarily a hardware problem; it could be that a device driver in this version of the kernel is performing inefficiently and slowing down the benchmark. This hints that the benchmark results for the 9k image benchmark on the AMD64 computer may not be accurate due to a slow driver. However, this is the observed performance on that computer. Until and unless a different kernel makes it perform better, this is how it will perform. Knowing that, it is unclear whether Tomcat or Apache httpd  is faster serving the 9k image files, although we would guess that the EM64T benchmark results are more accurate.

Does the dual EM64T again serve roughly double the number of requests per second as the single processor AMD64 when serving the image files? No. For some reason, it’s more like four times the number of requests per second. How could it be possible that by adding one additional processor, the computer can do four times the work? It probably can’t. The only explanation we can think of is that something is slowing down the AMD64 laptop’s ability to serve the image files to the processor’s full potential. This isn’t necessarily a hardware problem; it could be that a device driver in this version of the kernel is performing inefficiently and slowing down the benchmark. This hints that the benchmark results for the 9k image benchmark on the AMD64 computer may not be accurate due to a slow driver. However, this is the observed performance on that computer. Until and unless a different kernel makes it perform better, this is how it will perform. Knowing that, it is unclear whether Tomcat or Apache   is faster serving the 9k image files, although we would guess that the EM64T benchmark results are more accurate.

Here is a summary of the benchmark results, including some important stats:

  1. Tomcat standalone was faster than Apache httpd compiled for worker MPM in all of our benchmark tests except the 9k image benchmark test on Intel 64-bit Xeon, and even in that benchmark, httpd  was only 1 percent faster than Tomcat. We observed that Tomcat standalone JIO was almost always the fastest way to serve static resources. Tomcat served them between 3 percent and 136 percent faster than Apache httpd  in our benchmarks—Tomcat standalone JIO was a minimum of 3 percent faster than Apache httpd  (worker MPM) for 9k image files, except for the Intel 64-bit Xeon benchmark, where httpd  appeared to perform 1 percent faster than Tomcat. But in the small files benchmark, Tomcat was a minimum of 99 percent faster than Apache httpd  and a maximum of 136 percent faster than Apache httpd .
  2. Apache httpd  built to use worker MPM was the fastest configuration of Apache httpd  we tested; Apache httpd  built to use prefork MPM was slower than worker MPM in all of our standalone tests. We observed worker MPM serving a minimum of 0.4 percent faster than prefork MPM and a maximum of 26 percent faster than prefork MPM. There was almost no difference in performance between the two in our small text files benchmarks, but in the 9k image files benchmark, the difference was at least 22 percent.
  3. Tomcat standalone (configured to use any HTTP connector implementation) was always faster than Apache httpd  built and configured for prefork MPM; Tomcat standalone was a minimum of 21 percent faster than Apache httpd  and a maximum of 30 percent faster than Apache httpd  for 9k image files, and for small files Tomcat was a minimum of 103 percent faster than Apache httpd and a maximum of 136 percent faster than Apache httpd  prefork MPM.
  4. Apache httpd was quite a bit slower at serving small files. Tomcat standalone’s JIO, APR, and NIO connectors were each faster than Apache httpd —Tomcat’s JIO connector performed as much as 136 percent faster than Apache
    httpd ’s fastest configuration, Tomcat’s APR connector performed 89 percent faster than Apache httpd, and Tomcat 6.0’s NIO connector performed 25 percent faster than Apache httpd . In this common use case benchmark, Apache httpd  dropped to fourth place behind all of Tomcat standalone’s three HTTP connectors.
  5. Serving Tomcat’s resources through Apache httpd  was very slow compared to serving them directly from Tomcat. When we compared the benchmark results between Tomcat standalone and Tomcat serving through Apache httpd via mod_ proxy , Tomcat standalone consistently served at least 51 percent faster when using only Tomcat’s JIO connector without Apache httpd . (including all three Apache httpd connector modules: mod_jk , mod_proxy_ajp , and mod_proxy_ http ). In the small text files benchmark, Tomcat standalone was a minimum of 168 percent faster than the Apache httpd  to Tomcat configurations and a maximum of 578 percent faster! That’s not a misprint—it’s really 578 percent faster. For the 9k image files benchmark, Tomcat standalone was at least 51 percent faster and at most 274 percent faster.
  6. AJP outperformed HTTP when using mod_proxy . The benchmark results show that mod_proxy_ajp  was consistently faster than mod_proxy_http . The margin between the two protocols was as low as 1 percent and as high as 30 percent when using the same Tomcat connector design, but it was usually smaller, with mod_proxy_ajp  averaging about 13 percent faster than mod_proxy_http .
  7. Serving Tomcat’s static resources through an Apache httpd  connector module was never faster than serving the same static resources through just Apache httpd by itself. The benchmark results of serving the resources through an httpd connector module (from Tomcat) were always somewhat slower than just serving the static resources straight from Apache httpd . This means that benchmarking Apache httpd  standalone will tell you a number slightly higher than the theoretical maximum that you could get by serving the same resource(s) through an httpd  connector module. This also means that no matter how performant Tomcat is, serving its files through Apache httpd throttles Tomcat down so that Tomcat is slower than Apache httpd .
  8. mod_jk  was not faster than mod_proxy, except in the 9k image benchmark and then only on AMD64. In our tests, serving Tomcat’s resources through Apache httpd via mod_jk  was only faster than using mod_proxy  on the AMD64 laptop and only in the 9k image benchmark. In all the other benchmarks, mod_jk  was slower than mod_proxy_ajp .

How is it possible for pure-Java Tomcat to serve static resource faster than Apache httpd ? The main reason we can think of: because Tomcat is written in Java and because Java bytecode can be natively compiled and highly optimized at runtime, well-written Java code can run very fast when it runs on a mature Java VM that implements many runtime optimizations, such as the Sun Hotspot JVM. After it runs and serves many requests, the JVM knows how to optimize it for that particular use on that particular hardware. On the other hand, Apache httpd  is written in C, which is completely compiled ahead of runtime. Even though you can tell the compiler to heavily optimize the binaries, no runtime optimizations can take place. So, there is no opportunity with Apache httpd  to take advantage of the many runtime optimizations that Tomcat enjoys.

Another potential reason Tomcat serves the web faster than Apache httpd  is that every release of Sun’s JVM seems to run Java code faster, which has gone on for many release cycles of their JVM. That means that even if you’re not actively changing your Java program to perform better, it will likely keep improving every time you run it on a newer, faster JVM if the same progress on JVM performance continues. This does, however, make the assumption that newer JVMs will be compatible enough to run your Java program’s bytecode without any modifications.

{mospagebreak title=What else we could have benchmarked}

In this benchmark, we tested the web server’s performance when serving HTTP. We did not benchmark HTTPS (encrypted HTTP). The performance characteristics are probably significantly different between HTTP and HTTPS because with HTTPS, both the server and the client must encrypt and decrypt the data in both directions over the network. The overhead caused by the encryption slows down the requests and responses to varying degrees on different implementations of the crypto code. We have not benchmarked the HTTPS performance of the above web server configurations. Without benchmarking it, many believe that Apache httpd ’s HTTPS performance is higher than that of Tomcat, and usually people base that belief on the idea that C code is faster than Java code. Our HTTP benchmark disproves that in three out of our four benchmark scenarios, and the fourth one is not significantly better on the C side. We do not know which web server configuration would be fastest serving HTTPS without benchmarking them. But, if either the C encryption code or the Java encryption code is the fastest—by a significant margin—Tomcat implements both because you can configure the APR connector to use OpenSSL for HTTPS encryption, which is the same C library that Apache httpd uses. 

We could have benchmarked other metrics such as throughput; there are many more interesting things to learn by watching any particular metric that ab reports. For this benchmark, we define greater performance to mean a higher number of requests per second being handled successfully (a 2xx response code).

We could have benchmarked other static file sizes, including files larger than 9k in size, but with files as large as 100k, all of the involved server configurations saturate the bandwidth of a megabit Ethernet network. This makes it impossible to measure how fast the server software itself could serve the files because the network was not fast enough. For our test, we did not have network bandwidth greater than 1 Mb Ethernet.

We could have tested with mixed file sizes per HTTP request, but what mixture would we choose, and what use case would that particular mixture represent? The results of benchmarks such as these would only be interesting if your own web traffic had a similar enough mixture, which is unlikely. Instead, we focused on benchmarking two file sizes, one file size per benchmark test.

We could have tested with a different number of client threads, but 150 threads is the default (as of this writing) on both Tomcat and Apache httpd , which means many administrators will use these settings—mainly due to lack of time to learn what the settings do and how to change them in a useful way. We ended up raising some of the limits on the Apache httpd  side to try to find a way to make httpd perform better when the benchmark client sends a maximum of 149 concurrent requests; it worked.

There are many other things we could have benchmarked and many other ways we could have benchmarked. Even covering other common use cases is beyond the scope of this book. We’re trying to show only one example of a benchmark that yields some useful information about how the performance of Tomcat’s web server implementations compares with that of Apache httpd  in a specific limited environment and for specific tests.

Please check back next week for the continuation of this article.

[gp-comments width="770" linklove="off" ]

chat sex hikayeleri Ensest hikaye