reHalcón: Simple server load test with cron and ab (Linux)

Load testing "refers to the practice of modeling the expected usage of a software program by simulating multiple users accessing the program concurrently. As such, this testing is most relevant for multi-user systems; often one built using a client/server model, such as web servers."

I've found many articles online explaining how ApacheBench lets you "load test" with a single command from a Linux terminal, but is this a realistic load test? A single execution of ab is a very limited simulation of what actually happens when multiple users try to access your web application. A server may perform well if it has to work hard for a 30 seconds (possible execution time of an ab command), but what happens when 20000 extra requests hit your web app after it's already been stressed for hours?

Apache HTTP server benchmarking tool (ApacheBench) is a simple yet great tool which was "designed to give you an impression of how your current Apache installation performs." It can be leveraged to load test any web server setup, but we need to think for a minute about what exactly we're simulating. Here are a few examples:

An average of 1000 request per minute by 30 different users reach a web server, with spikes of up to 5000 request by 100 users every hour or so.
We expect 15000 requests every five min (by 50-100 different users), which doubles from 7 to 10pm on weekdays.
Up to 10 other systems access my REST API, with 500,000 up to a million requests per hour each.

This is where cron comes into play. Cron is a time-based job scheduler in Linux, this means you can use it to program commands to execute at specific times in the background, including recurrent runs of the same command (for example on minute 15 of every hour). Like ab, it's a pretty simple tool which you can access with the crontab -e command in Linux, which opens your preferred editor (typically nano) for you to enter single-line 6-field expressions in the CRON format – which may vary slightly among Linux distributions: m h dom mon dow command (minutes, hours, day of month, month, day of week, command). Going back to the 3 examples:

We need 2 entries in crontab:
* * * * * ab -k -c 30 -n 1000 http://server-hostname/ # every minute
0 * * * * ab -k -c 70 -n 4000 http://server-hostname/ # every hour
Now we may need 3 entries:
/5 0-19 * * * ab -k -c `shuf -i 50-100 -n 1` -n 1000 http://webapp-hostname/ # every 5 min in the initial normal hours (12am to 7pm)
/5 19-22 * * * ab -c `shuf -i 50-100 -n 1` -n 4000 http://webapp-hostname/path/ # every 5 min in "rush hours" (7-10pm)
/5 22,23 * * * ab -k -c `shuf -i 50-100 -n 1` -n 1000 http://webapp-hostname/path/ # every 5 min in the remaining normal hours (10pm and 22pm)
A single entry will do here:
30 * * * * ab -c 10 -n `shuf -i 50000-1000000 -n 1` http://api-hostname/get?query # every hour (on minute :30)

Notes:
I use -k in some of the ab commands to web applications, this uses HTTP keep-alive and is meant to simulate returning individual users.

The shuf on Linux command generates a random number within a given range (-i) and increment (-n).

These are simple use cases which begin to approximate complete load tests but which do not take into account certain factors such as multiple URL paths or POST requests. Also, in order to see the output of the ab commands executed by cron, we need to add log files to the mix. I'll leave that for you to figure out but here's a tip, on example #3:

30 * * * * ab -c 10 -n `shuf -i 50000-1000000 -n 1` http://api-hostname/get?query >> /home/myuser/my/api/tests/load/cronab.log

ab's output is a report that looks like this:

Concurrency Level: 35
Time taken for tests: 38.304 seconds
Complete requests: 220000
Failed requests: 0
Keep-Alive requests: 217820
Total transferred: 70609100 bytes
HTML transferred: 18480000 bytes
Requests per second: 5743.58 [#/sec] (mean)
Time per request: 6.094 [ms] (mean)
Time per request: 0.174 [ms] (mean, across all concurrent requests)
Transfer rate: 1800.20 [Kbytes/sec] received

Connection Times (ms)
min mean[+/-sd] median max
Connect: 0 0 0.0 0 5
Processing: 0 6 1.4 6 25
Waiting: 0 6 1.4 6 25
Total: 0 6 1.4 6 25

...

Final tip: How to determine infrastructural limits?

Every server's capacity is different. While reports from trial-and-error executions of ab can give you an idea of where a web application's infrastructure (servers) start to falter (mean response times go up exponentially), the very best way is by having a visual APM such as Amazon CloudWatch, in AWS. Monitoring graphs of different metrics over time –e.g. requests handled, errors, dropped connections, CPU utilization, memory, or swap usage– once we have left ab run on cron for hours or even days lets you better adjust the number of requests and concurrency for future ab commands. Try to find that breaking point!

Thanks for reading (=

reHalcón

2018-04-02

Simple server load test with cron and ab (Linux)

Final tip: How to determine infrastructural limits?

No comments:

Post a Comment

Simple server load test with cron and ab (Linux)