Tag Archives: scalr

JMeter distributed testing with Amazon EC2

Recently I had to setup a performance testing infrastructure for one of my client. The goal was to put their web application under heavy load to prove it was able to scale properly and do some capacity planning.

I chose Apache JMeter to generate the load, created a few test plans and started to nudge the application. Of course I quickly understood that my MacBook won’t be enough have the server sweat.

To serve this application we are using Amazon EC2 instances managed with the Sclar.net service. One day I should write something about Scalr, but for now, enough is to say that a Scalr farm defines the configuration of a cluster of EC2 instances. Each instance in a farm belongs to a role (an EC2 AMI) and the farm defines how many instances of each role should be launched.

Since JMeter can be used in a master/slave mode (see details here) I decided to use Scalr to create a farm of JMeter slaves that would put the application under pressure.

The first problem I faced is that the JMeter master and its slaves must be in the same sub-network to be able to communicate, so my JMeter farm had to define two different roles, one for the master (jmeter-master) with only one instance and one for the slaves (jmeter-slave) with as many instances as necessary.

The second problem was concerning the IP addresses of the slaves, I did not want to write down the slaves’ IPs and manually enter them in the JMeter command line. By chance, with Scalr, each instance in a farm is informed of its peers’ IP addresses, so I wrote a small Python script that would get those IPs and launch the JMeter master with a given test plan.

#! /usr/bin/python
import os, sys, subprocess, datetime

JMETER_CMD = '/usr/share/jmeter/bin/jmeter'
SCRIPTS_ROOT = '/var/testing/'
# Instance IPs for a given role are filenames in the '/etc/aws/hosts' folder
SLAVES = os.listdir('/etc/aws/hosts/jmeter-slave')

def jmeter(script):
    logname = datetime.datetime.now().strftime('{5f676304cfd4ae2259631a2f5a3ea815e87ae216a7b910a3d060a7b08502a4b2}Y{5f676304cfd4ae2259631a2f5a3ea815e87ae216a7b910a3d060a7b08502a4b2}m{5f676304cfd4ae2259631a2f5a3ea815e87ae216a7b910a3d060a7b08502a4b2}d{5f676304cfd4ae2259631a2f5a3ea815e87ae216a7b910a3d060a7b08502a4b2}H{5f676304cfd4ae2259631a2f5a3ea815e87ae216a7b910a3d060a7b08502a4b2}M{5f676304cfd4ae2259631a2f5a3ea815e87ae216a7b910a3d060a7b08502a4b2}S') + '.log'
    script = os.path.join(SCRIPTS_ROOT, script)
    cmd = [ JMETER_CMD, '-n' ]
    cmd += [ '-t', script ]
    cmd += [ '-R', ','.join(SLAVES) ]
    cwd = SCRIPTS_ROOT
    subprocess.check_call(cmd, cwd=cwd, stderr=sys.stderr, stdout=sys.stdout)

if __name__ == '__main__':

This was working pretty nicely for my simpler test plans (like the one that only GETs the home page) but as soon as I tried to POST (like during the login process) this was not enough. The thing is that the POST data that JMeter is using are not stored in the test plan itself but in companion .binary files, and those files are not sent by the master to the slaves like the test plans are.

I thus had to find a way to send those files by myself before the launch of the test plans. Rsync seemed the easiest thing to do, so I wrote another Python script to synchronize the slaves.

#! /usr/bin/python
import os, sys, subprocess

SCRIPTS_ROOT = '/var/testing/'
# Instance IPs for a given role are filenames in the '/etc/aws/hosts' folder
SLAVES = os.listdir('/etc/aws/hosts/jmeter-slave')

def sync():
    for slave in SLAVES:
        dest = '{5f676304cfd4ae2259631a2f5a3ea815e87ae216a7b910a3d060a7b08502a4b2}s:/var/testing' {5f676304cfd4ae2259631a2f5a3ea815e87ae216a7b910a3d060a7b08502a4b2} slave
        cmd = ( 'rsync', '-r', '-e', 'ssh -q -i /var/testing/farm.key', SCRIPT_ROOT, dest)
        subprocess.check_call(cmd, stderr=sys.stderr, stdout=sys.stdout)

if __name__ == '__main__':

The above script requires only three things:

  • a valid RSA private key (here /var/testing/farm.key), which you can download using the Scalr.net farm’s menu
  • the /var/testing folder must already exist on the slaves
  • and, of course, you need to initially get the files on the master. I use svn up.

Once you have prepared and tested everything, using one master and one slave, you can rebundle the instances you used and then start to spawn tens of slaves to stress your application.

If you have already done something similar or have ideas for improving my setup do not hesitate in letting me know in the comments :)

UPDATE: With the release of the Amazon Virtual Private Cloud it should be possible now to have slaves running in the cloud and a master running on your workstation, they would all be in your own sub-network. However, you will need to find another way to synchronize the POST data with the slaves.