Archive for the 'technology' Category

Page 2 of 6

JMeter distributed testing with Amazon EC2

Recently I had to setup a performance testing infrastructure for one of my client. The goal was to put their web application under heavy load to prove it was able to scale properly and do some capacity planning.

I chose Apache JMeter to generate the load, created a few test plans and started to nudge the application. Of course I quickly understood that my MacBook won’t be enough have the server sweat.

To serve this application we are using Amazon EC2 instances managed with the Sclar.net service. One day I should write something about Scalr, but for now, enough is to say that a Scalr farm defines the configuration of a cluster of EC2 instances. Each instance in a farm belongs to a role (an EC2 AMI) and the farm defines how many instances of each role should be launched.

Since JMeter can be used in a master/slave mode (see details here) I decided to use Scalr to create a farm of JMeter slaves that would put the application under pressure.

The first problem I faced is that the JMeter master and its slaves must be in the same sub-network to be able to communicate, so my JMeter farm had to define two different roles, one for the master (jmeter-master) with only one instance and one for the slaves (jmeter-slave) with as many instances as necessary.

The second problem was concerning the IP addresses of the slaves, I did not want to write down the slaves’ IPs and manually enter them in the JMeter command line. By chance, with Scalr, each instance in a farm is informed of its peers’ IP addresses, so I wrote a small Python script that would get those IPs and launch the JMeter master with a given test plan.

#! /usr/bin/python
import os, sys, subprocess, datetime
 
JMETER_CMD = '/usr/share/jmeter/bin/jmeter'
SCRIPTS_ROOT = '/var/testing/'
# Instance IPs for a given role are filenames in the '/etc/aws/hosts' folder
SLAVES = os.listdir('/etc/aws/hosts/jmeter-slave')
 
def jmeter(script):
    logname = datetime.datetime.now().strftime('%Y%m%d%H%M%S') + '.log'
    script = os.path.join(SCRIPTS_ROOT, script)
    cmd = [ JMETER_CMD, '-n' ]
    cmd += [ '-t', script ]
    cmd += [ '-R', ','.join(SLAVES) ]
    cwd = SCRIPTS_ROOT
    subprocess.check_call(cmd, cwd=cwd, stderr=sys.stderr, stdout=sys.stdout)
 
if __name__ == '__main__':
    jmeter(sys.argv[1])

This was working pretty nicely for my simpler test plans (like the one that only GETs the home page) but as soon as I tried to POST (like during the login process) this was not enough. The thing is that the POST data that JMeter is using are not stored in the test plan itself but in companion .binary files, and those files are not sent by the master to the slaves like the test plans are.

I thus had to find a way to send those files by myself before the launch of the test plans. Rsync seemed the easiest thing to do, so I wrote another Python script to synchronize the slaves.

#! /usr/bin/python
import os, sys, subprocess
 
SCRIPTS_ROOT = '/var/testing/'
# Instance IPs for a given role are filenames in the '/etc/aws/hosts' folder
SLAVES = os.listdir('/etc/aws/hosts/jmeter-slave')
 
def sync():
    for slave in SLAVES:
        dest = '%s:/var/testing' % slave
        cmd = ( 'rsync', '-r', '-e', 'ssh -q -i /var/testing/farm.key', SCRIPT_ROOT, dest)
        subprocess.check_call(cmd, stderr=sys.stderr, stdout=sys.stdout)
 
if __name__ == '__main__':
    sync()

The above script requires only three things:

  • a valid RSA private key (here /var/testing/farm.key), which you can download using the Scalr.net farm’s menu
  • the /var/testing folder must already exist on the slaves
  • and, of course, you need to initially get the files on the master. I use svn up.

Once you have prepared and tested everything, using one master and one slave, you can rebundle the instances you used and then start to spawn tens of slaves to stress your application.

If you have already done something similar or have ideas for improving my setup do not hesitate in letting me know in the comments :)

UPDATE: With the release of the Amazon Virtual Private Cloud it should be possible now to have slaves running in the cloud and a master running on your workstation, they would all be in your own sub-network. However, you will need to find another way to synchronize the POST data with the slaves.

Posting multipart form data using PHP

@shvi asked me for this code over Twitter, I though it would a good idea to share it here.

The following code is used to post two different fields, a simple text data named “somedata” and a file named “somefile”.
Hope it helps :)

$destination = "http://yourdomain.com/yoururl";
 
$eol = "\r\n";
$data = '';
 
$mime_boundary=md5(time());
 
$data .= '--' . $mime_boundary . $eol;
$data .= 'Content-Disposition: form-data; name="somedata"' . $eol . $eol;
$data .= "Some Data" . $eol;
$data .= '--' . $mime_boundary . $eol;
$data .= 'Content-Disposition: form-data; name="somefile"; filename="filename.ext"' . $eol;
$data .= 'Content-Type: text/plain' . $eol;
$data .= 'Content-Transfer-Encoding: base64' . $eol . $eol;
$data .= chunk_split(base64_encode("Some file content")) . $eol;
$data .= "--" . $mime_boundary . "--" . $eol . $eol; // finish with two eol's!!
 
$params = array('http' => array(
                  'method' => 'POST',
                  'header' => 'Content-Type: multipart/form-data; boundary=' . $mime_boundary . $eol,
                  'content' => $data
               ));
 
$ctx = stream_context_create($params);
$response = @file_get_contents($destination, FILE_TEXT, $ctx);

I Can Haz Java?

They announced it yesterday at the Google Campfire ’09 (here and here) and it is today on the Google App Engine blog: Java is now supported on Google App Engine!

It comes with a set of Eclipse plugins to test and deploy Java servlets, using JDO or JPA to support database access. Of course, the database behind this is BigTable, which means that a lot of relational features are not available, but it scales!

Go there to get you started, or, if you want to know if your preferred framework will play well with GAE, go to the “Will it play in App Engine” page.

That’s good news! Especially because we may start having more and more Java applications outside of the corporate walls.

Yes Google, YES!

vedovini.net is on Facebook

As you might have noticed already, Facebook as been on vedovini.net since last December. To this purpose I have been using the Sociable! fbConnect plugin for WordPress that enables visitors to login using their Facebook credentials and later comment using their Facebook identity and feature those comments on their Facebook stream.

However, I wanted a deeper integration between this blog and my Facebook profile. Here is what I have done.

In the process of installing the fbConnect plugin you have to create a Facebook application and, among other things, Facebook applications feature a canvas page and an application tab. The canvas page is the main application page, the tab can be added to any user’s or page’s profiles.

To setup the canvas page you specify an URL that will either serve FBML (Facebook markup) or pure HTML. In the latter case the page is displayed in an IFRAME.

Initially, I used the IFRAME version to display the home page but I found awkward to have my blog design mixed with the Facebook design. Additionally this technique cannot be used for tabs, that only support FBML.

Finally, I crafted special pages on this blog that serves only FBML with a special Facebook styling extracted from Foxinni’s Facebook WordPress Theme. The resulting canvas page can bee seen here and the corresponding profile tab is now featured on my own Facebook profile, here (you may not be able to see this one because of Facebook privacy control so I inserted a screenshot below).

If this is getting enough interest I might package it as a WordPress plugin. Leave a comment if you are interested or if you have additional ideas.

UPDATE (2010-03-13): This is now a WordPress plugin, see my plugins page.

What BOINC are you?

SET@HomeI have been participating in the SETI@Home project since January the 6th, 2000 (at this time I was a huge fan of The X-Files and I though it may help Fox Mulder find his lost sister…). Since then I have been installing the SETI@Home client on every desktop or laptop computer I have been using.

For those who do not know what the SETI@Home project is, the goal is to sort out radio signals from the Arecibo radio telescope using grid computing in order to find extraterrestrial signals.

As far as I know this was the very first large scale implementation of a grid based computer and today, with 556,888 machines and 1,393.74 TeraFLOPS per day, I think this can be considered the most powerful super-computer in the world.

We still did not find any track of Samantha Mulder, but what this project has actually proven is that it is possible to use desktop computers idle time to do useful scientific researches.

In 2005, the SETI@Home software became BOINC (Berkeley Open Infrastructure for Network Computing) and mutated into a platform which is now not only supporting the SETI@Home project but many other scientific projects.

When I am not using them, my laptop and my desktop are running the following projects:

What are you doing with your computers’ idle time? If you have BOINC installed, list the projects you are supporting in the comments, if not then install it and become part of the most powerful super-computer in the Solar System ;)