Category Archives: Code & Technology

And you think 160 is not enough?

SMS are not 160 characters long, they are 140 bytes long! This is what I discovered today after my SO complained that her mobile operator was charging her for SMS she never sent…

And when you know how computers are working, it totally makes sense!

“So what?” are you going to ask? So, this is again a nice example of character encodings drive you crazy. According to wikipedia there are 3 encodings used in text messages which respectively use 7bits, 8bits and 16bits to encode a single character.

Depending on the characters you used in your message your phone is going to decide what encoding to use, thus reducing the maximum number of characters to, respectively, 160, 140 and 70 (and even less, see later). Any extra character will lead to the splitting of your message into multiple SMS and, obviously, a raise in your bill.

By default the 7bit encoding used is GSM 03.38, which has the following 128 characters alphabet: @, £, $, Â¥, è, é, ù, ì, ò, Ç, LF, Ø, ø, CR, Ã…, Ã¥, Δ, _, Φ, Γ, Λ, Ω, Π, Ψ, Σ, Θ, Ξ, ESC, Æ, æ, ß, É, SP, !, “, #, ¤, {5f676304cfd4ae2259631a2f5a3ea815e87ae216a7b910a3d060a7b08502a4b2}, &, ‘, (, ), *, +, ,, -, ., /, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, :, ;, <, =, >, ?, ¡, A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P, Q, R, S, T, U, V, W, X, Y, Z, Ä, Ö, Ñ, Ãœ, §, ¿, a, b, c, d, e, f, g, h, i, j, k, l, m, n, o, p, q, r, s, t, u, v, w, x, y, z, ä, ö, ñ, ü, à

If you use only those characters, then you text messages can have 160 characters, however, any character outside of this alphabet will mean the use of a different encoding. And if you are using exotic scripts, your messages will be encoded in UTF-16 and in this encoding a Chinese character, for example, will take up to 4 bytes, reducing the maximum length of you Chinese message to 35 characters max.

I guess that now that smart phones are supporting international scripts and transparently breaking up text messages, a lot of people get trapped. The only recommendation I can think of is to enable your phone to display the character count when you type text messages, I noticed that my iPhone is changing the maximum number of characters according to the encoding it’s going to use to send my message.

If you want to know more about character encodings I absolutely recommend the following article by Joel Spolsky: The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets (No Excuses!)

UPDATE: At Stephanie’s request here is how to activate message count on your iPhone (at least on my 3GS with iOS 4.1).

Go to your iPhone settings, scroll down to “Messages” then toggle “Character Count” on. When you write a text message the count will show up only if you have at least two lines of text :)

Image Credits: Steve Webel

What’s your problem, exactly?

Nacmias Auto Sales, Service, and RepairsWhen you have a problem with your car and you go to the garage you usually say something along the line of “I’ve got a strange noise when I do this or that”. The guy (or, in my dreams, the gal) never say “I don’t understand, come back later with a better description of your problem”.

To my fellow developers: this is the same when a user comes to you with a bug. Believe me, there’s no such thing as an under-specified bug.

There’s a rampant habit among developers for being condescendant and asking for precisely specified bug reports. I know it, I do it as well. Developers have a lot of reason to do that, but mostly it’s because we think we are smarter than average (which is true, most of the time) and you don’t deserve our attention if you’re too stupid to understand the way we work. The other thing is that we don’t like bugs, and one way to avoid bugs is to make it difficult to report them.

When a user (or client, which is worse) take the time to come  to you and say there’s a problem with your software, take their word for granted. Even if it’s not a technical bug it can be a documentation or an education bug. Make sure it’s easy and worthwhile for them to report that bug because, before all, it’s in your own interest.

Reporting a bug must be a conversation. I mean, you would go to another garage if the guy was condescendant, disrespectful or was simply oblivious, wouldn’t you?

Image Credits: Rich Nacmias

Facebook Pages Notifications

You might have noticed already, if you administer community or professional pages or have developed Facebook applications, that contrary to your own personal wall you never get notified when someone post a status or write a comment on your pages?

This is a problem for most page administrators and there is a 32 pages long (and growing) thread with people complaining about this missing feature. This is a problem because without such a feature you have to periodically crawl your own pages to check if anyone posted anything (status or comment) and get the opportunity to eventually respond to it, or spam it. I guess that since they won’t get notified about this thread the Facebook people will never notice the problem…

One proposed solution to this issue is to “like” each and every status update you post on your pages wall, however, beside the fact that liking everything you post may look a bit awkward, this does not gets you notified when someone posts a new status.

I have some pages I need to watch, so missing this feature was really a problem to me. And when it itches, I scratch… Besides, I wanted to experiment with the new Facebook Graph API.

So I created this application, it’s called “Watch My Pages!” and provides users with receiving daily e-mail notifications when someone posts a status or writes a comment to their pages wall. If you like it, have a problem with it or think about a feature, just drop a message on it’s wall, I’ll get notified ;)

How to manage Google AppEngine maintenance periods

AppEngine logoHere is a small snippet of code that I use on applications deployed on Google AppEngine to inform users that the application is in maintenance mode.

It usually happen when the AppEngine team put the datastore in read-only mode for maintenance purpose but other capabilities can be tested as well.

def requires_datastore_write(view):
    def newview(request, *args, **kwargs):
        from google.appengine.api import capabilities
        datastore_write_enabled = capabilities.CapabilitySet('datastore_v3', capabilities=['write']).is_enabled()

        if datastore_write_enabled:
            return view(request, *args, **kwargs)
        else:
            from django.shortcuts import render_to_response
            from django.template import RequestContext
            return render_to_response('maintenance.html', context_instance=RequestContext(request))

    return newview

This is a python decorator and you can use it to decorate views that require write access to the datastore. For example:

@requires_datastore_write
def update(request):
    # Do something that requires to write in the database

You will need to create a Django template named maintenance.html to display a warning to your users. Mine looks like this:

<h2>Application Maintenance</h2>
<p>The LibraryThing for Facebook application is currently
in maintenance mode and some operations are temporarily unavailable.</p>
<p>Thanks for trying back later. Sorry for the inconvenience.</p>

Properly uploading files to Amazon S3

Here is a little script I wrote and I though ought to be shared. I use it to upload static files like images, css and javascript so that they can be served by Amazon S3 instead of the main application server (like Google App Engine).

It’s written in Python and does interesting things like compressing and minifying what needs to be. It takes 3 arguments and as 2 options:

Usage: s3uploader.py [-xm] src_folder destination_bucket_name prefix
src_folder
path to the local folder containing the static files to upload
destination_bucket_name
name of the S3 bucket to upload to (e.g. static.example.com)
prefix
a prefix to use for the destination key (kind of a folder on the destination bucket, I use it to specify a release version to defeat browser caching)
x
if set, the script will set a far future expiry for all files, otherwise the S3 default will be used (one day if I remember well)
m
if set, the script will minify css and javascript files

First you will have to install some dependencies, namely boto, jsmin and cssmin. Installation procedure will depend on your OS but on my Mac I do the following:

sudo easy_install boto
sudo easy_install jsmin
sudo easy_install cssmin

And here is the script itself:

#! /usr/bin/env python
import os, sys, boto, mimetypes, zipfile, gzip
from io import StringIO, BytesIO
from optparse import OptionParser
from jsmin import *
from cssmin import *

# Boto picks up configuration from the env.
os.environ['AWS_ACCESS_KEY_ID'] = 'Your AWS access key id goes here'
os.environ['AWS_SECRET_ACCESS_KEY'] = 'Your AWS secret access key goes here'

# The list of content types to gzip, add more if needed
COMPRESSIBLE = [ 'text/plain', 'text/csv', 'application/xml',
                'application/javascript', 'text/css' ]

def main():
    parser = OptionParser(usage='usage: {5f676304cfd4ae2259631a2f5a3ea815e87ae216a7b910a3d060a7b08502a4b2}prog [options] src_folder destination_bucket_name prefix')
    parser.add_option('-x', '--expires', action='store_true', help='set far future expiry for all files')
    parser.add_option('-m', '--minify', action='store_true', help='minify javascript files')
    (options, args) = parser.parse_args()
    if len(args) != 3:
        parser.error("incorrect number of arguments")
    src_folder = os.path.normpath(args[0])
    bucket_name = args[1]
    prefix = args[2]

    conn = boto.connect_s3()
    bucket = conn.get_bucket(bucket_name)

    namelist = []
    for root, dirs, files in os.walk(src_folder):
        if files and not '.svn' in root:
            path = os.path.relpath(root, src_folder)
            namelist += [os.path.normpath(os.path.join(path, f)) for f in files]

    print 'Uploading {5f676304cfd4ae2259631a2f5a3ea815e87ae216a7b910a3d060a7b08502a4b2}d files to bucket {5f676304cfd4ae2259631a2f5a3ea815e87ae216a7b910a3d060a7b08502a4b2}s' {5f676304cfd4ae2259631a2f5a3ea815e87ae216a7b910a3d060a7b08502a4b2} (len(namelist), bucket.name)
    for name in namelist:
        content = open(os.path.join(src_folder, name))
        key = bucket.new_key(os.path.join(prefix, name))
        type, encoding = mimetypes.guess_type(name)
        type = type or 'application/octet-stream'
        headers = { 'Content-Type': type, 'x-amz-acl': 'public-read' }
        states = [type]

        if options.expires:
            # We only use HTTP 1.1 headers because they are relative to the time of download
            # instead of being hardcoded.
            headers['Cache-Control'] = 'max-age {5f676304cfd4ae2259631a2f5a3ea815e87ae216a7b910a3d060a7b08502a4b2}d' {5f676304cfd4ae2259631a2f5a3ea815e87ae216a7b910a3d060a7b08502a4b2} (3600 * 24 * 365)

        if options.minify and type == 'application/javascript':
            outs = StringIO()
            JavascriptMinify().minify(content, outs)
            content.close()
            content = outs.getvalue()
            if len(content) > 0 and content[0] == '\n':
                content = content[1:]
            content = BytesIO(content)
            states.append('minified')

        if options.minify and type == 'text/css':
            outs = cssmin(content.read())
            content.close()
            content = outs
            if len(content) > 0 and content[0] == '\n':
                content = content[1:]
            content = BytesIO(content)
            states.append('minified')

        if type in COMPRESSIBLE:
            headers['Content-Encoding'] = 'gzip'
            compressed = StringIO()
            gz = gzip.GzipFile(filename=name, fileobj=compressed, mode='w')
            gz.writelines(content)
            gz.close()
            content.close
            content = BytesIO(compressed.getvalue())
            states.append('gzipped')

        states = ', '.join(states)
        print '- {5f676304cfd4ae2259631a2f5a3ea815e87ae216a7b910a3d060a7b08502a4b2}s => {5f676304cfd4ae2259631a2f5a3ea815e87ae216a7b910a3d060a7b08502a4b2}s ({5f676304cfd4ae2259631a2f5a3ea815e87ae216a7b910a3d060a7b08502a4b2}s)' {5f676304cfd4ae2259631a2f5a3ea815e87ae216a7b910a3d060a7b08502a4b2} (name, key.name, states)
        key.set_contents_from_file(content, headers)
        content.close();

if __name__ == '__main__':
    main()

Thanks to Nico for the expiry trick :)

PRIVACY: @chassyofcricket by Michell Zappa

Spare me the talk about privacy, they’re all clueless anyway…

With all the talks and posts and whatnot about privacy on the Internet it’s easy for anyone to turn into a privacy control freak.

And I really was starting to freak out myself. After all, a good bunch of my own life is on the Net: Facebook, Twitter, Flickr, LinkedIn, this blog, all the Google applications and all the other services I use, or I test… But this morning I received a letter, not an e-mail, a paper letter. From Google AdWords. Sent from France. In German!

I guess they just assumed that since I was living in Switzerland I was talking German, like when ebay.com redirects me to ebay.de, but I don’t speak nor read German.

And it reminded me something I learned a long time ago, when I was working for Singularis – a now defunct start-up that was collecting users preferences about TV programs: You can collect as many data as you want, if you don’t know how to use it it’s only worth the cost of the storage.

And the more you have the harder it is.

Image Credits: Michell Zappa

My Fontself is better than your font

FontselfFor those of you who were at the Lift conference 2008 you might remember of Fontself. Franz Hoffman and Marc Escher, the two founders of the company, were there to offer everyone the opportunity to fill in a grid with their own hand writing, scan it, and use it on the Lift website.

Today, the Fontself team has grown and is celebrating their first release of a product. Together with Netlog, the european online social portal, they are now giving the opportunity to the Netlog community members to send messages, post blog entries or post comments using personalized character fonts.

Congratulation to them, they have been working long and hard for their ideas to come out and I am proud I helped them make their dream come true.

[fontself font=”_9905c72628cf93321a6ce43c146071af09cb7d2339b3b1cfc8eb764ccf6d87ff” size=”30″]And this also gives me some advantages, like being able to use a Fontself font on my own blog and give you a glimpse at what the future of web fonts might be![/fontself]
[fontself font=”_b72139c4df1037c8971033917d5bf684f05f9e33f8b990b9f3fef046823e596a” size=”20″]Among other things, you will appreciate the ability to select, copy and paste the text :P[/fontself]
For now, the feature is only available to the french version of the platform but there is no doubt that it will rapidly extend to the rest of the 35 million Netlog members throughout Europe and that the Fontself team will continue to develop their technology and enhance the web.

If you want to stay informed about Fontself and their technology you can either subscribe to their newsletter, become a friend of their Netlog page, follow them on twitter or keep following this blog…

Image Credits: Fontself

Sidewiki RSS

Sidewiki RSSLast week Google announced Google Sidewiki, a new service that enables anyone to comment on any page.

There has been a lot of comments already about Sidewiki but the thing that instantly stroke me is the fact that there’s no easy way to keep up with what others are saying about your own pages. So I took a look at the Sidewiki API and built the Sidewiki RSS service.

This free service (hope you won’t mind the Google Ads) enables webmasters to get the URL to the recent Sidewiki entries for their pages. There’s even a bookmarklet that you can drop in your browser’s toolbar and use to get the feed of the page you are browsing.

Hope you will like it ;)

JMeter distributed testing with Amazon EC2

Recently I had to setup a performance testing infrastructure for one of my client. The goal was to put their web application under heavy load to prove it was able to scale properly and do some capacity planning.

I chose Apache JMeter to generate the load, created a few test plans and started to nudge the application. Of course I quickly understood that my MacBook won’t be enough have the server sweat.

To serve this application we are using Amazon EC2 instances managed with the Sclar.net service. One day I should write something about Scalr, but for now, enough is to say that a Scalr farm defines the configuration of a cluster of EC2 instances. Each instance in a farm belongs to a role (an EC2 AMI) and the farm defines how many instances of each role should be launched.

Since JMeter can be used in a master/slave mode (see details here) I decided to use Scalr to create a farm of JMeter slaves that would put the application under pressure.

The first problem I faced is that the JMeter master and its slaves must be in the same sub-network to be able to communicate, so my JMeter farm had to define two different roles, one for the master (jmeter-master) with only one instance and one for the slaves (jmeter-slave) with as many instances as necessary.

The second problem was concerning the IP addresses of the slaves, I did not want to write down the slaves’ IPs and manually enter them in the JMeter command line. By chance, with Scalr, each instance in a farm is informed of its peers’ IP addresses, so I wrote a small Python script that would get those IPs and launch the JMeter master with a given test plan.

#! /usr/bin/python
import os, sys, subprocess, datetime

JMETER_CMD = '/usr/share/jmeter/bin/jmeter'
SCRIPTS_ROOT = '/var/testing/'
# Instance IPs for a given role are filenames in the '/etc/aws/hosts' folder
SLAVES = os.listdir('/etc/aws/hosts/jmeter-slave')

def jmeter(script):
    logname = datetime.datetime.now().strftime('{5f676304cfd4ae2259631a2f5a3ea815e87ae216a7b910a3d060a7b08502a4b2}Y{5f676304cfd4ae2259631a2f5a3ea815e87ae216a7b910a3d060a7b08502a4b2}m{5f676304cfd4ae2259631a2f5a3ea815e87ae216a7b910a3d060a7b08502a4b2}d{5f676304cfd4ae2259631a2f5a3ea815e87ae216a7b910a3d060a7b08502a4b2}H{5f676304cfd4ae2259631a2f5a3ea815e87ae216a7b910a3d060a7b08502a4b2}M{5f676304cfd4ae2259631a2f5a3ea815e87ae216a7b910a3d060a7b08502a4b2}S') + '.log'
    script = os.path.join(SCRIPTS_ROOT, script)
    cmd = [ JMETER_CMD, '-n' ]
    cmd += [ '-t', script ]
    cmd += [ '-R', ','.join(SLAVES) ]
    cwd = SCRIPTS_ROOT
    subprocess.check_call(cmd, cwd=cwd, stderr=sys.stderr, stdout=sys.stdout)

if __name__ == '__main__':
    jmeter(sys.argv[1])

This was working pretty nicely for my simpler test plans (like the one that only GETs the home page) but as soon as I tried to POST (like during the login process) this was not enough. The thing is that the POST data that JMeter is using are not stored in the test plan itself but in companion .binary files, and those files are not sent by the master to the slaves like the test plans are.

I thus had to find a way to send those files by myself before the launch of the test plans. Rsync seemed the easiest thing to do, so I wrote another Python script to synchronize the slaves.

#! /usr/bin/python
import os, sys, subprocess

SCRIPTS_ROOT = '/var/testing/'
# Instance IPs for a given role are filenames in the '/etc/aws/hosts' folder
SLAVES = os.listdir('/etc/aws/hosts/jmeter-slave')

def sync():
    for slave in SLAVES:
        dest = '{5f676304cfd4ae2259631a2f5a3ea815e87ae216a7b910a3d060a7b08502a4b2}s:/var/testing' {5f676304cfd4ae2259631a2f5a3ea815e87ae216a7b910a3d060a7b08502a4b2} slave
        cmd = ( 'rsync', '-r', '-e', 'ssh -q -i /var/testing/farm.key', SCRIPT_ROOT, dest)
        subprocess.check_call(cmd, stderr=sys.stderr, stdout=sys.stdout)

if __name__ == '__main__':
    sync()

The above script requires only three things:

  • a valid RSA private key (here /var/testing/farm.key), which you can download using the Scalr.net farm’s menu
  • the /var/testing folder must already exist on the slaves
  • and, of course, you need to initially get the files on the master. I use svn up.

Once you have prepared and tested everything, using one master and one slave, you can rebundle the instances you used and then start to spawn tens of slaves to stress your application.

If you have already done something similar or have ideas for improving my setup do not hesitate in letting me know in the comments :)

UPDATE: With the release of the Amazon Virtual Private Cloud it should be possible now to have slaves running in the cloud and a master running on your workstation, they would all be in your own sub-network. However, you will need to find another way to synchronize the POST data with the slaves.

Posting multipart form data using PHP

@shvi asked me for this code over Twitter, I though it would a good idea to share it here.

The following code is used to post two different fields, a simple text data named “somedata” and a file named “somefile”.
Hope it helps :)

echo adrotate_group(4);
$destination = "http://yourdomain.com/yoururl";

$eol = "\r\n";
$data = '';

$mime_boundary=md5(time());

$data .= '--' . $mime_boundary . $eol;
$data .= 'Content-Disposition: form-data; name="somedata"' . $eol . $eol;
$data .= "Some Data" . $eol;
$data .= '--' . $mime_boundary . $eol;
$data .= 'Content-Disposition: form-data; name="somefile"; filename="filename.ext"' . $eol;
$data .= 'Content-Type: text/plain' . $eol;
$data .= 'Content-Transfer-Encoding: base64' . $eol . $eol;
$data .= chunk_split(base64_encode("Some file content")) . $eol;
$data .= "--" . $mime_boundary . "--" . $eol . $eol; // finish with two eol's!!

$params = array('http' => array(
                  'method' => 'POST',
                  'header' => 'Content-Type: multipart/form-data; boundary=' . $mime_boundary . $eol,
                  'content' => $data
               ));

$ctx = stream_context_create($params);
$response = @file_get_contents($destination, FILE_TEXT, $ctx);