Author Archives: claude

Properly uploading files to Amazon S3

Here is a little script I wrote and I though ought to be shared. I use it to upload static files like images, css and javascript so that they can be served by Amazon S3 instead of the main application server (like Google App Engine).

It’s written in Python and does interesting things like compressing and minifying what needs to be. It takes 3 arguments and as 2 options:

src_folder
path to the local folder containing the static files to upload
destination_bucket_name
name of the S3 bucket to upload to (e.g. static.example.com)
prefix
a prefix to use for the destination key (kind of a folder on the destination bucket, I use it to specify a release version to defeat browser caching)
x
if set, the script will set a far future expiry for all files, otherwise the S3 default will be used (one day if I remember well)
m
if set, the script will minify css and javascript files

First you will have to install some dependencies, namely boto, jsmin and cssmin. Installation procedure will depend on your OS but on my Mac I do the following:

And here is the script itself:

Thanks to Nico for the expiry trick :)

PRIVACY: @chassyofcricket by Michell Zappa

Spare me the talk about privacy, they’re all clueless anyway…

With all the talks and posts and whatnot about privacy on the Internet it’s easy for anyone to turn into a privacy control freak.

And I really was starting to freak out myself. After all, a good bunch of my own life is on the Net: Facebook, Twitter, Flickr, LinkedIn, this blog, all the Google applications and all the other services I use, or I test… But this morning I received a letter, not an e-mail, a paper letter. From Google AdWords. Sent from France. In German!

I guess they just assumed that since I was living in Switzerland I was talking German, like when ebay.com redirects me to ebay.de, but I don’t speak nor read German.

And it reminded me something I learned a long time ago, when I was working for Singularis – a now defunct start-up that was collecting users preferences about TV programs: You can collect as many data as you want, if you don’t know how to use it it’s only worth the cost of the storage.

And the more you have the harder it is.

Image Credits: Michell Zappa

Sévices Après Vente

Dans un monde chaque jour un peu plus numérisé n’oublions pas que ce sont toujours les même vieux trucs qui fonctionnent: si vis pacem, para bellum!

!!Attention!! Ce billet est long et ennuyeux…

Ceux qui me suivent sur Twitter ou Facebook se souviendront qu’en novembre dernier (le vendredi 13 exactement) mon appartement avait été cambriolé pendant que j’étais chez le dentiste (ça fais beaucoup pour un vendredi 13). Parmi les objets qui m’avaient été volés se trouvait mon MacBook Pro, qui est mon seul et unique outil de travail.

J’avais donc rapidement besoin d’une nouvelle machine. Après avoir fait le tour des revendeurs Apple de la région pour découvrir que seules des configurations de base étaient en stock, je me rend, sans grand espoir, à la FNAC de Lausanne.

Je n’achète jamais de matériel électronique de ce prix à la FNAC – en dessous de 200,-CHF le rapport prix/rapidité de l’achat est assez favorable pour que je ne cherche pas plus loin mais au-delà j’ai toujours pu trouver moins cher ailleurs. Mais, ce samedi 14, je découvre avec bonheur que la FNAC possède en stock un MacBook Pro dont la configuration approche de très prés la configuration que je recherche: 15″, 3.06GHZ, 4Go de RAM et un disque dur de 500Go à 7200 tr/mn. Je l’achète donc, pour le prix de 3299,-CHF (moins le rabais adhérents).

Le 29 décembre dernier (un mois et demi plus tard), en allant me coucher, je décide de laisser mon MacBook allumé sur la table du salon afin qu’il puisse participer au réseau BOINC et dédier quelques cycles à la recherche extra-terrestre. Je n’ai pas d’animaux, pas d’enfants et la machine est posée sur un endroit dégagé où la ventilation n’est pas obstruée.

Le lendemain matin, ayant pris mon petit déjeuner, je m’en vais consulter mes e-mails. Étrangement, alors qu’une simple caresse suffit d’habitude, mon Mac ne veut pas se réveiller. Étonné, je vérifie que je ne l’ai pas laissé sans alimentation: non, le cordon est bien là, branché et alimenté, il ne s’agit donc pas d’un épuisement des batteries. De plus, un ronron très léger m’indique que la machine semble toujours être en marche. Je force donc un shutdown en maintenant la touche on/off enfoncée et j’entends distinctement ce petit bruit caractéristique qui signale l’arrêt d’un moteur électrique quelque part dans la machine. Je l’allume de nouveau et là un bruit de moteur se fait également entendre mais à par cela rien, l’écran reste désespérément aveugle. Après une ou deux autres tentatives aussi infructueuses je décide d’amener la machine au SAV de la FNAC.

Continue reading

Experiencing viral growth

This is something to hear and talk about it but this is something totally different to experience it, it’s thrilling, even on modest scales.

Since my LibraryThing application for Facebook is out it has clearly had a viral growth curve. So far there are only 435 users and every week I am looking for an inflection of this tendency. I know there will be one because there is a limited number of LibraryThing users on Facebook. My goal, right now, is to attract as many of them as possible on this application.

The next step will be to attract Facebook users to LibraryThing. But I know that for this I will need help from Tim Spalding and the LibraryThing team. I have always been grateful for their work but I must admit that I have been quite disappointed recently as I was trying to contact them and they constantly ignored me.

I am also thinking about open-sourcing the application, because I think it is both a good use case for people who are developing Python/Django applications on Google AppEngine and those who are developing for the Facebook platform. I still have to choose a license but the GNU Affero General Public License seems like a good match.

Anyway, if you love books, got plenty of them and want to share your readings, do not forget to give LibraryThing a try and once your are convinced, join the Facebook application, with this application you can:

  • Add a tab and a box to your profile, listing your most recent books
  • Choose the number of books to display in your profile tab
  • Choose whether you want to display them with covers only or as a list which will include your ratings and reviews
  • If you grant the application the right to publish to your stream it will publish books you add to LibraryThing on your wall
  • It will also publish reviews as you write them on LibraryThing

You can also:

  • Browse your Facebook friend’s books
  • Find books on the search page
  • Share a book you like or comment on it (those are Facebook only features and will not appear in LibraryThing)
  • Add a book to your LibraryThing collection with a single click

Enjoy :)

A year in review

As this year in coming to an end I though I should do a post-mortem, like at the end of a project, to see what went well and what did not.

  • I left the office I was renting downtown and started working from home: It was a good move from work perspective, being alone at home allows me to be really productive. However, as the second part of the year was getting busier it became difficult to put limits and my work/life balance suffered.
  • I worked the map editor of the DITA-OP but did not finish it: Not good at all, I have not been able to do a release this year. The other problem with the DITA-OP is that I don’t know my users. I know they are here, somewhere and I really need to find a way to gather the community.
  • I started two toys projects, SidewikiRSS.com and a Facebook application for LibraryThing: SidewikiRSS.com is rolling on its own, it does not cost me anything beside the domain name (thanks Google App Engine), it’s used regularly and bring some traffic here. fbLibraryThing is slowly but steadily growing but I am wondering if I will be able to add new features – I am completely dependent on the LibraryThing API and I will need help from the LibraryThing team if I want to go farther.
  • I completely put aside my super-secret Babelizr project: That’s not a good thing, for sure, but at least it was because of too much paid work. A positive thing is that I greatly improved my Python and Django skills with other projects and it will payoff for Babelizr.
  • I can now consider myself an Amazon Web Services and Google App Engine expert: And that’s a tremendous addition to my curriculum. I need now to dedicate more time to their respective communities.
  • I accepted too many projects in the second part of the year: The beginning of the year had been slow and I though I needed as many contacts as possible to build a sustainable business. Overall this is a good thing – especially since I exceeded my financial goals. The other positive side is that I only accepted interesting projects and that I met really nice people. But I really had a lot of pressure in the last quarter and this was definitely not the purpose of being a freelancer – “Working more to earn more” is not my moto.
  • I did not blogged enough: Especially since I gained a lot of experience in many fields and with many tools, I should have definitely written more about these.

And last, but not least:

  • I swam with sharks: Biggest thrill ever! I swam with two Oceanic Whitetip (Carcharhinus longimanus) in the Red Sea. My only regret is that I have been totally unable to take a descent picture or make a video of this event.
  • I skied almost every week-end of the winter season and hiked almost every week-end of the summer season: This prepared me really well for our 2 weeks vacation in Peru.
  • I have been more than 80 times to the movie theater: Thanks to the Pathé Pass Yang offered me last Christmas. It allowed me to see movies (good and bad) that I would not have seen otherwise.

I think I can say it was a good year, tiring, a bit stressful near the end but a good year. However, I must say it did not bring me any closer to my biggest goal that is to find ways to automate my revenue stream, so I really need to work that out next year.

The other planned features of next year are:

  • Releasing the latest version of the DITA-OP and finding a way to build and animate the community.
  • More blogging (like everyblogger else) and tweeting. Find a better organization of my Facebook presence.
  • Connecting with the LibraryThing team, although this proved to be difficult so far.
  • Coming back to Babelizr, may be starting with building external interest around the project first in order to force commitment.
  • Dedicating more time to online communities: Google App Engine, AWS, Drupal, Django, etc. May be through Stack Overflow.
  • Planning of a 4 weeks vacation dedicated to hiking or diving.
  • Watch as many movies as possible with my renewed Pathé Pass.

And, of course, keeping my clients happy :)

I only worked for two public projects this year (others are either private or still in stealth mode, so I cannot talk about them):

  • Fontself, a startup company which provides a revolutionary new experience of text, through digital text personalization. It provides digital fonts that preserves the gestures of a given handwriting and the original look of the drawing appliance (ball-point pen, pencil, ink, paper, etc.). I participated in the design of the font distribution system and its implementation on the Amazon’s cloud infrastructure using Python and Django.
  • nouvo.ch, the multimedia magazine from the Swiss Romand Television channel asked me to redevelop their website using the Drupal CMS and various media management modules.

My Fontself is better than your font

FontselfFor those of you who were at the Lift conference 2008 you might remember of Fontself. Franz Hoffman and Marc Escher, the two founders of the company, were there to offer everyone the opportunity to fill in a grid with their own hand writing, scan it, and use it on the Lift website.

Today, the Fontself team has grown and is celebrating their first release of a product. Together with Netlog, the european online social portal, they are now giving the opportunity to the Netlog community members to send messages, post blog entries or post comments using personalized character fonts.

Congratulation to them, they have been working long and hard for their ideas to come out and I am proud I helped them make their dream come true.

[fontself font=”_9905c72628cf93321a6ce43c146071af09cb7d2339b3b1cfc8eb764ccf6d87ff” size=”30″]And this also gives me some advantages, like being able to use a Fontself font on my own blog and give you a glimpse at what the future of web fonts might be![/fontself]
[fontself font=”_b72139c4df1037c8971033917d5bf684f05f9e33f8b990b9f3fef046823e596a” size=”20″]Among other things, you will appreciate the ability to select, copy and paste the text :P[/fontself]
For now, the feature is only available to the french version of the platform but there is no doubt that it will rapidly extend to the rest of the 35 million Netlog members throughout Europe and that the Fontself team will continue to develop their technology and enhance the web.

If you want to stay informed about Fontself and their technology you can either subscribe to their newsletter, become a friend of their Netlog page, follow them on twitter or keep following this blog…

Image Credits: Fontself

Sidewiki RSS

Sidewiki RSSLast week Google announced Google Sidewiki, a new service that enables anyone to comment on any page.

There has been a lot of comments already about Sidewiki but the thing that instantly stroke me is the fact that there’s no easy way to keep up with what others are saying about your own pages. So I took a look at the Sidewiki API and built the Sidewiki RSS service.

This free service (hope you won’t mind the Google Ads) enables webmasters to get the URL to the recent Sidewiki entries for their pages. There’s even a bookmarklet that you can drop in your browser’s toolbar and use to get the feed of the page you are browsing.

Hope you will like it ;)

JMeter distributed testing with Amazon EC2

Recently I had to setup a performance testing infrastructure for one of my client. The goal was to put their web application under heavy load to prove it was able to scale properly and do some capacity planning.

I chose Apache JMeter to generate the load, created a few test plans and started to nudge the application. Of course I quickly understood that my MacBook won’t be enough have the server sweat.

To serve this application we are using Amazon EC2 instances managed with the Sclar.net service. One day I should write something about Scalr, but for now, enough is to say that a Scalr farm defines the configuration of a cluster of EC2 instances. Each instance in a farm belongs to a role (an EC2 AMI) and the farm defines how many instances of each role should be launched.

Since JMeter can be used in a master/slave mode (see details here) I decided to use Scalr to create a farm of JMeter slaves that would put the application under pressure.

The first problem I faced is that the JMeter master and its slaves must be in the same sub-network to be able to communicate, so my JMeter farm had to define two different roles, one for the master (jmeter-master) with only one instance and one for the slaves (jmeter-slave) with as many instances as necessary.

The second problem was concerning the IP addresses of the slaves, I did not want to write down the slaves’ IPs and manually enter them in the JMeter command line. By chance, with Scalr, each instance in a farm is informed of its peers’ IP addresses, so I wrote a small Python script that would get those IPs and launch the JMeter master with a given test plan.

This was working pretty nicely for my simpler test plans (like the one that only GETs the home page) but as soon as I tried to POST (like during the login process) this was not enough. The thing is that the POST data that JMeter is using are not stored in the test plan itself but in companion .binary files, and those files are not sent by the master to the slaves like the test plans are.

I thus had to find a way to send those files by myself before the launch of the test plans. Rsync seemed the easiest thing to do, so I wrote another Python script to synchronize the slaves.

The above script requires only three things:

  • a valid RSA private key (here /var/testing/farm.key), which you can download using the Scalr.net farm’s menu
  • the /var/testing folder must already exist on the slaves
  • and, of course, you need to initially get the files on the master. I use svn up.

Once you have prepared and tested everything, using one master and one slave, you can rebundle the instances you used and then start to spawn tens of slaves to stress your application.

If you have already done something similar or have ideas for improving my setup do not hesitate in letting me know in the comments :)

UPDATE: With the release of the Amazon Virtual Private Cloud it should be possible now to have slaves running in the cloud and a master running on your workstation, they would all be in your own sub-network. However, you will need to find another way to synchronize the POST data with the slaves.