Tag Archives: java

I Can Haz Java?

They announced it yesterday at the Google Campfire ’09 (here and here) and it is today on the Google App Engine blog: Java is now supported on Google App Engine!

It comes with a set of Eclipse plugins to test and deploy Java servlets, using JDO or JPA to support database access. Of course, the database behind this is BigTable, which means that a lot of relational features are not available, but it scales!

Go there to get you started, or, if you want to know if your preferred framework will play well with GAE, go to the “Will it play in App Engine” page.

That’s good news! Especially because we may start having more and more Java applications outside of the corporate walls.

Yes Google, YES!

XML schemas compatibility

Photo by psd

This is the fourth installment of this series about managing backward compatibility in software development. Here I talk about what makes an XML Schema backward incompatible.

I specifically address W3C XML Schemas but general principles applies regardless of the schema language you use.

But first, why bother about XML Schemas compatiblility?

Actually, in enterprise applications, XML is often used either to specify configuration files or interchange formats. With the rise of WebServices and RESTFull applications on the Internet there is also an increase in the use of XML.

Thus, making sure that existing configuration files still work with your new software or, more importantly,  that other applications can still communicate with yours can really make a difference.

So, what makes a schema incompatible?

  • Changing an element or attribute type to a more restricted type (like adding constraints on a xs:string)
  • Changing the order of a sequence in a complex element
  • Removing or renaming an element or attribute from a complex type
  • Adding a mandatory element or attribute to a complex type without providing a default

Removing complex or simple types will also make it incompatible if:

  • Your schema is included or imported by other schemas or
  • You do not replace them by compatible anonymous types (compatible meaning equivalent or less strict, e.g. if one defines a simple type JavaClass, which is a xs:string with a constraint, and replaces it with xs:string).

Then, how to preserve backward compatibility?

If some elements of the schema are becoming obsolete, do not remove them. Instead, mark them as deprecated in the schema documentation and, if applicable, remove their mapping to the object model (that way you will not have to maintain the code equivalent of the deprecated elements).

The best strategy I came across so far is using namespacing: If a given schema must be refactored, create a new one, changing its namespace (a good practice is to include the major version of the schema in the namespace).

You then have two options:

  1. provide an XSL stylesheet that enables the migration of XML documents from the old schema the new one
  2. provide support code to be able to read both document structures

Of course, the second solution is the most desirable from the operational point of view (and the first one is not always applicable). However, the trade-off is that it is more expensive from the development point of view. Once again, choosing between who is going to do the work (the guy who develops or the guy who installs your application) is a matter of project management.

Database schemas compatibility

by gnizrThis is the third post about software compatibility, the previous ones were talking about project management and bugs and this one deals with database schemas compatibility (I will deal with stored procedures in the chapters about code compatibility).

First of all, what does backward compatibility means when talking about the database?

  1. Being able to retain data stored in one schema into a new one.
  2. Preserving compatibility with external systems (like report engines) that may be accessing the database directly.

Point #1 is achieved through migration tools that update the database schema, in some cases such tools may be very tricky.

Point #2 is a bigger challenge. Changes that may break the database compatibility are:

  1. Removing a table or changing its name.
  2. Removing a column, changing its type (including its precision or length) or changing its name.
  3. Changing the semantic of a column (e.g. changing the valid values).
  4. Adding foreign keys.

In case #1 and #2, if such changes cannot be avoided, a good enough solution is to implement database views that mockup the old tables based on the new ones.

The thing is that for #2 you will need to rename the actual table which will force an update of the foreign keys in other tables and surely more code update than what was initially expected. Leaving an unused column in the table may be a better solution. As usual, this is a trade-off that should be discussed at the project level.

Point #3 is more tricky because it really depends on the change and the usage of the column. Most of the time transforming a “change” into a “remove and add new” will enable to refer to #2. Triggers can then be used to update the old column or it can just be left unused.

Point #4 is a problem when there are scripts that delete entries in a table. If all of a sudden there is a new foreign key that depends on this table then the script will fail, thus breaking the compatibility. I actually have no technical solution for this one. I think that only documentation can be given, but if any of you has an idea please share it with us :)

Nevertheless, one should recall to never do any incompatible change without a good enough reason.

About bugs and software compatibility

This is my second post about backward compatibility in software, the first one was dealing with the project management aspect of software compatibility, this one talks about bugs and how, sometimes, correcting a bug can break compatibility.

First of all, coming back to my previous post on the subject, deciding whether or not to break the backward compatibility of an application is a project management matter. The decision that correcting a bug will break compatibility must not be left solely to the developer, sometimes the company may decide that compatibility should be preserved even when it comes to bugs.

Raymond Chen, a well-known developer at Microsoft, has some good examples on his blog, The Old New Thing, to illustrate this. Raymond actually gives us a good insight at Microsoft policy concerning backward compatibility of its OS.

This post, from Joel Spolsky (another well-known ex-employee of MS) gives another good example with this leap-year-bug deliberately created for Excel/1-2-3 compatibility.

So, to make it short, when you correct a bug, incompatibilities can appear because:

  • Either the bug as been detected and a workaround as been put in place. This workaround will have to be removed once the bug is corrected.
  • Or this was not initially considered a bug and the behavior is going to unexpectedly change.

As an example, if an interface exchanges strings representing date and time and you later discover that the time zone is omitted from the format. If someone developed a parser for this date and time but never expected a time zone information, then the application will break. This is a semantic incompatibility, but one that is brought by a bug fix.

In the case where your management decided that bug for bug compatibility was not necessary, the incompatibility and its potential impacts should be documented in the migration release notes.

In the case where you have to maintain the bug to maintain the compatibility, I recommend you subscribe to Raymond Chen’s blog or stop writing bugs.

How to manage software compatibility

For most software companies the ability to ship new versions of a product that will preserve clients’ data and customizations is a matter of market share. Still, this is often an afterthought and there seems to be little documentation available.

This article is the first of a serie about managing backward compatibility in enterprise applications. This will not be a definitive guide but I will try to spot the common areas where incompatibilities can appear and give guidelines about managing them.

This first post is about the project management side of backward compatibility.

One of the most important thing to remember about backward incompatibility is that it is mostly a matter of process and project management.

In order to find the most accurate way of solving a compatibility issue you need to talk about it because the solution can be driven by technical, business or project considerations. Once a solution is accepted, the reason as to why this as been done that way must be properly advertised (this is of uttermost importance when only documentation is provided) and rolled-out.

As backward compatibility is a project concern it must be:

  1. Listed in the project risks list
  2. Considered at the project level
  3. Optionally considered at the product level (mostly when it has business impacts)

There are three ways to solve backward incompatibilities, they are listed from the most desirable to the one that requires the less developer work:

  1. Ensure binary compatibility – Work is done at the development’s level.
  2. Provide migration tools – Work is split between development and services but emphasis is put on development.
  3. Provide thorough documentation of incompatibilities and ways to overcome them – Work is split between development and services but emphasis is put on services
  4. Reject or postpone the change – Work is then at the product management level

Like for bugs, backward compatibility cannot be guaranteed at 100{5f676304cfd4ae2259631a2f5a3ea815e87ae216a7b910a3d060a7b08502a4b2}, the best thing a project manager can provide is a good measure of the risk upon it for a given version.

When a new version is released, incompatibilities, those that have not been foreseen or at least documented, must then be treated like any other bug and become part of the maintenance process.

In the following posts I will focus on what can make an application backward incompatible and give some guidelines in order to limit those issues and ensure binary compatibility.

See also Backward Compatibility on Wikipedia.

Fun with Java files encoding

Have you ever tried to write Java code with non-ASCII characters? Like having French method names?

The other day I stumbled upon Java classes written in French. Class names like “Opération”, methods names like “getRéalisateur” and embedded log messages and comments all the same.

At first you say “not common but cool” (and you start thinking about writing code in Chinese because your boss always wondered how we could forbid clients from decompiling our classes without using an obfuscator).

But cool it is not!

Why? Because of encoding!

Here is a quiz, what is the encoding those Java files were saved in?

  1. UTF-8 (after all this is how strings are encoded in the JVM)
  2. ASCII (come-on, everybody is writing code in English)
  3. MacRoman (why not?)

Just wonder for a while.

Answer is #3 because the Java IDE (Eclipse in this case) is by default using the platform encoding to save files. And those classes have been created on a Mac.

I actually had no problem reading and compiling them because I also use Eclipse on a Mac and because the Java compiler is also assuming the source files are in the platform encoding.

So what, nothing wrong then? Yeah, except the integration server is running on Ubuntu and sometimes I work on Windows as well. And on those platforms the default encoding is not MacRoman…

Something interesting is that it is always like that! I mean, even when you code in plain English there are chances that your IDE is going to write the files in the platform encoding. But nobody notices because as long as you only use characters in the ASCII-7 range, then they will be encoded the same in almost all encodings.

So what is the solution? Well it depends if you really want to code in French (or in Chinese). My advice anyway is “don’t do that” and externalize localized strings. However, if you really insist you have two solutions:

  1. Make the whole production chain encoding-explicit: Configure your IDE to use UTF-8 and specify in your build that the Java compiler is going to deal with UTF-8 encoded files (UTF-8 is better in most cases).
  2. Make sure you only use ASCII-7 characters in your files and replace all non-ASCII-7 characters with their \uXXXX equivalent (even in comments).

However, be aware that #1 is not always possible, you might be using processing tools that do not offer you the option to use something else than the platform encoding.

Have fun with encoding :)

Image Credits: Arite

What is a lightweight application server?

A colleague of mine just sent me a link to this article from Jeff Hanson on JavaWorld: Is Tomcat an application server?.

That’s funny because another colleague, yesterday during the lunch, asked us if instead of developing for a JEE container it would not be better to adopt a lightweight container like Tomcat? Using frameworks like Spring?

My answer was actually another question (as often): What is a lightweight container?

When frameworks like Spring and Hibernate started, their purpose was to add functionalities that did not exist or were badly designed: flows, inversion of control and injection or entity management. People were complaining about JEE and some switched to Tomcat plus Spring and Hibernate. Some of them did so because at this moment they did not need the other JEE services.

Hanson concludes his article with the following:

When attempting to determine the server environment best suited to a particular application or system, it is helpful to break down the requirements of the system and determine which Java EE components will need to be supported.

I could not agree more with this. However, requirements evolve and people switch to new projects but they usually continue to use the same frameworks.

The result is that when the need for new services increases (transaction, security, messaging, administration) the pressure on frameworks increases and they add those services to their stack because their clients ask for it and that is fun to code.

Then, what is the difference between a JEE server and Tomcat+Spring? I mean at which point a lightweight container is not lightweight anymore? When you add transactions? And in that case why not using JEE? Because it is JEE and it is said to be heavyweight?

My answer is to always use JEE when it offers the services you need. If it does not? Use something else but otherwise use JEE. Today if I was creating a new application I would not use Hibernate for entity persistence, I would use EJB3.

Windows Integrated Security and Java Web Applications

In my previous post I was explaining how to use an Active Directory server to authenticate a user. Indeed, I was trying to make the system authenticate the user using the Windows credentials that she already entered when logging onto her workstation.

Some years ago I was working with IIS and it was only a matter of configuration of the server to enable that for browsers that were supporting the appropriate protocol (others would be using HTTP basic).
One of the advantages of that protocol is that the user’s password is never sent over the wire. I found out this protocol is named SPNEGO and is an extension to the HTTP Negotiate protocol.

Since negotiation must occur between the browser and the server, if the server does not natively implement that protocol you cannot use the standard security APIs like custom registries or JAAS.
The solution is then to disable the server standard authentication mechanism and implement a filter that will negotiate, using SPNEGO, with the browser.

In the principle it looks easy but one still need to implement SPNEGO and bridge with Windows, because it’s Windows that finally authenticates the user.

After some goggling I found that the jCIFS library and its extension jCIFS-Ext have the necessary support to help me do the job. In fact everything is already there, even the filter: jcifs.http.AuthenticationFilter.

So first, let’s configure the security constraints for our web-app. In the web.xml we must have the following:

I do not define any role nor any authentication method because I don’t actually want the server to do the authentication by himself. Nevertheless, I define that I want confidentiality on those URLs.
I do that because I will configure my filter to fall-back to HTTP Basic if the browser does not support SPNEGO or HTTP Negotiate and I do not want the password to travel unencrypted on the net.

I hope this imply that if the application is not served over HTTPS there will be a problem, but I actually correctly configured my server to serve the application over HTTPS so I did not test this behaviour.

The second step is to configure the filter itself, the jCIFS-Ext filter has undocumented parameters so I had to go through the code to find them:

“Et voilà“, now your application should automatically authenticate the user based on its Windows credentials. I said should because there are some prerequisites:

  • on the browser side, Windows integrated security must be enabled
  • on the server side your platform must actually support Kerberos for the filter to properly work.

However, the former is a matter of configuration and the latter is a matter of slightly changing the code of the filter.

Configuring an Internet Explorer Browser

To configure an Internet Explorer browser to use Windows authentication, follow these procedures in Internet Explorer:

  1. Configure Local Intranet Domains
    1. In Internet Explorer, select Tools > Internet Options.
    2. Select the Security tab.
    3. Select Local intranet and click Sites.
    4. In the Local intranet popup, ensure that the “Include all sites that bypass the proxy server” and “Include all local (intranet) sites not listed in other zones” options are checked.
    5. Click Advanced.
    6. In the Local intranet (Advanced) dialog box, add all relative domain names that will be used for Integrator server instances participating in the SSO configuration (for example, myhost.example.com) and click OK.
  2. Configure Intranet Authentication
    1. Select Tools > Internet Options.
    2. Select the Security tab.
    3. Select Local intranet and click Custom Level…
    4. In the Security Settings dialog box, scroll to the User Authentication section.
    5. Select Automatic logon only in Intranet zone. This option prevents users from having to re-enter logon credentials, which is a key piece to this solution.
    6. Click OK.
  3. Verify the Proxy Settings (If you have a proxy server enabled)
    1. Select Tools > Internet Options.
    2. Select the Connections tab and click LAN Settings.
    3. Verify that the proxy server address and port number are correct.
    4. Click Advanced.
    5. In the Proxy Settings dialog box, ensure that all desired domain names are entered in the Exceptions field.
    6. Click OK to close the Proxy Settings dialog box.
  4. Set Integrated Authentication for Internet Explorer 6.0 (In addition to the previous settings, one additional setting is required if you are running Internet Explorer 6.0)
    1. In Internet Explorer, select Tools > Internet Options.
    2. Select the Advanced tab.
    3. Scroll to the Security section.
    4. Make sure that Enable Integrated Windows Authentication option is checked and click OK.
    5. If this option was not checked, restart the computer.

Despite all of this configuration I encountered some cases where this was not working at all in IE and I was unable to spot the problem, so you might be falling into this category. The symptoms are that the negociation process takes place but the browser does not answer the last challenge and no error message is displayed at all.

Configuring a Mozilla Firefox Browser

To configure an Mozilla Firefox browser to use Windows authentication, follow these procedures in Mozilla Firefox:

  1. Type about:config in the address bar of the browser and press return (a big list of properties should be displayed in the browser window).
  2. Type “network” in the filter box.
  3. Double-click on the network.automatic-ntlm-auth.trusted-uris property and enter “mydomain.com” (if there is already a value you can add a comma to separate both entries)

The value for this preference is a comma-separated list of URI fragments. This sample string shows the three legal kinds of fragments: https://, http://www.example.com, test.com

The first fragment says, “Trust all URLs with an https scheme.” The second fragment (a full URL) says, “Trust this particular web site.” The third fragment is interpreted to mean http://anything.test.com, so any web site that is a subdomain of test.com, including test.com itself, will also be trusted.

I did not encounter any problem with Firefox which is what I call a paradox…

Changing the filter to use NTLM instead of Kerberos

Actually the change must not occur in the filter but in the class jcifs.spnego.Authentication which comes with jCIFS-Ext. This class tries to determine if the system supports Kerberos but uses introspection, looking for some Java classes that enable Kerberos support in Java.
Nevertheless, those classes can be there without the actual system supporting Kerberos (which is the case where I work).

Fortunately, modifying the behaviour is not too much complicated, just change line 57 of this class:

to the following:

And then the filter will use NTLM instead of Kerberos.

I hope the next posts will be shorter :-P

ActiveDirectory authentication in Java

Recently I came into trying to authenticate users of an intranet web-application against the ActiveDirectory server that is used to authenticate them on their Windows desktop. Here is some code I used to achieve this.

I went into several steps, the first of them being creating a custom user registry to interface my web server and the AD server.

I was using Jetty as the web container so I had to develop an implementation of Jetty’s UserRealm but in any other web container or application things should be the same.
Mostly you need to do two things:

  1. Authenticate the user’s credentials
  2. Retrieve the user’s roles

1. Authenticating the user

Once you have created the initial context, the user has been authenticated by the AD server and everything is fine (creating the inital context will throw a NamingException otherwise).

However, since you are going to send the user’s credentials over the network you may want to have some confidence in the protocole that is used to negociate the connection. The javax.security.sasl.qop and other properties may be set to ensure that the protocole is safe.

This code adds the domain name to the username, that way the user doesn’t have to enter domain\username as its credentials but only its username.
You may want to force her to enter the domain or do some autodetection… as you like.

2. Retrieving the user’s roles

The roles that are returned are distinguished names, like cn=Joe Smith,ou=Sales,dc=mydomain,dc=com so that may be another issue to map them to simpler names. Fortunately I didn’t need these roles for my application.

The second step for me was to actually enable single sign-on (authentication without asking for credentials).

I quickly discovered that the previous code was totally useless for that purpose. But I keep this one for a later post ;-)