Thursday, September 08, 2011

The happiness of failing installations

When you set-up the same software several times (for you or for your customers), you want that software to install quickly and reliably, and you are generally happy when everything works as expected.
In this context, a failing installation is when the installation process exits unexpectedly, and you are left with an error message and the prospect of looking at the manual to find out what was it.

A failing installation is unpleasant, you'd say, and I concur. But do you know what's more unpleasant than a failing installation? It's an installation that succeeds, only to fail silently the first time you try using the application.

Looking at this enhanced definition, it is no surprise that I assert to find happiness in failure. And I have practical reasons for my claim. When I first tried Tungsten Replicator installation, it succeeded. And to my chagrin, the application did not work. I had to dig the reason for not working from the logs, and from that reason I had to figure out what I had done wrong. For example, the log might say "file not found mysql-bin.000003", and from that piece of information I had to figure out that I forgot to make the binary logs directory group readable, so that the 'tungsten' user could see the logs.
But a "successful" installation with later failure often meant that a clean shut down was not possible, and then I had to become an expert at cleaning up messy installations.
The next installation may get past the failure point, and possibly fail (again silently) for a different reason. Sometimes, I had to install four or five times until I get to the working and stable point. And then I'd install on another server, and I made a different mistake (or I forgot to apply the cure for a known mistake) and the stream of successful installations with hidden failures continued for a while.

With the above reminiscences, I am very happy to report that now you can install Tungsten Replicator with the near assurance that when something goes wrong, the installation does not start, and you are given a clear list of what was wrong.
The installer runs a long list of validation probes, and it doesn't stop at the first validation failure. It will try its best to tell you what you should do to reach a satisfactory installation, giving you a detailed list of everything that doesn't match up.
Not only that: the installer checks the requirements on all the servers in your intended cluster, and the installation does not start anywhere until you meet all the requirements in all the servers.

That's why, when my installation fails, I feel very happy, knowing that I won't have to clean up a messy server, and when I fix the problem that made the installation fail, my application will most certainly work.

No comments: