Sunday, July 07, 2013

RFC - DBdeployer : Bringing MySQL Sandbox to a new level

MySQL Sandbox is growing old

The MySQL Sandbox project has been around for 8 years, and it has gained considerable attention from the community. I have seen it mentioned in books and articles, used in other projects, and widely adopted by testers and bug reporters.
I have used it for more than testing, and it has saved me many hours of labor by allowing me to create database servers in a few seconds.
Yet, I have gathered a long list of grievance about it, both from my own experience and from other users feedback. Here goes.

  • MySQL Sandbox is not easy to install. For people used to install Perl modules, it feels natural. For experienced Perl users, installing it in user space without root access is feasible and even easy. But for the vast majority of users, it is often a nightmare. Even when the method is well understood, there are issues that baffle even the experts. Let’s see them:
    • You can have several Perl versions in your system. How do you get them? By just following the instructions to install some other project, which will silently install a competing Perl interpreter in your laptop.
    • Your operating system can upgrade Perl and mix up the paths where the module is installed.
    • In both the above cases, you may end up with MySQL Sandbox not being the latest version or the help not being available.
  • MySQL Sandbox is not intuitive to use. This is a consequence of the project being extended and stretched over the previous releases, where it was really arcane and difficult to start. As a result, I made the call to the sandboxing tools brief, but not really easy. It’s a defect that often happens when the designer and the developer are the same person. Now I know better, but there is more.
  • MySQL Sandbox is not easy to extend. And also this is a consequence of the project being evolved from its initial roots. The first version of the application was a Swiss Army knife with many sharp blades and a tiny handle. With the latest releases I created an easy wrapper around the initial tool (which is now labeled as low-level-make-sandbox) at the price of using a awkward syntax. I made common operations really easy to use, and uncommon ones unnecessarily hard. But the worst consequence is that the features I wanted to develop are still in my wish list, because the code underneath is not flexible.

My wish list

Having used the project for most everything that came to my path during my job, I came to appreciate its versatility, but at the same time I wished I could do more to make the tool meet my needs. In the past years I have extended MySQL Sandbox with many tiny new improvements, but the core remains the same. Here’s what I would like to do:

  • Deploy for production, which includes the ability of deploying safely with root access. You can do that now with MySQL Sandbox, but since it was designed on purpose for deployment in user space. If you want to create production ready deployments and make them maintainable, there are many tasks that you should manage, which are taken care of when you use .rpm or .deb based deployments, but that are not that easy with a custom deployment.
  • Deploy remotely, which you can do now, using a shell script that I have added to the build, but it is kind of a hack: a wrapper on top of other wrappers. While it makes the deployment really easy, it has not enough checking to guarantee it will work well in most cases.
  • A GUI. I know MySQL Sandbox intimately. Yet, when I need to do something unusual, I need to look at the help, and sometimes at the docs to remind me of what needs to be done. A web-based (or even a text-based) menu would make the application more friendly. The main obstacle to this is that the internal build-up of the work flow has not been designed for interactivity.
  • Easy installation. This includes the ability of being installed through a package manager (apt-get, yum) or a language specific manager (Perl CPAN, Ruby Gems, Python PYPI), but also a simple way of using it out of the box without installing at all. This feature (or lack thereof) is what makes current remote deployment so fragile.
  • Deploy on Windows. I don’t like Windows, and I don’t want to have anything to do with it, but I realize that for many users it is the only operating system available. So I would make the next application modular, in such a way that someone else can just create descending classes from my abstract ones, and implement sandboxes in a different setup.
  • Make sandboxes for more than MySQL. Databases are all different. People have asked me to create sandboxes for Postgres, Drizzle, and Oracle, and I have declined to even try. But if the framework is abstract enough, it could allow subclasses that handle different operating systems and different database servers.
  • Create an API that can be used from a programming language instead of using the tools directly. This requires some clever redesign but it is feasible.

Meet DBdeployer

The next incarnation of MySQL Sandbox is named DBdeployer. Its features so far include a development plan on GitHub and a dedicated domain (with a twin .com) that so far redirects to MySQL Sandbox site.
As you can see from the development plan, there is quite a lot to do.

Why GitHub?

During the lifetime of MySQL Sandbox I have changed from Savannah to SourceForge to the current Launchpad.
For the work at my company, I have also used extensively Google code. In each place I found good and bad parts, and I kept looking around for alternatives. The fastest growing project hosting that I have noticed is GitHub, where I find all the main features I need. What I loved in launchpad was the ability of writing blueprints with their dependencies. What I have realized later is that I can write a development plan using Test Driven (TDD) and/or Behavior Driven Development (BDD) for my working documents. If you look at the MySQL sandbox code, you will see that the largest piece of the code base is made of tests, and yet I wrote most of those tests after writing the features, just because of how the application was designed. Given my love for testing, it makes sense that in my next project I privilege testing by embracing a development method that combines tests, design, and development in one harmonious loop. What has this to do with GitHub? It’s because it has all except a blueprint designer graphical interface, and since the design is going to be done with the BDD tests, GitHub, with its vibrant community and the tools that make production easy, is the tool that I have chosen. I should also mention that I have grown tired of Bazaar that comes with Launchpad. While I like its principles, it is really not as widespread and maintained as git. When git was a Linux-only application, it was not a good choice, but now, not only it is universal, but it comes pre-installed in every major operating system (no, Windows is not one of them, but despite having mentioned it in my wish list, I still don’t care).

Why not Perl?

Going with GitHub is a hint that I won’t be using Perl for my next project. A few years ago I stated my intention of using Python for a new MySQL Sandbox, but then I changed job and I focused on other priorities. While I kept up to date with Python, I also rediscovered an old love: Ruby.
Ruby is everything that Perl wanted to be but could not. It has taken from Perl most of its core principles and made them extensible with true object oriented implementation, and in the latest release of Ruby the language has improved its stability and expressiveness that strikes a chord with my love for technology. Not only that, but there are testing techniques that the Ruby community has embraced and enhanced more than other languages, and since I am a QA developer at heart, it sounds like a logical choice.
I could still use Perl, and GitHub, and BDD together. But let’s be frank: Perl object oriented abilities are poor. While it has finally come up with a new OOP framework, it is still an addition that feels foreign to the language. Compared to Ruby terse oop implementation (I love the everything is an object paradigm) Perl feels old and awkward when I try to do some innovative design.

What, you may ask, is this fixation with OOP? It's an early infatuation of mine. I started using OOP in the early 1990s, with C++, before it became as useful as it is nowadays thanks to the Standard Template Library. And since C++ compilers were not widely available, I implemented my own OOP flavor using standard C, which allowed me to develop a whole database engine interpreter and to expand with new features quite easily. Since then, I have always liked OOP. With Perl, I had to bargain. On one side, it allows me to save thousands of lines of code, compared to C, but it does not encourage using OOP. As a result, I wrote code faster, but not the kind of reusable code that with hindsight I would have liked to have produced.

So, where is the code?

Here’s the thing: there is no code yet. Although my fingers are itching to get started, I want to avoid getting stuck with a prototype that grows into a non-scalable application, and so I will put together as many BDD tests as I can before writing any code.
The first step was writing the development plan, which is more a wishlist than a proper plan. On the same vein, there is a possible description of the application at Introducing DBdeployer, from which I excerpt a dummy tentative interface.

$ dbdeployer
dbdeployer - installs single and composite MySQL instances

dbdeployer [global options] command [command options] [arguments...]


--help                - Show this message
--hosts=hostname list - Install on one or more remote hosts (default: localhost)
-s, --[no-]sandbox    - Creates a sandboxed instead of a production-ready instance
--version             - Display the program version

single      - Deploys a single instance
replication - Deploys a composite instance in replication
multiple    - Deploys a composite instance of isolated servers
check       - Checks if an instance is working correctly
clone       - Clone an instance
move        - Move an instance
remove      - Remove an instance
help        - Shows a list of commands or help for one command

Next: requesting comments

The next step is collecting advice, comments, and wishes from users and see where it goes.
I have a wish of getting a version 0.1 ready for MySQL Connect 2013, where I have a talk about running MySQL in your laptop (obviously it covers MySQL Sandbox). If the design goes well, I should be able to get a working application by mid September. Probably it won’t include all the features, but if I have an extensinble framework and the implementation of the most basic features in place, I would feel satisfied.
If you are interested in the future of MySQL Sandbox, read the development plan, and comment to this article with advice or wishes.


Thanks to my current company: Continuent, Inc (and the previous one: MySQL AB) for allowing me to keep maintaining MySQL Sandbox and its next revolutionary evolution. And thanks to all the people who have contributed with bug reports, advice, features, comments, reviews, and of course by using it and spreading it around.


Fernando said...

Giuseppe, thanks a lot for your work on this. I use mysql sandbox daily on my laptop. Even though I've been working with MySQL one way or the other since about 2000, I just can't imagine my workflow without a tool like this now!

I can understand the reasons for the replacement, and I think they're good. Anything that lowers the barrier of adoption of MySQL should be welcome by the community.

Moritz said...

Just wanted to mention that i really appreciate your work! Keep up the good work! Really looking forward to dbdeployer.

Sheeri K. Cabral said...

This is great news! I hope our strong developer community will be able to give back to this project and make it a ton better.