The Data Charmer: 2011

Monday, December 19, 2011

Testing new builds with MySQL-Sandbox 3.0.24

MySQL::Sandbox 3.0.24 was released yesterday, with many new features.

More than vanilla MySQL

If you have missed my previous announcement, here's the gist of it. MySQL Sandbox can now deal with tarballs from either Percona Server or MariaDB. The main difference after this change is that you can now create a directory called <PREFIX>5.5.16 and make_sandbox will recognize it as well as the plain 5.5.16.

$ make_sandbox --export_binaries --add_prefix=ps \
   Percona-Server-5.5.11-rel20.2-114.Darwin.i386.tar.gz  \
   -- --sandbox_directory=msb_ps5_5_11
unpacking Percona-Server-5.5.11-rel20.2-114.Darwin.i386.tar.gz
[…]
installing with the following parameters:
upper_directory                = /Users/gmax/sandboxes
sandbox_directory              = msb_ps5_5_11
[…]
basedir                        = $HOME/opt/mysql/ps5.5.11
tmpdir                         = 
[…]
Your sandbox server was installed in $HOME/sandboxes/msb_ps5_5_11

After the binary export, subsequent installations will be easier:

$ make_sandbox ps5.5.11

The same commands can be used for MariaDB. At the moment, make_sandbox does not recognize other packages, but adding them should not be a big deal, provided that such packages look like MySQL. It wouldn't work with Drizzle, because it lacks the main ingredients for MySQL installation.

High Performance sandboxes

While testing parallel replication and prefetch slaves with Tungsten Replicator, I realized that I was doing too much manual fiddling with my scripts. Since I need more performant servers, I added the basic items that I need to modify to enable a faster server. Now, using the '--high_performance' option with make_sandbox, you get a server that is much better than out-of-the-box MySQL. To avoid problems with too much RAM, I am using a default of 512 MB for InnoDB, which is not enough for really demanding tests, but at least it is a good placeholder in the sandbox configuration file, should you need to modify it.

$ make_sandbox 5.1.60 -- --high_performance
[…]
innodb-flush-method=O_DIRECT ; \
innodb-log-file-size=50M ; \
innodb_buffer_pool_size=512M ; \
max_allowed_packet=48M ; \
max-connections=350 ; \
innodb-additional-mem-pool-size=50M ; \
innodb-log-buffer-size=50M ; sync_binlog=0 ; \
innodb-thread-concurrency=0 ; log-error=msandbox.err
[…]

Standalone masters and slaves

MySQL Sandbox has had the ability of creating replicated systems for years. Yet, sometimes you need a stand-alone master server that you want to use for some odd experiment. Similarly, you may want to create a slave of a specific master without having a full replication system. One case where you would like this ability is when you want to try replicating between servers of different versions.

$ make_sandbox 5.1.57 -- --master
[…]
my_clause                      = server-id=5157 ; log-bin=mysql-bin ; log-error=msandbox.err
[…]
Your sandbox server was installed in $HOME/sandboxes/msb_5_1_57

$ make_sandbox 5.5.10 -- --slaveof='master_port=5157' 
[…]
my_clause                      = server-id=5510 ; log-bin=mysql-bin ; log-error=msandbox.err
[…]
Your sandbox server was installed in $HOME/sandboxes/msb_5_5_10


$ ~/sandboxes/msb_5_1_57/use -e 'show master status'
+------------------+----------+--------------+------------------+
| File             | Position | Binlog_Do_DB | Binlog_Ignore_DB |
+------------------+----------+--------------+------------------+
| mysql-bin.000001 |      106 |              |                  |
+------------------+----------+--------------+------------------+

$ ~/sandboxes/msb_5_5_10/use -e 'show slave status\G'
*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: 127.0.0.1
                  Master_User: rsandbox
                  Master_Port: 5157
                Connect_Retry: 60
              Master_Log_File: mysql-bin.000001
          Read_Master_Log_Pos: 106
               Relay_Log_File: mysql_sandbox5510-relay-bin.000002
                Relay_Log_Pos: 252
        Relay_Master_Log_File: mysql-bin.000001
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes
              Replicate_Do_DB: 
          Replicate_Ignore_DB: 
           Replicate_Do_Table: 
       Replicate_Ignore_Table: 
      Replicate_Wild_Do_Table: 
  Replicate_Wild_Ignore_Table: 
                   Last_Errno: 0
                   Last_Error: 
                 Skip_Counter: 0
          Exec_Master_Log_Pos: 106
              Relay_Log_Space: 420
              Until_Condition: None
               Until_Log_File: 
                Until_Log_Pos: 0
           Master_SSL_Allowed: No
           Master_SSL_CA_File: 
           Master_SSL_CA_Path: 
              Master_SSL_Cert: 
            Master_SSL_Cipher: 
               Master_SSL_Key: 
        Seconds_Behind_Master: 0
Master_SSL_Verify_Server_Cert: No
                Last_IO_Errno: 0
                Last_IO_Error: 
               Last_SQL_Errno: 0
               Last_SQL_Error: 
  Replicate_Ignore_Server_Ids: 
             Master_Server_Id: 5157

You can download MySQL::Sandbox from either launchpad or CPAN.

Saturday, December 10, 2011

How to submit a good database bug report

When an open source project becomes popular, bug reports start flocking in. This is both good and bad news for the project developers. The good news is that someone is using the product, and they are finding ways of breaking it that we didn't think of. The bad news is that most of the times the reporters assume that the developers have super human powers, and that they will find what's wrong by the simple mentioning that a given feature is not working as expected. Unfortunately, it doesn't work that way. An effective bug report should have enough information that the ones in charge will be able to reproduce it and examine in lab conditions to find the problem. When dealing with databases and database tools, there are several cases, from simple to complex. Let's cover them in order.

Installation issues

This is often a straightforward case of lack of functionality. When a tool does not install what it is supposed to, it is a show stopper, and a solution must be found. In this case, it's often enough to state the environment where the installation is happening (operating system, version of the tool, version of core components such as the MySQL database used) and the command used to start the installation. The error message could be an expected failure, when the installation procedure checks for requirements and fails if they are not met. For example: "Missing Ruby interpreter". The message tells (or suggests) you what to do. Filing a bug report on an expected failure is a waste of time. You should install the missing part and try again. Even if the message is about an unexpected failure (e.g. you get a stack trace from Ruby or Java), usually the first message tells you enough to be able to find a workaround. For example, if you get an exception from Ruby complaining about a missing 'curl' command, you can file a bug report to ask for the installer to check for 'curl' in the requirements, but if you install 'curl' yourself, the installation should continue.

Simple database issues

Reporting a database bug means complaining that the DBMS is not behaving as advertising by the documentation, or as common usage dictates. If it is a missing or misbehaving functionality, the best way of showing the problem is by starting with an empty DBMS, followed by the creation of the objects needed to reproduce the issue (CREATE SCHEMA, TABLE, INDEX) and by a minimal amount of data that triggers the problem. Some information about what operating system and database version was used is probably necessary to reproduce the problem consistently.

Simple database replication issues

By simple replication we mean a vanilla master/slave topology. In this scheme, data inserted in the master will eventually end up in the slave. Bugs in this category may fail to replicate the data totally or partially, or they may cause a break in the replication flow. Reproducing them is almost as easy as with simple database bugs. If you start with an empty system and manage to reproduce the error with a short sequence of commands, it should probably reproducible by a third party. Sometimes, settings in the master and the slave are essential to reproduce the problem. In MySQL, the format of binary logs, the default database engine and SQL modes can affect replication and produce different results with the same stream of SQL commands.

Complex database replication issues

The most difficult bugs to report are the ones where the error shows up only in a given topology. While MySQL native replication offers only few options to pipe data around (single, circular, hierarchical), Tungsten replicator allows a rich set of combined pipelines that can change the outcome of a data change event, depending on the originating node and the direction it took. In these case, information o how the cluster was installed becomes essential.

Concurrency issues

This is one of the most difficult bugs to report. When an error happens only because of the contemporary action of two or more threads, there is no easy way of reporting it in a way that it can be easily reproduced. Three methods are possible:

Describe the action of the first thread, then mark the change of thread and describe the actions of the second thread, continuing in this way until you reach the error point.
If you are a developer and feel comfortable with multi thread applications, write a script that reproduces the error by running several threads (Perl, Python, and Ruby offer the best environment for this kind of tests).
If the database offers a tool to write such multi-threading tests, consider using it.

Heavy load issues

This is a more complex case of the above one. Not only you need concurrency, but a lot of it happening at the same time. Reproducing this kind or error is challenging. If you have a support agreement with the provider of the database or the tool, you may let the support engineer have a look at your running environment, to find some clues. But even in this case, the support engineers or yourself need to ultimately reproduce the case in such a way that a developer can fix the problem and test the fix. There are two methods to report this problem:

Simplification: if you can reduce the concurrency to the elements that are misbehaving, the methods for concurrency issues will apply also in this case.
Enabling a query log could lead to identifying the sequence of events that have generated the error. The log should be integrated by the DDL of the objects involved in the action (schemas, tables, triggers, views, etc).

Large data issues

If your error only shows itself with large data, there is often a logistical problem, as you can't easily provide gigabytes of data, even if there privacy and security issues weren't in the way (which usually are). There are three strategies that you can use to report such bugs:

If only the size matters, then describing the kind of data used could be enough. E.g. When a table with such fields and such indexes grows beyond X GB, then the optimizer warp drive explodes. (Don't try this at home)
Create a script to generate the data that will ultimately trigger the error. This method requires both skills and an understanding of what the error is.
You may use a publicly available database to reproduce the error (for example, the Sample database with test suite.) Just mention in the bug report where to find the database, eventually how to load it if it is not simple, and then describe the steps needed to reproduce the bug after loading it.

DOs and DONTs

DO
- Search the bug repository and the mailing lists (or forums) before submitting yours. Someone may have had the same problem before you did. (Thanks, Robert Hodges, for this important reminder.)
- Put yourself in the receiver's position, and try to reproduce the problem from a clean initial state.
- If there is a workaround, mention it: it might give a good clue to the developers.
- More information is better. Anything that can improve the identification of the bug root cause will be welcome. (But don't overdo: see below)
- If you feel that a missing feature should be useful, report it as a feature request. (Even better: suggest a patch)

DON'T
- Don't report a missing feature as a bug, unless the docs say that the feature should be there.
- Don't just send the error message without the events that generated it.
- Don't include SQL commands embedded in code.
- Don't say "my application doesn't work anymore," assuming it's the database's fault. Remember The First Rule of Programming: It's Always Your Fault
- Less is better. If there is a long way and a short way of reproducing the bug, the shorter one is better. Don't send more info just for the sake of it.
- Don't tell the developers that they are retarded. This will not increase the priority given to your bug, or your credibility.

Friday, December 02, 2011

Call for participation for MySQL events - MySQL conference and FOSDEM

It's that time of the year where MySQL would be speakers are called to action. As usual, the main event is the MySQL spring conference, this year hosted by Percona. The call for participation to the MySQL Conference And Expo 2012 is still open until December 5th. To submit a proposal, you should register as a speaker and then fill in the form.
There is a conference committee which is already busy evaluating the proposals that have been submitted so far. The committee is demanding (I know for a fact, since I am in it!) and therefore, if you want to submit something, be very critical with yourself and polish your proposal as if your job depended on it!
Please read an update on Percona Live MySQL Conference & Expo 2012.

Speakers in Europe have some more duty, though. This is also the time to submit talks for FOSDEM MySQL DevRoom. Thanks to Frédéric Descamps and Sergey Petrunia, we have Room H.1309 with 150 seats on Sunday 5th February 2012, all day. The deadline to submit a talk proposal is December 26th. There is no review committee. Like we did on past editions, as soon as the talks are submitted, we will ask everyone to vote on the talks via Twitter or email. More updates will come soon. Of course, participation to FOSDEM DevRoom is not limited to European speakers. There have been several brave speakers who have willingly crossed the pond to offer their services at European conferences before, and they are welcome to repeat the experience. Submit your talk proposals now.

Thursday, December 01, 2011

Never say "there is no way"

Reading a recent MySQL book, I saw an example of SHOW CREATE TABLE that comes with backticks (`) around the table and column names, and a comment:

Unfortunately, there is no way to remove this from generated syntax with this command.

(Emphasis mine).
Here's how it goes:

mysql> show create table mytest\G
*************************** 1. row ***************************
       Table: mytest
Create Table: CREATE TABLE `mytest` (
  `id` int(11) NOT NULL,
  `description` varchar(50) DEFAULT NULL,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1
1 row in set (0.00 sec)

Of course, there is a way!

mysql> pager tr -d '`'
PAGER set to 'tr -d '`''
mysql> show create table mytest\G
*************************** 1. row ***************************
       Table: mytest
Create Table: CREATE TABLE mytest (
  id int(11) NOT NULL,
  description varchar(50) DEFAULT NULL,
  PRIMARY KEY (id)
) ENGINE=InnoDB DEFAULT CHARSET=latin1
1 row in set (0.00 sec)

Never say "there is no way!"

Instead of "tr -d '`'", you can use "sed -e 's/`//g'", which does the same thing.

If you are running the query at the command line, you may use the pipe directly:

$ mysql -e 'show create table test.mytest\G' | tr -d '`'
*************************** 1. row ***************************
       Table: mytest
Create Table: CREATE TABLE mytest (
  id int(11) NOT NULL,
  description varchar(50) DEFAULT NULL,
  PRIMARY KEY (id)
) ENGINE=InnoDB DEFAULT CHARSET=latin1

Friday, November 04, 2011

Replication stars

Working with replication, you come across many topologies, some of them sound and established, some of them less so, and some of them still in the realm of the hopeless wishes. I have been working with replication for almost 10 years now, and my wish list grew quite big during this time. In the last 12 months, though, while working at Continuent, some of the topologies that I wanted to work with have moved from the cloud of wishful thinking to the firm land of things that happen. My quest for star replication starts with the most common topology. One master, many slaves.

Fig 1. Master/Slave topology

Legend

It looks like a star, with the rays extending from the master to the slaves. This is the basis of most of the replication going on mostly everywhere nowadays, and it has few surprises. Setting aside the problems related to failing over and switching between nodes, which I will examine in another post, let's move to another star.

Fig 2. Fan-in slave, or multiple sources

The multiple source replication, also known as fan-in topology, has several masters that replicate to the same slave. For years, this has been forbidden territory for me. But Tungsten Replicator allows you to create multiple source topologies easily. This is kind of uni-directional, though. I am also interested in topologies where I have more than one master, and I can retrieve data from multiple points.

Fig 3. all-to-all three nodes

Fig 4. All-to-all four nodes

Tungsten Multi-Master Installation solves this problem. It allows me to create topologies where every node replicates to every other node. Looking at the three-node scheme, it appears a straightforward solution. When we add one node, though, we see that the amount of network traffic grows quite a lot. The double sided arrows mean that there is a replication service at each end of the line, and two open data channels. When we move from three nodes to four, we double the replication services and the channels needed to sustain the scheme. For several months, I was content with this. I thought: it is heavy, but it works, and it's way more than what you can do with native replication, especially if you consider that you can have a practical way of preventing conflicts using Shard Filters. But that was not enough. Something kept gnawing at me, and from time to time I experimented with Tungsten Replicator huge flexibility to create new topologies. But the star kept eluding me. Until … Until, guess what? a customer asked for it. The problem suddenly ceased to be a personal whim, and it became a business opportunity. Instead of looking at the issue in the idle way I often think about technology, I went at it with practical determination. What failed when I was experimenting in my free time was that either the pieces did not glue together the way I wanted, or I got an endless loop. Tungsten Replicator has a set of components that are conceptually simple. You deploy a pipeline between two points, open the tap, and data starts flowing in one direction. Even with multiple masters replication, the principle is the same. You deploy many pipes, and each one has one purpose only.

Fig 5. All-masters star topology

In the star topology, however, you need to open more taps, but not too many, as you need to avoid the data looping around. The recipe, as it turned out, is to create a set of bi-directional replication systems, where you enable the central node slave services to get changes only from a specific master, and the slave services on the peripheral nodes to accept changes from any master. It was as simple as that. There are, of course, benefits and drawbacks with a star topology, compared to a all-replicate-to-all design. In the star topology, we create a single point of failure. If the central node fails, replication stops, and the central node needs to be replaced. Instead, the all-to-all design has no weaknesses. Its abundance of connections makes sure that, if a node fails, the system continues working without any intervention. There is no need for fail-over.

Fig 6. extending an all-to-all topology

Fig 7. Extending a star topology

However, there is a huge benefit in the node management. If you need to add a new node, it costs two services and two connections, while the same operation in the all-to-all replication costs 8 services and 8 connections. With the implementation of this topology, a new challenge has arisen. While conflict prevention by sharding is still possible, this is not the kind of scenario where you want to apply it. We have another conflict prevention mechanism in mind, and this new topology is a good occasion make it happen. YMMV. I like the additional choice. There are cases where a all-replicate-to-all topology is still the best option, and there are cases where a star topology is more advisable.

Tuesday, October 18, 2011

Tungsten Replicator and MySQL Sandbox at Percona Live London 2011

I will be a speaker at Percona Live - London 2011, and I am looking forward to the event, which is packed with great content. A whopping 40 session of MySQL content, plus 3 keynotes and 14 tutorials. It's enough to keep every MySQL enthusiast busy. Continuent speakers will be particularly busy, as between me and Robert Hodges, we will be on stage four times on Tuesday, October 25th.

Robert, at 10:15am: Teaching an Old Dog New Tricks: Tungsten Enterprise Clusters for MySQL, Bishopsgate Suite
Robert, at 1:30pm: MySQL Parallel Replication in 5 Minutes or Less, Newgate Suite
me, at 2:30pm: MySQL Replication outside the box : multiple masters, fan-in, parallel apply, Ludgate Suite
me, at 4:30pm: MySQL Sandbox: a framework for productive laziness,Ludgate Suite

This event feels good from the beginning. There are plenty of participants, many names from all over the MySQL community, covering large and small companies, experienced speakers, well known names in the MySQL engineering arena, and a wealth of topics that will make me feel sorry for not being able to attend them all. It's the usual dilemma that attendees have at this kind of conferences. Not so much at Oracle Open World 2011, where there weren't that many MySQL sessions to choose from, although it was great for networking.

Our talks

Robert will open the dances with Teaching an Old Dog New Tricks: Tungsten Enterprise Clusters for MySQL, a talk about Tungsten Enterprise, my company's commercial product, which is a professional managing tool for demanding companies. Robert, again in the afternoon, with one of the most amazing features of our open source product, Tungsten Replicator: MySQL Parallel Replication in 5 Minutes or Less. This is a feature for large replication systems where the slave can't cope with large data streams, due to the singled-thread MySQL slave. This talk will show how easy is it to plug Tungsten Replicator to a lagging slave, start parallel replication until the lag has been zeroed, and then hand over the control to the native replication again. Then it will be my turn, with a general presentation about Tungsten Replicator, the open source product. I like the idea of calling it MySQL Replication outside the box : multiple masters, fan-in, parallel apply. The reasoning is that MySQL replication, although wildly successful in the web economy of the last decade, it is also constrained by several limits, which Tungsten, acting outside the boundaries, sets free. This will be a quick intro to Tungsten and its new user-friendly installation, with a few demos. Finally, a classic presentation with some new content, on MySQL Sandbox: a framework for productive laziness. The news is that MySQL Sandbox now supports Percona and MariaDB builds. Again, some demos will be shown, with old and new features mixed together.

Thursday, September 22, 2011

Upgrading Tungsten Replicator: as easy as ...

When I talked about the usability improvements of Tungsten Replicator, I did not mention the procedure for upgrading. I was reminded about it by a question in the TR mailing list, and since the question was very relevant, I updated the Tungsten Cookbook with some quick upgrading instructions. A quick upgrading procedure is as important as the installer. Since we release software quite often, either because we have scheduled features to release or because of bug fixes, users want to apply a new release to an existing installation without much fuss. You can do the upgrade with a very quick and painless procedure. Let's suppose that you have installed one Tungsten Replicator cluster using this command:

#
# using tungsten-replicator 2.0.4
#
TUNGSTEN_HOME=/home/tungsten/installs/master_slave
./tools/tungsten-installer \
  --master-slave \
  --master-host=r1 \
  --datasource-user=tungsten \
  --datasource-password=secret \
  --service-name=dragon \
  --home-directory=$TUNGSTEN_HOME \
  --cluster-hosts=r1,r2,r3,r4 \
  --start-and-report

If you want to upgrade to the very latest Tungsten Replicator 2.0.5, build 321, this is what you need to do.

Get the latest tarball, and expand it;
Stop the replicator;
Run the update command (this will also restart the replicator)
Check that the replicator is running again.

The actual upgrade command is in bold in the following script.

#
# using tungsten-replicator 2.0.5-321 (get it from bit.ly/tr20_builds) 
#
TUNGSTEN_HOME=/home/tungsten/installs/master_slave
HOSTS=(r1 r2 r3 r4)
for HOST in ${HOSTS[*]} 
do 
   ssh $HOST $TUNGSTEN_HOME/tungsten/tungsten-replicator/bin/replicator stop 
   ./tools/update --host=$HOST --user=tungsten --release-directory=$TUNGSTEN_HOME -q 
   $$TUNGSTEN_HOME/tungsten/tungsten-replicator/bin/trepctl -host $HOST services
done

One benefit of this procedure, in addition to being brief and effective, is that the previous binaries are preserved. Before the upgrade, you will see:

$ ls -lh ~/installs/master_slave/ ~/installs/master_slave/releases
/home/tungsten/installs/master_slave/:
total 32K
drwxrwxr-x 3 tungsten tungsten 4.0K Sep 22 22:03 backups
drwxrwxr-x 2 tungsten tungsten 4.0K Sep 22 22:03 configs
drwxrwxr-x 3 tungsten tungsten 4.0K Sep 22 22:03 relay
drwxrwxr-x 4 tungsten tungsten 4.0K Sep 22 22:06 releases
drwxrwxr-x 2 tungsten tungsten 4.0K Sep 22 22:03 service-logs
drwxrwxr-x 2 tungsten tungsten 4.0K Sep 22 22:03 share
drwxrwxr-x 3 tungsten tungsten 4.0K Sep 22 22:03 thl
lrwxrwxrwx 1 tungsten tungsten   75 Sep 22 22:06 tungsten -> /home/tungsten/installs/master_slave/releases/tungsten-replicator-2.0.4
/home/tungsten/installs/master_slave/releases:
total 8.0K
drwxr-xr-x 6 tungsten tungsten 4.0K Sep 22 22:03 tungsten-replicator-2.0.4

The 'tungsten' directory is a symlink to the actual binaries inside the 'releases' directory. After the upgrade, the same directory looks like this:

ls -lh ~/installs/master_slave/ ~/installs/master_slave/releases
/home/tungsten/installs/master_slave/:
total 32K
drwxrwxr-x 3 tungsten tungsten 4.0K Sep 22 22:03 backups
drwxrwxr-x 2 tungsten tungsten 4.0K Sep 22 22:03 configs
drwxrwxr-x 3 tungsten tungsten 4.0K Sep 22 22:03 relay
drwxrwxr-x 4 tungsten tungsten 4.0K Sep 22 22:06 releases
drwxrwxr-x 2 tungsten tungsten 4.0K Sep 22 22:03 service-logs
drwxrwxr-x 2 tungsten tungsten 4.0K Sep 22 22:03 share
drwxrwxr-x 3 tungsten tungsten 4.0K Sep 22 22:03 thl
lrwxrwxrwx 1 tungsten tungsten   75 Sep 22 22:06 tungsten -> /home/tungsten/installs/master_slave/releases/tungsten-replicator-2.0.5-321

/home/tungsten/installs/master_slave/releases:
total 8.0K
drwxr-xr-x 6 tungsten tungsten 4.0K Sep 22 22:03 tungsten-replicator-2.0.4
drwxr-xr-x 6 tungsten tungsten 4.0K Sep 22 22:06 tungsten-replicator-2.0.5-321

If you did some manual change to the files in 2.0.4, you will be able to retrieve them. Upgrading from earlier versions of Tungsten Replicator is not as smooth. Since we changed the installation format, it has become incompatible from previous versions. Clusters running TR 2.0.3 need to be reinstalled manually. The next upgrade, though, will be much faster!

My three MySQL sessions at OOW 2011 - and much more

Oracle Open World 2011 is approaching. MySQL is very well represented. Sheeri has put together a simple table of all the MySQL sessions at OOW, which is more handy than the Oracle schedule. I will be speaking in three sessions on Sunday, October 2^nd.

Sunday, 9am MySQL: Don't Be a Rookie Forever—Be in Command (Line) I have given this talk before, as a tutorial at the UC in 2010 and at FrOSCon one month ago. It is one of the most rewarding sessions ever. The attendee were very interested. This will be a short version of the tutorial.
Sunday, 10:15am >MySQL: Jailbreaking MySQL Replication. This is related to my job at Continuent. It will a showcase of Tungsten Replicator, with quick examples of how to build replication clusters with multiple masters, multiple sources, chained clusters, parallel replication.
Sunday, 2:45pm Bridging MySQL and Oracle Databases with Open Source Replication. In this session, together with Robert Hodges, I will show how you can replicate from MySQL to Oracle and PostgreSQl using Tungsten Replicator.

There are 47 MySQL sessions in total. You can see them in Technocation summary or get the Oracle focus on mysql pdf. There are huge expo halls at OOW. Among them, there is also MySQL. The MySQL Community booth, manned by volunteers, is at Moscone West, Level 2 Lobby. Other MySQL booths are listed in the Technocation summary. On the social side, Oracle ACEs will have a dinner on Sunday evening, and MySQL Oracle ACEs will have another gathering on Monday evening. On Tuesday, October 4th, there is a MySQL Community reception. It's free. You don't need a OOW pass to attend, but registration is required.

Monday, September 19, 2011

Chaining Replication Clusters

MySQL built-in replication includes a concept called relay slave, which allows you to create hierarchical database clusters. You can do the same thing with Tungsten, and this can be done in more than one way. Let's start with two distinct clusters. We can follow the recipes in the Tungsten Cookbook to install a master / slave cluster in three separate hosts and a Tungsten sandbox containing another master/slave cluster. Now, we want to make the master in the sandbox a slave of the master in the first cluster, as illustrated in the figure below. Chaning clusters master to master

Notice that the recipe works in exactly the same way for two distinct clusters on separate hosts. We are using one sandbox to minimize the number of hosts involved. To install the bridge between the two cluster, you go to the directory where Tungsten was installed for the master in the sandbox, and run ./tools/configure-service. The purpose of this command is to create a second service in the master, a service that will act as a slave, and fetch data from the other cluster. To do so, we need to provide some information to the service installer, the most important of which are:

the local-service-name, a.k.a. who's the boss, will tell the replicator that this service will live with the 'tsandbox' service, which is the local master.
the role tells the replicator that this service will fetch data;
the master-thl-host is the address where to find a master capable of feeding data to this new slave;
the master-thl-port and thl-port options make sure that the service uses one port for its own dispatching and another one to get data from the master.

cd $HOME/tsb2/db1
./tungsten/tools/configure-service -C \
  --local-service-name=tsandbox \
  --thl-port=12111 \
  --role=slave \
  --service-type=remote \
  --master-thl-host=r1 \
  --master-thl-port=2112 \
  --datasource=127_0_0_1 \
  --svc-start \  
  dragon

After this connection, every change in the first cluster master will be replicated to all its slaves, one of which happens to be a master, which will then distribute the same data to all its slaves. So we have a cascade hierarchical replication cluster, similar to what we can have with MySQL native replication. But Tungsten can do something more than that. In MySQL replication, you need to enable a slave to become a relay-slave. In Tungsten, you don't need to do it. Chaning clusters slave to master

Using a very similar command, I can connect to a slave of the first cluster, instead of the master, and the final result will be exactly the same.

cd $HOME/tsb2/db1
./tungsten/tools/configure-service -C \
  --local-service-name=tsandbox 
  --thl-port=12111 \
  --role=slave \
  --service-type=remote \
  --master-thl-host=r3 \
  --master-thl-port=2112 \
  --datasource=127_0_0_1 \
  --svc-start \  
  dragon

In my presentations, I call this feature "slave with an attitude". Thanks to Tungsten global transaction ID, a slave can request data to any host. Since the data is not labeled in terms of log files and position (as it is in MySQL), but in terms of sequence numbers, a slave ch ask any server for a given sequence number, and that number identifies a transaction unequivocally.

Friday, September 16, 2011

Quick recipes for database cluster building

One lesson learned in more than two decades working in this industry is that most of the IT professionals are impatient, want to achieve results immediately, and, most importantly, they don't read documentation. Much as the average geek is happy to answer many requests with a dismissive RTFM, the same geeks are not as diligent when it comes to learning about new or updated technologies. For this reason, there is a kind of documentation that is very much appreciated by busy and impatient professionals: cookbooks. And I am not talking about food. Geeks are not known for being cooks ⁽¹⁾ and they like fast food. I am talking about collection of technical recipes, short articles where a problem is briefly stated, and a direct solution is shown. Working with Tungsten Replicator, I am constantly amazed at all the things you can do with it, and at the same time, I am amazed at how so few people read the documentation. Since I want more users to be aware of the goodies, and being aware of the geeks' aversion to regular docs, I have started putting together a Tungsten Replicator Cookbook, where users can quickly find the recipe to build their cluster of choice. The problem is stated in one short paragraph, and the solution is outlined with code in the shortest possible way. Can't get any lazier than this! Or maybe you can, but I haven't reached that level yet. I hope I have found a good balance. Some of the recipes that the cookbook offers are:

Install a master / slave cluster
Install a master slave directory with customized parameters
Install more than one Tungsten Replicator in one host
Install a direct slave with parallel replication
Taking over replication from a MySQL slave in direct mode
Install bi-directional replication
Install bi-directional replication with additional slave
Install a three masters replication
Install a four masters replication
Modify one or more properties with the installer
Add one slave to an existing master

In addition to the cookbook, we have inaugurated a sample of another popular literary genre, namely Troubleshooting recipes. When things go wrong (and they usually do when you are dealing with something new, you want a quick answer to your problem. These troubleshooting items are aimed at making such quick answer readily available. Both projects are moving targets. We will adjust as the project grows. Contributions and comments are welcome. If you have suggestions on how to improve these documents, you can use the mailing list. ⁽¹⁾ With some notable exception, I must say. I have a reputation as a good cook. But then, I represent a minority in so many ways.

Welcome, MySQL commercial extensions

I saw yesterday that MySQL has finally done the right thing, and announced new commercial extensions.
What this means is that paying customers receive something more than users who get the community edition for free.
Believe it or not, when I was working in the community team at MySQL, I was already an advocate of this solution. You may see a contradiction, but there isn't. I would like to explain how this works.

An open source product needs to be developed. And the developers need to get paid. Ergo, the company needs to make money from that product if it wants to continue developing it. Either that, or the company needs to sell something else to pay the bills. (Let's not get into the argument that a pure open source project with universal participation is better, faster, or more marvelous: MySQL was never that, not with Oracle, not with Sun, and not when it was an independent company. If you want a extra-super-ultra open project, go with Drizzle. With MySQL, we need to continue reasoning with the raw material at hand.)
When MySQL was scaling its business, it realized that many customers were not willing to pay for support and consulting alone. To expand the business beyond the simple offer of services, the company created MySQL Network, which soon evolved into MySQL Enterprise, with the addition of the MySQL Monitoring tools and a fast upgrade plan. This was a good proposal for small to medium customers, but not as good for customers with large installations. When you deploy thousands of MySQL servers, you really don't want to upgrade every month. Anyway, for some time, the value proposition from MySQL was that the community users would get one release twice a year, and the paying customers would get one every month.
As a community manager, I strongly objected to that formula, not only because it hurts the community, but also because it hurts customers. When the release is exposed to millions of free users before it goes to the paying customers, chances are that serious bugs are discovered by the community and fixed in due time, before it hurts a high profile customer and needs to be fixed in a hurry at higher cost. One of the main values of MySQL is that it's that its large community adoption and feedback increases stability. Fortunately, I was not the only one who believed that larger distribution is valuable for customers, and the decision was reversed at the end of 2008.
In that period, I and other employees recommended a different value proposition for our customers. Instead of selling fast upgrade plans (which become a liability), MySQL could develop some reserved features that would be given only to paying customers.
There are two problems with reserved features, though: you need to develop them internally. You can't start them in the open, asking the community to test them for bugs, and then give them only to customers when they are ready (There was a faux pas in that direction in early 2008, but it was promptly retracted). These features must be developed as closed source, and tested only internally. The second problem is that MySQL had little internal QA manpower when this discussion arose.
There was another issue, namely that the code base for the next intended version (the fabled 6.0) was brittle. After 2 years in alpha stage, there was little to show for the effort. In parallel to the Oracle acquisition, two important things happened: version 6 was abandoned, and a new effort was started, using the more stable version 5.x as a code base, and a new developing model was launched, based on milestones and robustness.
This new foundation, combined with the injection of experienced QA personnel from the ranks of Sun and Oracle, made the project ready to offer reserved features to customers, while continuing the development of a lot more features for the community edition.
From a community standpoint, I welcome the commercial extensions. It means that the MySQL project will get more revenues, and be able to sustain the public release more easily. In my opinion it's the logical evolution of the project, and it's what MySQL should have done already years ago if it had had the manpower.
There are already detractors who see this release as a sign of the apocalypse. They probably want to see only this one feature in the commercial arena, dismissing the dozens of new features released to the general public under Oracle stewardship. I refuse to enroll in the chorus of critics who judge Oracle on prejudice. I am so far satisfied with what Oracle has done with MySQL. My opinion is based on facts, and since the facts are in favor of the community, I am pleased to say so. If future facts will make me change my opinion, I will say it. But now, I want to say thank you to the MySQL team at Oracle!

Thursday, September 08, 2011

The happiness of failing installations

When you set-up the same software several times (for you or for your customers), you want that software to install quickly and reliably, and you are generally happy when everything works as expected.
In this context, a failing installation is when the installation process exits unexpectedly, and you are left with an error message and the prospect of looking at the manual to find out what was it.

A failing installation is unpleasant, you'd say, and I concur. But do you know what's more unpleasant than a failing installation? It's an installation that succeeds, only to fail silently the first time you try using the application.

Looking at this enhanced definition, it is no surprise that I assert to find happiness in failure. And I have practical reasons for my claim. When I first tried Tungsten Replicator installation, it succeeded. And to my chagrin, the application did not work. I had to dig the reason for not working from the logs, and from that reason I had to figure out what I had done wrong. For example, the log might say "file not found mysql-bin.000003", and from that piece of information I had to figure out that I forgot to make the binary logs directory group readable, so that the 'tungsten' user could see the logs.
But a "successful" installation with later failure often meant that a clean shut down was not possible, and then I had to become an expert at cleaning up messy installations.
The next installation may get past the failure point, and possibly fail (again silently) for a different reason. Sometimes, I had to install four or five times until I get to the working and stable point. And then I'd install on another server, and I made a different mistake (or I forgot to apply the cure for a known mistake) and the stream of successful installations with hidden failures continued for a while.

With the above reminiscences, I am very happy to report that now you can install Tungsten Replicator with the near assurance that when something goes wrong, the installation does not start, and you are given a clear list of what was wrong.
The installer runs a long list of validation probes, and it doesn't stop at the first validation failure. It will try its best to tell you what you should do to reach a satisfactory installation, giving you a detailed list of everything that doesn't match up.
Not only that: the installer checks the requirements on all the servers in your intended cluster, and the installation does not start anywhere until you meet all the requirements in all the servers.

That's why, when my installation fails, I feel very happy, knowing that I won't have to clean up a messy server, and when I fix the problem that made the installation fail, my application will most certainly work.

Wednesday, September 07, 2011

Tungsten Replicator 2.0.4 released: usability and power

It has been a bumpy ride, with dozens of issues opened and resolved, but we finally feel that Tungsten Replicator 2.0.4 is ready for prime time. There have been quite a lot of changes. Most notably, the replicator is much faster, especially when it comes to parallel replication, and it is much easier to install, thanks to its new integrated installer, which can validate all the requirements to install the replicator, and suggest remedies when the requirements aren't met. This new installer is so good, in fact, that calling it installer is an insult. It is a legitimate cluster builder, able to install a full fledged cluster from a central location.

Probably equally important, we have caught up with the documentation, and now you can install several replication topologies following detailed instructions from the docs. You will find both HTML and PDF guides, with the steps to install straight master/slave systems, or direct slave takeover, or bi-directional replication. The binaries are available in the project's Downloads page. Later on, you will find the most updated (and possibly less bug-infested) binaries in our build server list. The Release_Notes list all the issues that have been closed since we released 2.0.3. The advanced users will especially appreciate an innovation introduced in the installer, which now allows users to define one or more of --property=key=value. Using this option wisely, you can now customize the replication properties straight at the start. What used to require several commands and a restart of the replicator right after the installation, now flows smoothly and quickly with one single command. With this release, Tungsten Replicator is closer to become a tool for mass consumption. The old installation method (which we have deprecated and renamed, to discourage anyone from using it) required time, constant attention, and it was unforgiving. The new one will let you make your mistakes freely. If something is amiss anywhere in all the servers where you are installing, it won't install and it will tell you what went wrong. This is probably my favorite feature, because it allows Tungsten to be used by less experienced users. Now it's up to the users. We have no illusion that the product is bug free, and we want to hear from users who try it and report on Issues.

Sunday, September 04, 2011

Finding tables without primary keys

I was checking a third party server, and I needed to find if there were tables without primary keys. This is important to know, not only because the lack of primary keys affects performance and data accuracy in general, but also because in row-based replication performance can degrade beyond belief when updating tables without primary keys. Anyway, I did not remember off the bat any method to get this information from a server with thousands of tables, and thus I went to find a solution on my own. My first instinct called for using the COLUMNS table from the INFORMATIOn_SCHEMA, and so I came up with this query, where I sum the number of columns that are inside either a PRIMARY or UNIQUE key and filter only the ones where such sum is zero (i.e. no primary or unique keys):

select 
    table_schema,table_name 
from  
    information_schema.columns  
group by 
    table_schema,table_name   
having 
    sum(if(column_key in ('PRI','UNI'), 1,0)) = 0;

This query got the job done, and it was quite quick as well. Then, since I was chatting with Sheeri Cabral about other matters, I asked her if she could come up with an alternative solution. She suggested a LEFT JOIN between the information_schema.tables and information_schema.statistics, which I translated into this query:

select 
    t.table_schema, t.table_name 
from 
    information_schema.tables  t 
    left join information_schema. statistics s 
       on t.table_schema=s.table_schema and t.table_name=s.table_name 
       and s.non_unique=0 
where 
    s.table_name is null;

This query works on the principle that it removes from the tables list all the ones for which there is no corresponding table in the statistics table. This query also works. Using both queries in a relatively empty server did not show any significant difference. But since I knew that I had to use this method on a very busy server, with a lot of tables, I quickly created 1,000 databases, each containing 5 tables, two of which did not have any primary or unique key. Now came the first surprise. The query with GROUP BY took about 0.5 seconds, while the one using LEFT JOIN used 11 seconds. I was about to congratulate myself for my acumen, when I realized that, in addition to schema and table names, I also needed the table engine. For the second query, that is not a problem. Adding the engine to the columns list works OK, and the query runs in 11 seconds like before. The first query, though, can't list the engine. There is no 'engine' in the COLUMNS table. So I needed a JOIN. Thus my query became

select 
    t.table_schema,t.table_name,engine 
from 
    information_schema.tables t 
    inner join information_schema .columns c  
        on t.table_schema=c.table_schema and t.table_name=c.table_name 
group by 
    t.table_schema,t.table_name   
having 
    sum(if(column_key in ('PRI','UNI'), 1,0)) =0;

Guess what? This query ran in 17 seconds (!). So much for my instinct! Joins without keys are not efficient in MySQL, and tables in the information schema are no exception. If anyone has a more efficient method of getting a list of tables without primary key (the list including schema name, table name, and engine), I am curious to know.

Friday, September 02, 2011

Primary keys from experience

From time to time I see articles in defense of natural primary keys against surrogate keys. I don't take an immovable stand on either side, as I have seen good cases for both. In general, I like the idea of a natural primary key, when I see one that it is really natural. Quite often, though, a natural primary key has proved itself faulty. And most of the times, when such faults happened, it was because of limited understanding of the data. I like to describe this kind of misinformation as data analysis culture clash. When choosing a natural primary key, one should consider which element, or group of elements, are unique in a given set. Not only that, they must be immutable, at least within that set. For example, in a group of people, we may assume that a combination of name, surname, date and place of birth is a good natural primary key. Well, no. It isn't, for several reasons. Read on for a few real cases. If we rule out the above combination, perhaps we could use the Social Security Number as a valid key? Not really. The fact is that the above assumptions work well if we consider only people from the same country in current times. If we extend our data definition to include people from different countries, or historical records, then the assumption collapses. A practical case: The birth place. This is a fair assumption. In combination with other elements (e.g.: date of birth, name, and surname) it can provide good basis for unique and immutable records. If you consider people in the United States today, you are right. Even in the UK, or Italy, or France. But try applying this method to places with recent political changes due to war or revolutions, and suddenly the name of the town may suddenly change. What was before VictoryBurg in Oppresslandia is now known as Heroes City in Freelandia. And this happens now, in the 21st century. If your set includes historical data, these occurrences may become frightfully frequent. Speaking of historical times, if you are dealing with really old records, you may want to consider that dates are less than immutable. The way we count days in most Western countries is called the Gregorian calendar, which was adopted by a handful of countries in 1582. Dates before October 5, 1582 and after October 14, 1582 are recorded using two different calendars. It is inconvenient, but if you know that you may avoid wrong date calculations. Simple, isn't it? Not really. If your records include data from different countries, you will have to take into account when this calendar was adopted. Just to give a few examples, England adopted the Gregorian calendar in 1752, Japan in 1873, Russia in 1918, Turkey in 1926. When dealing with people from different countries, you may be tempted to use citizenship as an immutable property. That may work, if you consider citizenship at birth and are prepared to keep the names of not currently existing states. If, instead, you get the citizenship from the employee's passport, you may incur in one or more of the following cases (all happened in practice during my work with an international organization):

Some employees came to work with the passport they had before the political change, and now that the country has split, they have different passports, depending on which side of the new border they live.
Some employees have parents from different countries and are entitled to more than one passport. They came to work initially with one passport, and came again a few year later with a different one because of different benefits.
Some employees started working with a given citizenship, and then they got a different one because of marriage, political asylum, or other legal means.
Some employees became stateless. The equivalent of NULL in a table field.

Coming to the Social Security Number, the assumption of uniqueness fails for the same reason that citizenship does. It's even more frequent, because of people working abroad and paying their taxes in a foreign country, so they had two or more SSN or equivalent to show. But this element failed for another reason, i.e. because it is not immutable. It is supposed to be, but in some cases it happens that, due to clerical mistakes, the SSN issued in a given country is wrong, and needs to be changed. It happens in Italy, where your SSN (called "codice fiscale", or fiscal code) must match your name, surname, place, and date of birth. If any of these elements was wrong when the code was generated, the code needs to be done again. It's a painful process that requires a court order, but the result is that the item is not immutable. Other elements that I have seen used wrongly for primary keys, either standalone or as key components, are: telephone numbers (they can change, and after being changed they can be assigned to other people), email addresses (they are unique, but they can easily change, or they can be abandoned when the person changes provider, or company, or both), gender (it can change), surname (it can change, legally, for male and female, depending on country and conditions). Summing up, a sane amount of skepticism should be used when considering if an element can be used in a primary key. Depending on the environment, the element can be safe or it can become a nightmare when the database grows from a neighborhood business to an international venture.

Monday, August 15, 2011

CodeBits - An event of competitive innovation

It was my pleasure and privilege to attend Codebits in 2009. As Roland Bouman says, its talk choice method is based on public voting, and therefore everyone cha have contribute to the schedule. But that is not the main reason for attending this extraordinary event. It is not just a conference. It's an innovation fest. For 1 and 1/2 days, it's a conference, where the speakers are encouraged to bring to their audience the most innovative and inspiring talks. In the afternoon of the second day, the event becomes a competition, where the teams that have registered will have 24 hours to bring a project to completion, and they have to start and finish within the allotted time. The project can be anything, and I have seen quite a lot of exciting stuff rolling live in the huge pavilion: I could hardly ignore robotics, as these little mechanical smurfs were running all over the place and you would have to be careful not to squash them when you walked. There was plenty of occasions for planning of great projects, together with attempts at improving social relations, and mixing up with big brother. There were projects based on 3D printing, and less broad projects like all-seasons keyboards. A very popular session, followed by practical workshops was lock picking. I attended one of them, learned how to pick simple and less simple locks, and I brought home some lockpicking tools. On a more technical level, I was there with Lenz Grimmer and Kai Seidler, we spoke about MySQL and other cool things, and we had lots of fun for three days. Besides the teams hacking away at their projects, there were several teams showcasing technology that had been developed by winners of the previous years, such as 3D television and intelligent phone networks. In short, This was an inspiring event, which I can warmly recommend.

Thursday, August 11, 2011

Usability improvements in Tungsten Replicator 2.0.4

If you love a software product, you should try to improve it, and not be afraid of criticizing it. This principle has guided me with MySQL (where I have submitted many usability bugs, and discussed interface with developers for years), and it proves true for Tungsten Replicator as well. When I started working at Continuent, while I was impressed by the technology, I found the installation procedure and the product logs quite discouraging. I would almost say disturbing. Fortunately, my colleagues have agreed on my usability focus, and we can enjoy some tangible improvements. I have already mentioned the new installation procedure, which requires just one command to install a full master/slave cluster. I would like to show how you can use the new installer to deploy a multiple source replication topology like the following:

The first step is to install one master in each node. I can run the commands from node #4, which is the one that will eventually receive the updates from the remote masters, and where I need to install the slave services:

TUNGSTEN_BASE=$HOME/newinst
SERVICES=(alpha bravo charlie delta)
REPLICATOR=$TUNGSTEN_BASE/tungsten/tungsten-replicator/bin/replicator

for N in 1 2 3 4
do
    INDEX=$(($N-1))

  ./tools/tungsten-installer \
    --master-slave \
    --master-host=qa.r$N.continuent.com \
    --datasource-user=tungsten \
    --datasource-password=secret \
    --service-name=${SERVICES[$INDEX]} \
    --home-directory=$TUNGSTEN_BASE \
    --cluster-hosts=qa.r$N.continuent.com \
    --start-and-report
done

The above loop will install a master (remotely or locally) in the four servers. Then I need to create the slave services. To do it, I use the updated configure-service in the tools directory.

TUNGSTEN_TOOLS=$TUNGSTEN_BASE/tungsten/tools
COMMON_OPTIONS='-C -q 
    --local-service-name=delta 
    --role=slave 
    --service-type=remote 
    --allow-bidi-unsafe=true 
    --datasource=qa_r4_continuent_com' 

$TUNGSTEN_TOOLS/configure-service $COMMON_OPTIONS --master-host=qa.r1.continuent.com  alpha 
$TUNGSTEN_TOOLS/configure-service $COMMON_OPTIONS --master-host=qa.r2.continuent.com  bravo
$TUNGSTEN_TOOLS/configure-service $COMMON_OPTIONS --master-host=qa.r3.continuent.com  charlie 

$TUNGSTEN_BASE/tungsten/tungsten-replicator/bin/replicator restart
$TUNGSTEN_BASE/tungsten/tungsten-replicator/bin/trepctl services

These commands create the slave services locally in Delta. After restarting the replicator, a simple test will be creating something different in each master, and check that the data has replicated to the single slave. The latest improvement in matter of usability is the simplification of the replicator logs. Until a few days ago, if you had an error in the replicator, you would get a long list of not exactly helpful stuff. For example, if I create a table in a slave, and then create the same table in the master, I will break replication. The extended log would produce something like this:

INFO   | jvm 1    | 2011/08/11 18:10:52 | 2011-08-11 18:10:52,216 [tsandbox - q-to-dbms-0] ERROR pipeline.SingleThreadStageTask Event application failed: seqno=1 fragno=0 message=java.sql.SQLException: Statement failed on slave but succeeded on master
INFO   | jvm 1    | 2011/08/11 18:10:52 | 2011-08-11 18:10:52,217 [tsandbox - Event dispatcher thread] ERROR management.OpenReplicatorManager Received error notification, shutting down services: Event application failed: seqno=1 fragno=0 message=java.sql.SQLException: Statement failed on slave but succeeded on master
INFO   | jvm 1    | 2011/08/11 18:10:52 | com.continuent.tungsten.replicator.applier.ApplierException: java.sql.SQLException: Statement failed on slave but succeeded on master
INFO   | jvm 1    | 2011/08/11 18:10:52 |       at com.continuent.tungsten.replicator.applier.MySQLDrizzleApplier.applyStatementData(MySQLDrizzleApplier.java:183)
INFO   | jvm 1    | 2011/08/11 18:10:52 |       at com.continuent.tungsten.replicator.applier.JdbcApplier.apply(JdbcApplier.java:1233)
INFO   | jvm 1    | 2011/08/11 18:10:52 |       at com.continuent.tungsten.replicator.applier.ApplierWrapper.apply(ApplierWrapper.java:101)
INFO   | jvm 1    | 2011/08/11 18:10:52 |       at com.continuent.tungsten.replicator.pipeline.SingleThreadStageTask.runTask(SingleThreadStageTask.java:498)
INFO   | jvm 1    | 2011/08/11 18:10:52 |       at com.continuent.tungsten.replicator.pipeline.SingleThreadStageTask.run(SingleThreadStageTask.java:155)
INFO   | jvm 1    | 2011/08/11 18:10:52 |       at java.lang.Thread.run(Unknown Source)
INFO   | jvm 1    | 2011/08/11 18:10:52 | Caused by: java.sql.SQLException: Statement failed on slave but succeeded on master
INFO   | jvm 1    | 2011/08/11 18:10:52 |       at com.continuent.tungsten.replicator.applier.MySQLDrizzleApplier.applyStatementData(MySQLDrizzleApplier.java:139)
INFO   | jvm 1    | 2011/08/11 18:10:52 |       ... 5 more
INFO   | jvm 1    | 2011/08/11 18:10:52 | Caused by: java.sql.SQLSyntaxErrorException: Table 't1' already exists
INFO   | jvm 1    | 2011/08/11 18:10:52 |       at org.drizzle.jdbc.internal.SQLExceptionMapper.get(SQLExceptionMapper.java:78)
INFO   | jvm 1    | 2011/08/11 18:10:52 |       at org.drizzle.jdbc.DrizzleStatement.executeBatch(DrizzleStatement.java:930)
INFO   | jvm 1    | 2011/08/11 18:10:52 |       at com.continuent.tungsten.replicator.applier.MySQLDrizzleApplier.applyStatementData(MySQLDrizzleApplier.java:125)
INFO   | jvm 1    | 2011/08/11 18:10:52 |       ... 5 more
INFO   | jvm 1    | 2011/08/11 18:10:52 | Caused by: org.drizzle.jdbc.internal.common.QueryException: Table 't1' already exists
INFO   | jvm 1    | 2011/08/11 18:10:52 |       at org.drizzle.jdbc.internal.mysql.MySQLProtocol.executeQuery(MySQLProtocol.java:500)
INFO   | jvm 1    | 2011/08/11 18:10:52 |       at org.drizzle.jdbc.internal.mysql.MySQLProtocol.executeBatch(MySQLProtocol.java:546)
INFO   | jvm 1    | 2011/08/11 18:10:52 |       at org.drizzle.jdbc.DrizzleStatement.executeBatch(DrizzleStatement.java:917)
INFO   | jvm 1    | 2011/08/11 18:10:52 |       ... 6 more
INFO   | jvm 1    | 2011/08/11 18:10:52 | 2011-08-11 18:10:52,218 [tsandbox - Event dispatcher thread] WARN  management.OpenReplicatorManager Performing emergency service shutdown
INFO   | jvm 1    | 2011/08/11 18:10:52 | 2011-08-11 18:10:52,219 [tsandbox - Event dispatcher thread] INFO  pipeline.Pipeline Shutting down pipeline: slave
INFO   | jvm 1    | 2011/08/11 18:10:52 | 2011-08-11 18:10:52,219 [tsandbox - q-to-dbms-0] INFO  pipeline.SingleThreadStageTask Terminating processing for stage task thread
INFO   | jvm 1    | 2011/08/11 18:10:52 | 2011-08-11 18:10:52,219 [tsandbox - q-to-dbms-0] INFO  pipeline.SingleThreadStageTask Last successfully processed event prior to termination: seqno=0 eventid=mysql-bin.000002:0000000000000426;20

Did you see the reason for the error? No? Neither did I. I would need to open the THL, look for event #1, and determine what it was. Instead, the new user.log looks like this:

2011-08-11 18:10:52,216 ERROR Received error notification: Event application failed: seqno=1 fragno=0 message=java.sql.SQLException: Statement failed on slave but succeeded on master
Caused by : java.sql.SQLException: Statement failed on slave but succeeded on master
Caused by : Statement failed on slave but succeeded on master
Caused by : Table 't1' already exists
Caused by : Table 't1' already exists
2011-08-11 18:10:54,721 INFO  State changed ONLINE -> OFFLINE:ERROR
2011-08-11 18:10:54,721 WARN  Received irrelevant event for current state: state=OFFLINE:ERROR event=OfflineNotification

That's much better. It is not perfect yet, but it will be soon. Right now, it tells me what is wrong without forcing me to go hunting for it amid hundreds of stack trace lines. Give it a try, using the latest replicator build.

Wednesday, August 10, 2011

Call for disclosure on MySQL Conference 2012

Percona has announced Percona Live MySQL Conference and Expo 2012. Kudos for their vision and entrepreneurship. I have seen comments praising their commitment to the community and their willingness to filling a void. I have to dot a few i's and cross some t's on this matter.

That was not the only game in town.

By the end of June, there were strong clues that O'Reilly was not going to organize a conference. The question of who could fill the void started to pop up. The MySQL Council started exploring the options for a community-driven conference to replace the missing one. The general plan was along the lines of "let's see who is in, and eventually run a conference without the big organizer. If nobody steps up, the IOUG can offer a venue in Las Vegas for an independent MySQL conference". The plan required general consensus among the major players, and therefore we started asking around about availability and participation. Percona did not answer our requests. They delayed the meeting, and in the meantime we continued preparing for the plan B of a conference in Vegas. Then some of us received a message from Percona, pre-announcing a conference in Santa Clara. No offer to gather a broad participation from other entities. No sign of wanting to do a neutral event, i.e. an event not tied to a single company.

Some background

That was puzzling, because I recall vividly how Baron Schwartz and Peter Zaitzev advocated strongly in favor of an independent conference, not so long ago:

The conference is organized and owned by MySQL, not the users. It isn’t a community event. It isn’t about you and me first and foremost. It’s about a company trying to successfully build a business, and other companies paying to be sponsors and show their products in the expo hall.
Baron Schwartz, April 23, 2008.

I would like to see the conference which is focused on the product users interests rather than business interests of any particular company (or personal interests of small group of people), I would like it to be affordable so more people can attend and I’d like to see it open so everyone is invited to contribute and process is as open as possible.
Peter Zaitzev, April 23, 2008.

A call to disclosure

I understand the business motivation to organize a conference with your company name in the title, while at the same time leveraging the wide MySQL community. However, if I have to judge by the organization of previous Percona Live events, I don't see any of the benefits that were advocated three years ago. I see a business conference that is inspired to the same principles that Percona was criticizing in 2008. What is it then? If it is supposed to be a community conference, let's call it "MySQL Conference" and ask for broad participation. There are plenty of people in the community who are willing to help and make the event a success, not only for the benefit of Percona, but for the global benefit of everyone in the ecosystem, including Oracle, the IOUG, and every company with a business related to MySQL. If it is not a community conference, let's state it clearly, so that people can set their expectations accordingly.

Unintended consequences

Someone may think it's a good thing to have a MySQL conference without Oracle participation but I am sure most will agree that it is not desirable. Much as I admire Percona's technical merits, if I go to the conference I want to hear from a wide range of participants. Specifically, I would like to know what's in the pipeline, and I want to hear that from the engineers in the MySQL team, i.e. from Oracle. I doubt that Oracle would send engineers and VPs to talk to a conference that is named after a competitor, and that may be true for other entities, which I (and many others) would like to hear from.

In short

Is this the conference of Baron's and Peter's earlier dreams or is it the fulfillment of their current business strategy?
Please, let the community know.

Disclaimer

The opinions in this post are my own. My employer does not censor my writings and gives me full freedom of expression, but my opinion does not necessarily match that of my company.

Wednesday, June 22, 2011

Less known facts about MySQL user grants

Reading MySQL security: inconsistencies I remembered a few related experiments that I did several years ago when I was studying for the MySQL certification. The first fact that came to mind is about the clause "WITH GRANT OPTION", which can only be given on the full set of options, not on a single grant. For example

GRANT INSERT,DELETE,UPDATE on world.* to myuser identified by 'mypass';
GRANT SELECT on world.* to myuser identified by 'mypass' WITH GRANT OPTION;
show grants for myuser\G
*************************** 1. row ***************************
Grants for myuser@%: GRANT USAGE ON *.* TO 'myuser'@'%' IDENTIFIED BY PASSWORD '*6C8989366EAF75BB670AD8EA7A7FC1176A95CEF4'
*************************** 2. row ***************************
Grants for myuser@%: GRANT SELECT, INSERT, UPDATE, DELETE ON `world`.* TO 'myuser'@'%' WITH GRANT OPTION

If you are surprised about the "WITH GRANT OPTION" clause applying to all grants instead of only applying to the SELECT, you forgot to consider how the grants are stored. All the grants for a given user (and a user is the combination of a name and a provenience) are stored in a single record in the mysql.user table. The GRANT OPTION is a column in that record. It is either set or not. You can't assign this option for only one attribute in the same record. It's either all the flagged grants or nothing. If you want to assign the "with grant option" on a single column, you must change either the provenience or the name of the user (thus opening another record). But also this addition may not be enough to reach your goal, as you can see in the next section. The other fact that came to mind about the "WITH GRANT OPTION" clause is that, in the examples given, it is ineffective. I dare say illusory. Let's start. As root user, we create this user:

root> grant all on granted.* to grantee identified by 'happyuser' with grant option;

The granted database exists, and now we have an user that can modify it, and, we think, delegate some functions to someone else.

grantee> grant select on granted.* to delegated identified by 'happy';
ERROR 1410 (42000): You are not allowed to create a user with GRANT
grantee> create user delegated;
ERROR 1227 (42000): Access denied; you need (at least one of) the CREATE USER privilege(s) for this operation

Right. I can't create a new user, but only transfer my superpowers to someone else. I will ask root to create the user, and then I will give it another try.

root>  create user delegated;
Query OK, 0 rows affected (0.00 sec)

grantee> grant select on granted.* to delegated identified by 'happy';
ERROR 1044 (42000): Access denied for user 'grantee'@'%' to database 'mysql'

Ouch! Since the grant tables are in the 'mysql' database, I don't have access. I will ask root to give me access to the mysql 'user' and 'db' tables.

root>  grant insert on mysql.user to grantee ;
Query OK, 0 rows affected (0.00 sec)
root>  grant insert on mysql.db to grantee ;
Query OK, 0 rows affected (0.00 sec)

grantee> grant select on granted.* to delegated identified by 'happy';
ERROR 1044 (42000): Access denied for user 'grantee'@'%' to database 'mysql'

Not good. I tried then to get SELECT,INSERT,UPDATE,DELETE for all the grant tables inside 'mysql'. Still, I could not exercise my grant options. Finally, the only solution was to get privilegs on the whole mysql database.

root> grant insert,select,delete,update on mysql.* to grantee;
Query OK, 0 rows affected (0.00 sec)

grantee> grant select on granted.* to delegated identified by 'happy';
Query OK, 0 rows affected (0.00 sec)

At last, I can grant something to someone.
But wait! Now that I can modify the 'mysql' database ...perhaps I could ...

grantee> update user set Select_priv ='Y',
 Insert_priv ='Y', Update_priv ='Y', Delete_priv ='Y',
 Create_priv ='Y', Drop_priv ='Y', Reload_priv ='Y',
 Shutdown_priv ='Y', Process_priv ='Y', File_priv ='Y',
 Grant_priv ='Y', References_priv ='Y', Index_priv ='Y',
 Alter_priv ='Y', Show_db_priv ='Y', Super_priv ='Y',
 Create_tmp_table_priv ='Y', Lock_tables_priv ='Y', Execute_priv ='Y',
 Repl_slave_priv ='Y', Repl_client_priv ='Y', Create_view_priv ='Y',
 Show_view_priv ='Y', Create_routine_priv ='Y', Alter_routine_priv ='Y',
 Create_user_priv ='Y', Event_priv ='Y', Trigger_priv ='Y',
 Create_tablespace_priv ='Y' where user = 'grantee';

This does not enhance my current grants, because I don't have the SUPER privilege (yet), but I can wait until the server restarts or until someone issues a 'flush privileges'. An then I will have full access to the server. Obviously, this situation is not what the DBA had in mind when the user 'grantee' was created.

Update The habit of always seeing the password set as integral part of the GRANT command has made me err on judgment.
As noted in one of the comments, the "grantee" user could have granted privileges to "delegated" without assigning a password. In this case,"grantee" does not need separate grants to the mysql database, which were apparently needed only if you wanted to set the password with the GRANT command.
All the above post is a miscalculation. The additional grants are not needed, provided that you don't include a password clause in your GRANT command.

Monday, June 20, 2011

Introducing the Tungsten-toolbox

After the public home for Tungsten Replicator, we needed another place where to host complementary tools. We discussed the pros and cons of hosting these tools in the same place where we publish Tungsten, but in the end we decided that it's more practical to have a separate project, where we can publish tools related to database replication, no matter if they are dedicated to Tungsten or if they can work with other replication systems. So, here it is. We have now Tungsten Toolbox, a site where we will collect our tools and accept contributions from others.

We have already a few tools that, after being cleaned up, will find their way to this repository. The one that was more or less ready for prime time is the Tungsten Sandbox, a tool that installs more than one Tungsten instance with a backend database in the same host. It is, as you can imagine, based on the MySQL Sandbox and it works reasonably well. All the tools in this toolbox are released under the New BSD License. What can you expect to see in the near future? We have a few ideas already:

A binary log analyzer
A tool that changes properties on-the-fly
A JSON API for Tungsten
Sandboxes and deployers for complex topologies (multiple masters, fan-in)
Sandboxes and deployers for direct slaves
Deployers for a mix of MySQL native and Tungsten replication
A PostgreSQL sandbox

We are, of course, open to contributions. If you have a tool that is useful for database replication and want to release it under a BSD license, feel free to propose it in the Google Group discussion on Tungsten Replicator.

Tuesday, June 14, 2011

What happened to MySQL Forge?

Update
Soon after I posted this article, the Forge came back online! Thanks!

MySQL Forge has been offline for two days now. (2011-06-14) No sign of acknowledgement of this problem from the MySQL team. What is happening? For those not well acquainted with MySQL Forge, here are the facts. The MySQL Forge is a site that was intended to contain all community contributions. The reality did not follow the plans very closely, and some sections of the forge ended up with less contents than what should be useful. However, there are a few sections of the forge that are extremely useful to users:

The Wiki, which is full of irreplaceable documentation, such as the MySQL internals, description of preview features, slides and recordings of MySQL University, manual of many tools (such as the Random Query Generator, and more
The worklogs, or the blueprints for most MySQL features, past, present, and future
The tools/snippets, which is a collection of community contributed scripts. Not many, but most of them of high quality and quite useful.

I know that there was a lot of spam in the forge, but I really hope that it was taken off line on purpose. If its maintenance is a problem, I am sure we can find plenty of volunteers in the community that want to host the wiki and the rest of the useful stuff. Hello? Anybody there at Oracle/MySQL? Please give us back the forge. Thanks.

Thursday, June 09, 2011

Getting started with Tungsten Replicator and Tungsten Sandbox

We have been busy, at Continuent. In addition to our usual work with high performance replication, we have addressed usability issues, since we know that a hard-to-use problem, no matter how powerful, has low adoption. Thus, is with some personal satisfaction that I can announce the release of Tungsten Replicator 2.0.3, which comes with several huge improvements in matter of user friendliness. The new installation procedure is so user friendly, in fact, that I was able to build a sophisticated tungsten-sandbox with a 150-line shell script. (The corresponding features for MySQL Sandbox required 4,500 lines of Perl).

Enough self celebration, though. Let's get started, as the title of this post suggests, with the practical steps.

Requirements

Before we begin, there are a few requirements to meet.

You need to be on a Unix-like operating system. Our main choice is Linux. If you want to test on Mac OSX, it works, but we won't recommend it.
Java JRE must be installed. And it must be the original one, not the Open JDK. Update: The requirement against OpenJDK has been lifted. It works fine in my tests.
Ruby 1.8 must be installed. This is mainly needed during the installation phase only, but it is required nonetheless.
The user account that will install and run Tungsten must have ssh access to the other hosts involved in the cluster
The above mentioned user must have sudo access. This is only needed if you want to use Tungsten Replicator to run backups that involve root access (like xtrabackup). We may lift this requirement later, but for now you need to enable it, at least during the installation, and remove the access when you are done.
This user must also have read access to MySQL binary logs. Usually you achieve this by making sure that the binary logs are readable by users belonging to the "mysql" group, and by adding such group to your user.
There must be a MySQL users for Tungsten replication. This user must have full access to the database server, with grant option.
The MySQL server must have binary logging enabled.
If you have MySQL native replication running, you must stop it.

Getting the code and install

The code is released in the downloads section of Tungsten's home. The current recommended version is 2.0.3, but if you like to be really up to date, we also publish a list of recent builds from our build server, which you can use to have a go at the replicator. For this simple installation, I will use four servers from our server farm. The servers are named R1, R2, R3, and R4. The first good news of the new installation process is this: you need to install in one server only!. More details follow. First off, create a directory where you want to install. Use a non-root account. Just make sure that it's the same user in all the servers, and that such user can access the directory where you want to install. I am going to call this directory planet.

cd $HOME
for N in 1 2 3 4 ; do ssh r$N mkdir planet ; done
cd planet 
wget http://tungsten-replicator.googlecode.com/files/tungsten-replicator-2.0.3.tar.gz
tar -xzf tungsten-replicator-2.0.3.tar.gz
cd tungsten-replicator-2.0.3

I have already a MySQL user named tungsten with password "mypwd" (but it can be anything you like, as long as it has the required privileges). Now we have all the components. If you have read the Tungsten documentation, please ignore the ./configure script. That is left for compatibility reasons, and will be deprecated soon. Instead, to install the cluster of our 4 servers, let's do the following:

export TUNGSTEN_BASE=$HOME/planet

./tools/tungsten-installer \
    --master-slave \
    --master-host=r1 \
    --datasource-user=tungsten \
    --datasource-password=mypwd \
    --service-name=dragon \
    --home-directory=$TUNGSTEN_BASE \
    --cluster-hosts=r1,r2,r3,r4 \
    --start

Some comment on this command: --master-slave is the installation mode (see below for more info). --service-name can be anything you want. --home-directory is where all the installation sub directories will go. --cluster-hosts is the list of servers you want to install, and finally, --master-host is the host that will be installed as a master, while all the others will be slaves of that one. If you have followed the instructions carefully, the installer will bring up the Tungsten cluster without any fuss, Unix style. If you hate silent installations, you can get the full monty by adding some options:

./tools/tungsten-installer \
    --verbose \
    --master-slave \
    --master-host=r1 \
    --datasource-user=tungsten \
    --datasource-password=mypwd \
    --service-name=dragon \
    --home-directory=$TUNGSTEN_BASE \
    --cluster-hosts=r1,r2,r3,r4 \
    --start-and-report

If you run the installer in verbose mode, you will see an extremely long list of validation checks that the installed does on your current servers and on the ones that are listed in the --cluster-hosts option. If everything went well, you will find the following directories in $HOME/planet (for all servers in your cluster):

configs, containing the configuration file created by the installer. This file describes your cluster
releases, containing the Tungsten binaries.
thl, containing Tungsten's Transaction History Logs. These logs are like MySQL binary logs, but with much more metadata, including a global transaction ID, which is missing in MySQL native replication.
relay, which should be empty, unless you install in "direct" mode (see below.)
tungsten, which is a symlink to the Tungsten directory inside releases.

In addition to the above mentioned directories, Tungsten Replicator creates a database for each service. Since we have only one service in this topology, you will find a database named "tungsten_dragon". (If you have called your service "bunny", you will instead find "tungsten_bunny"). Inside this database there is the replication metadata necessary for making the servers fault tolerant. Only a small amount of data is kept on that database. It's roughly corresponding to what you get from the .info files in MySQL native replication. To test that the system is OK, let's find our tools. The first one is trepctl, which, among other things, can give us an overview of the running services.

cd $HOME/planet
./tungsten/tungsten-replicator/bin/trepctl services
Processing services command...
NAME              VALUE
----              -----
appliedLastSeqno: 0
appliedLatency  : 1.152
role            : slave
serviceName     : dragon
serviceType     : local
started         : true
state           : ONLINE
Finished services command...

./tungsten/tungsten-replicator/bin/trepctl -host r1 services
Processing services command...
NAME              VALUE
----              -----
appliedLastSeqno: 0
appliedLatency  : 0.936
role            : master
serviceName     : dragon
serviceType     : local
started         : true
state           : ONLINE
Finished services command...

mysql -h r1 -e 'create schema if not exists test'
mysql -h r1 -e 'create table test.t1 (i int)'

./tungsten/tungsten-replicator/bin/trepctl services
Processing services command...
NAME              VALUE
----              -----
appliedLastSeqno: 2
appliedLatency  : 0.155
role            : slave
serviceName     : dragon
serviceType     : local
started         : true
state           : ONLINE
Finished services command...

./tungsten/tungsten-replicator/bin/trepctl -host r1 services
Processing services command...
NAME              VALUE
----              -----
appliedLastSeqno: 2
appliedLatency  : 0.044
role            : master
serviceName     : dragon
serviceType     : local
started         : true
state           : ONLINE
Finished services command...

After the installation, trepctl reported the last applied sequence number (appliedLastSeqno) as 0. Following the execution of two commands in the master, such number became 2. If you want to know more of what was happening, you can use the thl command. This corresponds roughly to using mysqlbinlog with MySQL native replication logs.

/tungsten/tungsten-replicator/bin/thl -service dragon list |less
SEQ# = 1 / FRAG# = 0 (last frag)
- TIME = 2011-06-09 14:51:23.0
- EVENTID = 000002:0000000000000514;197609
- SOURCEID = qa.r1.continuent.com
- STATUS = COMPLETED(2)
- METADATA = [mysql_server_id=10;service=dragon;shard=test]
- TYPE = com.continuent.tungsten.replicator.event.ReplDBMSEvent
- OPTIONS = [##charset = ISO8859_1, createOrDropDB = , autocommit = 1, sql_auto_is_null = 1, foreign_key_checks = 1, unique_checks = 1, sql_mode = '', character_set_client = 8, collation_connection = 8, collation_server = 8]
- SQL(0) = create schema if not exists test /* ___SERVICE___ = [dragon] */
SEQ# = 2 / FRAG# = 0 (last frag)
- TIME = 2011-06-09 14:51:30.0
- EVENTID = 000002:0000000000000601;197610
- SOURCEID = qa.r1.continuent.com
- STATUS = COMPLETED(2)
- METADATA = [mysql_server_id=10;service=dragon;shard=test]
- TYPE = com.continuent.tungsten.replicator.event.ReplDBMSEvent
- OPTIONS = [##charset = ISO8859_1, autocommit = 1, sql_auto_is_null = 1, foreign_key_checks = 1, unique_checks = 1, sql_mode = '', character_set_client = 8, collation_connection = 8, collation_server = 8]
- SCHEMA = 
- SQL(0) = create table test.t1 (i int) /* ___SERVICE___ = [dragon] */

Once we are satisfied that replication is working, we can clean up the cluster and try other installation experiments. To clean up a cluster, you need to do the following:

stop the replicator in all servers.
for N in 1 2 3 4; do $PWD/tungsten/tungsten-replicator/bin/replicator stop; done
remove the thl files from all servers.
remove the tungsten_SERVICE_NAME database from all mysql servers
run a "reset master" in the master database
remove the directories created by the installer in all servers

Installation types

The procedure described above was, until a few months ago, the only thing you could do with Tungsten. Now you can broaden your horizons with a wider range of possibilities.

Master/slave is of course the main option, and it's the one that you have seen in the previous section. This method gives you the full set of Tungsten features and performance. It is the recommended method for production use and for benchmarking. In this scenario, the Tungsten replicator on the master will extract transactions from the binary log, transfer them to the THL, and share it with the slaves. The slaves will read from the THL and apply the transactions to the database. There are a few steps more in between, but for the sake of brevity I will skip them You can have a look at Robert Hodges blog for more info.

Slave "direct" is the alternative that you can use in production, and it's been designed to satisfy users who only want some particular benefits on the slave side, and don't care about global transaction IDs. If you are looking at parallel apply, this is probably a setup that you want to try. In this scenario, there is no replicator on the master. The slave pulls data remotely from the binary logs, copies them locally, and extracts data to the THL. Here's an example of how to start a slave-direct system:

./tools/tungsten-installer \
    --direct \
    --master-host=r1 \
    --slave-host=r4 \
    --master-user=tungsten \
    --slave-user=tungsten \
    --master-password=secret \
    --slave-password=secret \
    --service-name=Castor \
    --thl-port=21124 \
    --rmi-port=10104 \
    --channels=5 \
    --home-directory=$TUNGSTEN_BASE \
    --start-and-report

If your purpose is testing Tungsten, probably the Tungsten Sandbox is what you should try. This system is based on MySQL Sandbox, a framework that lets you install more than one MySQL server in the same host. Building on top of MySQL Sandbox, and leveraging the new flexibility in Tungsten installer, tungsten-sandbox allows you to build a master/slave system inside a single host. Let's give it a try. You need to have MySQL Sandbox installed, and at least one MySQL tarball expanded under $HOME/opt/mysql/X.X.XX (where X.X.XX is the MySQL version, such as 5.5.12).

cd $HOME/planet
mkdir sb
cd tungsten-replicator-2.0.3
wget http://tungsten-replicator.googlecode.com/files/tungsten-sandbox
./tungsten-sandbox -h
USAGE: ./tungsten-sandbox [flags] args
flags:
  -n,--nodes:  how many nodes to install (default: 3)
  -m,--mysql_version:  which MySQL version to use (default: '5.1.56')
  -t,--tungsten_base:  where to install the sandbox (default: '/home/tungsten/tsb2')
  -d,--group_dir:  sandbox group directory name (default: 'tr_dbs')
  -s,--service:  how the service is named (default: 'tsandbox')
  -P,--base_port:  port base for MySQL sandbox nodes (default: 710)
  -l,--thl_port:  port for the THL service (default: 1211)
  -r,--rmi_port:  port for the RMI service (default: 1010)
  -v,--[no]version:  show Tungsten sandbox version (default: false)
  -h,--[no]help:  show Tungsten sandbox help (default: false)

In my server, I have already expanded MySQL 5.5.10, and I want to install inside $HOME/tsb. So, here is what I do:

./tungsten-sandbox -m 5.5.10 -t ~/tsb

This command installs three instances of MySQL under $HOME/sandboxes and three of Tungsten under $HOME/tsb. Inside this directory, in addition to the running instances, we find some more goodies:

clear_all send_kill_all start_all status_all stop_all use_all : symlinks to the corresponding commands in MySQL sandbox
db1 db2 db3: these are the tungsten instances.
n1 n2 n3: quick links to access each MySQL node
replicator_all trepctl_all: utilities that run "replicator" or "trepctl" for each node with the arguments provided on the command line

Additionally, there are a few scripts inside each Tungsten instance in the sandbox:

trepctl: a link to the deep down trepctl
thl: a link to the thl utility
show_log: a quick way of showing the replicator log

Since this is a tool for testing, removing it is going to be easy.

~/tsb/replicator_all stop
~/tsb/clear_all
rm -rf ~/tsb/*

A final method of installing is a sandbox with ths slave-direct method. There is no dedicated script for this method, but thanks to the new installer, you can get the job done quite easily:

export NODE_OPTIONS='-c innodb_flush_log_at_trx_commit=2 -c max_allowed_packet=48M'
make_multiple_sandbox --group_directory=tr_dbs --sandbox_base_port=7100 5.5.10

TUNGSTEN_BASE=$HOME/tinstall/tsb/
./tools/tungsten-installer \
    --direct \
    --master-host=127.0.0.1 \
    --master-port=7101 \
    --slave-host=db2 \
    --slave-port=7102 \
    --master-user=root \
    --slave-user=root \
    --master-password=msandbox \
    --slave-password=msandbox \
    --master-log-directory=$HOME/sandboxes/tr_dbs/node1/data \
    --service-name=Castor \
    --thl-port=12112 \
    --channels=5 \
    --rmi-port=20000 \
    --home-directory=$TUNGSTEN_BASE \

./tools/tungsten-installer \
    --direct \
    --master-host=127.0.0.1 \
    --master-port=7101 \
    --slave-host=db3 \
    --slave-port=7103 \
    --master-user=root \
    --slave-user=root \
    --master-password=msandbox \
    --slave-password=msandbox \
    --master-log-directory=$HOME/sandboxes/tr_dbs/node1/data \
    --service-name=Pollux \
    --thl-port=22112 \
    --channels=1 \
    --rmi-port=20000 \
    --home-directory=$TUNGSTEN_BASE \
    --relay-directory=$TUNGSTEN_BASE/relay --start-and-report

This script creates two services (Castor and Pollux), with only one instance of Tungsten replicator, with all the servers (MySQL and Tungsten ones) in the same host.

Conclusions

There should be much more to say, but I will leave it for the coming days. In the meantime, I encourage everyone to try the new Tungsten and submit bug reports when things don't work as expected. As always, happy hacking!
P.S. Today at 10am PT there is a webinar on this very topic!