Thursday, September 22, 2011

Upgrading Tungsten Replicator: as easy as ...

When I talked about the usability improvements of Tungsten Replicator, I did not mention the procedure for upgrading. I was reminded about it by a question in the TR mailing list, and since the question was very relevant, I updated the Tungsten Cookbook with some quick upgrading instructions. A quick upgrading procedure is as important as the installer. Since we release software quite often, either because we have scheduled features to release or because of bug fixes, users want to apply a new release to an existing installation without much fuss. You can do the upgrade with a very quick and painless procedure. Let's suppose that you have installed one Tungsten Replicator cluster using this command:
#
# using tungsten-replicator 2.0.4
#
TUNGSTEN_HOME=/home/tungsten/installs/master_slave
./tools/tungsten-installer \
  --master-slave \
  --master-host=r1 \
  --datasource-user=tungsten \
  --datasource-password=secret \
  --service-name=dragon \
  --home-directory=$TUNGSTEN_HOME \
  --cluster-hosts=r1,r2,r3,r4 \
  --start-and-report
If you want to upgrade to the very latest Tungsten Replicator 2.0.5, build 321, this is what you need to do.
  • Get the latest tarball, and expand it;
  • Stop the replicator;
  • Run the update command (this will also restart the replicator)
  • Check that the replicator is running again.
The actual upgrade command is in bold in the following script.
#
# using tungsten-replicator 2.0.5-321 (get it from bit.ly/tr20_builds) 
#
TUNGSTEN_HOME=/home/tungsten/installs/master_slave
HOSTS=(r1 r2 r3 r4)
for HOST in ${HOSTS[*]} 
do 
   ssh $HOST $TUNGSTEN_HOME/tungsten/tungsten-replicator/bin/replicator stop 
   ./tools/update --host=$HOST --user=tungsten --release-directory=$TUNGSTEN_HOME -q 
   $$TUNGSTEN_HOME/tungsten/tungsten-replicator/bin/trepctl -host $HOST services
done
One benefit of this procedure, in addition to being brief and effective, is that the previous binaries are preserved. Before the upgrade, you will see:
$ ls -lh ~/installs/master_slave/ ~/installs/master_slave/releases
/home/tungsten/installs/master_slave/:
total 32K
drwxrwxr-x 3 tungsten tungsten 4.0K Sep 22 22:03 backups
drwxrwxr-x 2 tungsten tungsten 4.0K Sep 22 22:03 configs
drwxrwxr-x 3 tungsten tungsten 4.0K Sep 22 22:03 relay
drwxrwxr-x 4 tungsten tungsten 4.0K Sep 22 22:06 releases
drwxrwxr-x 2 tungsten tungsten 4.0K Sep 22 22:03 service-logs
drwxrwxr-x 2 tungsten tungsten 4.0K Sep 22 22:03 share
drwxrwxr-x 3 tungsten tungsten 4.0K Sep 22 22:03 thl
lrwxrwxrwx 1 tungsten tungsten   75 Sep 22 22:06 tungsten -> /home/tungsten/installs/master_slave/releases/tungsten-replicator-2.0.4
/home/tungsten/installs/master_slave/releases:
total 8.0K
drwxr-xr-x 6 tungsten tungsten 4.0K Sep 22 22:03 tungsten-replicator-2.0.4
The 'tungsten' directory is a symlink to the actual binaries inside the 'releases' directory. After the upgrade, the same directory looks like this:
ls -lh ~/installs/master_slave/ ~/installs/master_slave/releases
/home/tungsten/installs/master_slave/:
total 32K
drwxrwxr-x 3 tungsten tungsten 4.0K Sep 22 22:03 backups
drwxrwxr-x 2 tungsten tungsten 4.0K Sep 22 22:03 configs
drwxrwxr-x 3 tungsten tungsten 4.0K Sep 22 22:03 relay
drwxrwxr-x 4 tungsten tungsten 4.0K Sep 22 22:06 releases
drwxrwxr-x 2 tungsten tungsten 4.0K Sep 22 22:03 service-logs
drwxrwxr-x 2 tungsten tungsten 4.0K Sep 22 22:03 share
drwxrwxr-x 3 tungsten tungsten 4.0K Sep 22 22:03 thl
lrwxrwxrwx 1 tungsten tungsten   75 Sep 22 22:06 tungsten -> /home/tungsten/installs/master_slave/releases/tungsten-replicator-2.0.5-321

/home/tungsten/installs/master_slave/releases:
total 8.0K
drwxr-xr-x 6 tungsten tungsten 4.0K Sep 22 22:03 tungsten-replicator-2.0.4
drwxr-xr-x 6 tungsten tungsten 4.0K Sep 22 22:06 tungsten-replicator-2.0.5-321
If you did some manual change to the files in 2.0.4, you will be able to retrieve them. Upgrading from earlier versions of Tungsten Replicator is not as smooth. Since we changed the installation format, it has become incompatible from previous versions. Clusters running TR 2.0.3 need to be reinstalled manually. The next upgrade, though, will be much faster!

My three MySQL sessions at OOW 2011 - and much more

Oracle Open World 2011 is approaching. MySQL is very well represented. Sheeri has put together a simple table of all the MySQL sessions at OOW, which is more handy than the Oracle schedule. I will be speaking in three sessions on Sunday, October 2nd.
There are 47 MySQL sessions in total. You can see them in Technocation summary or get the Oracle focus on mysql pdf. There are huge expo halls at OOW. Among them, there is also MySQL. The MySQL Community booth, manned by volunteers, is at Moscone West, Level 2 Lobby. Other MySQL booths are listed in the Technocation summary. On the social side, Oracle ACEs will have a dinner on Sunday evening, and MySQL Oracle ACEs will have another gathering on Monday evening. On Tuesday, October 4th, there is a MySQL Community reception. It's free. You don't need a OOW pass to attend, but registration is required.

Monday, September 19, 2011

Chaining Replication Clusters

MySQL built-in replication includes a concept called relay slave, which allows you to create hierarchical database clusters. You can do the same thing with Tungsten, and this can be done in more than one way. Let's start with two distinct clusters. We can follow the recipes in the Tungsten Cookbook to install a master / slave cluster in three separate hosts and a Tungsten sandbox containing another master/slave cluster. Now, we want to make the master in the sandbox a slave of the master in the first cluster, as illustrated in the figure below. Chaning clusters master to master Notice that the recipe works in exactly the same way for two distinct clusters on separate hosts. We are using one sandbox to minimize the number of hosts involved. To install the bridge between the two cluster, you go to the directory where Tungsten was installed for the master in the sandbox, and run ./tools/configure-service. The purpose of this command is to create a second service in the master, a service that will act as a slave, and fetch data from the other cluster. To do so, we need to provide some information to the service installer, the most important of which are:
  • the local-service-name, a.k.a. who's the boss, will tell the replicator that this service will live with the 'tsandbox' service, which is the local master.
  • the role tells the replicator that this service will fetch data;
  • the master-thl-host is the address where to find a master capable of feeding data to this new slave;
  • the master-thl-port and thl-port options make sure that the service uses one port for its own dispatching and another one to get data from the master.
cd $HOME/tsb2/db1
./tungsten/tools/configure-service -C \
  --local-service-name=tsandbox \
  --thl-port=12111 \
  --role=slave \
  --service-type=remote \
  --master-thl-host=r1 \
  --master-thl-port=2112 \
  --datasource=127_0_0_1 \
  --svc-start \  
  dragon
After this connection, every change in the first cluster master will be replicated to all its slaves, one of which happens to be a master, which will then distribute the same data to all its slaves. So we have a cascade hierarchical replication cluster, similar to what we can have with MySQL native replication. But Tungsten can do something more than that. In MySQL replication, you need to enable a slave to become a relay-slave. In Tungsten, you don't need to do it. Chaning clusters slave to master Using a very similar command, I can connect to a slave of the first cluster, instead of the master, and the final result will be exactly the same.
cd $HOME/tsb2/db1
./tungsten/tools/configure-service -C \
  --local-service-name=tsandbox 
  --thl-port=12111 \
  --role=slave \
  --service-type=remote \
  --master-thl-host=r3 \
  --master-thl-port=2112 \
  --datasource=127_0_0_1 \
  --svc-start \  
  dragon
In my presentations, I call this feature "slave with an attitude". Thanks to Tungsten global transaction ID, a slave can request data to any host. Since the data is not labeled in terms of log files and position (as it is in MySQL), but in terms of sequence numbers, a slave ch ask any server for a given sequence number, and that number identifies a transaction unequivocally.

Friday, September 16, 2011

Quick recipes for database cluster building

One lesson learned in more than two decades working in this industry is that most of the IT professionals are impatient, want to achieve results immediately, and, most importantly, they don't read documentation. Much as the average geek is happy to answer many requests with a dismissive RTFM, the same geeks are not as diligent when it comes to learning about new or updated technologies. For this reason, there is a kind of documentation that is very much appreciated by busy and impatient professionals: cookbooks. And I am not talking about food. Geeks are not known for being cooks (1) and they like fast food. I am talking about collection of technical recipes, short articles where a problem is briefly stated, and a direct solution is shown. Working with Tungsten Replicator, I am constantly amazed at all the things you can do with it, and at the same time, I am amazed at how so few people read the documentation. Since I want more users to be aware of the goodies, and being aware of the geeks' aversion to regular docs, I have started putting together a Tungsten Replicator Cookbook, where users can quickly find the recipe to build their cluster of choice. The problem is stated in one short paragraph, and the solution is outlined with code in the shortest possible way. Can't get any lazier than this! Or maybe you can, but I haven't reached that level yet. I hope I have found a good balance. Some of the recipes that the cookbook offers are:
  • Install a master / slave cluster
  • Install a master slave directory with customized parameters
  • Install more than one Tungsten Replicator in one host
  • Install a direct slave with parallel replication
  • Taking over replication from a MySQL slave in direct mode
  • Install bi-directional replication
  • Install bi-directional replication with additional slave
  • Install a three masters replication
  • Install a four masters replication
  • Modify one or more properties with the installer
  • Add one slave to an existing master
In addition to the cookbook, we have inaugurated a sample of another popular literary genre, namely Troubleshooting recipes. When things go wrong (and they usually do when you are dealing with something new, you want a quick answer to your problem. These troubleshooting items are aimed at making such quick answer readily available. Both projects are moving targets. We will adjust as the project grows. Contributions and comments are welcome. If you have suggestions on how to improve these documents, you can use the mailing list. (1) With some notable exception, I must say. I have a reputation as a good cook. But then, I represent a minority in so many ways.

Welcome, MySQL commercial extensions

I saw yesterday that MySQL has finally done the right thing, and announced new commercial extensions.
What this means is that paying customers receive something more than users who get the community edition for free.
Believe it or not, when I was working in the community team at MySQL, I was already an advocate of this solution. You may see a contradiction, but there isn't. I would like to explain how this works.

An open source product needs to be developed. And the developers need to get paid. Ergo, the company needs to make money from that product if it wants to continue developing it. Either that, or the company needs to sell something else to pay the bills. (Let's not get into the argument that a pure open source project with universal participation is better, faster, or more marvelous: MySQL was never that, not with Oracle, not with Sun, and not when it was an independent company. If you want a extra-super-ultra open project, go with Drizzle. With MySQL, we need to continue reasoning with the raw material at hand.)
When MySQL was scaling its business, it realized that many customers were not willing to pay for support and consulting alone. To expand the business beyond the simple offer of services, the company created MySQL Network, which soon evolved into MySQL Enterprise, with the addition of the MySQL Monitoring tools and a fast upgrade plan. This was a good proposal for small to medium customers, but not as good for customers with large installations. When you deploy thousands of MySQL servers, you really don't want to upgrade every month. Anyway, for some time, the value proposition from MySQL was that the community users would get one release twice a year, and the paying customers would get one every month.
As a community manager, I strongly objected to that formula, not only because it hurts the community, but also because it hurts customers. When the release is exposed to millions of free users before it goes to the paying customers, chances are that serious bugs are discovered by the community and fixed in due time, before it hurts a high profile customer and needs to be fixed in a hurry at higher cost. One of the main values of MySQL is that it's that its large community adoption and feedback increases stability. Fortunately, I was not the only one who believed that larger distribution is valuable for customers, and the decision was reversed at the end of 2008.
In that period, I and other employees recommended a different value proposition for our customers. Instead of selling fast upgrade plans (which become a liability), MySQL could develop some reserved features that would be given only to paying customers.
There are two problems with reserved features, though: you need to develop them internally. You can't start them in the open, asking the community to test them for bugs, and then give them only to customers when they are ready (There was a faux pas in that direction in early 2008, but it was promptly retracted). These features must be developed as closed source, and tested only internally. The second problem is that MySQL had little internal QA manpower when this discussion arose.
There was another issue, namely that the code base for the next intended version (the fabled 6.0) was brittle. After 2 years in alpha stage, there was little to show for the effort. In parallel to the Oracle acquisition, two important things happened: version 6 was abandoned, and a new effort was started, using the more stable version 5.x as a code base, and a new developing model was launched, based on milestones and robustness.
This new foundation, combined with the injection of experienced QA personnel from the ranks of Sun and Oracle, made the project ready to offer reserved features to customers, while continuing the development of a lot more features for the community edition.
From a community standpoint, I welcome the commercial extensions. It means that the MySQL project will get more revenues, and be able to sustain the public release more easily. In my opinion it's the logical evolution of the project, and it's what MySQL should have done already years ago if it had had the manpower.
There are already detractors who see this release as a sign of the apocalypse. They probably want to see only this one feature in the commercial arena, dismissing the dozens of new features released to the general public under Oracle stewardship. I refuse to enroll in the chorus of critics who judge Oracle on prejudice. I am so far satisfied with what Oracle has done with MySQL. My opinion is based on facts, and since the facts are in favor of the community, I am pleased to say so. If future facts will make me change my opinion, I will say it. But now, I want to say thank you to the MySQL team at Oracle!

Thursday, September 08, 2011

The happiness of failing installations

When you set-up the same software several times (for you or for your customers), you want that software to install quickly and reliably, and you are generally happy when everything works as expected.
In this context, a failing installation is when the installation process exits unexpectedly, and you are left with an error message and the prospect of looking at the manual to find out what was it.

A failing installation is unpleasant, you'd say, and I concur. But do you know what's more unpleasant than a failing installation? It's an installation that succeeds, only to fail silently the first time you try using the application.

Looking at this enhanced definition, it is no surprise that I assert to find happiness in failure. And I have practical reasons for my claim. When I first tried Tungsten Replicator installation, it succeeded. And to my chagrin, the application did not work. I had to dig the reason for not working from the logs, and from that reason I had to figure out what I had done wrong. For example, the log might say "file not found mysql-bin.000003", and from that piece of information I had to figure out that I forgot to make the binary logs directory group readable, so that the 'tungsten' user could see the logs.
But a "successful" installation with later failure often meant that a clean shut down was not possible, and then I had to become an expert at cleaning up messy installations.
The next installation may get past the failure point, and possibly fail (again silently) for a different reason. Sometimes, I had to install four or five times until I get to the working and stable point. And then I'd install on another server, and I made a different mistake (or I forgot to apply the cure for a known mistake) and the stream of successful installations with hidden failures continued for a while.

With the above reminiscences, I am very happy to report that now you can install Tungsten Replicator with the near assurance that when something goes wrong, the installation does not start, and you are given a clear list of what was wrong.
The installer runs a long list of validation probes, and it doesn't stop at the first validation failure. It will try its best to tell you what you should do to reach a satisfactory installation, giving you a detailed list of everything that doesn't match up.
Not only that: the installer checks the requirements on all the servers in your intended cluster, and the installation does not start anywhere until you meet all the requirements in all the servers.

That's why, when my installation fails, I feel very happy, knowing that I won't have to clean up a messy server, and when I fix the problem that made the installation fail, my application will most certainly work.

Wednesday, September 07, 2011

Tungsten Replicator 2.0.4 released: usability and power

TR 2 0 4 It has been a bumpy ride, with dozens of issues opened and resolved, but we finally feel that Tungsten Replicator 2.0.4 is ready for prime time. There have been quite a lot of changes. Most notably, the replicator is much faster, especially when it comes to parallel replication, and it is much easier to install, thanks to its new integrated installer, which can validate all the requirements to install the replicator, and suggest remedies when the requirements aren't met. This new installer is so good, in fact, that calling it installer is an insult. It is a legitimate cluster builder, able to install a full fledged cluster from a central location.
Probably equally important, we have caught up with the documentation, and now you can install several replication topologies following detailed instructions from the docs. You will find both HTML and PDF guides, with the steps to install straight master/slave systems, or direct slave takeover, or bi-directional replication. The binaries are available in the project's Downloads page. Later on, you will find the most updated (and possibly less bug-infested) binaries in our build server list. The Release_Notes list all the issues that have been closed since we released 2.0.3. The advanced users will especially appreciate an innovation introduced in the installer, which now allows users to define one or more of --property=key=value. Using this option wisely, you can now customize the replication properties straight at the start. What used to require several commands and a restart of the replicator right after the installation, now flows smoothly and quickly with one single command. With this release, Tungsten Replicator is closer to become a tool for mass consumption. The old installation method (which we have deprecated and renamed, to discourage anyone from using it) required time, constant attention, and it was unforgiving. The new one will let you make your mistakes freely. If something is amiss anywhere in all the servers where you are installing, it won't install and it will tell you what went wrong. This is probably my favorite feature, because it allows Tungsten to be used by less experienced users. Now it's up to the users. We have no illusion that the product is bug free, and we want to hear from users who try it and report on Issues.

Sunday, September 04, 2011

Finding tables without primary keys

I was checking a third party server, and I needed to find if there were tables without primary keys. This is important to know, not only because the lack of primary keys affects performance and data accuracy in general, but also because in row-based replication performance can degrade beyond belief when updating tables without primary keys. Anyway, I did not remember off the bat any method to get this information from a server with thousands of tables, and thus I went to find a solution on my own. My first instinct called for using the COLUMNS table from the INFORMATIOn_SCHEMA, and so I came up with this query, where I sum the number of columns that are inside either a PRIMARY or UNIQUE key and filter only the ones where such sum is zero (i.e. no primary or unique keys):
select 
    table_schema,table_name 
from  
    information_schema.columns  
group by 
    table_schema,table_name   
having 
    sum(if(column_key in ('PRI','UNI'), 1,0)) = 0;
This query got the job done, and it was quite quick as well. Then, since I was chatting with Sheeri Cabral about other matters, I asked her if she could come up with an alternative solution. She suggested a LEFT JOIN between the information_schema.tables and information_schema.statistics, which I translated into this query:
select 
    t.table_schema, t.table_name 
from 
    information_schema.tables  t 
    left join information_schema. statistics s 
       on t.table_schema=s.table_schema and t.table_name=s.table_name 
       and s.non_unique=0 
where 
    s.table_name is null;
This query works on the principle that it removes from the tables list all the ones for which there is no corresponding table in the statistics table. This query also works. Using both queries in a relatively empty server did not show any significant difference. But since I knew that I had to use this method on a very busy server, with a lot of tables, I quickly created 1,000 databases, each containing 5 tables, two of which did not have any primary or unique key. Now came the first surprise. The query with GROUP BY took about 0.5 seconds, while the one using LEFT JOIN used 11 seconds. I was about to congratulate myself for my acumen, when I realized that, in addition to schema and table names, I also needed the table engine. For the second query, that is not a problem. Adding the engine to the columns list works OK, and the query runs in 11 seconds like before. The first query, though, can't list the engine. There is no 'engine' in the COLUMNS table. So I needed a JOIN. Thus my query became
select 
    t.table_schema,t.table_name,engine 
from 
    information_schema.tables t 
    inner join information_schema .columns c  
        on t.table_schema=c.table_schema and t.table_name=c.table_name 
group by 
    t.table_schema,t.table_name   
having 
    sum(if(column_key in ('PRI','UNI'), 1,0)) =0;
Guess what? This query ran in 17 seconds (!). So much for my instinct! Joins without keys are not efficient in MySQL, and tables in the information schema are no exception. If anyone has a more efficient method of getting a list of tables without primary key (the list including schema name, table name, and engine), I am curious to know.

Friday, September 02, 2011

Primary keys from experience

From time to time I see articles in defense of natural primary keys against surrogate keys. I don't take an immovable stand on either side, as I have seen good cases for both. In general, I like the idea of a natural primary key, when I see one that it is really natural. Quite often, though, a natural primary key has proved itself faulty. And most of the times, when such faults happened, it was because of limited understanding of the data. I like to describe this kind of misinformation as data analysis culture clash. When choosing a natural primary key, one should consider which element, or group of elements, are unique in a given set. Not only that, they must be immutable, at least within that set. For example, in a group of people, we may assume that a combination of name, surname, date and place of birth is a good natural primary key. Well, no. It isn't, for several reasons. Read on for a few real cases. If we rule out the above combination, perhaps we could use the Social Security Number as a valid key? Not really. The fact is that the above assumptions work well if we consider only people from the same country in current times. If we extend our data definition to include people from different countries, or historical records, then the assumption collapses. A practical case: The birth place. This is a fair assumption. In combination with other elements (e.g.: date of birth, name, and surname) it can provide good basis for unique and immutable records. If you consider people in the United States today, you are right. Even in the UK, or Italy, or France. But try applying this method to places with recent political changes due to war or revolutions, and suddenly the name of the town may suddenly change. What was before VictoryBurg in Oppresslandia is now known as Heroes City in Freelandia. And this happens now, in the 21st century. If your set includes historical data, these occurrences may become frightfully frequent. Speaking of historical times, if you are dealing with really old records, you may want to consider that dates are less than immutable. The way we count days in most Western countries is called the Gregorian calendar, which was adopted by a handful of countries in 1582. Dates before October 5, 1582 and after October 14, 1582 are recorded using two different calendars. It is inconvenient, but if you know that you may avoid wrong date calculations. Simple, isn't it? Not really. If your records include data from different countries, you will have to take into account when this calendar was adopted. Just to give a few examples, England adopted the Gregorian calendar in 1752, Japan in 1873, Russia in 1918, Turkey in 1926. When dealing with people from different countries, you may be tempted to use citizenship as an immutable property. That may work, if you consider citizenship at birth and are prepared to keep the names of not currently existing states. If, instead, you get the citizenship from the employee's passport, you may incur in one or more of the following cases (all happened in practice during my work with an international organization):
  • Some employees came to work with the passport they had before the political change, and now that the country has split, they have different passports, depending on which side of the new border they live.
  • Some employees have parents from different countries and are entitled to more than one passport. They came to work initially with one passport, and came again a few year later with a different one because of different benefits.
  • Some employees started working with a given citizenship, and then they got a different one because of marriage, political asylum, or other legal means.
  • Some employees became stateless. The equivalent of NULL in a table field.
Coming to the Social Security Number, the assumption of uniqueness fails for the same reason that citizenship does. It's even more frequent, because of people working abroad and paying their taxes in a foreign country, so they had two or more SSN or equivalent to show. But this element failed for another reason, i.e. because it is not immutable. It is supposed to be, but in some cases it happens that, due to clerical mistakes, the SSN issued in a given country is wrong, and needs to be changed. It happens in Italy, where your SSN (called "codice fiscale", or fiscal code) must match your name, surname, place, and date of birth. If any of these elements was wrong when the code was generated, the code needs to be done again. It's a painful process that requires a court order, but the result is that the item is not immutable. Other elements that I have seen used wrongly for primary keys, either standalone or as key components, are: telephone numbers (they can change, and after being changed they can be assigned to other people), email addresses (they are unique, but they can easily change, or they can be abandoned when the person changes provider, or company, or both), gender (it can change), surname (it can change, legally, for male and female, depending on country and conditions). Summing up, a sane amount of skepticism should be used when considering if an element can be used in a primary key. Depending on the environment, the element can be safe or it can become a nightmare when the database grows from a neighborhood business to an international venture.