Tuesday, December 29, 2009

Filtering mysqldump output


MySQLdump filter

Several people have suggested a more flexible approach at mysqldump output in matter of user privileges.
When you dump the data structure for views, triggers, and stored routines, you also dump the permissions related to such objects, with the DEFINER clause.
It would be nice to have such DEFINER clauses removed, or even replaced with the appropriate users in the new database.

The mysqldump filter was created with this need in mind. It allows you to remove all DEFINER clauses and eventually replacing them with a better one.
For example:

mysqldump --no-data sakila | dump_filter --delete > sakila_simple.sql
mysqldump --no-data sakila | dump_filter --replace='newuser@`10.%`' > sakila_secure.sql

The first example removes all references to DEFINER, while the second one replaces every definer with a new user name.

Update Since MySQL Forge is not available anymore, here's the code:


#!/usr/bin/perl
use strict;
use warnings;
use Getopt::Long qw(:config no_ignore_case );

my $replace = undef;
my $delete  = undef;
my $help    = 0;

GetOptions (
   'replace|r=s' => \$replace,
   'delete|d'    => \$delete,
   'help|h'      => \$help,
) or help('unrecognized options');

help() if $help;

if ($delete and $replace) {
    help( 'you must choose EITHER "delete" OR "replace". Not both');
}

$delete = 1 unless $replace;

while (my $line = ) {
    if ($delete) {
        $line =~ s{(/\*!\d+ )\s*definer\s*=\s*\S+}{$1}i;
    }
    elsif ($replace) {
        $line =~ s{(/\*!\d+\s+definer\s*=\s*)(\S+)}{$1 $replace}i;
    } 
    print $line;
}

sub help {
    my ($msg) = @_;
    if ($msg) {
        print "*** $msg\n";
    }
    print "dump_filter - mysqldump filter \n",
          "(C) Giuseppe Maxia, 2009\n",
          "removes/changes DEFINER clauses from MySQL dumps\n",
          "USAGE: dump_filter [options]\n",
          "    -d|--delete    removes the DEFINER clauses\n",
          "    -r|--replace=s replaces every DEFINER clause with the \n",
          "                   new value provided\n",
          "    -h|--help      This text\n";
    exit(1);
}


__END__
notes: MySQLdump filter.
Removes or replaces the DEFINER clauses from a dump.
It is necessary when you want to migrate a database to a new server with different users.

Thursday, December 24, 2009

Holiday gift - A deep look at MySQL 5.5 partitioning enhancements


A deep look at MySQL 5.5 partitioning enhancements

Half a day into my vacation, I managed to finish an article on a topic that has been intriguing me for a while.
Since several colleagues were baffled by the semantics of the new enhancements of MySQL 5.5 partitions, after talking at length with the creator and the author of the manual pages, I produced this article: A deep look at MySQL 5.5 partitioning enhancements.
Happy holidays!

UPDATE This matter was more tricky than it appeared at first sight. As Bug#49861 shows, several MySQL engineers were initially fooled by the multiple column partitions. Also I wrote something wrong in the article, and I updated the text to explain more accurately the behavior of the partitioning engine.

Sunday, December 20, 2009

MySQL Conference 2010 - The call for participation is open


MySQL Conference 2010

The MySQL Conference 2010, with Sun Microsystems as founding sponsor, has opened its Call for participation.
There is already an impressive lineup of tutorials, and I don't say that only because I am on that list. You will find the usual suspects (Replication, Cluster, Certification) and several new ones: Partitioning (covering 5.5), Drizzle replication plugins and core development, Dual master setup, Scaling Applications, Diagnosing and fixing performance, Inspecting variables, command line magic.
The names next to the above topics are all well known: Sheeri K. Cabral, Brian Aker, Jay Pipes, Toru Maesaka, Baron Schwartz, Andrew Hutchings, Andrew Morgan , Geert Vanderkelen, Arjen Lenz, Morgan Tocker, Kai Voigt, Alan Kasindorf, Padraig O'Sullivan, Mats Kindahl, Lars Thalmann, and yours truly.
If you want to join the ranks of these already confirmed folks, it's now time to gather all your skills and make a proposal for a talk.
The Call for Participation opens today, and it's open till the end of January. Not much time, but not a bad deal either. You have certainly enough time to pull an excellent proposal. If you don't remember the rules, you can read again what I wrote about the past conference. The review committee is different, but the current one cares about quality as much as the previous ones.
And consider how much better is it this year for innovative speakers: due to the delay in the conference announcement (on the reasons of which I am not going to speculate), you are in a unique position, since you can propose talks based on technology that has been released in the past two months. There is a whole lot of talks to be created around MySQL 5.5 and other fresh releases.
Don't be lazy. Start writing your proposal now, and you may be in the conference schedule by February!

Tuesday, December 15, 2009

Getting started with MySQL 5.5


MySQL 5.5

Some time go, we announced a new release model for MySQL. As all new things, it had some initial hiccups (with MySQL 5.4 we were still getting acquainted with the new model), but now it seems to be in full swing.
By the time you read these lines, MySQL 5.5 will be available. If the mirrors aren't seeded yet, the impatient can compile and use the new version from the launchpad source tree..

Overview

What's this new release anyway? I'll leave it to Kaj's blog to give you the full description. Suffice it to say that this release is the second milestone of the current installment. It is of beta quality, and it will mature to RC quality. There will be yet another milestone before we release a GA in mid 2010.
One thing that this milestone shows is that there are no dead weights. If a feature doesn't make the deadline, i.e. it doesn't reach beta quality by the scheduled date, it will be dropped, and eventually rescued at the next milestone.
With the introduction of the milestone model, we have also increased our internal QA, especially thanks to the Random Query Generator, which finds bugs in early stages of the code faster than any other method. (1)

Built-in InnoDB plugin

The InnoDB plugin 1.0.5 is included in the distribution, and, unlike MySQL 5.1, it's built-in. There is no need to load and register the plugin. The performance enhancements developed for MySQL 5.4 are now available together with the other enhancements available with the InnoDB plugin. This was already available in the previous milestone, but it's worth mentioning it now, because not many people are aware of that.

Semi-synchronous replication

Of all the new features, this one is probably the most relevant. It is based on a patch made by Google to the InnoDB engine, and adapted by MySQL developers to make it engine-independent.
In short, it's a safety device that establishes some internal communication between master and slaves, and makes sure that at least one slave has received the changes being committed. That is, before committing, the master waits until at least one slave has acknowledged that it has received the portion of binary log necessary to reproduce the transaction.
UPDATE As Harrison Fisk rightly notes, there is a mistake in my description. The acknowledgment of the transaction being sent to the slave only happens after the master's commit to the binary log.
Some caveats apply:
  • It's called semi-synchronous replication, because it doesn't necessarily apply to all the slaves. Although you can manually check if the replication has worked for all the slaves, it's enough for the master to make sure that at least one slave has got the goods.
  • Received by a slave doesn't mean "executed". The slave acknowledges when it has got the binary log, even if the SQL thread is busy or stopped.
  • If there is no slave that can acknowledge receipt (e.g. slaves are all down or stopped), then the master reverts to the normal asynchronous operation.

To use this feature, you need to install two plugins: one for the master and one for each slave. No need to compile anything, though. They are provided with the binaries. All you need to do is load the appropriate plugin for each server.
master >  INSTALL PLUGIN rpl_semi_sync_master SONAME 'libsemisync_master.so';

slave1 > INSTALL PLUGIN rpl_semi_sync_slave SONAME 'libsemisync_slave.so';

Additionally, there are a few variables that you must set, either in the options file or online.
master > SET GLOBAL rpl_semi_sync_master_enabled=1;
slave1 > SET GLOBAL rpl_semi_sync_slave_enabled=1;

Now that the system is ready, let's see how to use it.
Before doing anything, we ask for the value of two important status variables:
SHOW STATUS LIKE 'Rpl_semi_sync%tx';
+-----------------------------+-------+
| Variable_name | Value |
+-----------------------------+-------+
| Rpl_semi_sync_master_no_tx | 0 |
| Rpl_semi_sync_master_yes_tx | 0 |
+-----------------------------+-------+

The first one is the number of failed synchronized transactions, the second one is the number of successful ones. Since nothing has happened so far, they are both zero.
create table t1 (i int not null primary key) engine=innodb;
Query OK, 0 rows affected (0.13 sec)

SHOW STATUS LIKE 'Rpl_semi_sync%tx';
+-----------------------------+-------+
| Variable_name | Value |
+-----------------------------+-------+
| Rpl_semi_sync_master_no_tx | 0 |
| Rpl_semi_sync_master_yes_tx | 1 |
+-----------------------------+-------+
The first operation (a table creation) was successfully transferred to a slave. Let's do one more.
set autocommit=0;
Query OK, 0 rows affected (0.00 sec)

insert into t1 values (1);
Query OK, 1 row affected (0.00 sec)

COMMIT;
Query OK, 0 rows affected (0.00 sec)

SHOW STATUS LIKE 'Rpl_semi_sync%tx';
+-----------------------------+-------+
| Variable_name | Value |
+-----------------------------+-------+
| Rpl_semi_sync_master_no_tx | 0 |
| Rpl_semi_sync_master_yes_tx | 2 |
+-----------------------------+-------+
Also this one was successful.
Now, let's try something sneaky. On each slave, we execute "STOP SLAVE SQL_THREAD". Normal replication would not work, but semi-synchronous replication will go on.

insert into t1 values (2);
Query OK, 1 row affected (0.01 sec)

SHOW STATUS LIKE 'Rpl_semi_sync%tx';
+-----------------------------+-------+
| Variable_name | Value |
+-----------------------------+-------+
| Rpl_semi_sync_master_no_tx | 0 |
| Rpl_semi_sync_master_yes_tx | 3 |
+-----------------------------+-------+
2 rows in set (0.00 sec)

The semi-synch replication has worked. However, if we query both master and slaves, only the master has the new record. The slaves have it only in their relay logs, which you can easily ascertain with mysqlbinlog.

Enhanced partitioning syntax

About one year ago, I briefly announced that this feature was in the making. With some interface improvement, it is now part of the regular partitioning. It's an extension of partitioning BY RANGE. As you know, you can only partition on one column value, and you can only partition on INTEGER columns. Both these restrictions were lifted in 5.5, with a syntax change that makes the code more readable and the overall feature more usable.
You can now partition by date, datetime, varchar, and char columns, not just integers, and you can use more than one column in your list. The most immediate usage of this extension is the ability of using dates without resorting to functions that convert the dates into integers. For example:
CREATE TABLE t2 
(dt date,a int, b int, c int)
PARTITION BY RANGE COLUMNS (dt)
(
PARTITION p0 VALUES LESS THAN ('2007-01-01'),
PARTITION p1 VALUES LESS THAN ('2008-01-01'),
PARTITION p2 VALUES LESS THAN ('2009-01-01'),
PARTITION p3 VALUES LESS THAN (MAXVALUE)
);
The COLUMNS keyword does the trick. The manual has more examples.

The partition helper has been updated to handle this new feature and generate partitions accordingly.

SIGNAL and RESIGNAL


If you have used stored routines extensively, you will certainly have asked yourself "why isn't there any way of raising an exception?" In the SQL standard, exception handling is implemented using the SIGNAL and RESIGNAL keywords, which were notably missing in MySQL 5.0 and 5.1 stored routines.
There have been many clever hacks by several community members to emulate the missing SIGNAL, but none were quite satisfactory. After long waiting here we have SIGNAl and RESIGNAL, which make stored routines programming much more robust and easier to debug. An authoritative example on how to use the new syntax is available in Roland Bouman's blog.

There is more. For the complete list of features, have a look at the official manual.
Happy hacking!

UPDATE Added more partitions to the example, as suggested by Jon.

(1) For the more technologically savvy, here's how Philip Stoev, one of my distinguished QA colleagues, describes the enhancements:
Historically, most of the MySQL tests have been manually created, however a modern database is so complex that it is impossible to test manually even a tiny percentage of the available functionality. Therefore for Betony [codename for MySQL 5.5], and the upcoming Celosia [5.6], the majority of our testing effort was concentrated around stochastic testing, using random data and millions of random queries to validate the behavior of the server across a wide range of scenarios and workloads.
For each new feature, and some existing ones, we automatically generated tests that attempt to cover all relevant SQL constructs, including the interaction between the feature being tested and existing code within the server. For features that have concurrency implications, we ran the random queries as a stress test or along with concurrent DDL statements. For areas such as the partitioning, we used the random queries to functionally validate the new code, by comparing the result from each query to a reference source, such as a previous version of the server.

Monday, December 07, 2009

MySQL user groups in Dubai and Sydney, on my way to NZ


Travel to LCA 2009

In January 2010 I will attend Linux.Conf.Au, which this year is held in Wellington, New Zealand.
It's a long way from Europe to New Zealand, and so I will take a few stops.
On January 13 I will be in Dubai, UAE. If you are around, I would love to organize a MySQL meeting. I haven heard back from the local user group and it seems that a meeting will take place. Stay tuned for more.
On January 15th I will be in Sydney. The organizers are already at work. We will definitely have an user group meeting. I am open to suggestions about the topics.

From Sydney, I will continue to Wellington, where I will attend LCA2009 and then DrupalSouth before coming back to my usual time zone.

Friday, December 04, 2009

Gearman: distributed computing and Codebits pictures


Gearman

The first Codebits> day lasted until long past midnight. So the attendees were a bit sleepy today, but they were brave and got up early enough for my session.
The presentation covered the basics of Gearman, some advanced magic to install remote MySQL servers, and more magic to enable MySQL users to shoot themselves in the foot repeatedly by combining a gearman/MySQL UDF and some clever scripts.
As usual, the slides are available on slideshare.
Some pictures from codebits 2009 are on Flickr.

Codebits 2009 - worshiping technology
Worshiping Technology.
Codebits 2009
Cool technology to worship.

Codebits 2009 - José, the mastermind
José, the mastermind

Codebits 2009 - Josette, the culture provider
Josette, the culture provider

Codebits 2009 - Gonçalo
Gonçalo, the friendly face who met me at the airport
&nbs;
Codebits 2009 - Pedro Moura Pinheiro
Pedro, magnificent guide of town and technology.

MySQL schema maintenance


Workbench

At CodeBits I had my first session about MySQL schema maintenance. I covered the basic command line possibilities before coming to the recommended tool, MySQL Workbench.
The slides are available at slideshare.


Interesting questions: ([updated] with answers from the development team
  • [Q] Are there plans to administer MySQL Cluster with Workbench?
    [A] Not that we know of.

  • [Q] Can Workbench deal with user permission maintenance across servers? (especially in cases where development and production users can't have the same privileges)
    [A] YES. It's in the roadmap

  • [Q] Can MySQL Workbench help editing stored routines? Apparently, you can't change the routine code with ALTER PROCEDURE/FUNCTION, but you need to drop it, recreate it, and eventually change privileges.
    [A] Not that we know of, although something can be done with Proxy to ease this problem.

  • [Q] Is the code for the plugin documented for contributors?
    [A] Not yet, but it's going to be soon.