The Data Charmer: July 2009

Monday, July 27, 2009

Automating MySQL Librarian tasks

The MySQL Librarian is a collection of community generated content, easy to search and to extend. If you have missed the announcement, you may look at Introducing the MySQL Librarian..
To add a new link, you have several ways. You can just use the Add A Link feature inside the Librarian. If the link to be added comes from Planet MySQL, every post has a link to facilitate the insertion to the Librarian. For everything else, adding a link is a manual task. Until today.

Adding easily to the Librarian

Diego Medina, who should be well known in the community as a very active promoter of MySQL Proxy, and one of the most creative bug finders, has made two additions that have already been incorporated into the Librarian.
If you go to the MySQL Librarian now, you will find a "bookmarklet" that you can drag and drop to your browser toolbar (if your browser is either FireFox or Safari, that is). After that, you have a button in your toolbar, which you can use to quickly add a new link.

For example, let's consider an article that we may want to add to the Librarian. Something like MockLoad on Launchpad - MySQL Proxy.

If you have already dragged and dropped the bookmarklet, you can click on your new button, and get to the Librarian page, with almost all the detailed already filled in for you.

Search engines

Diego didn't limit his contribution to adding links. He has also created a new search engine plugin for FireFox, which lets you use the MySQ Librarian as your search engine.


Adding the Librarian engine	Using the Librarian engine

If your default search engine is the Librarian, your search will be sent to MySQL Librarian, as easily as you search in Google or in any other engine.

Thanks, Diego!

Wednesday, July 22, 2009

OpenSQLCamp democracy

We have seen this before. Actually, we got the idea from Drupal, where talk proposals are public, and the most voted ones get in the schedule. Nonetheless, it's a pleasure to see that a transparent voting system is accepted and used.

The OpenSQLCamp 2009 European edition, is under scrutiny. There are 27 session proposals, from which we will need to get 12 in the schedule.
The open voting is done via Twitter or the mailing list.
I have a good feeling about it. Since I am proposing a public talk, I must be prepared for public scrutiny. It's only fair that my proposal is evaluated by the same audience that will later decide to come see my presentation. It's an incentive to write a clear and compelling proposal. I can't get away with "I am well known. I will say something about replication."
If you are attending OpenSQLCamp at FrOSCon, let us know what you would like to see. Let the best proposals win.

Monday, July 06, 2009

Sharding for the masses: the spider storage engine

In my previous article about the Spider storage engine, I made some tests and I saw that the engine has potential. I did also identify some problems, which were promptly addressed by the author. I have looked at the latest version (0.12), and the results are very much encouraging.

Installing the Spider storage engine is not trivial but not extremely difficult either. My previous article has a step-by-step explanation. Before installing, though, you need to apply one patch to the server, to enable condition pushdown to the partitions engine. What is it about? In short, when you issue a query from the Spider table, with a WHERE clause, the query is passed to a backend database without the WHERE clause. True, it's filtered by partition, but each partition may have a large data set, and this means a full table scan on the backend server. In a few cases, this is not a big deal, because the general performance of the Spider engine is better than a non-partitioned table, but it can be improved.
The lack of conditions pushdown is a problem already known in the Federated engine. With the above patch applied, the Spider engine pushes the condition down to the back end, and the query is executed with the WHERE clause intact.
Let's consider this simple query:

select * from salaries where from_date='1996-01-10' limit 10;

Before applying the patch, the backend database would execute this:

select `emp_no`,`salary`,`from_date`,`to_date`
from `employees`.`salaries` limit 0,9223372036854775807

After the patch, the backend receives and execute the more sensible

select `emp_no`,`salary`,`from_date`,`to_date`
from `employees`.`salaries`
where (`from_date` = '1996-01-10')
limit 0,9223372036854775807

A final observation. While I was testing the latest version, I noticed something that should have been clear from the beginning, i.e. that a spider engine set of backends can have multiple front-ends. Meaning that you can create a Spider table from a different host, and access the same backends that another host is already using. In theory, it means more flexibility when dealing with large amount of data. In practice, we'll have to test it more and see what happens.

The Data Charmer

Monday, July 27, 2009

Automating MySQL Librarian tasks

Adding easily to the Librarian

Search engines

Wednesday, July 22, 2009

OpenSQLCamp democracy

Monday, July 06, 2009

Sharding for the masses: the spider storage engine

Pages

Giuseppe Maxia

Like my posts? Buy me a book!

Julian's force

Yes, I am one of those

Sapere, saper fare, fare, far sapere

Blog Archive

Planet MySQL