Sunday, November 27, 2005

MySQL 5 general purpose routine library - I : general concepts

A few days ago I started a forum thread that got quite a good acceptance.

There are rumours from the MySQL community that some sort of routine repository is going to be organized sooner or later, and after I heard them I almost dismissed my initial idea.
However, some people convinced me that starting such an initiative could always be beneficial for the community, and after all, if the official repository shows up, it would be easy to merge into it.

Therefore, here it goes. This is the first article of a series that will explain what I have made in the field of general purpose routines.

The project is available at savannah.org. The documentation is till in the making, so I will start with a general overview.

PURPOSE


This routine library will collect general purpose MySQL 5 stored
procedures and functions. What is to be considered "general purpose"?
Any routine that enhances the language itself or that can be useful
in an abstract database fits the bill.

Since its appearance, MySQL stored routines language has proved itself
useful and powerful, and yet it lacks many features that we have become
accustomed to in other languages, and we would like to see in MySQL::SP
as well. Stored routines that improve the language espressiveness and
usability fall in this category.

Furthermore, there are common tasks that can get accomplished by a stored
routine, and are not related to a particular database. Also these routines
are good candidates for this library.

CURRENT CATEGORIES


At the very beginning of this project, I have put some categories that got
populated faster than others, and thus they became, in my view, the ones with
the highest priority.

ARRAYS AND BASIC DATA STRUCTURES


Perhaps the biggest disappointment when MySQL::SP language was disclosed was
the lack of arrays. Not as a SQL type, of which frankly I can do without, as
it breaks normalization, but as a language component. Without arrays, and
without things like queues and stacks, I feel that my programming abilities
are severely cut down.
Thus this first component, which inmplements named arrays where you access
items by index, but that can also be treated like Perl hashes, where items
are accessed by name. Built-in with them are some stack and queue features,
so that an array is automatically a queue, if accessed with shift/unshift,
or a stack, if accessed with push/pop.
Further routines to transform arrays to strings and strings to arrays make
this library ready for prime time.

"FOR EACH" LOOPS


I don't know you, but I find myself doing quite often things like
$ for $X in `mysql -B -N -e 'show tables'` ;  \
do perl -pe "something with $X" | \
mysql -e "something else with $X" ; done

i.e. I get a list out of mysql, treat the list with some sort of filter,
and then feed it back to the database.
Sometimes I fire up a quick Perl script to do the whole task, but most of the
times the whole thing would be just a simple matter if MySQL had a FOR loop
that you can use from a regular SQL statement.
This component came out of this need. Now I have routines that accept
some parameters and perform for me a loop with a counter, or using a
dataset of table values, or walk through an array.

NAMED PARAMETERS


In the process of making the "for each" routines, I came to the disappointing
discovery that I had to use quite a lot of parameters if I wanted to achieve
some powerful results. When you have routines that need more than three
parameters, as every programmer knows, it is quite difficult to get the order
right, and many mistakes can come from this situation. Thus, I wanted to
implement a simple way of calling my complex routines with maned parameters,
something that in Perl I do like this:
function_name(
db_name => 'world',
table_name => 'City',
expression => 'where CountryCode = "ITA"'
);

This way, if I misplace one of the parameters, it does not matter, because
they are retrieved by name. My routine does almost this ("almost", because
lack of language power prevents an exact implementation). It lets you pass
an array of named parameters (key + value) and a hidden mechanism will
sort out the parameters and pass them to the receiving routine in the
correct order.
The corresponding MySQL code would be:
  select array_set('mypar', 'db_name', 'world'),
array_set('mypar', 'table_name', 'City'),
array_set('mypar', 'expression', 'where CountryCode = "ITA"')\G
call simple_sp('function_name', 'mypar');

As a bonus, I now have an array of parameters that I can reuse when I need
them, perhaps after changing one or two parameter values.

SYNTAX HELPERS


As I said, my routines can grow their parameters quite fast, and remembering
all of them can be difficult even for their own author. Here comes another
component that registers all the routines syntax, and has a way of showing
you a quick excerpt of their usage.
This helper is also a key component of the routines that deal with named
parameters.
Just as an example the reference guide that is currently in the repository
was created by just these two calls:
call for_each_table_value_simple(
database(),
'_routine_syntax',
'routine_name',
'routine_type="function"',
'select fsyntax("$I1")')\G
call for_each_table_value_simple(
database(),
'_routine_syntax',
'routine_name',
'routine_type="procedure"',
'select psyntax("$I1")')\G


Curious for more? Visit the project site and if you are interested in contributing, subscribe to the developers mailing list.

2 comments:

Mike Kruckenberg said...

This is an excellent idea, and one I had been wondering about myself. As a person who's done a lot of Perl there's nothing better than finding a well-developed module on CPAN that can be downloaded and used.

kl said...

Thank you. These routines definitely helped. (myself, and others I've recommended them to.)