Friday, January 27, 2006

MySQL 5 general purpose routine library - VII: Testing and best practices

(Seventh part of the MySQL 5 general purpose routine library presentation.)

Test units

Testing is often considered a boring activity, something that is not cool enough for a developer to take into consideration.
Sometimes, it's true. Testing is almost as exciting as having a haircut or trimming your nails.
It needs to be done, but you hardly welcome the task.
However, that happens when you don't test with method. Because testing can be fun and it can be as rewarding as any other programming activity.

What you need to do, is to move your testing schedule up to the development schedule and put it in front of the code.That way, you don't have just "some testing", which is usually fumbling with a few examples to ensure that your routine does what you thought it should do.
That's not testing. That's sampling.
And if that fails, you will test even less after you fix the bug (or you think you did),
because that sort of testing is really boring and tiresome.

What you should do instead is to create testing units.
The practice of unit testing (http://en.wikipedia.org/wiki/Unit_testing) calls for the creation of a set of tests to probe all facets of the code you are about to check.
This practice is well known to Perl programmers, since almost all modules from Perl itself and from CPAN come with a standard set of tests. Some modules have even thousands of them.
The clever addition of recent practice, though, is not just writing the testing units,
but writing them before the actual code.
The idea is this: write your specifications as testing units, i.e. put down a stream of cases to prove that the code being produced meets your expectations. If your specifications are detailed and complex, so will be the testing units, which will include as many detail as your specification requires.
Using this approach, you will know that your routine works when it passes all the tests.

Unit testing in MySQL

Testing in Perl is a powerful experience. There are so many tools available, that testing it's becoming an art in itself. There are modules to help you test just about everything. Writing your tests in Perl becomes a very easy task, even pleasant sometimes.
Coming from such experience, the testing environment for MySQL stored routines looks just, well, lacking.
Therefore my first thought when I was facing the task was "let's do it in Perl", and so I did it.
My first test unit (for the arrays module) was in Perl. It did not take long to write, but I found myself writing quite a lot to describe how the tests work in Perl.
And then I realized that, for a non Perl programmer, installing the Perl DBI and DBD::mysql modules is not an easy task, and it can be actually a frightening experience.
The same goes for another well established testing environment for C and C++. Who is not used to compiling and linking a C program will find even a simple task very demanding.
Thus I came to the logical conclusion that testing for stored routines has to be done with MySQL resources alone. It is not very advanced, and it pales compared to Perl, but it can be enhanced with some well thought structures and routines.
So, here goes.
Let's see how unit testing can be performed with MySQL and the general purpose stored routines.

Testing support in mysql-sr-lib

To ease the task of testing, there are a few routines. The most important one, and perhaps the only one that you may need in setting a test, is log_test, which expects four parameters

procedure log_test(
p_description varchar(200),
p_result text,
p_expected text,
p_outcome boolean)
The first one is a description of what we are testing, perhaps the routine name, with some additional short comment. The second one is the result we are inspecting. The third one is a description of what we are expecting. The last one is a boolean, and it should receive an expression, the evaluation of which will tell us whether the test succeeded or not. I know that it sounds obscure, but it will become clearer with the examples below.
Furthermore, we have some generalized tests for something that is quite common during our testing tasks. It is very likely that, testing stored routines, we want to test if such routine exists. And it is equally common to test if a given table exists. So, there are procedures to perform these simple repetitive tasks: check_routine and check_table do the job for you, and if you look at the source code for them, you will see just a clear example of how a test should be carried out. If you have several routines to check at once (as it is the case of this collection, for instance) you can benefit from check_routine_simple and check_for_routines_existence. The first one will let you set a user variable @database and use it to check your routines, so you will set the database parameter only once. The second one gets the routine names form a table (_routine_list) and calls check_routine for each of them. This latest case will raise a question: "how do I know how the table structure?" Don't worry. Before starting your testing work, call initialize_tests, and that, among other hidden tasks, will be done for you. To see different uses of these routines, inspect the source code of test_*.mysql in the "tests" directory. for now, it's enough to say that these three uses are equivalent:
 call initialize_tests();

connect;
call check_routine(database(), 'myp1', 'procedure');
call check_routine(database(), 'myf2', 'function');

connect;
set @database = database();
call check_routine_simple('myp1', 'procedure');
call check_routine_simple('myf2', 'function');

connect;
insert into _routine_list (routine_name, routine_type) values
('myp1', 'procedure'),
('myf2', 'function');
call check_for_routines_existence(database());
Last, there is show_test_results, which gives you an account of what you have tested, listing the result of each test, a summary of passed and failed ones, and a further list of failed test for your easy review. And now it's time for some hands on.

Short tutorial: Writing a testing unit

For this brief exercise, we want to create two stored routines. A procedure that creates a table with a character column and two numeric columns, and a function that will return the sum of the two numbers given the string.
Some facts: The name of the table is 'my_data'. The procedure, 'make_table,' won't have any parameters, the function, 'get_result,' will have one. These facts are enough to set our tests.
   -- test_tutorial.mysql
call initialize_tests();

call check_routine(database(), 'make_table', 'procedure');
call check_routine(database(), 'get_result', 'function');

connect;
call make_table();
call check_table(database(), 'my_data');

connect;
insert into my_data values('first', 1, 2), ('second',10,20),('third', 100, 200);
set @my_res = get_result('first');
call log_test('get_result first', @my_res, '@my_res = 3', @my_res = 3 );

set @my_res = get_result('second');
call log_test('get_result second', @my_res, '@my_res = 30', @my_res = 30 );

set @my_res = get_result('third');
call log_test('get_result third', @my_res, '@my_res = 300', @my_res = 300 );

set @my_res = get_result('fourth');
call log_test('get_result non-existent', @my_res, '@my_res is null',
@my_res is null );

call show_test_results();
This test is what we expect the actual routines to do. It can be part of the user's requirements to signify what you want to achieve. That done, it's time to write the actual code:
   -- test_t1.mysql

delimiter //

drop procedure if exists make_table //
create procedure make_table ()
deterministic
begin
create table my_data
(
name char(10) not null primary key,
val1 int,
val2 int
);
end//

drop function if exists get_result //
create function get_result ( p_name char(10))
returns int
reads sql data
begin
declare mysum int;
select val1*val2 into mysum -- intentional bug
from my_data
where name = p_name;
return mysum;
end//

delimiter ;
Now we have all the components. To perform the tests, we need to install the test utilities and the globals module.


$ mysql databasename < globals.mysql
$ mysql databasename < test_utilities.mysql


and then we can install the two routines:

   $ mysql databasename < test_t1.mysql


Finally, we run the test. Don't forget to add the '-t' option to mysql.


$ mysql -t databasename < test_tutorial.mysql
+---------+-------------------------+--------+-----------------------+------------+
| test no | description | result | expected | outcome |
+---------+-------------------------+--------+-----------------------+------------+
| 1 | make_table p exists | 1 | routine_exists = true | pass |
| 2 | get_result f exists | 1 | routine_exists = true | pass |
| 3 | table my_data exists | 1 | table_exists = true | pass |
| 4 | get_result first | 2 | @my_result = 3 | ** fail ** |
| 5 | get_result second | 200 | @my_result = 30 | ** fail ** |
| 6 | get_result third | 20000 | @my_result = 300 | ** fail ** |
| 7 | get_result non-existent | NULL | @my_result is null | pass |
+---------+-------------------------+--------+-----------------------+------------+
7 rows in set (0.00 sec)

+-------------+--------+--------+-------------------------+
| total tests | passed | failed | passed tests percentage |
+-------------+--------+--------+-------------------------+
| 7 | 4 | 3 | 57.14 |
+-------------+--------+--------+-------------------------+
1 row in set (0.00 sec)

+---------+-------------------+--------+------------------+------------+
| test no | description | result | expected | outcome |
+---------+-------------------+--------+------------------+------------+
| 4 | get_result first | 2 | @my_result = 3 | ** fail ** |
| 5 | get_result second | 200 | @my_result = 30 | ** fail ** |
| 6 | get_result third | 20000 | @my_result = 300 | ** fail ** |
+---------+-------------------+--------+------------------+------------+


The tests worked, but something surely went wrong, because three tests were unsuccessful. We expected the result to be 3, 30,300, and instead we got 2,200,20000. Time to inspect the source, and we find quickly that get_result has val1*val1 where it should have been val1+val2. Let's correct the mistake, and run the test again

   $ mysql databasename < test_t1.mysql
$ mysql -t databasename < test_tutorial.mysql
+---------+-------------------------+--------+-----------------------+---------+
| test no | description | result | expected | outcome |
+---------+-------------------------+--------+-----------------------+---------+
| 1 | make_table p exists | 1 | routine_exists = true | pass |
| 2 | get_result f exists | 1 | routine_exists = true | pass |
| 3 | table my_data exists | 1 | table_exists = true | pass |
| 4 | get_result first | 3 | @my_result = 3 | pass |
| 5 | get_result second | 30 | @my_result = 30 | pass |
| 6 | get_result third | 300 | @my_result = 300 | pass |
| 7 | get_result non-existent | NULL | @my_result is null | pass |
+---------+-------------------------+--------+-----------------------+---------+
7 rows in set (0.00 sec)

+-------------+--------+--------+-------------------------+
| total tests | passed | failed | passed tests percentage |
+-------------+--------+--------+-------------------------+
| 7 | 7 | 0 | 100.00 |
+-------------+--------+--------+-------------------------+


Now we got what we wanted. The routine works reasonably well. Notice that we also included a test for failure, when we asked for a value 'fourth', which does not exist in the table.

best practices



Enlightened by the testing example, we can draw some guidelines for developing stored routines. Actually, this advice can be successfully applied to any programming language, and I owe this great advice to Damian Conway (Ten essential development practices). Follow these simple rules, and you'll see the quality of your routines improving soon.


Write the interface

Before writing any code, write down the interface, i.e. how you would like to call the routine, with the necessary parameters. Possibly, write down the documentation for such a routine, describing how to invoke it and what it should do. Write down every detail of the implementation such as environment variables, dependencies, side effects, and so on.

Write a test unit for your routine

Before writing the code, there is yet another step. Write some tests for your routine. You already know the interface, so you can create the tests for your routine. For every feature described in the documentation, you should prepare at least one test. Make tests for success and for failure. For example, if your routine has some pre-requisites, test that it fails when such requisites aren't met.


As soon as you implement one feature of the routine, test it

When finally writing the code, run the tests on your routine, to see if that feature works as expected. If the test for that feature fails, don't go on developing further features until the failing one is fixed.

When your routine passes all tests, you're done

Your work is finished as soon as your routine passes all the tests. You'll be amazed at the low number of bugs that this approach leaves behind. Most of them are caught during the test.

2 comments:

rpbouman said...

Hi Giuseppe,


It's always a pleasure to read your articles, but this time I enjoyed it so much, I want to thank you explicitly!

Now, maybe it's because I am not familiar with perl and unit testing that I am so enthusiastic - the result is what counts, and the result here is that I now have a clear ideat about what unit testing is, and how it is supposed to fit in as a part of the development process.

I think it's also great you are releasing a module that helps mysql developers devise test. It may pale as compared to pearl - it shines very bright as compared to nothing at all when you stick to default functionality. So, this is definitely something I am going to try out.

Thanks again for the clear insights provided in your article

Roland

Giuseppe Maxia said...

Hi, mysquser.
I don't see the problem.
You are comparing a string with two spaces in it to a string with only one space. If they were not datetime columns, you would expect the comparison to fail.
Thus, if you were expecting a datetime in your variable, you must enforce it, e.g.:
set @some_date = '2006-01-18 14:10:03' + interval 0 day;
After this, the comparison works as you expected.