Thursday, February 14, 2008

Chasing elusive bugs

Since the announcement of Maria, I have been trying to use the same crashing test with Falcon, to establish conclusively if its crash recovery features are as reliable as I hope.

I tried the same test used for Maria, and after a crash, Falcon did not recover nicely. In fact, the server crashed on restart. I have not been able to repeat that behavior on Linux. Actually, I did, on a remote server that is not available to me at the moment, using an earlier revision, and I am waiting until I am back home to do some more testing with the latest tree.
However, I managed to repeat the problem with Mac OS X.
I run this script on a 6.0.5 server:

set storage_engine=falcon;
drop table if exists t1;
create table t1 (id int, b longblob) ;
insert into t1 values (1, repeat('a',1000000));
insert into t1 select * from t1;
insert into t1 select * from t1;
insert into t1 select * from t1;
insert into t1 select * from t1;
insert into t1 select * from t1;
insert into t1 select * from t1;
insert into t1 select * from t1;
insert into t1 select * from t1;
insert into t1 select * from t1;

When the script reaches the last line (has loaded 128 records and it's inserting the next batch), I stop the server with a vicious kill -9.
Upon restarting, the system looks under control, and I run the command that was previously interrupted:

insert into t1 select * from t1;

Here, the server crashes. I have repeated the above steps several times, and it crashes consistently on Mac OS X. On Linux, which I have on a virtual machine only (I am still on the road!) I can't repeat the crash with the current revision.
So, this is a bug, no questions about that, but not a serious one, unless I manage to prove otherwise.
Perhaps this bug is related to Bug#33517. We'll see.
If anyone has noted something similar, please let me know.

No comments: