Showing posts with label crash. Show all posts
Showing posts with label crash. Show all posts

Thursday, February 14, 2008

Chasing elusive bugs

Since the announcement of Maria, I have been trying to use the same crashing test with Falcon, to establish conclusively if its crash recovery features are as reliable as I hope.

I tried the same test used for Maria, and after a crash, Falcon did not recover nicely. In fact, the server crashed on restart. I have not been able to repeat that behavior on Linux. Actually, I did, on a remote server that is not available to me at the moment, using an earlier revision, and I am waiting until I am back home to do some more testing with the latest tree.
However, I managed to repeat the problem with Mac OS X.
I run this script on a 6.0.5 server:

set storage_engine=falcon;
drop table if exists t1;
create table t1 (id int, b longblob) ;
insert into t1 values (1, repeat('a',1000000));
insert into t1 select * from t1;
insert into t1 select * from t1;
insert into t1 select * from t1;
insert into t1 select * from t1;
insert into t1 select * from t1;
insert into t1 select * from t1;
insert into t1 select * from t1;
insert into t1 select * from t1;
insert into t1 select * from t1;

When the script reaches the last line (has loaded 128 records and it's inserting the next batch), I stop the server with a vicious kill -9.
Upon restarting, the system looks under control, and I run the command that was previously interrupted:

insert into t1 select * from t1;

Here, the server crashes. I have repeated the above steps several times, and it crashes consistently on Mac OS X. On Linux, which I have on a virtual machine only (I am still on the road!) I can't repeat the crash with the current revision.
So, this is a bug, no questions about that, but not a serious one, unless I manage to prove otherwise.
Perhaps this bug is related to Bug#33517. We'll see.
If anyone has noted something similar, please let me know.

Tuesday, January 29, 2008

Introducing Maria, the new tough storage engine

Over the weekend, MySQL made available a new storage engine, codenamed Maria.
It is an improvement of MyISAM, the flagship engine, adding crash recovery to the already appreciated features of MyISAM. This release is a preview, based on MySQL 5.1 code base. The actual final implementation has not been announced yet, although it is safe to assume that it will happen in the 6.x series.
The source code is available as a bitkeeper tree.
There is some documentation, and more will follow.
What can we do with the new engine? The immediate answer is "use it as we use MyISAM, with crash recovery features".

Let's show an example, using MyISAM: (this demonstration was given live at the Linux Conference Australia during MySQL Mini-conf)

mysql [localhost] {msandbox} (test) > drop table if exists t1;
Query OK, 0 rows affected (1.14 sec)

mysql [localhost] {msandbox} (test) > create table t1 (id int, b longblob) engine=myisam;
Query OK, 0 rows affected (0.00 sec)

mysql [localhost] {msandbox} (test) > insert into t1 values (1, repeat('a',1000000));
Query OK, 1 row affected (0.02 sec)

mysql [localhost] {msandbox} (test) > insert into t1 select * from t1;
Query OK, 1 row affected (0.01 sec)
Records: 1 Duplicates: 0 Warnings: 0

mysql [localhost] {msandbox} (test) > insert into t1 select * from t1;
Query OK, 2 rows affected (0.10 sec)
Records: 2 Duplicates: 0 Warnings: 0

mysql [localhost] {msandbox} (test) > insert into t1 select * from t1;
Query OK, 4 rows affected (0.13 sec)
Records: 4 Duplicates: 0 Warnings: 0

mysql [localhost] {msandbox} (test) > insert into t1 select * from t1;
Query OK, 8 rows affected (0.26 sec)
Records: 8 Duplicates: 0 Warnings: 0

mysql [localhost] {msandbox} (test) > insert into t1 select * from t1;
Query OK, 16 rows affected (1.17 sec)
Records: 16 Duplicates: 0 Warnings: 0

mysql [localhost] {msandbox} (test) > insert into t1 select * from t1;
Query OK, 32 rows affected (2.13 sec)
Records: 32 Duplicates: 0 Warnings: 0

mysql [localhost] {msandbox} (test) > insert into t1 select * from t1;
Query OK, 64 rows affected (4.77 sec)
Records: 64 Duplicates: 0 Warnings: 0

mysql [localhost] {msandbox} (test) > insert into t1 select * from t1;

At this point, while the server is inserting data, from a different shell, we enter a vicious command:

$ kill -9 `pgrep mysqld`

which is as close as you can get to pulling your server plug. The result is disastrous.

ERROR 2013 (HY000): Lost connection to MySQL server during query
mysql [localhost] {msandbox} (test) > exit
Bye

Let's restart the server and see what has happened.

[sandbox@dsl093-167-171 maria_5_1_23]$ ./start
sandbox server started
[sandbox@dsl093-167-171 maria_5_1_23]$ ./use test
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A

Welcome to the MySQL monitor. Commands end with ; or \g.
Your MySQL connection id is 1
Server version: 5.1.23-maria-alpha Source distribution

Type 'help;' or '\h' for help. Type '\c' to clear the buffer.

mysql [localhost] {msandbox} (test) > check table t1;
+---------+-------+----------+-----------------------------------------------------------+
| Table | Op | Msg_type | Msg_text |
+---------+-------+----------+-----------------------------------------------------------+
| test.t1 | check | warning | 1 client is using or hasn't closed the table properly |
| test.t1 | check | warning | Size of datafile is: 256004096 Should be: 128002048 |
| test.t1 | check | error | Record-count is not ok; is 256 Should be: 128 |
| test.t1 | check | warning | Found 256 parts Should be: 128 parts |
| test.t1 | check | error | Corrupt |
+---------+-------+----------+-----------------------------------------------------------+
5 rows in set (0.62 sec)

As you can see, the table got corrupted, because we pulled the plug during an insertion.
Let's do the same operation with a Maria table.

mysql [localhost] {msandbox} (test) > drop table if exists t1;
Query OK, 0 rows affected (0.15 sec)

mysql [localhost] {msandbox} (test) > create table t1 (id int, b longblob) engine=maria;
Query OK, 0 rows affected (0.01 sec)

mysql [localhost] {msandbox} (test) > insert into t1 values (1, repeat('a',1000000));
Query OK, 1 row affected (0.03 sec)

mysql [localhost] {msandbox} (test) > insert into t1 select * from t1;
Query OK, 1 row affected (0.01 sec)
Records: 1 Duplicates: 0 Warnings: 0

mysql [localhost] {msandbox} (test) > insert into t1 select * from t1;
Query OK, 2 rows affected (0.09 sec)
Records: 2 Duplicates: 0 Warnings: 0

mysql [localhost] {msandbox} (test) > insert into t1 select * from t1;
Query OK, 4 rows affected (0.29 sec)
Records: 4 Duplicates: 0 Warnings: 0

mysql [localhost] {msandbox} (test) > insert into t1 select * from t1;
Query OK, 8 rows affected (0.82 sec)
Records: 8 Duplicates: 0 Warnings: 0

mysql [localhost] {msandbox} (test) > insert into t1 select * from t1;
Query OK, 16 rows affected (1.61 sec)
Records: 16 Duplicates: 0 Warnings: 0

mysql [localhost] {msandbox} (test) >
mysql [localhost] {msandbox} (test) > insert into t1 select * from t1;
Query OK, 32 rows affected (3.16 sec)
Records: 32 Duplicates: 0 Warnings: 0

mysql [localhost] {msandbox} (test) > insert into t1 select * from t1;
Query OK, 64 rows affected (6.54 sec)
Records: 64 Duplicates: 0 Warnings: 0

mysql [localhost] {msandbox} (test) > insert into t1 select * from t1;
#
# kill -9 from another terminal
#
ERROR 2013 (HY000): Lost connection to MySQL server during query
mysql [localhost] {msandbox} (test) > exit
Bye

Here we pull the plug as in the MyISAM case, and the server dies.
On restart, the situation looks much different:


[sandbox@dsl093-167-171 maria_5_1_23]$ ./start
sandbox server started
[sandbox@dsl093-167-171 maria_5_1_23]$ ./use test
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A

Welcome to the MySQL monitor. Commands end with ; or \g.
Your MySQL connection id is 1
Server version: 5.1.23-maria-alpha Source distribution

Type 'help;' or '\h' for help. Type '\c' to clear the buffer.

mysql [localhost] {msandbox} (test) > check table t1;
+---------+-------+----------+----------+
| Table | Op | Msg_type | Msg_text |
+---------+-------+----------+----------+
| test.t1 | check | status | OK |
+---------+-------+----------+----------+
1 row in set (0.73 sec)

So, Maria survived the crash, while MyISAM was corrupted. There is still much to do to make this engine ready for production, but it looks like an excellent start.