Tuesday, September 30, 2008

A cool idea - Revision engine


Yesterday I saw an announcement in the MySQL internals@ list, about a new storage engine being released. DDengine has created a revision engine, a sort of embedded proxy inside MySQL that keeps track of the changes you do to your data.
The idea is clever. You write to your table, update and delete without any concern, and the revision engine stores your changes in the background.
I wanted to tried the engine on my laptop (Mac OSX), but there was no binary available for this architecture. I contacted the authors and I received prompt assistance until I had the product installed. Unfortunately, it crashes immediately. Well, it's to be expected for a version 0.1.
I then tried on Linux, and also here I received very quick assistance from Peter Benjamin Volk, Project Head at DDengine.
It now works on Linux, although it's very basic at the moment.
The documentation says that there are two ways of using the revision engine. One with embedded changes, which involves a silent alteration of your table structure to include the revision info, and one with two tables, one of which is created behind the scenes.
For example:
create table t1 (
id int not null primary key,
c char(10)
) engine=revision comment="InnoDB:DOUBLE";
show tables;
+----------------+
| Tables_in_test |
+----------------+
| t1 |
| t1_revision |
+----------------+

desc t1;
+-------+----------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+-------+----------+------+-----+---------+-------+
| id | int(11) | NO | PRI | NULL | |
| c | char(10) | YES | | NULL | |
+-------+----------+------+-----+---------+-------+

desc t1_revision;
+--------------------+---------------------+------+-----+
| Field | Type | Null | Key |
+--------------------+---------------------+------+-----+
| id | int(11) | NO | PRI |
| c | char(10) | YES | |
| revision_id | int(10) unsigned | NO | PRI |
| revision_timestamp | timestamp | NO | |
| revision_deleted | tinyint(3) unsigned | NO | |
+--------------------+---------------------+------+-----+
The "InnoDB:DOUBLE" tells the engine to use a second table for revision info.
Operations on the table are transparent.
insert into t1 (id,c) values (1, 'aaa'), (2, 'bbb');
Query OK, 2 rows affected (0.01 sec)
Records: 2 Duplicates: 0 Warnings: 0

insert into t1 (id, c) values (3, 'ccc'), (4, 'ddd');
Query OK, 2 rows affected (0.01 sec)
Records: 2 Duplicates: 0 Warnings: 0

select * from t1;
+----+------+
| id | c |
+----+------+
| 1 | aaa |
| 2 | bbb |
| 3 | ccc |
| 4 | ddd |
+----+------+
4 rows in set (0.00 sec)

select * from t1_revision;
Empty set (0.00 sec)
So far, nothing unusual. Let's try some changes.
update t1 set c ='changed' where id = 3;
Query OK, 1 row affected (0.00 sec)
Rows matched: 1 Changed: 1 Warnings: 0

select * from t1;
+----+---------+
| id | c |
+----+---------+
| 1 | aaa |
| 2 | bbb |
| 3 | changed |
| 4 | ddd |
+----+---------+
4 rows in set (0.00 sec)

show variables like '%revision%';
+----------------------+---------+
| Variable_name | Value |
+----------------------+---------+
| revision_select_mode | current |
+----------------------+---------+
1 row in set (0.00 sec)

select * from t1_revision;
+----+------+-------------+---------------------+------------------+
| id | c | revision_id | revision_timestamp | revision_deleted |
+----+------+-------------+---------------------+------------------+
| 3 | ccc | 1 | 2008-09-30 05:45:49 | 0 |
+----+------+-------------+---------------------+------------------+
1 row in set (0.00 sec)

delete from t1 where id = 2;
Query OK, 1 row affected (0.01 sec)

select * from t1_revision;
+----+------+-------------+---------------------+------------------+
| id | c | revision_id | revision_timestamp | revision_deleted |
+----+------+-------------+---------------------+------------------+
| 3 | ccc | 1 | 2008-09-30 05:45:49 | 0 |
+----+------+-------------+---------------------+------------------+
1 row in set (0.00 sec)

set revision_select_mode = 'deleted';
Query OK, 0 rows affected (0.01 sec)

select * from t1_revision;
+----+------+-------------+---------------------+------------------+
| id | c | revision_id | revision_timestamp | revision_deleted |
+----+------+-------------+---------------------+------------------+
| 2 | bbb | 1 | 2008-09-30 05:47:14 | 1 |
+----+------+-------------+---------------------+------------------+
1 row in set (0.00 sec)

select * from t1;
+----+---------+
| id | c |
+----+---------+
| 1 | aaa |
| 3 | changed |
| 4 | ddd |
+----+---------+
3 rows in set (0.00 sec)
The session variable revision_select_mode changes the revision data that to be shown.
At the moment, there is no revert mechanism (or if there is, it's undocumented), but the idea is cool, and I think that this engine will become very useful.

1 comment:

kjordan said...

Much better than having to write a trigger for each table or having to put it into your application logic.