Sunday, January 30, 2011

A first look at delayed replication in MySQL 5.6

Delayed replication If you like fresh features, you should not miss this one. MySQL 5.6.2 includes, among other improvements, the implementation of Time delayed replication, a feature that lets you tell the slave not to apply changes from the master immediately, but to wait N seconds.
The feature is documented in WL#344. (There was a manual online as well together with the binaries for MySQL 5.6.0, but they were removed after a few days for a good reason. I am confident that both the manual and some binaries will eventually show up soon).
Since as of today there are no binaries for MySQL 5.6.x, you need to get the code and compile it yourself. Just get the code from https://code.launchpad.net/mysql-server and compile it using the instructions in building MySQL 5.5 with cmake.
To get a taste of this new feature, the quickest way is to set up replication using the binaries that you have built and MySQL Sandbox.
make_replication_sandbox  mysql-5.6.2-m5-osx10.6-.tar.gz 
# the file name may change, depending on the operating system you are using
Soon you will have one master and two slaves in $HOME/sandboxes/rsandbox_5_6_2.
What you have to do is connect to one of the slaves and enter these commands:

STOP SLAVE;
change master to master_delay=60;
START SLAVE;
Let's say that you did this to slave #2.
Now whatever you do in the master will be replicated immediately in slave #1, but it will executed with 60 seconds delay in slave #2.
To be clear, the IO_THREADs of both slaves keep getting data from the master as fast as they can, same as they did until version 5.5, but slave #2 will hold the SQL_THREAD for the defined amount of seconds.
This new state is visible in the output of the SHOW SLAVE STATUS command, which lists this information after you do something in the master like creating a table or inserting data:

               SQL_Delay: 60
     SQL_Remaining_Delay: 43
 Slave_SQL_Running_State: Waiting until MASTER_DELAY seconds after master executed event
The main purpose of delayed replication is to protect the server against human mistakes. If I accidentally drop a table, the statement is instantly replicated to all the slaves, but it is not executed to the delayed slaves.

$ ./m -e 'drop table test.t1 '
$ ./use_all 'show tables from test'
# master  
# server: 1: 
# server: 2: 
Tables_in_test
t1
The table is gone in the master, and it is gone in the regular slave, but it is still there in the delayed slave. And if I detect the problem before the delayed statement gets executed (a delay time longer than 60 seconds would be advisable in this case, 3600=1 hour, seems healthier), then I may be able to recover the data.

I notice en passant that there is much more than delayed replication going on in MySQL 5.6. For example, the information_schema tables related to InnoDB have increased from 7 to 18:

 show tables from information_schema like 'innodb%';
+----------------------------------------+
| Tables_in_information_schema (innodb%) |
+----------------------------------------+
| INNODB_CMPMEM                          |
| INNODB_TRX                             |
| INNODB_BUFFER_PAGE                     | *
| INNODB_LOCK_WAITS                      |
| INNODB_SYS_TABLESTATS                  | *
| INNODB_CMP                             |
| INNODB_SYS_COLUMNS                     | *
| INNODB_CMPMEM_RESET                    |
| INNODB_SYS_FOREIGN_COLS                | *
| INNODB_BUFFER_PAGE_LRU                 | * 
| INNODB_BUFFER_POOL_STATS               | *
| INNODB_CMP_RESET                       |
| INNODB_SYS_FOREIGN                     | *
| INNODB_METRICS                         | *
| INNODB_SYS_INDEXES                     | *
| INNODB_LOCKS                           |
| INNODB_SYS_FIELDS                      | *
| INNODB_SYS_TABLES                      | *
+----------------------------------------+
18 rows in set (0.00 sec)
# (*) new tables marked with a star
What they do and how to play with them will be matter for some more investigation.

6 comments:

Pavel said...

Nice feature :) BTW, do you happen to know if there are any plans for some helper scripts (thinking maatkit, openark, etc.) to take advantage of this?

Anonymous said...

Do the queries still get written to the binary log right away? Does the delay have any affect there?

Giuseppe Maxia said...

@pavel,
I am not aware of any tools that take advantage of this feature yet. You can safely assume you'd be the first: code away!

@id,
the queries are written immediately to the master's binary log, and they travel from the master to the slave immediately. In fact, delayed replication and semi-synchronous replication work together. (this is a good subject for another post, as combining the two features may hide some bugs).
It's the execution in the slave that is delayed.

Shlomi Noach said...

Just as a quick note: such functionality is provided today by maatkit's mk-slave-delay.

It's great that it is becoming a MySQL feature. I'm using this functionality a lot, and have met a lots of customers who wished they had been using it (a seconds after doing a DROP DATABASE or being hit by some SQL injection).

Giuseppe Maxia said...

@Shlomi,
Indeed. I forgot to mention that you can achieve delayed replication with external tools.
It's nice to see this feature in the server, though.

LenZ said...

Thanks for highlighting this feature, which was kindly contributed by Kay Roepke a while ago. They actually implemented this functionality to test their application's behavior in case of replication lag on the slaves. But of course this feature can serve other purposes as well, e.g. if you have multiple replication stages with different delay times...