Sunday, January 15, 2017

MySQL group replication: installation with Docker

Overview

MySQL Group Replication was released as GA with MySQL 5.7.17. It is essentially a plugin that, when enabled, allows users to set replication with this new way.

There has been some confusion about the stability and usability of this release. Until recently, MySQL Group Replication (MGR) was only available in the Labs, which traditionally denotes a preview or an use-at-your-own-risk feature. Several months ago we saw the release of Group Replication as a Docker image, which allowed users to deploy a peer-to-peer cluster (every node is a master.) However, about one month after such release, word came from Oracle discouraging this setup, and inviting users to use Group Replicator in Single Primary mode which is functionally equivalent to traditional replication, with just some synchronous component more. There hasn't been an update of MGR for Docker since.

BTW, some more confusion came from the use of "synchronous replication" to refer to Group Replication operations. In reality, what in many presentations was called synchronous replication is only a synchronous transfer of binary logs data. The replication itself, i.e. the operation that makes a node able to retrieve the data inserted in the master, is completed asynchronously. Therefore, if you looked at MGR as a way of using multiple masters without conflicts, this is not the solution.

What we have is a way of replicating from a node that is the Primary in the group, with some features designed to facilitate high availability solutions. And all eyes are on the next product, which is based on MGR, named MySQL InnoDB Cluster which is MGR + an hormone pumped MySQL Shell (released with the same version number 1.0.5 in two different packages,) and MySQL-Router.

MGR has several limitations, mostly related to multi-primary mode.

Another thing that users should know is that the performance of MGR is inferior to that of asynchronous replication, even in Single-Primary mode. As an example, loading the test employees database takes 92 seconds in MGR, against 49 seconds in asynchronous replication (same O.S., same MySQL version, same server setup.)

Installing MySQL Group Replication

One of the biggest issue with MGR has been the quality of its documentation, which for a while was just lack of documentation altogether. What we have now has a set of instructions that refers to installing group replication in three nodes on the same host. You know, sandboxes, although without the benefit of using a tool to simplify operations. It's just three servers on the same host, and you drive with stick shift.

What we'll see in this post is how to set group replication using three servers in Docker. The advantage of using this approach is that the servers look and feel like real ones. Since the instructions assume that you are only playing with sandboxes (an odd assumption for a GA product) we lack the instructions for a real world setup. The closest thing to a useful manual is the tutorial given by Frédéric Descamps and Kenny Gryp at PerconaLive Amsterdam in October. The instructions, however, are muddled up by the fact that they were using the still unreliable InnoDB Cluster instead of a bare bones Group Replicator. What follows is my own expansion of the sandboxed rules as applied to distinct servers.

The environment:

I am using Docker 1.12.6 on Linux, and the image for mysql/mysql-server:5.7.17. I deploy three containers, with a customized my.cnf containing the bare minimum options to run Group Replication. Here's the template for the configuration files:

$ cat my-template.cnf
[mysqld]
user=mysql
server_id=_SERVER_ID_
gtid_mode=ON
enforce_gtid_consistency=ON
master_info_repository=TABLE
relay_log_info_repository=TABLE
binlog_checksum=NONE
log_slave_updates=ON
log_bin=mysql-bin
relay-log=relay
binlog_format=ROW
log-error=mysqld.err

transaction_write_set_extraction=XXHASH64
loose-group_replication_group_name="aaaaaaaa-bbbb-cccc-dddd-eeeeeeeeeeee"
loose-group_replication_start_on_boot=off
loose-group_replication_local_address= "172.19.0._IP_END_:6606"
loose-group_replication_group_seeds= "172.19.0.2:6606,172.19.0.3:6606,172.19.0.4:6606"
loose-group_replication_ip_whitelist="172.19.0.2,172.19.0.3,172.19.0.4,127.0.0.1"
loose-group_replication_bootstrap_group= off

Here I take a shortcut. Recent versions of Docker assign a predictable IP address to new containers. To make sure I get the right IPs, I use a private network to deploy the containers. In a perfect world, I should use the container names for this purpose, but the manual lacks the instructions to set up the cluster progressively. For now, this method requires full knowledge about the IPs of the nodes, and I play along with what I have.

This is the deployment script:

#!/bin/bash
exists_net=$(docker network ls | grep -w group1 )
if [ -z "$exists_net" ]
then
    docker network create group1
fi
docker network ls

for node in 1 2 3
do
    export SERVERID=$node
    export IPEND=$(($SERVERID+1))
    perl -pe 's/_SERVER_ID_/$ENV{SERVERID}/;s/_IP_END_/$ENV{IPEND}/' my-template.cnf > my${node}.cnf
    datadir=ddnode${node}
    if [ ! -d $datadir ]
    then
        mkdir $datadir
    fi
    unset SERVERID
    docker run -d --name=node$node --net=group1 --hostname=node$node \
        -v $PWD/my${node}.cnf:/etc/my.cnf \
        -v $PWD/data:/data \
        -v $PWD/$datadir:/var/lib/mysql \
        -e MYSQL_ROOT_PASSWORD=secret \
        mysql/mysql-server:5.7.17

    ip=$(docker inspect --format '{{ .NetworkSettings.Networks.group1.IPAddress}}' node${node})
    echo "${node} $ip"
done

This script deploys three nodes, called node1, node2, and node3. For each one, the template is modified to use a different server ID. They use an external data directory created on the current directory (see Customizing MYSQL in Docker for more details on this technique.) Moreover, each node can access the folder /data, which contains this set of SQL commands:

reset master;
SET SQL_LOG_BIN=0;
CREATE USER rpl_user@'%';
GRANT REPLICATION SLAVE ON *.* TO rpl_user@'%' IDENTIFIED BY 'rpl_pass';
SET SQL_LOG_BIN=1;
CHANGE MASTER TO MASTER_USER='rpl_user', MASTER_PASSWORD='rpl_pass' FOR CHANNEL 'group_replication_recovery';
INSTALL PLUGIN group_replication SONAME 'group_replication.so';

Operations

After deploying the containers using the above script, I wait a few seconds to give time to the servers to be ready. I can peek at the error logs, which are in the directories ddnode1, ddnode2, and ddnode3, as defined in the installation command. Then I run the SQL code:

$ for N in 1 2 3; do docker exec -ti node$N bash -c 'mysql -psecret < /data/user.sql' ; done

At this stage, the plugin is installed in all three nodes. I can start the cluster:

$ docker exec -ti node1 mysql -psecret
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 5
Server version: 5.7.17-log MySQL Community Server (GPL)

Copyright (c) 2000, 2016, Oracle and/or its affiliates. All rights reserved.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql> SET GLOBAL group_replication_bootstrap_group=ON;
Query OK, 0 rows affected (0.00 sec)

mysql> START GROUP_REPLICATION;
Query OK, 0 rows affected (1.14 sec)

mysql>SET GLOBAL group_replication_bootstrap_group=OFF;
Query OK, 0 rows affected (0.00 sec)

mysql> select * from performance_schema.replication_group_members;
+---------------------------+--------------------------------------+-------------+-------------+--------------+
| CHANNEL_NAME              | MEMBER_ID                            | MEMBER_HOST | MEMBER_PORT | MEMBER_STATE |
+---------------------------+--------------------------------------+-------------+-------------+--------------+
| group_replication_applier | ecba1582-db68-11e6-a492-0242ac130002 | node1       |        3306 | ONLINE       |
+---------------------------+--------------------------------------+-------------+-------------+--------------+
1 row in set (0.00 sec)

The above operations have started the replication with the bootstrap, an operation that must be executed only once, and that defines the primary node.

After setting the replication, I can enter some data, and then see what happens in the other nodes:

mysql> create schema test;
Query OK, 1 row affected (0.01 sec)

mysql> use test
Database changed
mysql> create table t1 (id int not null primary key, msg varchar(20));
Query OK, 0 rows affected (0.06 sec)

mysql> insert into t1 values (1, 'hello from node1');
Query OK, 1 row affected (0.01 sec)

mysql> show binlog events;
+------------------+------+----------------+-----------+-------------+----------------------------------------------------------------------------+
| Log_name         | Pos  | Event_type     | Server_id | End_log_pos | Info                                                                       |
+------------------+------+----------------+-----------+-------------+----------------------------------------------------------------------------+
| mysql-bin.000001 |    4 | Format_desc    |         1 |         123 | Server ver: 5.7.17-log, Binlog ver: 4                                      |
| mysql-bin.000001 |  123 | Previous_gtids |         1 |         150 |                                                                            |
| mysql-bin.000001 |  150 | Gtid           |         1 |         211 | SET @@SESSION.GTID_NEXT= 'aaaaaaaa-bbbb-cccc-dddd-eeeeeeeeeeee:1'          |
| mysql-bin.000001 |  211 | Query          |         1 |         270 | BEGIN                                                                      |
| mysql-bin.000001 |  270 | View_change    |         1 |         369 | view_id=14845163185775300:1                                                |
| mysql-bin.000001 |  369 | Query          |         1 |         434 | COMMIT                                                                     |
| mysql-bin.000001 |  434 | Gtid           |         1 |         495 | SET @@SESSION.GTID_NEXT= 'aaaaaaaa-bbbb-cccc-dddd-eeeeeeeeeeee:2'          |
| mysql-bin.000001 |  495 | Query          |         1 |         554 | BEGIN                                                                      |
| mysql-bin.000001 |  554 | View_change    |         1 |         693 | view_id=14845163185775300:2                                                |
| mysql-bin.000001 |  693 | Query          |         1 |         758 | COMMIT                                                                     |
| mysql-bin.000001 |  758 | Gtid           |         1 |         819 | SET @@SESSION.GTID_NEXT= 'aaaaaaaa-bbbb-cccc-dddd-eeeeeeeeeeee:3'          |
| mysql-bin.000001 |  819 | Query          |         1 |         912 | create schema test                                                         |
| mysql-bin.000001 |  912 | Gtid           |         1 |         973 | SET @@SESSION.GTID_NEXT= 'aaaaaaaa-bbbb-cccc-dddd-eeeeeeeeeeee:4'          |
| mysql-bin.000001 |  973 | Query          |         1 |        1110 | use `test`; create table t1 (id int not null primary key, msg varchar(20)) |
| mysql-bin.000001 | 1110 | Gtid           |         1 |        1171 | SET @@SESSION.GTID_NEXT= 'aaaaaaaa-bbbb-cccc-dddd-eeeeeeeeeeee:5'          |
| mysql-bin.000001 | 1171 | Query          |         1 |        1244 | BEGIN                                                                      |
| mysql-bin.000001 | 1244 | Table_map      |         1 |        1288 | table_id: 219 (test.t1)                                                    |
| mysql-bin.000001 | 1288 | Write_rows     |         1 |        1341 | table_id: 219 flags: STMT_END_F                                            |
| mysql-bin.000001 | 1341 | Xid            |         1 |        1368 | COMMIT /* xid=144 */                                                       |
+------------------+------+----------------+-----------+-------------+----------------------------------------------------------------------------+
19 rows in set (0.00 sec)

The binary log events show that we are replicating using the ID of the group, instead of the ID of the single server.

In the other two nodes I run the operation a bit differently:

$ docker exec -ti node2 mysql -psecret
mysql> select * from performance_schema.global_variables where variable_name in ('read_only', 'super_read_only');
+-----------------+----------------+
| VARIABLE_NAME   | VARIABLE_VALUE |
+-----------------+----------------+
| read_only       | OFF            |
| super_read_only | OFF            |
+-----------------+----------------+
2 rows in set (0.01 sec)

mysql>  START GROUP_REPLICATION;
Query OK, 0 rows affected (5.62 sec)

mysql> select * from performance_schema.global_variables where variable_name in ('read_only', 'super_read_only');
+-----------------+----------------+
| VARIABLE_NAME   | VARIABLE_VALUE |
+-----------------+----------------+
| read_only       | ON             |
| super_read_only | ON             |
+-----------------+----------------+
2 rows in set (0.01 sec)

mysql> select * from performance_schema.replication_group_members;
+---------------------------+--------------------------------------+-------------+-------------+--------------+
| CHANNEL_NAME              | MEMBER_ID                            | MEMBER_HOST | MEMBER_PORT | MEMBER_STATE |
+---------------------------+--------------------------------------+-------------+-------------+--------------+
| group_replication_applier | ecba1582-db68-11e6-a492-0242ac130002 | node1       |        3306 | ONLINE       |
| group_replication_applier | ecf2eae5-db68-11e6-a492-0242ac130003 | node2       |        3306 | ONLINE       |
+---------------------------+--------------------------------------+-------------+-------------+--------------+
2 rows in set (0.01 sec)

Now the cluster has two nodes, and I've seen that the nodes are automatically defined as read-only. I can repeat the same operation in the third one.

$ docker exec -ti node2 mysql -psecret
mysql> START GROUP_REPLICATION;
Query OK, 0 rows affected (2.35 sec)

mysql> select * from performance_schema.replication_group_members;
+---------------------------+--------------------------------------+-------------+-------------+--------------+
| CHANNEL_NAME              | MEMBER_ID                            | MEMBER_HOST | MEMBER_PORT | MEMBER_STATE |
+---------------------------+--------------------------------------+-------------+-------------+--------------+
| group_replication_applier | ecba1582-db68-11e6-a492-0242ac130002 | node1       |        3306 | ONLINE       |
| group_replication_applier | ecf2eae5-db68-11e6-a492-0242ac130003 | node2       |        3306 | ONLINE       |
| group_replication_applier | ed259dfc-db68-11e6-a4a6-0242ac130004 | node3       |        3306 | ONLINE       |
+---------------------------+--------------------------------------+-------------+-------------+--------------+
3 rows in set (0.00 sec)

What about the data? It's been replicated:

mysql> show schemas;
+--------------------+
| Database           |
+--------------------+
| information_schema |
| mysql              |
| performance_schema |
| sys                |
| test               |
+--------------------+
5 rows in set (0.00 sec)

mysql> show tables from test;
+----------------+
| Tables_in_test |
+----------------+
| t1             |
+----------------+
1 row in set (0.01 sec)

Monitoring

In this flavor of replication there is no SHOW SLAVE STATUS. Everything I've got is in performanceschema tables and in mysql.slavemasterinfo and mysql.slaverelayloginfo, and sadly it is not a lot.

mysql> select * from replication_group_member_stats\G
*************************** 1. row ***************************
                      CHANNEL_NAME: group_replication_applier
                           VIEW_ID: 14845163185775300:3
                         MEMBER_ID: ecba1582-db68-11e6-a492-0242ac130002
       COUNT_TRANSACTIONS_IN_QUEUE: 0
        COUNT_TRANSACTIONS_CHECKED: 3
          COUNT_CONFLICTS_DETECTED: 0
COUNT_TRANSACTIONS_ROWS_VALIDATING: 0
TRANSACTIONS_COMMITTED_ALL_MEMBERS: aaaaaaaa-bbbb-cccc-dddd-eeeeeeeeeeee:1-6
    LAST_CONFLICT_FREE_TRANSACTION: aaaaaaaa-bbbb-cccc-dddd-eeeeeeeeeeee:5
1 row in set (0.00 sec)


mysql> select * from replication_connection_status\G
*************************** 1. row ***************************
             CHANNEL_NAME: group_replication_recovery
               GROUP_NAME:
              SOURCE_UUID:
                THREAD_ID: NULL
            SERVICE_STATE: OFF
COUNT_RECEIVED_HEARTBEATS: 0
 LAST_HEARTBEAT_TIMESTAMP: 0000-00-00 00:00:00
 RECEIVED_TRANSACTION_SET:
        LAST_ERROR_NUMBER: 0
       LAST_ERROR_MESSAGE:
     LAST_ERROR_TIMESTAMP: 0000-00-00 00:00:00
*************************** 2. row ***************************
             CHANNEL_NAME: group_replication_applier
               GROUP_NAME: aaaaaaaa-bbbb-cccc-dddd-eeeeeeeeeeee
              SOURCE_UUID: aaaaaaaa-bbbb-cccc-dddd-eeeeeeeeeeee
                THREAD_ID: NULL
            SERVICE_STATE: ON
COUNT_RECEIVED_HEARTBEATS: 0
 LAST_HEARTBEAT_TIMESTAMP: 0000-00-00 00:00:00
 RECEIVED_TRANSACTION_SET: aaaaaaaa-bbbb-cccc-dddd-eeeeeeeeeeee:1-6
        LAST_ERROR_NUMBER: 0
       LAST_ERROR_MESSAGE:
     LAST_ERROR_TIMESTAMP: 0000-00-00 00:00:00
2 rows in set (0.00 sec)

Compared to regular replication, we lose the ID of the node where the data was originated. Instead, we get the ID of the group replication (which we set in the configuration file.) This is useful for a smoother operation of replacing the primary node (a.k.a. the master) with another node, but we have lost some valuable information that could have been added to the output rather than simply being replaced. Another valuable piece of information that is missing is the transactions that were executed (we only see RECEIVED_TRANSACTION_SET.) As in regular replication, we can get this information with "SHOW MASTER STATUS" or "SELECT @@global.gtid_executed", but as mentioned in improving the design of MySQL replication there are several flaws in this paradigm. What we see in MGR is a reduction of replication monitoring data, while we would have expected some improvement, given the complexity of the operations for this new technology.

Summing up

MySQL Group Replication is an interesting technology. If we consider it in the framework of a component for high availability (which will be completed when the InnoDB Cluster is released) it might improve the workflow of many database users.

As it is now, however, it gives the feeling of being a rushed up piece of software that does not offer any noticeable advantage to users, especially considering that the documentation released with it is far below the standards of other MySQL products.

No comments: