Introduction: where we stand
Previous episodes:In the latest releases of MySQL and MariaDB we have seen several replication improvements. One of the most exciting additions is the ability to enhance basic replication with multiple sources. Those who have used replication for a while should remember that one of the tenets of the “old” replication was that a slave couldn’t have more than one master. This was The Law and there was no escape ... until now. The only way to work around that prohibition was to use circular replication, also known as ring replication, where each node is slave of the previous node and master of the next one.
Figure 1: Circular replication
This topology can work, but it is quite fragile: if one node breaks, the replication flow is also broken. Every change down the chain will continue up to the broken node, but it does not reach the nodes up the chain.
Figure 2: Circular replication with a broken node.
Of course you can fix a broken circular replication deployment, but it is not easy, and has several tricky points that make this task one of the least liked by DBAs.
Despite this limitation, circular replication has been used in production sometimes, mostly because there was no alternative (well, there is Tungsten Replicator, now part of VMWare Continuent, but not everybody was ready to embrace an external replicator), and because users were trying to solve the HQ-branch problem, also known as the fan-in topology.
What’s a fan-in topology
In a regular master-slave topology, we have one master and one or more slaves. This setup is useful for many scenarios: reducing load of a database backing a web server (load balancing), providing the basis for a rapid replacement of a failed master (disaster recovery), spreading the data to more users than one server could bear, and more.Figure 3: master-slave topology.
One thing that regular replication cannot do is getting data from many input points. The best example is the headquarters of a company, where users need to have data in real time from various branches.
Figure 4: fan-in topology.
You have a visual resemblance between regular replication and fan-in replication. One is the mirror image of the other. While in regular replication the data is produced in one node (master) and conveyed to many consumers (slaves), in a fan-in topology we have many producers (masters) and one consumer (slave). Things are not always this simple. We can have various degrees of fan-in, where many masters replicate to one or more slaves. We could actually have more slaves than masters, if we want. But the main criteria that defines this topology is to have more than one master for each slave.
Figure 5: enhanced fan-in topology.
We will revisit the enhanced fan-in topology soon, as this is the basis for a more complex deployment.
In practice: how to set up a fan-in topology in MySQL 5.7.
In a nutshell, MySQL 5.7 defines syntax enhancements for the existing replication commands, which now allow a channel clause. For example, the command CHANGE MASTER TO now can have one additional clause that identifies the source.CHANGE MASTER TO MASTER_HOST='logistics.local' ... FOR CHANNEL 'logistics';
CHANGE MASTER TO MASTER_HOST='employees.local' ... FOR CHANNEL 'employees';
And that’s basically it. You define the same options that you would use to set up replication to a single master, but you add a channel definition for each master, and run the command once for each source.Similarly, you can start a single channel …
START SLAVE FOR CHANNEL 'logistics';
or all of them at once:START SLAVE;
However, you can’t start multi-source replication out of the box. You need to enable table-based repositories for monitoring. You can do that dynamically, as in the instructions below, but it’s better to add these options to the configuration file.SET GLOBAL master_info_repository = 'TABLE';
SET GLOBAL relay_log_info_repository = 'TABLE';
There is no requirement for GTIDs to use multi-source replication, but the manual says you should be better off if you do. Thus, your configuration file, in all nodes involved in this topology, should have at least the following:[mysqld]
...
master-info-repository=table
relay-log-info-repository=table
gtid_mode=ON
enforce-gtid-consistency
server-id=XXX
log-bin=mysql-bin
relay-log=relay-log
relay-index=relay-log
Let’s suppose we have four servers: host1, host2, host3, and host4. We want to set host1, host2, and host3 as masters, and host4 as the fan-in slave. Once we have set up the recommended options, we can connect to host4 and issue these commands:CHANGE MASTER TO master_host='host1', master_port=3306, master_user='slave_user',
master_password='slavepass', master_auto_position=1
FOR CHANNEL 'NewYork';
CHANGE MASTER TO master_host='host2', master_port=3306, master_user='slave_user',
master_password='slavepass', master_auto_position=1
FOR CHANNEL 'London';
CHANGE MASTER TO master_host='host3', master_port=3306, master_user='slave_user',
master_password='slavepass', master_auto_position=1
FOR CHANNEL 'Paris';
START SLAVE for channel 'NewYork';
START SLAVE for channel 'London';
START SLAVE for channel 'Paris';
If you want to try multi-source topologies without having multiple servers at your disposal, you can use MySQL::Sandbox with several sample scripts on GitHub, as we will see a few paragraphs below .In practice: a fan-in topology in MariaDB 10
The syntax enhancement is similar to MySQL 5.7, but, sadly, it requires a different wording. If you need to run multi-source topologies with both MySQL 5.7 and MariaDB 10, you will have to prepare two sets of commands.In CHANGE MASTER TO, there is no new keyword, but a master name can be added after CHANGE MASTER:
CHANGE MASTER 'logistics' TO MASTER_HOST='logistics.local' ... ;
CHANGE MASTER 'employees' TO MASTER_HOST='employees.local' ... ;
and similarly, the other replication commandsSTART SLAVE 'logistics';
STOP SLAVE 'logistics';
Compared to MySQL 5.7 implementation, this one sounds wrong, or at least funny, as we first say “change master logistics” and then “start slave logistics.” Nitpicks. The good thing is that it works just as well.You need, however, to be careful with MariaDB. Unlike MySQL 5.7, where the domain is defined implicitly by the server identifier, in MariaDB you must define explicitly a domain ID for each data source. In our case, we would run this command in each master:
set global gtid_domain_id=xxx;
Where 'xxx' is a unique integer. As said before about setting crash-safe tables, this option should also go inside the configuration file in each master.Intermezzo: spoiling multi-source
With all its simplicity, multi-source deployment can get out of hand surprisingly quickly. You need to be aware of one important detail: in both MariaDB 10.x and MySQL 5.7 there is a hidden channel named "" (= empty string).This means that you can mix up, inadvertently or willingly, "old" replication and multi-source replication in the same slave. For example, in a regular master/slave topology in MySQL 5.7 replication, you can go to a slave and run these commands:
CHANGE MASTER TO master_host='host1', master_port=3306, master_user='slave_user',
master_password='slavepass', master_auto_position=1
FOR CHANNEL 'NewYork';
START SLAVE for channel 'NewYork';
You can't do the opposite, i.e. in a slave that was defined with the "for channel" clause, you can't run an old fashioned CHANGE MASTER TO: CHANGE MASTER TO MASTER_HOST='127.0.0.1', MASTER_PORT=5708, MASTER_USER='rsandbox', MASTER_PASSWORD='rsandbox', MASTER_AUTO_POSITION=1;
ERROR 3079 (HY000): Multiple channels exist on the slave. Please provide channel name as an argument.
However, you can do what the system wants to prevent by using the empty string channel explicitlyCHANGE MASTER TO MASTER_HOST='127.0.0.1', MASTER_PORT=5708, MASTER_USER='rsandbox', MASTER_PASSWORD='rsandbox', MASTER_AUTO_POSITION=1 for channel '';
Query OK, 0 rows affected, 2 warnings (0.08 sec)
In MariaDB, you can do both things, i.e. adding a empty name channel or running a straight old fashioned CHANGE MASTER TO command, and both will be accepted:CHANGE MASTER TO MASTER_HOST='127.0.0.1', MASTER_PORT=10020, MASTER_USER='rsandbox', MASTER_PASSWORD='rsandbox', MASTER_USE_GTID=current_pos;
# or
CHANGE MASTER '' TO MASTER_HOST='127.0.0.1', MASTER_PORT=10020, MASTER_USER='rsandbox', MASTER_PASSWORD='rsandbox', MASTER_USE_GTID=current_pos;
Even if you can do it without errors (MySQL 5.7 will eagerly throw a few warnings, but that's it) don't do it. As you will see in the next session, there are enough reasons for head scratching in multi-source monitoring without introducing difficulties of your own.In the trenches with multi-source
To see what kind of monitoring we can get with multi source, we will use the sample scripts from GitHub in coordination with MySQL Sandbox.To use these examples, you will need the following:
- MySQL Sandbox 3.0.66 installed in your server;
- A binary tarball of MariaDB 10.0.20, expanded into a directory named $HOME/opt/mysql/ma10.0.20
- A binary tarball of MySQL 5.7.8, expanded into a directory named $HOME/opt/mysql/5.7.8
- The above mentioned MySQL replication scripts from GitHub.
Fan-in with MySQL 5.7
When all the components are in place, we can start the script multi_source.sh, which has a simple syntax:$ ./multi_source.sh
VERSION, FLAVOR, and TOPOLOGY required
Where VERSION is an indentifier like 5.7.7 or ma10.0.20
FLAVOR is either mysql or mariadb
TOPOLOGY is either FAN-IN or ALL-MASTERS
We choose to install first MySQL 5.7 FAN-IN topology:
$ ./multi_source.sh 5.7.8 mysql FAN-IN
installing node 1
installing node 2
installing node 3
installing node 4
group directory installed in $HOME/sandboxes/multi_msb_5_7_8
# server: 1:
# server: 2:
# server: 3:
# server: 4:
# option 'master-info-repository=table' added to node1 configuration file
# option 'relay-log-info-repository=table' added to node1 configuration file
# option 'gtid_mode=ON' added to node1 configuration file
# option 'enforce-gtid-consistency' added to node1 configuration file
# option 'master-info-repository=table' added to node2 configuration file
# option 'relay-log-info-repository=table' added to node2 configuration file
# option 'gtid_mode=ON' added to node2 configuration file
# option 'enforce-gtid-consistency' added to node2 configuration file
# option 'master-info-repository=table' added to node3 configuration file
# option 'relay-log-info-repository=table' added to node3 configuration file
# option 'gtid_mode=ON' added to node3 configuration file
# option 'enforce-gtid-consistency' added to node3 configuration file
# option 'master-info-repository=table' added to node4 configuration file
# option 'relay-log-info-repository=table' added to node4 configuration file
# option 'gtid_mode=ON' added to node4 configuration file
# option 'enforce-gtid-consistency' added to node4 configuration file
# executing "stop" on $HOME/sandboxes/multi_msb_5_7_8
executing "stop" on node 1
executing "stop" on node 2
executing "stop" on node 3
executing "stop" on node 4
# executing "start" on $HOME/sandboxes/multi_msb_5_7_8
executing "start" on node 1
. sandbox server started
executing "start" on node 2
. sandbox server started
executing "start" on node 3
. sandbox server started
executing "start" on node 4
. sandbox server started
# Setting topology FAN-IN
--------------
CHANGE MASTER TO MASTER_HOST='127.0.0.1', MASTER_PORT=8379, MASTER_USER='rsandbox', MASTER_PASSWORD='rsandbox', MASTER_AUTO_POSITION=1 for channel 'node1'
START SLAVE for channel 'node1'
--------------
CHANGE MASTER TO MASTER_HOST='127.0.0.1', MASTER_PORT=8380, MASTER_USER='rsandbox', MASTER_PASSWORD='rsandbox', MASTER_AUTO_POSITION=1 for channel 'node2'
START SLAVE for channel 'node2'
--------------
CHANGE MASTER TO MASTER_HOST='127.0.0.1', MASTER_PORT=8381, MASTER_USER='rsandbox', MASTER_PASSWORD='rsandbox', MASTER_AUTO_POSITION=1 for channel 'node3'
START SLAVE for channel 'node3'
--------------
$HOME/git/mysql-replication-samples/test_multi_source_replication.sh -> $HOME/sandboxes/multi_msb_5_7_8/test_multi_source_replication.sh
The script launches MySQL Sandbox, which installs 4 nodes, and then restarts them with GTID and the other prerequisites for multi-source. Later, it configures node #4 to be the slave of the first 3 nodes. Finally, it copies a test script into the sandbox directory, so that we can see if the replication flow (from each master to the fan-in slave) works as expected.Let's start with that:
$ cd $HOME/sandboxes/multi_msb_5_7_8
$ $ ./test_multi_source_replication.sh
# Tables in server 101
test_node1
# Tables in server 102
test_node2
# Tables in server 103
test_node3
# Tables in fan-in slave
test_node1
test_node2
test_node3
+-----------+--------+--------------+
| server_id | @@port | node |
+-----------+--------+--------------+
| 104 | 8382 | fan-in slave |
+-----------+--------+--------------+
+----+----------+--------+-------+---------------------+
| id | serverid | dbport | node | ts |
+----+----------+--------+-------+---------------------+
| 1 | 101 | 8379 | node1 | 2015-08-11 18:17:44 |
+----+----------+--------+-------+---------------------+
+----+----------+--------+-------+---------------------+
| id | serverid | dbport | node | ts |
+----+----------+--------+-------+---------------------+
| 1 | 102 | 8380 | node2 | 2015-08-11 18:17:44 |
+----+----------+--------+-------+---------------------+
+----+----------+--------+-------+---------------------+
| id | serverid | dbport | node | ts |
+----+----------+--------+-------+---------------------+
| 1 | 103 | 8381 | node3 | 2015-08-11 18:17:44 |
+----+----------+--------+-------+---------------------+
The test script creates a table in each of the masters, and inserts one row in each one. After a few seconds, it attempts to retrieve data from the three tables in the fan-in slave. The output shows that replication is working. (This is the "sentinel method" described in MySQL Replication Monitoring 101).Now we have something to look at in our system. But to make things more interesting, let's introduce more data in one of the masters (we'll load the Sakila database), and create a few transactions in the slave itself. Then we'll start by looking at GTIDs:
$ for N in 1 2 3 4 ; do ./n$N -e 'select @@server_id; select @@global.gtid_executed\G'; done
+-------------+
| @@server_id |
+-------------+
| 101 |
+-------------+
*************************** 1. row ***************************
@@global.gtid_executed: 1ade9710-4042-11e5-9b1c-ee8cf1128871:1-3
+-------------+
| @@server_id |
+-------------+
| 102 |
+-------------+
*************************** 1. row ***************************
@@global.gtid_executed: 1f5bc560-4042-11e5-90a6-d011e342a05a:1-197
+-------------+
| @@server_id |
+-------------+
| 103 |
+-------------+
*************************** 1. row ***************************
@@global.gtid_executed: 2329e9ce-4042-11e5-ae8d-96290ab7793a:1-3
+-------------+
| @@server_id |
+-------------+
| 104 |
+-------------+
*************************** 1. row ***************************
@@global.gtid_executed: 1ade9710-4042-11e5-9b1c-ee8cf1128871:1-3,
1f5bc560-4042-11e5-90a6-d011e342a05a:1-197,
2329e9ce-4042-11e5-ae8d-96290ab7793a:1-3,
27089360-4042-11e5-8b8b-51136eee5e0b:1-2
Looking at the result in node #4 (server_id 104) we see that we have four series of GTIDs: one each from the masters, and one from the slave itself. Getting to know which is which can be tricky, as I warned in my previous article, where we examined the GTID in general.Anyway, we can get more details by looking at the monitoring facilities:
node4 [localhost] {msandbox} ((none)) > SHOW SLAVE STATUS\G
*************************** 1. row ***************************
Slave_IO_State: Waiting for master to send event
Master_Host: 127.0.0.1
Master_User: rsandbox
Master_Port: 8379
Connect_Retry: 60
Master_Log_File: mysql-bin.000002
Read_Master_Log_Pos: 891
Relay_Log_File: mysql-relay-node1.000002
Relay_Log_Pos: 1104
Relay_Master_Log_File: mysql-bin.000002
Slave_IO_Running: Yes
Slave_SQL_Running: Yes
Replicate_Do_DB:
Replicate_Ignore_DB:
Replicate_Do_Table:
Replicate_Ignore_Table:
Replicate_Wild_Do_Table:
Replicate_Wild_Ignore_Table:
Last_Errno: 0
Last_Error:
Skip_Counter: 0
Exec_Master_Log_Pos: 891
Relay_Log_Space: 1313
Until_Condition: None
Until_Log_File:
Until_Log_Pos: 0
Master_SSL_Allowed: No
Master_SSL_CA_File:
Master_SSL_CA_Path:
Master_SSL_Cert:
Master_SSL_Cipher:
Master_SSL_Key:
Seconds_Behind_Master: 0
Master_SSL_Verify_Server_Cert: No
Last_IO_Errno: 0
Last_IO_Error:
Last_SQL_Errno: 0
Last_SQL_Error:
Replicate_Ignore_Server_Ids:
Master_Server_Id: 101
Master_UUID: 1ade9710-4042-11e5-9b1c-ee8cf1128871
Master_Info_File: mysql.slave_master_info
SQL_Delay: 0
SQL_Remaining_Delay: NULL
Slave_SQL_Running_State: Slave has read all relay log; waiting for more updates
Master_Retry_Count: 86400
Master_Bind:
Last_IO_Error_Timestamp:
Last_SQL_Error_Timestamp:
Master_SSL_Crl:
Master_SSL_Crlpath:
Retrieved_Gtid_Set: 1ade9710-4042-11e5-9b1c-ee8cf1128871:1-3
Executed_Gtid_Set: 1ade9710-4042-11e5-9b1c-ee8cf1128871:1-3,
1f5bc560-4042-11e5-90a6-d011e342a05a:1-197,
2329e9ce-4042-11e5-ae8d-96290ab7793a:1-3,
27089360-4042-11e5-8b8b-51136eee5e0b:1-2
Auto_Position: 1
Replicate_Rewrite_DB:
Channel_Name: node1
*************************** 2. row ***************************
Slave_IO_State: Waiting for master to send event
Master_Host: 127.0.0.1
Master_User: rsandbox
Master_Port: 8380
Connect_Retry: 60
Master_Log_File: mysql-bin.000002
Read_Master_Log_Pos: 1369702
Relay_Log_File: mysql-relay-node2.000002
Relay_Log_Pos: 1369915
Relay_Master_Log_File: mysql-bin.000002
Slave_IO_Running: Yes
Slave_SQL_Running: Yes
Replicate_Do_DB:
Replicate_Ignore_DB:
Replicate_Do_Table:
Replicate_Ignore_Table:
Replicate_Wild_Do_Table:
Replicate_Wild_Ignore_Table:
Last_Errno: 0
Last_Error:
Skip_Counter: 0
Exec_Master_Log_Pos: 1369702
Relay_Log_Space: 1370124
Until_Condition: None
Until_Log_File:
Until_Log_Pos: 0
Master_SSL_Allowed: No
Master_SSL_CA_File:
Master_SSL_CA_Path:
Master_SSL_Cert:
Master_SSL_Cipher:
Master_SSL_Key:
Seconds_Behind_Master: 0
Master_SSL_Verify_Server_Cert: No
Last_IO_Errno: 0
Last_IO_Error:
Last_SQL_Errno: 0
Last_SQL_Error:
Replicate_Ignore_Server_Ids:
Master_Server_Id: 102
Master_UUID: 1f5bc560-4042-11e5-90a6-d011e342a05a
Master_Info_File: mysql.slave_master_info
SQL_Delay: 0
SQL_Remaining_Delay: NULL
Slave_SQL_Running_State: Slave has read all relay log; waiting for more updates
Master_Retry_Count: 86400
Master_Bind:
Last_IO_Error_Timestamp:
Last_SQL_Error_Timestamp:
Master_SSL_Crl:
Master_SSL_Crlpath:
Retrieved_Gtid_Set: 1f5bc560-4042-11e5-90a6-d011e342a05a:1-197
Executed_Gtid_Set: 1ade9710-4042-11e5-9b1c-ee8cf1128871:1-3,
1f5bc560-4042-11e5-90a6-d011e342a05a:1-197,
2329e9ce-4042-11e5-ae8d-96290ab7793a:1-3,
27089360-4042-11e5-8b8b-51136eee5e0b:1-2
Auto_Position: 1
Replicate_Rewrite_DB:
Channel_Name: node2
*************************** 3. row ***************************
Slave_IO_State: Waiting for master to send event
Master_Host: 127.0.0.1
Master_User: rsandbox
Master_Port: 8381
Connect_Retry: 60
Master_Log_File: mysql-bin.000002
Read_Master_Log_Pos: 891
Relay_Log_File: mysql-relay-node3.000002
Relay_Log_Pos: 1104
Relay_Master_Log_File: mysql-bin.000002
Slave_IO_Running: Yes
Slave_SQL_Running: Yes
Replicate_Do_DB:
Replicate_Ignore_DB:
Replicate_Do_Table:
Replicate_Ignore_Table:
Replicate_Wild_Do_Table:
Replicate_Wild_Ignore_Table:
Last_Errno: 0
Last_Error:
Skip_Counter: 0
Exec_Master_Log_Pos: 891
Relay_Log_Space: 1313
Until_Condition: None
Until_Log_File:
Until_Log_Pos: 0
Master_SSL_Allowed: No
Master_SSL_CA_File:
Master_SSL_CA_Path:
Master_SSL_Cert:
Master_SSL_Cipher:
Master_SSL_Key:
Seconds_Behind_Master: 0
Master_SSL_Verify_Server_Cert: No
Last_IO_Errno: 0
Last_IO_Error:
Last_SQL_Errno: 0
Last_SQL_Error:
Replicate_Ignore_Server_Ids:
Master_Server_Id: 103
Master_UUID: 2329e9ce-4042-11e5-ae8d-96290ab7793a
Master_Info_File: mysql.slave_master_info
SQL_Delay: 0
SQL_Remaining_Delay: NULL
Slave_SQL_Running_State: Slave has read all relay log; waiting for more updates
Master_Retry_Count: 86400
Master_Bind:
Last_IO_Error_Timestamp:
Last_SQL_Error_Timestamp:
Master_SSL_Crl:
Master_SSL_Crlpath:
Retrieved_Gtid_Set: 2329e9ce-4042-11e5-ae8d-96290ab7793a:1-3
Executed_Gtid_Set: 1ade9710-4042-11e5-9b1c-ee8cf1128871:1-3,
1f5bc560-4042-11e5-90a6-d011e342a05a:1-197,
2329e9ce-4042-11e5-ae8d-96290ab7793a:1-3,
27089360-4042-11e5-8b8b-51136eee5e0b:1-2
Auto_Position: 1
Replicate_Rewrite_DB:
Channel_Name: node3
3 rows in set (0.00 sec)
The first change is that SHOW SLAVE STATUS displays three rows instead of one. We get one row for each master (Note: we can ask the status for a single channel, using SHOW SLAVE STATUS FOR CHANNEL 'channel_name'\G.)After that, things are not as clean cut as we have seen with a single master and many slaves. There we had one set of Retrieved_Gtid_Set and one set of Executed_Gtid_Set. Here, instead, we have the same group of executed sets in each row. I would have expected to have the corresponding received/executed set for each channel. Instead, they are mixed up.
Moving on, we'll see now the relay-log info table in MySQL
node4 [localhost] {msandbox} (mysql) > select * from slave_relay_log_info\G
*************************** 1. row ***************************
Number_of_lines: 7
Relay_log_name: ./mysql-relay-node1.000002
Relay_log_pos: 1104
Master_log_name: mysql-bin.000002
Master_log_pos: 891
Sql_delay: 0
Number_of_workers: 0
Id: 1
Channel_name: node1
*************************** 2. row ***************************
Number_of_lines: 7
Relay_log_name: ./mysql-relay-node2.000002
Relay_log_pos: 1362290
Master_log_name: mysql-bin.000002
Master_log_pos: 1362077
Sql_delay: 0
Number_of_workers: 0
Id: 1
Channel_name: node2
*************************** 3. row ***************************
Number_of_lines: 7
Relay_log_name: ./mysql-relay-node3.000002
Relay_log_pos: 1104
Master_log_name: mysql-bin.000002
Master_log_pos: 891
Sql_delay: 0
Number_of_workers: 0
Id: 1
Channel_name: node3
3 rows in set (0.00 sec)
Here we have something more telling: each row is identified by the node name, and the relay-log name, rather than being a simple mysql-relay.000002, includes the channel in its name: mysql-relay-nodeX.000002.Unfortunately, this table does not have information about GTID. The opposite happens in the performance_schema:
node4 [localhost] {msandbox} (performance_schema) > select * from replication_connection_status\G
*************************** 1. row ***************************
CHANNEL_NAME: node1
GROUP_NAME:
SOURCE_UUID: 1ade9710-4042-11e5-9b1c-ee8cf1128871
THREAD_ID: 30
SERVICE_STATE: ON
COUNT_RECEIVED_HEARTBEATS: 97
LAST_HEARTBEAT_TIMESTAMP: 2015-08-11 18:50:14
RECEIVED_TRANSACTION_SET: 1ade9710-4042-11e5-9b1c-ee8cf1128871:1-3
LAST_ERROR_NUMBER: 0
LAST_ERROR_MESSAGE:
LAST_ERROR_TIMESTAMP: 0000-00-00 00:00:00
*************************** 2. row ***************************
CHANNEL_NAME: node2
GROUP_NAME:
SOURCE_UUID: 1f5bc560-4042-11e5-90a6-d011e342a05a
THREAD_ID: 34
SERVICE_STATE: ON
COUNT_RECEIVED_HEARTBEATS: 96
LAST_HEARTBEAT_TIMESTAMP: 2015-08-11 18:50:11
RECEIVED_TRANSACTION_SET: 1f5bc560-4042-11e5-90a6-d011e342a05a:1-197
LAST_ERROR_NUMBER: 0
LAST_ERROR_MESSAGE:
LAST_ERROR_TIMESTAMP: 0000-00-00 00:00:00
*************************** 3. row ***************************
CHANNEL_NAME: node3
GROUP_NAME:
SOURCE_UUID: 2329e9ce-4042-11e5-ae8d-96290ab7793a
THREAD_ID: 38
SERVICE_STATE: ON
COUNT_RECEIVED_HEARTBEATS: 97
LAST_HEARTBEAT_TIMESTAMP: 2015-08-11 18:50:15
RECEIVED_TRANSACTION_SET: 2329e9ce-4042-11e5-ae8d-96290ab7793a:1-3
LAST_ERROR_NUMBER: 0
LAST_ERROR_MESSAGE:
LAST_ERROR_TIMESTAMP: 0000-00-00 00:00:00
3 rows in set (0.00 sec)
Here, we get the received transaction set, but not the executed set. As we have noted for single master replication, the most complete source of information for monitoring is still SHOW SLAVE STATUS.Fan-in with MariaDB 10
The installation is similar to what we have seen for MySQL 5.7.$ ./multi_source.sh ma10.0.20 mariadb FAN-IN
installing node 1
installing node 2
installing node 3
installing node 4
group directory installed in $HOME/sandboxes/multi_msb_ma10_0_20
# server: 1:
# server: 2:
# server: 3:
# server: 4:
# server: 1:
# server: 2:
# server: 3:
# server: 4:
# Setting topology FAN-IN
--------------
CHANGE MASTER 'node1' TO MASTER_HOST='127.0.0.1', MASTER_PORT=19021, MASTER_USER='rsandbox', MASTER_PASSWORD='rsandbox', MASTER_USE_GTID=current_pos
START SLAVE 'node1'
--------------
CHANGE MASTER 'node2' TO MASTER_HOST='127.0.0.1', MASTER_PORT=19022, MASTER_USER='rsandbox', MASTER_PASSWORD='rsandbox', MASTER_USE_GTID=current_pos
START SLAVE 'node2'
--------------
CHANGE MASTER 'node3' TO MASTER_HOST='127.0.0.1', MASTER_PORT=19023, MASTER_USER='rsandbox', MASTER_PASSWORD='rsandbox', MASTER_USE_GTID=current_pos
START SLAVE 'node3'
--------------
The main differences in the installation are that we don't have to enable GTID or crash-safe tables, as they are on by default, and that each node has a different domain ID (which was set by the installation script.) The domain ID is arbitrary. It can be any number, provided that each master has a different one. Of course, it's better to choose numbers that can be easily attributed to each server. In this case, it's the server ID multiplied by 10.$ for N in 1 2 3 4; do ./n$N -e 'select @@server_id, @@gtid_domain_id' ; done
+-------------+------------------+
| @@server_id | @@gtid_domain_id |
+-------------+------------------+
| 101 | 1010 |
+-------------+------------------+
+-------------+------------------+
| @@server_id | @@gtid_domain_id |
+-------------+------------------+
| 102 | 1020 |
+-------------+------------------+
+-------------+------------------+
| @@server_id | @@gtid_domain_id |
+-------------+------------------+
| 103 | 1030 |
+-------------+------------------+
+-------------+------------------+
| @@server_id | @@gtid_domain_id |
+-------------+------------------+
| 104 | 1040 |
+-------------+------------------+
Here as well we run the test script, followed by loading the Sakila database in one node, and inserting a few transactions in the slave node. Now we can look at the GTID situation: $ for N in 1 2 3 4; do ./n$N -e 'select @@server_id; select @@global.gtid_current_pos\G' ; done
+-------------+
| @@server_id |
+-------------+
| 101 |
+-------------+
*************************** 1. row ***************************
@@global.gtid_current_pos: 1010-101-3
+-------------+
| @@server_id |
+-------------+
| 102 |
+-------------+
*************************** 1. row ***************************
@@global.gtid_current_pos: 1020-102-119
+-------------+
| @@server_id |
+-------------+
| 103 |
+-------------+
*************************** 1. row ***************************
@@global.gtid_current_pos: 1030-103-3
+-------------+
| @@server_id |
+-------------+
| 104 |
+-------------+
*************************** 1. row ***************************
@@global.gtid_current_pos: 1010-101-3,1030-103-3,1020-102-119,1040-104-2
Then we look at SHOW SLAVE STATUS, and we find something odd:node4 [localhost] {msandbox} ((none)) > SHOW SLAVE STATUS\G
Empty set (0.00 sec)
The reason for this oddity is that SHOW SLAVE STATUS only works with the default, nameless channel. For named ones, we need to use a new command:node4 [localhost] {msandbox} ((none)) > show ALL SLAVES status\G
*************************** 1. row ***************************
Connection_name: node1
Slave_SQL_State: Slave has read all relay log; waiting for the slave I/O thread to update it
Slave_IO_State: Waiting for master to send event
Master_Host: 127.0.0.1
Master_User: rsandbox
Master_Port: 19021
Connect_Retry: 60
Master_Log_File: mysql-bin.000001
Read_Master_Log_Pos: 882
Relay_Log_File: mysql-relay-node1.000002
Relay_Log_Pos: 1169
Relay_Master_Log_File: mysql-bin.000001
Slave_IO_Running: Yes
Slave_SQL_Running: Yes
Replicate_Do_DB:
Replicate_Ignore_DB:
Replicate_Do_Table:
Replicate_Ignore_Table:
Replicate_Wild_Do_Table:
Replicate_Wild_Ignore_Table:
Last_Errno: 0
Last_Error:
Skip_Counter: 0
Exec_Master_Log_Pos: 882
Relay_Log_Space: 1468
Until_Condition: None
Until_Log_File:
Until_Log_Pos: 0
Master_SSL_Allowed: No
Master_SSL_CA_File:
Master_SSL_CA_Path:
Master_SSL_Cert:
Master_SSL_Cipher:
Master_SSL_Key:
Seconds_Behind_Master: 0
Master_SSL_Verify_Server_Cert: No
Last_IO_Errno: 0
Last_IO_Error:
Last_SQL_Errno: 0
Last_SQL_Error:
Replicate_Ignore_Server_Ids:
Master_Server_Id: 101
Master_SSL_Crl:
Master_SSL_Crlpath:
Using_Gtid: Current_Pos
Gtid_IO_Pos: 1010-101-3
Retried_transactions: 0
Max_relay_log_size: 1073741824
Executed_log_entries: 14
Slave_received_heartbeats: 0
Slave_heartbeat_period: 1800.000
Gtid_Slave_Pos: 1010-101-3,1030-103-3,1020-102-119,1040-104-2
*************************** 2. row ***************************
Connection_name: node2
Slave_SQL_State: Slave has read all relay log; waiting for the slave I/O thread to update it
Slave_IO_State: Waiting for master to send event
Master_Host: 127.0.0.1
Master_User: rsandbox
Master_Port: 19022
Connect_Retry: 60
Master_Log_File: mysql-bin.000001
Read_Master_Log_Pos: 3230973
Relay_Log_File: mysql-relay-node2.000002
Relay_Log_Pos: 3231260
Relay_Master_Log_File: mysql-bin.000001
Slave_IO_Running: Yes
Slave_SQL_Running: Yes
Replicate_Do_DB:
Replicate_Ignore_DB:
Replicate_Do_Table:
Replicate_Ignore_Table:
Replicate_Wild_Do_Table:
Replicate_Wild_Ignore_Table:
Last_Errno: 0
Last_Error:
Skip_Counter: 0
Exec_Master_Log_Pos: 3230973
Relay_Log_Space: 3231559
Until_Condition: None
Until_Log_File:
Until_Log_Pos: 0
Master_SSL_Allowed: No
Master_SSL_CA_File:
Master_SSL_CA_Path:
Master_SSL_Cert:
Master_SSL_Cipher:
Master_SSL_Key:
Seconds_Behind_Master: 0
Master_SSL_Verify_Server_Cert: No
Last_IO_Errno: 0
Last_IO_Error:
Last_SQL_Errno: 0
Last_SQL_Error:
Replicate_Ignore_Server_Ids:
Master_Server_Id: 102
Master_SSL_Crl:
Master_SSL_Crlpath:
Using_Gtid: Current_Pos
Gtid_IO_Pos: 1020-102-119
Retried_transactions: 0
Max_relay_log_size: 1073741824
Executed_log_entries: 263
Slave_received_heartbeats: 0
Slave_heartbeat_period: 1800.000
Gtid_Slave_Pos: 1010-101-3,1030-103-3,1020-102-119,1040-104-2
*************************** 3. row ***************************
Connection_name: node3
Slave_SQL_State: Slave has read all relay log; waiting for the slave I/O thread to update it
Slave_IO_State: Waiting for master to send event
Master_Host: 127.0.0.1
Master_User: rsandbox
Master_Port: 19023
Connect_Retry: 60
Master_Log_File: mysql-bin.000001
Read_Master_Log_Pos: 882
Relay_Log_File: mysql-relay-node3.000002
Relay_Log_Pos: 1169
Relay_Master_Log_File: mysql-bin.000001
Slave_IO_Running: Yes
Slave_SQL_Running: Yes
Replicate_Do_DB:
Replicate_Ignore_DB:
Replicate_Do_Table:
Replicate_Ignore_Table:
Replicate_Wild_Do_Table:
Replicate_Wild_Ignore_Table:
Last_Errno: 0
Last_Error:
Skip_Counter: 0
Exec_Master_Log_Pos: 882
Relay_Log_Space: 1468
Until_Condition: None
Until_Log_File:
Until_Log_Pos: 0
Master_SSL_Allowed: No
Master_SSL_CA_File:
Master_SSL_CA_Path:
Master_SSL_Cert:
Master_SSL_Cipher:
Master_SSL_Key:
Seconds_Behind_Master: 0
Master_SSL_Verify_Server_Cert: No
Last_IO_Errno: 0
Last_IO_Error:
Last_SQL_Errno: 0
Last_SQL_Error:
Replicate_Ignore_Server_Ids:
Master_Server_Id: 103
Master_SSL_Crl:
Master_SSL_Crlpath:
Using_Gtid: Current_Pos
Gtid_IO_Pos: 1030-103-3
Retried_transactions: 0
Max_relay_log_size: 1073741824
Executed_log_entries: 14
Slave_received_heartbeats: 0
Slave_heartbeat_period: 1800.000
Gtid_Slave_Pos: 1010-101-3,1030-103-3,1020-102-119,1040-104-2
3 rows in set (0.00 sec)
Also here, like in MySQL 5.7, we have multiple rows. As we have seen in the previous article, the GTIDs mentioned here are the ones that were received, while the executed ones are available in @@global.gtid_current_pos. Sadly, I note that SHOW SLAVE STATUS (and SHOW ALL SLAVES STATUS) lists the GTID for all channels in each row, making the monitoring unnecessarily more complex.Similar to SHOW ALL SLAVES STATUS, MariaDB introduces START/STOP ALL SLAVES, while in MySQL 5.7 the old command without mention of channel acts on all slaves.
The only table that stores GTIDs shows a more organised output:
node4 [localhost] {msandbox} (mysql) > select * from gtid_slave_pos;
+-----------+--------+-----------+--------+
| domain_id | sub_id | server_id | seq_no |
+-----------+--------+-----------+--------+
| 1010 | 2 | 101 | 2 |
| 1010 | 3 | 101 | 3 |
| 1020 | 124 | 102 | 118 |
| 1020 | 125 | 102 | 119 |
| 1030 | 8 | 103 | 2 |
| 1030 | 9 | 103 | 3 |
+-----------+--------+-----------+--------+
Here we see two rows for each channel. I am not sure how many are supposed to be recorded, but for the purpose of resuming replication after a crash this is enough.Multi-source replication: expectations and reality.
Discovering the possibility of multi-source topologies unchains, for some users, an immediate need of using the technology to solve a pet problem. Soon, however, one of the following problems arises:- If you are trying to scale master writes using multiple masters, it's going to fail Due to the nature of replication, each write in one master is replicated in another master. Regardless of the technology used (circular replication, MySQL 5.7 or MariaDB 10 multi-source, Tungsten Replicator) all the writes in one master will go the the other master. The only way of scaling master writes is by using sharding, and multi-source replication cannot help you here.
- If you are using multiple masters to solve the fan-in problem, or a case where you want to insert data from different nodes without changing your application, you may run into conflicts. There are several ones of them. I have written an article about them. And another article explaining that with vanilla asynchronous replication, you cannot solve conflicts. You need to avoid them in your application. Many years ago (~ 2006) the MySQL team was looking into the possible solutions for conflict resolution, but that project was shelved. I hope it will be resumed and implemented.
Summing up
Multi-source replication with MySQL is now a reality. It can be deployed and used relatively easily.However, the monitoring features should be improved, to meet the demands introduced by the new features. Using multi-source replication as it is, with either MySQL 5.7 or MariaDB 10, is a risky proposition, as the DBA has limited visibility into the working of the system.
Looking at the future, I see room for improvement, and I hope that the development teams will look at feedback such as this one to create better tools for DBAs.
Coming next: more topologies
When you hear “multi-source”, you mostly get the idea of what we have described in this article: one slave, many masters. But multi-source is just the main building block that allows us to assemble more complex topologies.In the next parts of this article we will see topologies where many nodes are masters with the same powers.
- The point-to-point all-masters replication, which is the most efficient and resilient topology (without SPOF) with multiple masters.
- The star topology, a lightweight all-masters scenario with one SPOF.
- Hybrid scenarios, where we mix this and that.
No comments:
Post a Comment