Thursday, August 11, 2011

Usability improvements in Tungsten Replicator 2.0.4

If you love a software product, you should try to improve it, and not be afraid of criticizing it. This principle has guided me with MySQL (where I have submitted many usability bugs, and discussed interface with developers for years), and it proves true for Tungsten Replicator as well. When I started working at Continuent, while I was impressed by the technology, I found the installation procedure and the product logs quite discouraging. I would almost say disturbing. Fortunately, my colleagues have agreed on my usability focus, and we can enjoy some tangible improvements. I have already mentioned the new installation procedure, which requires just one command to install a full master/slave cluster. I would like to show how you can use the new installer to deploy a multiple source replication topology like the following: The first step is to install one master in each node. I can run the commands from node #4, which is the one that will eventually receive the updates from the remote masters, and where I need to install the slave services:
TUNGSTEN_BASE=$HOME/newinst
SERVICES=(alpha bravo charlie delta)
REPLICATOR=$TUNGSTEN_BASE/tungsten/tungsten-replicator/bin/replicator

for N in 1 2 3 4
do
    INDEX=$(($N-1))

  ./tools/tungsten-installer \
    --master-slave \
    --master-host=qa.r$N.continuent.com \
    --datasource-user=tungsten \
    --datasource-password=secret \
    --service-name=${SERVICES[$INDEX]} \
    --home-directory=$TUNGSTEN_BASE \
    --cluster-hosts=qa.r$N.continuent.com \
    --start-and-report
done
The above loop will install a master (remotely or locally) in the four servers. Then I need to create the slave services. To do it, I use the updated configure-service in the tools directory.
TUNGSTEN_TOOLS=$TUNGSTEN_BASE/tungsten/tools
COMMON_OPTIONS='-C -q 
    --local-service-name=delta 
    --role=slave 
    --service-type=remote 
    --allow-bidi-unsafe=true 
    --datasource=qa_r4_continuent_com' 

$TUNGSTEN_TOOLS/configure-service $COMMON_OPTIONS --master-host=qa.r1.continuent.com  alpha 
$TUNGSTEN_TOOLS/configure-service $COMMON_OPTIONS --master-host=qa.r2.continuent.com  bravo
$TUNGSTEN_TOOLS/configure-service $COMMON_OPTIONS --master-host=qa.r3.continuent.com  charlie 

$TUNGSTEN_BASE/tungsten/tungsten-replicator/bin/replicator restart
$TUNGSTEN_BASE/tungsten/tungsten-replicator/bin/trepctl services
These commands create the slave services locally in Delta. After restarting the replicator, a simple test will be creating something different in each master, and check that the data has replicated to the single slave. The latest improvement in matter of usability is the simplification of the replicator logs. Until a few days ago, if you had an error in the replicator, you would get a long list of not exactly helpful stuff. For example, if I create a table in a slave, and then create the same table in the master, I will break replication. The extended log would produce something like this:
INFO   | jvm 1    | 2011/08/11 18:10:52 | 2011-08-11 18:10:52,216 [tsandbox - q-to-dbms-0] ERROR pipeline.SingleThreadStageTask Event application failed: seqno=1 fragno=0 message=java.sql.SQLException: Statement failed on slave but succeeded on master
INFO   | jvm 1    | 2011/08/11 18:10:52 | 2011-08-11 18:10:52,217 [tsandbox - Event dispatcher thread] ERROR management.OpenReplicatorManager Received error notification, shutting down services: Event application failed: seqno=1 fragno=0 message=java.sql.SQLException: Statement failed on slave but succeeded on master
INFO   | jvm 1    | 2011/08/11 18:10:52 | com.continuent.tungsten.replicator.applier.ApplierException: java.sql.SQLException: Statement failed on slave but succeeded on master
INFO   | jvm 1    | 2011/08/11 18:10:52 |       at com.continuent.tungsten.replicator.applier.MySQLDrizzleApplier.applyStatementData(MySQLDrizzleApplier.java:183)
INFO   | jvm 1    | 2011/08/11 18:10:52 |       at com.continuent.tungsten.replicator.applier.JdbcApplier.apply(JdbcApplier.java:1233)
INFO   | jvm 1    | 2011/08/11 18:10:52 |       at com.continuent.tungsten.replicator.applier.ApplierWrapper.apply(ApplierWrapper.java:101)
INFO   | jvm 1    | 2011/08/11 18:10:52 |       at com.continuent.tungsten.replicator.pipeline.SingleThreadStageTask.runTask(SingleThreadStageTask.java:498)
INFO   | jvm 1    | 2011/08/11 18:10:52 |       at com.continuent.tungsten.replicator.pipeline.SingleThreadStageTask.run(SingleThreadStageTask.java:155)
INFO   | jvm 1    | 2011/08/11 18:10:52 |       at java.lang.Thread.run(Unknown Source)
INFO   | jvm 1    | 2011/08/11 18:10:52 | Caused by: java.sql.SQLException: Statement failed on slave but succeeded on master
INFO   | jvm 1    | 2011/08/11 18:10:52 |       at com.continuent.tungsten.replicator.applier.MySQLDrizzleApplier.applyStatementData(MySQLDrizzleApplier.java:139)
INFO   | jvm 1    | 2011/08/11 18:10:52 |       ... 5 more
INFO   | jvm 1    | 2011/08/11 18:10:52 | Caused by: java.sql.SQLSyntaxErrorException: Table 't1' already exists
INFO   | jvm 1    | 2011/08/11 18:10:52 |       at org.drizzle.jdbc.internal.SQLExceptionMapper.get(SQLExceptionMapper.java:78)
INFO   | jvm 1    | 2011/08/11 18:10:52 |       at org.drizzle.jdbc.DrizzleStatement.executeBatch(DrizzleStatement.java:930)
INFO   | jvm 1    | 2011/08/11 18:10:52 |       at com.continuent.tungsten.replicator.applier.MySQLDrizzleApplier.applyStatementData(MySQLDrizzleApplier.java:125)
INFO   | jvm 1    | 2011/08/11 18:10:52 |       ... 5 more
INFO   | jvm 1    | 2011/08/11 18:10:52 | Caused by: org.drizzle.jdbc.internal.common.QueryException: Table 't1' already exists
INFO   | jvm 1    | 2011/08/11 18:10:52 |       at org.drizzle.jdbc.internal.mysql.MySQLProtocol.executeQuery(MySQLProtocol.java:500)
INFO   | jvm 1    | 2011/08/11 18:10:52 |       at org.drizzle.jdbc.internal.mysql.MySQLProtocol.executeBatch(MySQLProtocol.java:546)
INFO   | jvm 1    | 2011/08/11 18:10:52 |       at org.drizzle.jdbc.DrizzleStatement.executeBatch(DrizzleStatement.java:917)
INFO   | jvm 1    | 2011/08/11 18:10:52 |       ... 6 more
INFO   | jvm 1    | 2011/08/11 18:10:52 | 2011-08-11 18:10:52,218 [tsandbox - Event dispatcher thread] WARN  management.OpenReplicatorManager Performing emergency service shutdown
INFO   | jvm 1    | 2011/08/11 18:10:52 | 2011-08-11 18:10:52,219 [tsandbox - Event dispatcher thread] INFO  pipeline.Pipeline Shutting down pipeline: slave
INFO   | jvm 1    | 2011/08/11 18:10:52 | 2011-08-11 18:10:52,219 [tsandbox - q-to-dbms-0] INFO  pipeline.SingleThreadStageTask Terminating processing for stage task thread
INFO   | jvm 1    | 2011/08/11 18:10:52 | 2011-08-11 18:10:52,219 [tsandbox - q-to-dbms-0] INFO  pipeline.SingleThreadStageTask Last successfully processed event prior to termination: seqno=0 eventid=mysql-bin.000002:0000000000000426;20
Did you see the reason for the error? No? Neither did I. I would need to open the THL, look for event #1, and determine what it was. Instead, the new user.log looks like this:
2011-08-11 18:10:52,216 ERROR Received error notification: Event application failed: seqno=1 fragno=0 message=java.sql.SQLException: Statement failed on slave but succeeded on master
Caused by : java.sql.SQLException: Statement failed on slave but succeeded on master
Caused by : Statement failed on slave but succeeded on master
Caused by : Table 't1' already exists
Caused by : Table 't1' already exists
2011-08-11 18:10:54,721 INFO  State changed ONLINE -> OFFLINE:ERROR
2011-08-11 18:10:54,721 WARN  Received irrelevant event for current state: state=OFFLINE:ERROR event=OfflineNotification
That's much better. It is not perfect yet, but it will be soon. Right now, it tells me what is wrong without forcing me to go hunting for it amid hundreds of stack trace lines. Give it a try, using the latest replicator build.

4 comments:

Rumbi said...

Hi

I am kindly asking that you please send me a step by step tutorial on how to set up fan in replication with two separate databases on two master servers being replicated onto one shared database on a slave machine using tungsten replicator. It is my first time to use tungsten so I need a detailed approach right from installation to configuration and lastly testing.

I have searched the internet and all I can find is the fact that it is possible but not how exactly to do so.

Your assistance will be greatly appreciated.

Rumbi
mukrue@yahoo.com

Giuseppe Maxia said...

@Rumbi,
I don't do private tutorials by email.
If the information in this article is not enough, you may supplement it with information from the Tungsten Cookbook.

Be aware that there is also a Tungsten Sandbox, which installs several test topologies (including fan-in) in a single host.

Should that not be enough, you may ask for professional services from my company (sales@continuent.com)

Rumbi said...

Thanks. I am running virtual machines for the setup. I keep getting the same error for each machine at the first step:
ERROR >> 10.1.1.204 >> root
./ruby/lib/net/ssh.rb:192:in `start'./ruby/configurator.rb:1111:in `ssh_result'./ruby/configure/modules/validation_deployment.rb:52:in `validate'./ruby/configure/validation_check_interface.rb:37:in `run'./ruby/configure/configure_validation_handler.rb:67:in `prevalidate_config'
./ruby/configure/configure_validation_handler.rb:64:in `each'
./ruby/configure/configure_validation_handler.rb:64:in `prevalidate_config'
./ruby/configure/configure_validation_handler.rb:51:in `prevalidate'
./ruby/configure/configure_validation_handler.rb:48:in `each'
./ruby/configure/configure_validation_handler.rb:48:in `prevalidate'
./ruby/configure/configure_deployment.rb:27:in `prevalidate'
./ruby/configurator.rb:235:in `run'
ruby/configure.rb:51
###################################
# Validation failed
###################################

# Errors for 10.1.1.204
###################################
ERROR >> 10.1.1.204 >> root
./ruby/lib/net/ssh.rb:192:in `start'
./ruby/configurator.rb:1111:in `ssh_result'
./ruby/configure/modules/validation_deployment.rb:52:in `validate'
./ruby/configure/validation_check_interface.rb:37:in `run'
./ruby/configure/configure_validation_handler.rb:67:in `prevalidate_config'
./ruby/configure/configure_validation_handler.rb:64:in `each'
./ruby/configure/configure_validation_handler.rb:64:in `prevalidate_config'
./ruby/configure/configure_validation_handler.rb:51:in `prevalidate'
./ruby/configure/configure_validation_handler.rb:48:in `each'
./ruby/configure/configure_validation_handler.rb:48:in `prevalidate'
./ruby/configure/configure_deployment.rb:27:in `prevalidate'
./ruby/configurator.rb:235:in `run'
ruby/configure.rb:51 (SSHLoginCheck)
root@Slave newinst/tungsten# ./tools/tungsten-installer --master-slave --master-host=10.1.1.204 --datasource-user=root --datasource-password=business08 --service-name=Master1 --cluster-hosts=10.1.1.204 --start-and-report done
ERROR >> 10.1.1.204 >> root
./ruby/lib/net/ssh.rb:192:in `start'
./ruby/configurator.rb:1111:in `ssh_result'
./ruby/configure/modules/validation_deployment.rb:52:in `validate'
./ruby/configure/validation_check_interface.rb:37:in `run'
./ruby/configure/configure_validation_handler.rb:67:in `prevalidate_config'
./ruby/configure/configure_validation_handler.rb:64:in `each'
./ruby/configure/configure_validation_handler.rb:64:in `prevalidate_config'
./ruby/configure/configure_validation_handler.rb:51:in `prevalidate'
./ruby/configure/configure_validation_handler.rb:48:in `each'
./ruby/configure/configure_validation_handler.rb:48:in `prevalidate'
./ruby/configure/configure_deployment.rb:27:in `prevalidate'
./ruby/configurator.rb:235:in `run'
ruby/configure.rb:51
###################################
# Validation failed
###################################
# Errors for 10.1.1.204
###################################
ERROR >> 10.1.1.204 >> root
./ruby/lib/net/ssh.rb:192:in `start'
./ruby/configurator.rb:1111:in `ssh_result'
./ruby/configure/modules/validation_deployment.rb:52:in `validate'
./ruby/configure/validation_check_interface.rb:37:in `run'./ruby/configure/configure_validation_handler.rb:67:in `prevalidate_config'
./ruby/configure/configure_validation_handler.rb:64:in `each'./ruby/configure/configure_validation_handler.rb:64:in `prevalidate_config'./ruby/configure/configure_validation_handler.rb:51:in `prevalidate'./ruby/configure/configure_validation_handler.rb:48:in `each'./ruby/configure/configure_validation_handler.rb:48:in `prevalidate'./ruby/configure/configure_deployment.rb:27:in `prevalidate'./ruby/configurator.rb:235:in `run'
ruby/configure.rb:51(SSHLoginCheck)

Giuseppe Maxia said...

@Rumbi,
This is not the place to report problems. There is a mailing list.
And please check the "Troubleshooting" page in the wiki.