Thursday, March 17, 2011

How fast is parallel replication? See it live today

I talked about parallel replication last month. Since then, there has been a considerable interest for this feature. As far as I know, Tungsten's is the only implementation of this much coveted feature, so I can only compare with MySQL native replication.
The most compelling question is "how fast is it?"
That's a tricky one. The answer is the same that I give when someone asks me "how fast is MySQL". I always say: it depends.
Running replication in a single thread is sometimes slower than the operations in the master. Many users complain that the single thread can't keep up with the master, and the slave lags behind. True. There is, however, a hidden benefit of single threaded replication: it requires less resources. There is no contention for writing on disk, no need to worry about several users blocking a table. You need to contend with the users that want to read the tables, but the lone writer has an easy job, albeit a hard one.
When we introduce parallel replication, the easy job fades away, and we are faced with the problem: how do I allow several writers to do the work of one? It's a nice problem to have. MySQL native replication does not allow parallel apply, but with Tungsten you can start tackling the issue of allowing several parallel threads to update the system at once. Therefore, this is the same problem that you have on a server where several users are allowed to write at once. If the server has sufficient resources, the operations will be fast. If it doesn't, the operations will lag behind.
Another aspect of the question is "what kind of queries?" If your database is well established and set in stone, and you mostly UPDATEs, the replication performance will depend on how well your server is tuned for concurrent writes. If you run ALTER TABLE statements on a daily basis, your queries will queue up after that ALTER TABLE no matter what. And if you have only INSERT and DELETE queries, parallel replication will probably depend on how fast is your server.
Ultimately, I can tell you that I have seen or experienced directly a wide range of repeatable results. I know cases where parallel replication is three times as fast as native replication. These cases usually involve huge amounts of binary logs, like in the case when your slave needs to be taken off-line for a few hours or even days and then it tries to catch up. Other cases that can be reproduced with a minimal amount of sample data show parallel replication as being 30% to 50% faster. ANd then there are cases when your server is so poor on resources or the load is so unevenly distributed that parallel replication is as fast as native replication. I would say that these cases are easily cured by beefing up the server.
If you want to see a demo of how this replication works, you can join this webinar:
Zoom, Zoom, Zoom! MySQL Parallel Replication With Tungsten Replicator 2.0.
You can tell from the title that we are quite excited about the product that we are building.


Stewart Smith said...

for drizzle, if we enable per-catalog replication logs, doing parallel apply should be trivial. This solves the multi-tenancy problem of having other users introduce latency in replication. It doesn't (of course) solve the single db/table high workload problem though (much trickier).

Robert Hodges said...

Hi Stewart,

First of all congrats on the new release.

Are per-catalog logs necessarily separate files? If so, then having a lot of them would make logging very slow unless you are on a device that works well with random I/O. You have the same issue when reading from them.

Also, parallel apply has a number of non-trivial problems around recovery, serialization of updates between dependent databases (common in multi-tenant apps) and avoiding serialization points on I/O etc.

Robert Hodges said...

p.s., I would be very interested in talking more deeply with your team about parallel and multi-master replication features. With some small feature additions I believe drizzle could be very good at these problems.