English 中文(简体)
Articles about replication schemes/algorithms?
原标题:

I m designing a distributed system with a certain flow of data in it. I d like to guarantee that at least N nodes have almost-current data at any given time. I do not need complete consistency, only eventual consistency (t.i. for any time instant, the current snapshot of data should eventually appear on at least N nodes. It is tricky to define the term "current" here, but still). Nodes may fail and go back up at any moment, and there is no single "central" node.

O overflowers! Point me to some good papers describing replication schemes. I ve so far found one: Consistency Management in Optimistic Replication Algorithms and a more broad and recent article by the same author: Optimistic Replication.

问题回答

A lot of the trick to this is finding your exact requirements, and yours still sound pretty vague. Do you just need to support operations like this?

  • Update key K to value V.
  • Look up a somewhat-recent value of key K.

You mentioned you need eventual consistency. So if you do a single update, it will eventually replicate everywhere. If you do two nearly-simultaneous updates, do you care which one wins? If one replica reports that an update was successfully completed, do you care if the value could be lost if that replica were to temporarily crash shortly afterward? Or if that replica were permanently destroyed?

How precise should somewhat-recent be? If there s a netsplit or something, a lookup might return a very stale result or just fail. Do you care which?

Do you ever need to support fancier operations like...

  • Get the absolute latest value of key K?
  • Update the value of key K to value V provided the latest value is currently V?

Do you have rigid reliability, latency, and/or bandwidth requirements? How far apart are your replicas / how good is the network between? This impacts if you can have cross-replica communication on every update and even on every lookup; or even if you can/should fail over operations to a remote replica if the local one seems to be down.

Depending on your answers here, I ve worked with a couple different schemes that might meet your requirements. There are several possible variations on them.

  • The simplest thing is to just have the application always talk to the local replica. Replicas timestamp values (using NTP-synced clocks) and only talk to each other for asynchronous replication. Highest timestamp wins in replication. Of course, if applications on two different replicas each do a read/modify/write near simultaneously, one of the modifications can easily be lost. (In fact, without a conditional update scheme, the same is even true for near-simultaneous changes on the same replica.) If a replica permanently fails, recent-ish updates can be lost. This is more or less what Bigtable s built-in replication does. In the paper you linked, it d be the "Optimistic - Multimaster" branch but not caring too much about losing some updates makes it simpler than they suggest.
  • Some databases use the Paxos algorithm (see for example "Data Management for Internet-Scale Single-Sign-On" here to make fancier things possible. Each replica can know how far behind it might be so you can say "give me a value that s no more than 1 minute old" or "give me the absolute latest value". An update isn t considered complete until a quorum of replicas have accepted it, so "give me the absolute latest value" will definitely always return that value until another update happens. You can do the conditional update operation I mentioned to prevent simultaneous writers from tramping each other. This doesn t seem to fit neatly into either the optimistic or pessimistic category as defined by that author because updates are replicated synchronously to a quorum but replicas which didn t vote in the latest Paxos round may still be able to answer some queries. The scheme can be very complicated, though...

Not RDBMS agnostic, but SQL Server 2008 (2005 onwards) supports Peer-To-Peer Replication





相关问题
what is wrong with this mysql code

$db_user="root"; $db_host="localhost"; $db_password="root"; $db_name = "fayer"; $conn = mysqli_connect($db_host,$db_user,$db_password,$db_name) or die ("couldn t connect to server"); // perform query ...

Users asking for denormalized database

I am in the early stages of developing a database-driven system and the largest part of the system revolves around an inheritance type of relationship. There is a parent entity with about 10 columns ...

Easiest way to deal with sample data in Java web apps?

I m writing a Java web app in my free time to learn more about development. I m using the Stripes framework and eventually intend to use hibernate and MySQL For the moment, whilst creating the pages ...

join across databases with nhibernate

I am trying to join two tables that reside in two different databases. Every time, I try to join I get the following error: An association from the table xxx refers to an unmapped class. If the ...

How can I know if such value exists in database? (ADO.NET)

For example, I have a table, and there is a column named Tags . I want to know if value programming exists in this column. How can I do this in ADO.NET? I did this: OleDbCommand cmd = new ...

Convert date to string upon saving a doctrine record

I m trying to migrate one of my PHP projects to Doctrine. I ve never used it before so there are a few things I don t understand. In my current code, I have a class similar to this: class ...

热门标签