Question

Theoretically, can one define a protocol where one machine does some remote calls on another machine (or more than one), and where in any part of the process, if any of the machines (or operations) fails, or the communication drops, everything is rolled back? (just like databases can)

I ask this, since on the hardware level, one always says that one can not make atomic operations (a very important ingredient of transactions) without an atomic processor operation (test and set).

But since now we are talking about multiple machines, this doesn t fly.

As an example how this would be tricky: Say I have a protocol to issue a command on a remote machine, and get a response back. It could be that the method is called, but during the transit of the response, the connection dies. It could then also very well be that the machine which performed the operation thinks everything was ok, but the receiving machine has never gotten the answer.

Adding Ack s doesn t help since also the ack s might be lost in transit.

Interested to read others thoughts (and to learn that some professor 27 years ago already came up with a rock solid solution probably).

R

Answer 1

Yes, this problem has been (more or less) solved already :)

What you re looking for is the Two-Phase Commit protocol.

In transaction processing, databases, and computer networking, the two-phase commit protocol (2PC) is a type of an atomic commitment protocol. It is a distributed algorithm that coordinates all the processes that participate in a distributed atomic transaction on whether to commit or abort (roll back) the transaction. The protocol achieves its goal even in many cases of system failure (involving either process, network node, communication, etc. failures), and is thus widely utilized.

Answer 2

Sounds like the standard problems of distribution.

Read Distributed Computing: Principles, Algorithms, and Systems and ask the question again ;)

友情链接