English 中文(简体)
MPI buffered send/receive order
原标题:

I m using MPI (with fortran but the question is more specific to the MPI standard than any given language), and specifically using the buffered send/receive functions isend and irecv. Now if we imagine the following scenario:

Process 0:

isend(stuff1, ...)
isend(stuff2, ...)

Process 1:

wait 10 seconds
irecv(in1, ...)
irecv(in2, ...)

Are the messages delivered to Process 1 in the order they were sent, i.e. can I be sure that in1 == stuff1 and in2 == stuff2 if the tag used is the same in all cases?

最佳回答

Yes, the messages are received in the order they are sent. This is described by the standard as non-overtaking messages. See this MPI Standard section for more details, here s an excerpt:

Order Messages are non-overtaking: If a sender sends two messages in succession to the same destination, and both match the same receive, then this operation cannot receive the second message if the first one is still pending. If a receiver posts two receives in succession, and both match the same message, then the second receive operation cannot be satisfied by this message, if the first one is still pending. This requirement facilitates matching of sends to receives. It guarantees that message-passing code is deterministic, if processes are single-threaded and the wildcard MPI_ANY_SOURCE is not used in receives. (Some of the calls described later, such as MPI_CANCEL or MPI_WAITANY, are additional sources of nondeterminism.)

问题回答

Yes and no.

can I be sure that in1 == stuff1 and in2 == stuff2 if the tag used is the same in all cases?

Yes. There is a deterministic 1:1 correlation between send s and recv s that will get the correct input into the correct recv buffer. This behavior is guaranteed by the standard, and is enforced by all MPI implementations.

No. The exact order of internal message progression and the exact order in which buffers on the receiver side are populated is somewhat of a black box....especially when RDMA style message transfers with multiple in-flight buffers are being used (e.g. InfiniBand).

If your code is using multiple threads, and inspecting the buffer to determine completeness (e.g. waiting on a bit to be toggled) rather than using MPI_Test or MPI_Wait, then it is possible that the messages can arrive out of order (but in the correct buffer).

If your code is dependent on the in1 = stuff1 being populated BEFORE in2 = stuff2 is populated on the receiver side, and there is a single sending rank for both messages, then using MPI_Issend (non-blocking, synchronous send) will guarantee the messages are recv d in order. If you need to guarantee the buffer population order of multiple recv s from multiple sending ranks, then some kind of blocking call is required between each revc (e.g. MPI_Recv, MPI_Barrier, MPI_Wait, etc).





相关问题
OutOfMemoryException on MemoryStream writing

I have a little sample application I was working on trying to get some of the new .Net 4.0 Parallel Extensions going (they are very nice). I m running into a (probably really stupid) problem with an ...

Master-Slave Pattern for Distributed Environment

Currently we have a batch driven process at work which runs every 15 mins and everytime it runs it repeats this cycle several times: Calls a sproc and get some data back from the DB Process the data ...

How to use database server for distributed job scheduling?

I have around 100 computers and few workers on each of them. The already connect to a central database to query for job parameters. Now I have to do job scheduling for them. One job for one worker ...

minimum work size of a goroutine [closed]

Does anyone know approximately what the minimum work size is needed in order for a goroutine to be beneficial (assuming that there are free cores for the work to be offloaded to)?

Optimal number of threads per core

Let s say I have a 4-core CPU, and I want to run some process in the minimum amount of time. The process is ideally parallelizable, so I can run chunks of it on an infinite number of threads and each ...

What s the quickest way to parallelize code?

I have an image processing routine that I believe could be made very parallel very quickly. Each pixel needs to have roughly 2k operations done on it in a way that doesn t depend on the operations ...

how to efficiently apply a medium-weight function in parallel

I m looking to map a modestly-expensive function onto a large lazy seq in parallel. pmap is great but i m loosing to much to context switching. I think I need to increase the size of the chunk of work ...

热门标签