As the previous poster indicates, every modern machine type has a special class of instruction known as atomics that do operate as the previous poster indicates... they serialize execution against at least the specified memory location.
On x86, there is a LOCK assembler prefix that indicates to the machine that the next instruction should be handled atomically. When the instruction is encountered, several things effectively happen on x86.
- Pending read prefetches are canceled (this means that the CPU won t present data to the program that may be made stale across the atomic).
- Pending writes to memory are flushed.
- The operation is performed, guaranteed atomically and serialized against other CPUs. In this context, serialized means they happen one-at-a-time . Atomically means "all the parts of this instruction happen without anything else intervening".
For x86, there are two commonly used instructions that are used to implement locks.
- CMPXCHG. Conditional exchange. Pseudocode:
uint32 cmpxchg(uint32 *memory_location, uint32 old_value, uint32 new_value) {
atomically {
if (*memory_location == old_value)
*memory_location = new_value;
return old_value;
}
}
- XCHG. Pseudocode:
uint32 xchg(uint32 *memory_location, uint32 new_value) {
atomically {
uint32 old_value = *memory_location;
*memory_location = new_value;
return *old_value;
}
}
So, you can implement a lock like this:
uint32 mylock = 0;
while (cmpxchg(&mylock, 0, 1) != 0)
;
We spin, waiting for the lock, hence, spinlock.
Now, unlocked instructions don t exhibit these nice behaviors. Depending on what machine you re on, with unlocked instructions, all sorts of violations of consistency can be observed. For example, even on x86, which has a very friendly memory consistency model, the following could be observed:
Thread 1 Thread 2
mov [w], 0 mov [x], 0
mov [w], 1 mov [x], 2
mov eax, w mov eax, x
mov [y], eax mov [z], eax
At the end of this program, y and z can both have the value 0!.
Anyway, one last note: LOCK on x86 can be applied to ADD, OR, and AND, in order to get consistent and atomic read-modify-write semantics for the instruction. This is important for, say, setting flag variables and making sure they don t get lost. Without that, you have this problem:
Thread 1 Thread 2
AND [x], 0x1 AND [x], 0x2
At the end of this program, possible values for x are 1, 2, and 0x1|0x2 (3). In order to get a correct program, you need:
Thread 1 Thread 2
LOCK AND [x], 0x1 LOCK AND [x], 0x2
Hope this helps.