One line of background: I m the developer of Redis, a NoSQL database. One of the new features I m implementing is Virtual Memory, because Redis takes all the data in memory. Thanks to VM Redis is able to transfer rarely used objects from memory to disk, there are a number of reasons why this works much better than letting the OS do the work for us swapping (redis objects are built of many small objects allocated in non contiguous places, when serialized to disk by Redis they take 10 times less space compared to the memory pages where they live, and so forth).
Now I ve an alpha implementation that s working perfectly on Linux, but not so well on Mac OS X Snow Leopard. From time to time, while Redis tries to move a page from memory to disk, the redis process enters the uninterruptible wait state for minutes. I was unable to debug this, but this happens either in a call to fseeko()
or fwrite()
. After minutes the call finally returns and redis continues working without problems at all: no crash.
The amount of data transfered is very small, something like 256 bytes. So it should not be a matter of a very big amount of I/O performed.
But there is an interesting detail about the swap file that s target of the write operation. It s a big file (26 Gigabytes) created opening a file with fopen()
and then enlarged using ftruncate()
. Finally the file is unlink()
ed so that Redis continues to take a reference to it, but we are sure that when the Redis process will exit the OS will really free the swap file.
Ok that s all but I m here for any further detail. And BTW you can even find the actual code in the Redis git, but it s not trivial to understand in five minutes given that s a fairly complex system.
Thank you very much for any help.