English 中文(简体)
C program stuck on uninterruptible wait while performing disk I/O on Mac OS X Snow Leopard
原标题:

One line of background: I m the developer of Redis, a NoSQL database. One of the new features I m implementing is Virtual Memory, because Redis takes all the data in memory. Thanks to VM Redis is able to transfer rarely used objects from memory to disk, there are a number of reasons why this works much better than letting the OS do the work for us swapping (redis objects are built of many small objects allocated in non contiguous places, when serialized to disk by Redis they take 10 times less space compared to the memory pages where they live, and so forth).

Now I ve an alpha implementation that s working perfectly on Linux, but not so well on Mac OS X Snow Leopard. From time to time, while Redis tries to move a page from memory to disk, the redis process enters the uninterruptible wait state for minutes. I was unable to debug this, but this happens either in a call to fseeko() or fwrite(). After minutes the call finally returns and redis continues working without problems at all: no crash.

The amount of data transfered is very small, something like 256 bytes. So it should not be a matter of a very big amount of I/O performed.

But there is an interesting detail about the swap file that s target of the write operation. It s a big file (26 Gigabytes) created opening a file with fopen() and then enlarged using ftruncate(). Finally the file is unlink()ed so that Redis continues to take a reference to it, but we are sure that when the Redis process will exit the OS will really free the swap file.

Ok that s all but I m here for any further detail. And BTW you can even find the actual code in the Redis git, but it s not trivial to understand in five minutes given that s a fairly complex system.

Thank you very much for any help.

最佳回答

As I understand it, HFS+ has very poor support for sparse files. So it may be that your write is triggering a file expansion that is initializing/materializing a large fraction of the file.

For example, I know mmap ing a new large empty file and then writing at a few random locations produces a very large file on disk with HFS+. It s quite annoying since mmap and sparse files are an extremely convenient way of working with data, and virtually every other platform/filesystem out there handles this gracefully.

Is the swap file written to linearly? Meaning we either replace an existing block or write a new block at the end and increment a free space pointer? If so, perhaps doing more frequent smaller ftruncate calls to expand the file would result in shorter pauses.

As an aside, I m curious why redis VM doesn t use mmap and then just move blocks around in an attempt to concentrate hot blocks into hot pages.

问题回答

antirez, I m not sure I ll be much help since my Apple experience is limited to the Apple ][, but I ll give it a shot.

First thing is a question. I would have thought that, for virtual memory, speed of operation would be a more important measure than disk space (especially for a NoSQL DB where speed is the whole point, otherwise you d be using SQL, no?). But, if your swap file is 26G, maybe not :-)

Some things to try (if possible).

  1. Try to actually isolate the problem to the seek or write. I have a hard time believing a seek could take that long since, at worst, it should be a buffer pointer change. Still, I didn t write OSX so I can t be sure.
  2. Try adjusting the size of the swap file to see if that s what is causing the problem.
  3. Do you ever dynamically expand the swap file (as opposed to pre-allocation)? If you do, that may be what is causing the problem.
  4. Do you always write as low in the file as you can? It may be that creating a 26G file may not actually fill it with data but, if you create it then write to the last byte, the OS may have to zero out the bytes before then (deferring the initialization, if any).
  5. What happens if you just pre-allocate the entire file (write to every byte) and not unlink it? In other words, leave the file there between runs of your program (creating it if it doesn t already exist of course). Then in your startup code for Redis, just initialize the file (pointers and such). This may get rid of any problems like those in point 4 above.
  6. Ask on the various BSD sites as well. I m not sure how much Apple changed under the covers but OSX is just BSD at the lowest level (Pax ducks for cover).
  7. Also consider asking on the Apple sites (if you haven t already done so).

Well, that s my small contribution, hopefully it ll help. Good luck with your project.

Have you turned off file caching for your file? i.e. fcntl(fd, F_GLOBAL_NOCACHE, 1)

Have you tried debugging with DTrace and or Instruments (Apple s experimental dtrace front-end)?

Exploring Leopard with DTrace

Debugging Chrome on OS X

As Linus said once on the Git mailing list:

"I realize that OS X people have a hard time accepting it, but OS X filesystems are generally total and utter crap - even more so than Windows."





相关问题
Fastest method for running a binary search on a file in C?

For example, let s say I want to find a particular word or number in a file. The contents are in sorted order (obviously). Since I want to run a binary search on the file, it seems like a real waste ...

Print possible strings created from a Number

Given a 10 digit Telephone Number, we have to print all possible strings created from that. The mapping of the numbers is the one as exactly on a phone s keypad. i.e. for 1,0-> No Letter for 2->...

Tips for debugging a made-for-linux application on windows?

I m trying to find the source of a bug I have found in an open-source application. I have managed to get a build up and running on my Windows machine, but I m having trouble finding the spot in the ...

Trying to split by two delimiters and it doesn t work - C

I wrote below code to readin line by line from stdin ex. city=Boston;city=New York;city=Chicago and then split each line by ; delimiter and print each record. Then in yet another loop I try to ...

Good, free, easy-to-use C graphics libraries? [closed]

I was wondering if there were any good free graphics libraries for C that are easy to use? It s for plotting 2d and 3d graphs and then saving to a file. It s on a Linux system and there s no gnuplot ...

Encoding, decoding an integer to a char array

Please note that this is not homework and i did search before starting this new thread. I got Store an int in a char array? I was looking for an answer but didn t get any satisfactory answer in the ...

热门标签