English 中文(简体)
7 Point computational stencil cache access in C (or..map a 3D array into a 1D array)
原标题:

I have a problem I am trying to tackle that involves a 7 point computational stencil. For those who may not know, this would be a 3D grid, and the 7 points are the n th point, and the neighbors one point away in the x, y and z directions, both positive and negative (or neighbors to the east/west/north/south and up/down).

So these 6 points plus the 1 additional point i am working on are used in a calculation, and are all stored in a 1 dimensional array.

Assume nx is the width of the cube, and ny is the height. In memory, then, when I am accessing a point in the array All_Points, such as All_points[n], then to get it s neighbors in each direction, I also want to access All_points[n-1], All_points[n+1], All_points[n-nx], All_points[n+nx], All_points[n-nxny], and All_points[n+nxny].

So my problem with this is that I am getting a ton of cache misses. I can t seem to find any code example that demonstrate how to avoid This problem. Ideally I d like to split this array back up into it s x, y and z coordinates, such as All_x_points[] but then I run into a problem trying to keep that updated, since All_points[n] changes, and when it does, that means for some other All_points[n ] my x, y or z value will need to be updated with it.

Anyone seen this kind of thing done before?

最佳回答

What kind of access pattern is using your 7-point stencil? If you re having cache coherence problems, this is the first question to ask -- if the access pattern of your central (x,y,z) coordinate is completely random, you may be out of luck.

If you have some control over the access pattern, you can try to adjust it to be more cache-friendly. If not, then you should consider what kind of access pattern to expect; you may be able to arrange the data so that this access pattern is more benign. A combination of these two can sometimes be very effective.

There is a particular data arrangement that is frequently useful for this kind of thing: bit-interleaved array layout. Assume (for simplicity) that the size of each coordinate is a power of two. Then, a "normal" layout will build the index by concatenating the bits for each coordinate. However, a bit-interleaved layout will allocate bits to each dimension in a round-robin fashion:

3D index coords: (xxxx, yyyy, zzzz)

normal index:    data[zzzzyyyyxxxx]  (x-coord has least-significant bits, then y)
bit-interleaved: data[zyxzyxzyxzyx]  (lsb are now relatively local)

Practically speaking, there is a minor cost: instead of multiplying the the coordinates by their step values, you will need to use a lookup table to find your offsets. But since you will probably only need very short lookup tables (especially for a 3D array!), they should all fit nicely into cache.

3D coords:  (x,y,z)

normal index:      data[x + y*ystep + z*zstep]  where:
  ystep= xsize (possibly aligned-up, if not a power of 2?)
  zsetp= ysize * ystep

bit-interleaved:   data[xtab[x] + ytab[y] + ztab[z]]  where:
  xtab={  0,  1,  8,  9, 64, 65, 72, 73,512...}   (x has bits 0,3,6,9...)
  ytab={  0,  2, 16, 18,128,130,144,146,1024...}  (y has bits 1,4,7,10...)
  ztab={  0,  4, 32, 36,256,260,288,292,2048...}  (y has bits 2,5,8,11...)

Ultimately, whether this is any use depends entirely on the requirements of your algorithm. But, again, please note that if your algorithm is too demanding of your cache, you may want to look into adjusting the algorithm, instead of just the layout.

问题回答

7 points? Six defining a spatial coordinate, one defining a length? Are these... Stargate coordinates?

Why not turn your Array of Structures (AOS) into a Structure of Arrays (SOA)?

int point = points_all[i]; // the point you want
Vec2 points_x[point]; // x and y are the neighbours left and right
Vec2 points_y[point]; // x and y are the neighbours up and down
Vec2 points_z[point]; // x and y are the neighbours front and back




相关问题
Write-though caching of large data sets in WCF?

We ve got a smart client that talks to a SQL Server database via WCF, displaying the entities in the database, and allowing the user to edit those entities. Some of the WCF calls return a large data ...

Clearing RSL in Cache

I have built a flex application which has a "main" project and it is assosciated with a few RSL s which are loaded and cached once i run my "main" application. The problem i am facing is that the ...

how to tell clicking "back" to load cache?

I would like for my site when someone clicks "Back" or "Forward" for the server to tell the browser to load the cache instead of reloading the entire page. I ve tested some headers and done research, ...

java plugin cache and dynamic IP host

I m trying to use Amazon S3 and Amazon Cloudfront CDN to deliver the jar files of my applet application. I m seeing several cache misses of my jars by the java plugin. This is a show-stopper for me, ...

Frequently Used metadata Hashmap

Are there any implementations of a static size hashtable that limits the entries to either the most recently or most frequently used metadata? I would prefer not to keep track of this information ...

PHP - Memcache - HTML Caching

I would like to create a caching system that will bypass some mechanisms in order to improve the performance. I have some examples: 1-) I have a dynamic PHP page that is updated every hour. The page ...

Performance of Sql subqueriesfunctions

I am currently working on a particularly complex use-case. Simplifying below :) First, a client record has a many-to-one relationship with a collection of services, that is, a single client may have ...

热门标签