English 中文(简体)
MATLAB和矢量循环中的双倍加和
原标题:Double Summation in MATLAB and vectorized loops

在这里,我试图执行这个可爱的公式。

http://dl.droppox.com/u/7348856/Picture1.png

%WIGNER Computes Wigner-Distribution on an image (difference of two images).
function[wd] = wigner(difference)
%Image size
[M, N, ~] = size(difference);
%Window size (5 x 5)
Md = 5;
Nd = 5;
%Fourier Transform
F = fft2(difference);
%Initializing the wigner picture
wd = zeros(M, N,  uint8 );
lambda =0.02;
value = (4/(Md*Nd));
for x = 1+floor(Md/2):M - floor(Md/2)
    for y = 1+floor(Nd/2):N - floor(Nd/2)
        for l = -floor(Nd/2) : floor(Nd/2)
            for k = -floor(Md/2) : floor(Md/2)
                kernel = exp(-lambda * norm(k,l));
                kernel = kernel * value;
                theta = 4 * pi * ((real(F(x, y)) * (k/M) )+ (imag(F(x, y)) * (l/N)));
                wd(x, y) = (wd(x, y)) + (cos(theta) * difference(x + k, y + l) * difference(x - k, y - l) * (kernel));
            end
        end
    end
end
end

正如你所看到的,外部两个环是滑动窗口的,而其余的内环是总和变量的。

Now, my request for you my beloved stackoverflow users is: Can you help me improve these very nasty for loops that take more than its share of time, and turn it into vectorized loops? And will that improvement be of a significant change?

谢谢

最佳回答

这四个嵌套环基本上正在以滑动邻里风格处理图像中的每个像素。 我立刻想到了 < a href="http://www.mathworks.com/help/toolbox/images/ref/nlfilter.html" rel="nofollown noreferrer" >NLFILLTER 和>IM2COL 功能。

以下是我试图对代码进行传导的尝试。请注意,我对它进行了彻底测试,或对照环基解决方案比较了性能:

function WD = wigner(D, Md, Nd, lambda)
    %# window size and lambda
    if nargin<2, Md = 5; end
    if nargin<3, Nd = 5; end
    if nargin<4, lambda = 5; end

    %# image size
    [M,N,~] = size(D);

    %# kernel = exp(-lambda*norm([k,l])
    [K,L] = meshgrid(-floor(Md/2):floor(Md/2), -floor(Nd/2):floor(Nd/2));
    K = K(:); L = L(:);
    kernel = exp(-lambda .* sqrt(K.^2+L.^2));

    %# frequency-domain part
    F = fft2(D);

    %# f(x+k,y+l) * f(x-k,y-l) * kernel
    C = im2col(D, [Md Nd],  sliding );
    X1 = bsxfun(@times, C .* flipud(C), kernel);

    %# cos(theta)
    C = im2col(F, [Md Nd],  sliding );
    C = C(round(Md*Nd/2),:);    %# take center pixels
    theta = bsxfun(@times, real(C), K/M) + bsxfun(@times, imag(C), L/N);
    X2 = cos(4*pi*theta);

    %# combine both parts for each sliding-neighborhood
    WD = col2im(sum(X1.*X2,1), [Md Nd], size(F),  sliding ) .* (4/(M*N));

    %# pad array with zeros to be of same size as input image
    WD = padarray(WD, ([Md Nd]-1)./2, 0,  both );
end

以下是环状版本, 其改进之处为 建议:

function WD = wigner_loop(D, Md, Nd, lambda)
    %# window size and lambda
    if nargin<2, Md = 5; end
    if nargin<3, Nd = 5; end
    if nargin<4, lambda = 5; end

    %# image size
    [M,N,~] = size(D);

    %# frequency-domain part
    F = fft2(D);

    WD = zeros([M,N]);
    for l = -floor(Nd/2):floor(Nd/2)
        for k = -floor(Md/2):floor(Md/2)
            %# kernel = exp(-lambda*norm([k,l])
            kernel = exp(-lambda * norm([k,l]));

            for x = (1+floor(Md/2)):(M-floor(Md/2))
                for y = (1+floor(Nd/2)):(N-floor(Nd/2))
                    %# cos(theta)
                    theta = 4 * pi * ( real(F(x,y))*k/M + imag(F(x,y))*l/N );

                    %# f(x+k,y+l) * f(x-k,y-l)* kernel
                    WD(x,y) = WD(x,y) + ( cos(theta) * D(x+k,y+l) * D(x-k,y-l) * kernel );
                end
            end
        end
    end
    WD = WD * ( 4/(M*N) );
end

以及我如何测试它(基于我从paper you 之前的

%# difference between two consecutive frames
A = imread( AT3_1m4_02.tif );
B = imread( AT3_1m4_03.tif );
D = imsubtract(A,B);
%#D = rgb2gray(D);
D = im2double(D);

%# apply Wigner-Distribution
tic, WD1 = wigner(D); toc
tic, WD2 = wigner_loop(D); toc
figure(1), imshow(WD1,[])
figure(2), imshow(WD2,[])

然后可能需要对矩阵进行缩放/调整,并应用阈值...

问题回答

这或许不是你们所要求的,但似乎(一眼看)总和的顺序是独立的,而不是 {x,y,l,k},你们可以去{l,k,x,y}。这样做可以使你们通过将内核保留在外环中来评估较少的内核时间。





相关问题
Optimizing a LAN server for a game

I m the network programmer on a school game project. We want to have up to 16 players at once on a LAN. I am using the Server-Client model and am creating a new thread per client that joins. ...

SQL Table Size And Query Performance

We have a number of items coming in from a web service; each item containing an unknown number of properties. We are storing them in a database with the following Schema. Items - ItemID - ...

Most optimized way to store crawler states?

I m currently writing a web crawler (using the python framework scrapy). Recently I had to implement a pause/resume system. The solution I implemented is of the simplest kind and, basically, stores ...

Do bitwise operations distribute over addition?

I m looking at an algorithm I m trying to optimize, and it s basically a lot of bit twiddling, followed by some additions in a tight feedback. If I could use carry-save addition for the adders, it ...

Improve INSERT-per-second performance of SQLite

Optimizing SQLite is tricky. Bulk-insert performance of a C application can vary from 85 inserts per second to over 96,000 inserts per second! Background: We are using SQLite as part of a desktop ...

Profiling Vim startup time

I’ve got a lot of plugins enabled when using Vim – I have collected plugins over the years. I’m a bit fed up with how long Vim takes to start now, so I’d like to profile its startup and see which of ...

Quick padding of a string in Delphi

I was trying to speed up a certain routine in an application, and my profiler, AQTime, identified one method in particular as a bottleneck. The method has been with us for years, and is part of a "...

热门标签