Question

我在MATLAB中做了一些方案,并且按照建议,我总是试图使用病媒。但最终该方案相当缓慢。因此,我发现,在使用假体时,守则在一处大大加快(见下文例)。

I would like to know if I misinterpreted something or did something wrong, because performance is important in this case, and I don t want to keep guessing if vectorization or loops are going to be faster.

% data initialization

k = 8;
n = 2^k+1;
h = 1/(n-1);
cw = 0.1;

iter = 10000;

uloc = zeros(n);
fploc = uloc;
uloc(2:end-1,2:end-1) = 1;
vloc = uloc;
ploc = ones(n);

uloc2 = zeros(n);
fploc2 = uloc2;
uloc2(2:end-1,2:end-1) = 1;
vloc2 = uloc2;
ploc2 = ones(n);

%%%%%%%%%%%%%%%%%%%%%%
% vectorized version %
%%%%%%%%%%%%%%%%%%%%%%
tic
for it=1:iter
    il=2:4;
    jl=2:4;
    fploc(il,jl) = h/6*(-uloc(il-1,jl-1) + uloc(il-1,jl)...
        -2*uloc(il,jl-1)+2*uloc(il,jl+1)...
        -uloc(il+1,jl) + uloc(il+1,jl+1)...
        ...
        -vloc(il-1,jl-1) - 2*vloc(il-1,jl)...
        +vloc(il,jl-1) - vloc(il,jl+1)...
        + 2*vloc(il+1,jl) + vloc(il+1,jl+1))...
        ...
        +cw*h^2*(-ploc(il-1,jl)-ploc(il,jl-1)+4*ploc(il,jl)...
        -ploc(il+1,jl)-ploc(il,jl+1));
end
toc


%%%%%%%%%%%%%%%%%%%%%%
%    loop version    %
%%%%%%%%%%%%%%%%%%%%%%
tic
for it=1:iter
    for il=2:4
        for jl=2:4
            fploc2(il,jl) = h/6*(-uloc2(il-1,jl-1) + uloc2(il-1,jl)...
                -2*uloc2(il,jl-1)+2*uloc2(il,jl+1)...
                -uloc2(il+1,jl) + uloc2(il+1,jl+1)...
                ...
                -vloc2(il-1,jl-1) - 2*vloc2(il-1,jl)...
                +vloc2(il,jl-1) - vloc2(il,jl+1)...
                + 2*vloc2(il+1,jl) + vloc2(il+1,jl+1))...
                ...
                +cw*h^2*(-ploc2(il-1,jl)-ploc2(il,jl-1)+4*ploc2(il,jl)...
                -ploc2(il+1,jl)-ploc2(il,jl+1));
        end
    end
end
toc

Answer 1

I didn t go through your code, but the JIT compiler in recent versions of Matlab has improved to the point where the situation you re facing is quite common - loops can be faster than vectorized code. It is difficult to know in advance which will be faster, so the best approach is to write the code in the most natural fashion, profile it and then if there is a bottleneck, try switching from loops to vectorized (or the other way).

Answer 2

MATLAB s just in time compiler (JIT) has been improved significantly over the last couple years. And even though you are right that one should generally vectorize code, from my experience this is only true for certain operations and functions and also depends on how much data your functions are handling.

The best way for you to find out what works best, is to profile your MATLAB code with and without vectorization.

Answer 3

Maybe a matrix of a few elements is not a good test for vectorization efficiency. In the end it depends on the application on what works well.

Also, usually vectorized code looks better (more true to underlying model), but it many cases it does not and it ends up hurting the implementation. What you did is great as now you know what works best for you.

Answer 4

我不会说这种病媒化。

你似乎正在采取某种过滤行动。这种过滤器的真正病媒化版本是原始数据,乘以过滤矩阵(即代表整个流体的矩阵)。

与这些矩阵有关的问题是,这些矩阵是太少的(只有少数非零元素环绕着寺院),很难有效地使用这些结构。页: 1 即便如此,通知的用意可能不足以证明需要额外记忆。

Matlab在 lo路上很坏,因为即使是 lo门柜等也仍然被视为复杂的tric子,因此,所有对tric子的检查都经过了评估。我的猜测是,在你申请过滤系数时,所有这些检查都始终进行。

Perhaps the matlab functions filter and filter2 are useful here? You may also ant to read this post: Improving MATLAB Matrix Construction Code : Or, code Vectorization for begginers

Answer 5

One possible explanation is startup overhead. If a temporary matrix is created behind the scene, be prepared for memory allocations. Also, I guess MATLAB cannot deduce that your matrix is small so there will be loop overhead. So your vectorized version may end up in code like

double* tmp=(double*)malloc(n*sizeof(double));
for(size_t k=0;k<N;++k)
    {
//  Do stuff with elements
    }
free(tmp);

Compare this to a known number of operations:

double temp[2];
temp[0]=...;
temp[1]=...;

So JIT may be faster when the malloc-loopcounter-free time is long compared to the workload for each computation.

友情链接