English 中文(简体)
如何加快计算C类绝对损失矩阵?
原标题:How to speed up the computation of absolute loss matrix in C?

我正在进行一项蒙特卡洛试验,并评估下文所述绝对损失功能。 由于这在计算上非常密集,我谨选择我的准则,进一步改进速度。 我的主要法典是MATLAB,但我正在利用MATLAB的MEX功能评价C的功能。

数学问题如下: 我有一个矩阵D,有层面(M倍N)。 通常M值约为20 000,N值约为{10、30、144}。

矩阵D的定义

实际上,我需要获得L栏病媒,其尺寸(M倍)的定义是:

矩阵L的定义

我的C职能就是这样做:

void absolute_loss(double *D, double *L, mwSize cols, mwSize rows)
{

  double aux;
  int i;
  int j;
  int k;
  for (i = 0; i < rows; i++) {
    for (j = 0; j < rows; j++){
      aux = 0;
      for  (k = 0; k < cols; k++) {
        aux = aux + fabs(D[j + rows * k] - D[i + rows * k]);
      }
      L[i] = L[i] + aux;
    }
  }

  for (i = 0; i < rows; i++) {
    L[i] /= rows;
  }
}

任何建议都受到高度赞赏。

问题回答

如何加快计算绝对损失汇总表

  • 可查阅https://stackoverflow.com/questions/76316325/how-to-speed-up-the-co-of-absolute-loss-matrix-in-c/76316931#comment134576746_76316325>。 Juhl。

  • 如具备能力,可使用<代码>float <>/code> 类型和float功能。 有时高达4x。 对我来说,8%的速度更快。

  • 使用<条码>限制,让汇编者了解参考数据并不重叠。 否则,汇编者必须承担<代码>L[i] = ......;可更改<代码>D[],从而防止某些优化。

  • 参考数据,可在<代码>const上查阅。

  • 采用统一的指数类型。

  • 变化指数增长。 @DevSolar

  • 指数类型: www.un.org/Depts/DGACM/index_french.htm <代码>unsign short是更快的5%。


void absolute_loss(const float * restrict D, float * restrict L,
    mwSize cols, mwSize rows) {
  mwSize rows_cols = rows*cols;
  for (mwSize i = 0; i < rows; i++) {
    for (mwSize j = 0; j < rows; j++){
      float aux = 0.0;
      for (mwSize k = 0; k < rows_cols; k += rows) {
        aux = aux + fabsf(D[j + k] - D[i + k]); // Note: fabsf
      }
      L[i] = L[i] + aux;
    }
  }
  for (mwSize i = 0; i < rows; i++) {
    L[i] /= rows;
  }
}

注:

d 我期望在职能开始时做到如下或那样。

for (mwSize i = 0; i < rows; i++) {
  L[i] = 0.0;
}

Tip, 而不是rows, cols, i, j, 使用M, N, m, n与公式相符。 我确信你是正确的。


候选人参加考试,利用

#include <math.h>

typedef unsigned short mwSize;

// Note re-ordered parameters.
void absolute_loss(mwSize m_rows, mwSize n_cols, //
    float D[restrict m_rows][n_cols], float L[restrict m_rows]) {

  for (mwSize ell = 0; ell < m_rows; ell++) {
    float ell_sum = 0.0;
    for (mwSize n = 0; n < n_cols; n++) {
      float d_ell_n = D[ell][n];
      for (mwSize m = 0; m < m_rows; m++) {
        ell_sum += fabsf(D[m][n] - d_ell_n);
      }
    }
    L[ell] = ell_sum / (float) m_rows;
  }
}

#include <assert.h>
#include <stdio.h>
#include <stdlib.h>
#include <time.h>

int main(void) {
  // Usually M is around 20,000 and N takes values around {10, 30, 144}.
  mwSize m_rows = (mwSize) (rand() % 1000 + (20000 - 1000));
  mwSize n_cols = (mwSize[3]) {10, 30, 144}[rand() % 3];

  float (*D)[m_rows][n_cols] = malloc(sizeof *D);
  assert(D);
  float (*L)[m_rows] = malloc(sizeof *L);
  assert(L);
  for (mwSize m = 0; m < m_rows; m++) {
    for (mwSize n = 0; n < n_cols; n++) {
      (*D)[m][n] = (float) (rand() % 1000 + 1);
    }
  }

  clock_t t0 = clock();
  absolute_loss(m_rows, n_cols, *D, *L);
  clock_t t1 = clock();
  // Print some of L
  for (mwSize ell = 0; ell < m_rows; ell++) {
    printf(" %-7g", (*L)[ell]);
    if (ell > 10) {
      printf("
");
      break;
    }
  }
  printf("
%g seconds.
", (double) (t1 - t0) / CLOCKS_PER_SEC);
  free(L);
  free(D);
}

时间:4 906秒。

你的行文和栏目似乎与您的表D有异乎寻常的安排。 您的职司使用一些指数获取数据,这些指数跳出一席之地,最能利用你的记忆。 重组后,可以处理大部分相联的因素,大大改善业绩。 在我的电脑上,这在你为M=10000、N=30所任职务的一半左右完成。

void absolute_loss2(double * restrict D, double * restrict L, , mwSize cols, mwSize rows) {

  double Dtemp;
  int i, j, k, rowstimesk;
  for (i = 0; i < rows; i++) {
    L[i] = 0.0;
  }
  for (i = 0; i < rows; i++) {
    for  (k = 0; k < cols; k++) {
      rowstimesk = rows * k;
      Dtemp = D[i + rowstimesk];
      for (j = 0; j < rows; j++){
        L[j] += fabs(D[j + rowstimesk] - Dtemp);
      }
    }
  }

  for (i = 0; i < rows; i++) {
    L[i] /= rows;
  }
}




相关问题
Fastest method for running a binary search on a file in C?

For example, let s say I want to find a particular word or number in a file. The contents are in sorted order (obviously). Since I want to run a binary search on the file, it seems like a real waste ...

Print possible strings created from a Number

Given a 10 digit Telephone Number, we have to print all possible strings created from that. The mapping of the numbers is the one as exactly on a phone s keypad. i.e. for 1,0-> No Letter for 2->...

Tips for debugging a made-for-linux application on windows?

I m trying to find the source of a bug I have found in an open-source application. I have managed to get a build up and running on my Windows machine, but I m having trouble finding the spot in the ...

Trying to split by two delimiters and it doesn t work - C

I wrote below code to readin line by line from stdin ex. city=Boston;city=New York;city=Chicago and then split each line by ; delimiter and print each record. Then in yet another loop I try to ...

Good, free, easy-to-use C graphics libraries? [closed]

I was wondering if there were any good free graphics libraries for C that are easy to use? It s for plotting 2d and 3d graphs and then saving to a file. It s on a Linux system and there s no gnuplot ...

Encoding, decoding an integer to a char array

Please note that this is not homework and i did search before starting this new thread. I got Store an int in a char array? I was looking for an answer but didn t get any satisfactory answer in the ...

热门标签