English 中文(简体)
CUDA 增加信息总库的罗
原标题:CUDA Add Rows of a Matrix

我试图把4800x9600矩阵加起来,结果形成1x9600矩阵。

我所做的事是,将4800x9600 split成9 600tric,每.4800.。 然后,我削减了4800项内容。

问题是,这确实是缓慢的。

没有人提出任何建议?

基本上,Im试图执行MATLAB的总和(......)职能。

在这里,我核实的法典是细致的,它的确进展缓慢:

void reduceRows(Matrix Dresult,Matrix DA)
{
        //split DA into chunks
        Matrix Dchunk;
        Dchunk.h=1;Dchunk.w=DA.h;
        cudaMalloc((void**)&Dchunk.data,Dchunk.h*Dchunk.w*sizeof(float));

        Matrix DcolSum;
        DcolSum.h=1;DcolSum.w=1;
        //cudaMalloc((void**)&DcolSum.data,DcolSum.h*DcolSum.w*sizeof(float));

        int i;
        for(i=0;i<DA.w;i++)   //loop over each column
        {
                //printf("%d ",i);
                cudaMemcpy(Dchunk.data,&DA.data[i*DA.h],DA.h*sizeof(float),cudaMemcpyDeviceToDevice);
                DcolSum.data=&Dresult.data[i];
                reduceTotal(DcolSum,Dchunk);
        }
        cudaFree(Dchunk.data);
}

矩阵定义为:

typedef struct{
        long w;
        long h;
        float* data;
}Matrix;

仅削减标准NVIDIA,将Dchunk的所有要素汇总起来,并将答案放在DcolSum。

如果我找不到答案,我就要在万国邮联上这样做。

许多代表团预先感谢,

最佳回答
问题回答

我认为,你的问题是,你正在发射9600X2个弹道。 这应当是一种简单的算法,可以作为一个单方表示。

执行这一计划的最富有活力的方法不会带来记忆,但比你现在这样做的快得多。

你们一旦走了不起的工作方式,就会把你的记忆:为: 例如,在一块块中,每read16个连续浮标注成共享记忆、合成物,然后将相关的16个浮标积积集到一个登记册、星座,然后重复。

计算机SDK有许多减少技术的例子。





相关问题
Matrix to Represent a Triangle in Screen Space

So i have a set of four points in 3D Space. P1 [0, 0, 0] P2 [128, 0, 0] P3 [0, 128, 0] P4 [128, 128, 0] Which I m then projecting orthographically to the screen effectively giving me two ...

Multiply a 3D matrix with a 2D matrix

Suppose I have an AxBxC matrix X and a BxD matrix Y. Is there a non-loop method by which I can multiply each of the C AxB matrices with Y?

matrix and vector template classes in c++

#include <array> template <typename T> class Vector4<T> { std::array<T, 4> _a; // or T _a[4]; ? }; template <typename T> class Matrix4<T> { std::array<...

Linear Independence Matrix

Suppose we have a m by n matrix A with rank m and a set K⊆{1..n} such that the columns of A indexed by K are linearly independent. Now we want to extend K and find a set L so that k⊆L and columns ...

Difference between MATLAB s matrix notations

How do you read the following MATLAB codes? #1 K>> [p,d]=eig(A) // Not sure about the syntax. p = 0.5257 -0.8507 -0.8507 -0.5257 d = ...

Strange error when using sparse matrices and glmnet

I m getting a weird error when training a glmnet regression. invalid class "dgCMatrix" object: length(Dimnames[[2]]) must match Dim[2] It only happens occasionally, and perhaps only under larger ...

Java large datastructure for storing a matrix

I need to store a 2d matrix containing zip codes and the distance in km between each one of them. My client has an application that calculates the distances which are then stored in an Excel file. ...

Checking row and column for a word in python

I am trying to create a checking program to see if the word is in a matrix horizontally or vertically. I have the code for checking the row, but would checking the column be similar to the row code? ...

热门标签