English 中文(简体)
我怎么能够复制被nes为“CUDA”装置的成员的记忆空间?
原标题:How can I copy the members of nested structs to a CUDA device s memory space?

I m 试图复制一些封套,以在CUDA-accelerated neural Network simulator中为弹片使用注入记忆。 该法典连接并运行,但有一些例外和《世界人权宣言》错误:

typedef struct rdLayer
{
    long NeuronQty ;
    long DendriteQty ;

    cuDoubleComplex *gpuWeights ;
    cuDoubleComplex *gpuZOutputs ;
    cuDoubleComplex *gpuDeltas ;
    cuDoubleComplex *gpuUnWeights ;
} rdLayer;

typedef struct rdNetwork
{
    long SectorQty;
    double K_DIV_TWO_PI;
    double two_pi_div_sect_qty;
    cuDoubleComplex *gpuSectorBdry;
    long LayerQty;
    rdLayer *rLayer;
} rdNetwork;

struct rdLearningSet 
{
    long EvalMode ;
    long SampleQty ;
    long InputQty ;
    long OutputQty ;
    long ContOutputs ;
    long SampleIdxReq ;

    cuDoubleComplex *gpuXInputs ;
    cuDoubleComplex *gpuDOutputs ;
    cuDoubleComplex *gpuYOutputs ;
    double *gpudSE1024 ;
    cuDoubleComplex *gpuOutScalar ;
};

[...]
    struct rdLearningSet * rdLearn;
    struct rdNetwork * rdNet;
[...]
    cudaMalloc(&rdNet, sizeof(rdNetwork));
    cudaMalloc(&rdLearn, sizeof(rdLearningSet));
[...]
    cuDoubleComplex * dummy;
    struct rdLayer rdlSource, * rdldummy;
[...]
    //rdLayer *rLayer;
    cudaMalloc(&rdldummy, sizeof(rdLayer)*rSes.rNet->LayerQty);
    cudaMemcpy( &rdNet->rLayer, &rdldummy, sizeof(rdLayer*), cudaMemcpyHostToDevice);
    for (int L=1; L<rSes.rNet->LayerQty; L++){
            // construct layer to be copied
            rdlSource.NeuronQty=rSes.rNet->rLayer[L].iNeuronQty 
            rdlSource.DendriteQty=rSes.rNet->rLayer[L].iDendriteQty 
            cudaMalloc( &rdlSource.gpuWeights, sizeof(cuDoubleComplex) * (rSes.rNet->rLayer[L].DendriteQty+1) * (rSes.rNet->rLayer[L].NeuronQty+1) ) 
                    mCheckCudaWorked
            cudaMalloc( &rdlSource.gpuZOutputs, sizeof(cuDoubleComplex) * (rSes.rNet->rLayer[L].DendriteQty+1) * (rSes.rNet->rLayer[L].NeuronQty+1) ) 
                    mCheckCudaWorked
            cudaMalloc( &rdlSource.gpuDeltas, sizeof(cuDoubleComplex) * (rSes.rNet->rLayer[L].iDendriteQty+1) * (rSes.rNet->rLayer[L].iNeuronQty+1) ) 
                    mCheckCudaWorked
            cudaMalloc( &rdlSource.gpuUnWeights, sizeof(cuDoubleComplex) * (rSes.rNet->rLayer[L].iDendriteQty+1) * (rSes.rNet->rLayer[L].iNeuronQty+1) ) 
                    mCheckCudaWorked
            //copy layer sructure to Device mem
            cudaMemcpyToSymbol( "rdNet->rLayer", &rdlSource, sizeof(rdLayer), sizeof(rdLayer) * L, cudaMemcpyHostToDevice );/*! 2D neuron cx weight matrix on GPU */
                    mCheckCudaWorked
    }
[...]   
    cudaMalloc(&dummy, sizeof(cuDoubleComplex) * (rSes.rLearn->SampleQty) * (rSes.rLearn->InputQty+1) ); /*! 2D complex input tuples in GPU. */
            cudaMemcpy( &rdLearn->gpuXInputs, &dummy, sizeof(cuDoubleComplex*), cudaMemcpyHostToDevice );
                    cudaMemcpy( &dummy, &rSes.rLearn->gpuXInputs, sizeof(cuDoubleComplex) * (rSes.rLearn->SampleQty) * (rSes.rLearn->InputQty+1), cudaMemcpyHostToDevice); 
                    mCheckCudaWorked        
    cudaMalloc(&dummy, sizeof(cuDoubleComplex) * (rSes.rLearn->SampleQty) * (rSes.rLearn->OutputQty+1) ); /*! 2D desired complex outputs in GPU. */
            cudaMemcpy( &rdLearn->gpuDOutputs, &dummy, sizeof(cuDoubleComplex*), cudaMemcpyHostToDevice );
                    cudaMemcpy( &dummy, &rSes.rLearn->gpuDOutputs, sizeof(cuDoubleComplex) * (rSes.rLearn->SampleQty) * (rSes.rLearn->OutputQty+1), cudaMemcpyHostToDevice); 
                    mCheckCudaWorked
[...]

Unfortunately, the cudaMemcpyToSymbol call returns an error that the mCheckCudaWorked macro says is "invalid device symbol", while the last (cudaMemcpy( &dummy, &rSes.rLearn->gpuDOutputs...) and third-from-last (cudaMemcpy( &dummy, &rSes.rLearn->gpuXInputs...) cudaMemcpy calls return "invalid argument".

我正在就如何着手将这些物品复制,以防弹器法的记忆和可处理,感到损失。 amp、my和amp;rd,正积极恢复,作为预留记忆的器点,我可以将这些点人写到装置的记忆中,但我不能把大部分成员价值观同点分配混在一起。 帮助?

最佳回答

诸如<代码>gpuXInputs的现场需要点击,并用<代码>cudaMalloc加以分配,以便它们成为device memory<>的有效点。

Typically you need a host version of your data structures, where your allocations use malloc etc, and then a mirror of these data structures on the device, which have been allocated via cudaMalloc. Any pointers within these data structures need to point to the right kind of memory - you can t "mix and match".

问题回答

暂无回答




相关问题
how to reliable capture display setting changed

static void Main() { // Set the SystemEvents class to receive event notification when a user // when display settings change. SystemEvents.DisplaySettingsChanged += new ...

Why use CComBSTR instead of just passing a WCHAR*?

I m new to COM. What exactly is the advantage of replacing: L"String" with CComBSTR(L"String") I can see a changelist in the COM part of my .NET application where all strings are replaced in this ...

COM Basic links

folks can you provide me the tutorial link or .pdf for learning basic COM?. i do google it.. still i recommend answers of stackoverflow so please pass me.. Thanks

Statically linking Winsock?

I m using Winsock 1.1 in my project. I include wsock32.lib in "Additional Dependencies". I m looking at the DLL project using depends.exe and notice that the DLL depends on wsock32.dll. How can I ...