Question

声明的含义是什么

// create arrays of 1M elements
const int num_elements = 1<<20;

在以下法典中: 是否具体针对《世界人权宣言》或《标准C》是否也可使用?

缩略语 I got num_elements=1048576

这一数字为2^20。因此,C的“带”和“带”;C的操作者是短暂的花招?

// This example demonstrates parallel floating point vector
// addition with a simple __global__ function.

#include <stdlib.h>
#include <stdio.h>


// this kernel computes the vector sum c = a + b
// each thread performs one pair-wise addition
__global__ void vector_add(const float *a,
                           const float *b,
                           float *c,
                           const size_t n)
{
  // compute the global element index this thread should process
  unsigned int i = threadIdx.x + blockDim.x * blockIdx.x;

  // avoid accessing out of bounds elements
  if(i < n)
  {
    // sum elements
    c[i] = a[i] + b[i];
  }
}


int main(void)
{
  // create arrays of 1M elements
  const int num_elements = 1<<20;

  // compute the size of the arrays in bytes
  const int num_bytes = num_elements * sizeof(float);

  // points to host & device arrays
  float *device_array_a = 0;
  float *device_array_b = 0;
  float *device_array_c = 0;
  float *host_array_a   = 0;
  float *host_array_b   = 0;
  float *host_array_c   = 0;

  // malloc the host arrays
  host_array_a = (float*)malloc(num_bytes);
  host_array_b = (float*)malloc(num_bytes);
  host_array_c = (float*)malloc(num_bytes);

  // cudaMalloc the device arrays
  cudaMalloc((void**)&device_array_a, num_bytes);
  cudaMalloc((void**)&device_array_b, num_bytes);
  cudaMalloc((void**)&device_array_c, num_bytes);

  // if any memory allocation failed, report an error message
  if(host_array_a == 0 || host_array_b == 0 || host_array_c == 0 ||
     device_array_a == 0 || device_array_b == 0 || device_array_c == 0)
  {
    printf("couldn t allocate memory
");
    return 1;
  }

  // initialize host_array_a & host_array_b
  for(int i = 0; i < num_elements; ++i)
  {
    // make array a a linear ramp
    host_array_a[i] = (float)i;

    // make array b random
    host_array_b[i] = (float)rand() / RAND_MAX;
  }

  // copy arrays a & b to the device memory space
  cudaMemcpy(device_array_a, host_array_a, num_bytes, cudaMemcpyHostToDevice);
  cudaMemcpy(device_array_b, host_array_b, num_bytes, cudaMemcpyHostToDevice);

  // compute c = a + b on the device
  const size_t block_size = 256;
  size_t grid_size = num_elements / block_size;

  // deal with a possible partial final block
  if(num_elements % block_size) ++grid_size;

  // launch the kernel
  vector_add<<<grid_size, block_size>>>(device_array_a, device_array_b, device_array_c, num_elements);

  // copy the result back to the host memory space
  cudaMemcpy(host_array_c, device_array_c, num_bytes, cudaMemcpyDeviceToHost);

  // print out the first 10 results
  for(int i = 0; i < 10; ++i)
  {
    printf("result %d: %1.1f + %7.1f = %7.1f
", i, host_array_a[i], host_array_b[i], host_array_c[i]);
  }


    // deallocate memory
  free(host_array_a);
  free(host_array_b);
  free(host_array_c);

  cudaFree(device_array_a);
  cudaFree(device_array_b);
  cudaFree(device_array_c);
}

Answer 1

<代码><<的操作者为借方转移操作者。它使用若干个轨道,例如00101,将其转至左侧n,其效果是乘以两个功率。 www.un.org/spanish/ga/president 因此,数字在内部储存在电脑中,即二元。

举例来说,<代码>1 是,在2个辅助器(即:)中作为32个轨道储存时:

00000000000000000000000000000001

When you do

1 << 20

页: 1

00000000000100000000000000000000

页: 1 这还有助于象征性地代表、1个补充等等。

Another example, if you take the representation of 5:

00000000000000000000000000000101

并且:5 << 1, 您收到

00000000000000000000000000001010

Which is 10, or 5 * 2^1.

Conversely, the >> will divide by a power of 2 by moving the bits over to the right n places.

Answer 2

变化不大。在双轨制中,将20个位置移至左边等于2^20。

edit: 是符合标准C的,也是向用户清楚表明它在20个轨道位置上是单一1个,多于写作的<代码>int a = 1048576;

Answer 3

(标准)C左轮操作员<<通过将左边的比值(二元数字)移至右边的左边的“空间”(右边填满零)所显示的“空间”(右边填充);<20个双面编号,1个,后是20个。由于双轨制是基数2,每一次向左侧的两倍(基数乘数),即与乘数2。

这种双位数的财产可被利用,以比一般数学功能更快地乘数和分立积极分类。 (同样在小学数学阶段,在行使10..=权力时,可以利用同样数量的财产。)

友情链接