Timing woes with clock_gettime() CUDA


简言之,《守则》确实创造了两个2^23长的气管病媒,一个在东道方,另一个在装置上,彼此相同,并且加以分类。 它还(试图)衡量每个目标的时间。

关于东道病媒一的使用<代码>std:sort。 关于装置矢量I的使用thrust:sort


iii - 侵权行为



所用时间为:19 224622882秒



The first std::cout statement is produced after 19.224 seconds as stated. Yet the second std::cout statement (even though it says 19.32 seconds) is produced immediately after the first std::cout statement. Note that I have used different time_stamps for measurements in clock_gettime() viz ts_host and ts_device

I am using Cuda 4.0 and NVIDIA GTX 570 compute capability 2.0


    //For timings
    //Necessary thrust headers

    int main(int argc, char *argv[])
      int N=23;
      thrust::host_vector<int>H(1<<N);//create a vector of 2^N elements on host
      thrust::device_vector<int>D(1<<N);//The same on the device.
      thrust::host_vector<int>dummy(1<<N);//Copy the D to dummy from GPU after sorting 

       //Set the host_vector elements. 
      for (int i = 0; i < H.size(); ++i)    {
          H[i]=rand();//Set the host vector element to pseudo-random number.

      //Sort the host_vector. Measure time
      // Reset the clock
        timespec ts_host;
        ts_host.tv_sec = 0;
        ts_host.tv_nsec = 0;
        clock_settime(CLOCK_PROCESS_CPUTIME_ID, &ts_host);//Start clock


        clock_gettime(CLOCK_PROCESS_CPUTIME_ID, &ts_host);//Stop clock
        std::cout << "
Host Time taken is: " << ts_host.tv_sec<<" . "<< ts_host.tv_nsec <<" seconds" << std::endl;

        D=H; //Set the device vector elements equal to the host_vector
      //Sort the device vector. Measure time.
        timespec ts_device;
        ts_device.tv_sec = 0;
            ts_device.tv_nsec = 0;
        clock_settime(CLOCK_PROCESS_CPUTIME_ID, &ts_device);//Start clock


        clock_gettime(CLOCK_PROCESS_CPUTIME_ID, &ts_device);//Stop clock
        std::cout << "
Device Time taken is: " << ts_device.tv_sec<<" . "<< ts_device.tv_nsec <<" seconds" << std::endl;

      return 0;

您没有核对<代码>clock_settime的回报率。 我会猜测这一缺陷,可能的话是向企业社会责任部或企业社会责任部制定的<代码>errno。 阅读文件并始终检查你们的返回价值!





