Question

我经常读到,在大多数情形下,独一无二的接收器比共享的接收器更可取,因为独一无二的接收器是不可复制的,而且有移动调制;共享的接收器会因复制和重新计算而增加间接费用;

但是,当我在某些局势中测试独一无二的吸收器时,似乎比对口单位(在接触时)缓慢得不清。

For example, under gcc 4.5 :

<><>>:> 印刷方法实际上没有印刷任何东西。

#include <iostream>
#include <string>
#include <memory>
#include <chrono>
#include <vector>

class Print{

public:
void print(){}

};

void test()
{
 typedef vector<shared_ptr<Print>> sh_vec;
 typedef vector<unique_ptr<Print>> u_vec;

 sh_vec shvec;
 u_vec  uvec;

 //can t use initializer_list with unique_ptr
 for (int var = 0; var < 100; ++var) {

    shared_ptr<Print> p(new Print());
    shvec.push_back(p);

    unique_ptr<Print> p1(new Print());
    uvec.push_back(move(p1));

  }

 //-------------test shared_ptr-------------------------
 auto time_sh_1 = std::chrono::system_clock::now();

 for (auto var = 0; var < 1000; ++var) 
 {
   for(auto it = shvec.begin(), end = shvec.end(); it!= end; ++it)
   {
     (*it)->print();
   }
 }

 auto time_sh_2 = std::chrono::system_clock::now();

 cout <<"test shared_ptr : "<< (time_sh_2 - time_sh_1).count() << " microseconds." << endl;

 //-------------test unique_ptr-------------------------
 auto time_u_1 = std::chrono::system_clock::now();

 for (auto var = 0; var < 1000; ++var) 
 {
   for(auto it = uvec.begin(), end = uvec.end(); it!= end; ++it)
   {
     (*it)->print();
   }
 }

 auto time_u_2 = std::chrono::system_clock::now();

 cout <<"test unique_ptr : "<< (time_u_2 - time_u_1).count() << " microseconds." << endl;

}

(g++-O0)

shared_ptr : 1480 microseconds
unique_ptr : 3350 microseconds

差异来自何处? 它是可解释的吗?

Answer 1

你们在时间范围内所做的一切都与他们接触。这只得起额外的间接费用。时间可能增加,可能来自青春期产出增长。你们永远不会在时间成熟的基准中做I/O。

如果你想测试重新计票的间接费用,那么实际上就算出了。如何增加<条码>的施工、销毁、转让和其他变式操作的时间如果您,则在您的时间里将考虑一下: 共有_ptr<>>>。

Edit: 如果没有I/O,汇编者的最佳程度如何? 他们应该全心全意。哪怕是ide子。

Answer 2

<UPDATED on Jan 01,2014

我知道这个问题很老,但结果在++4.7.0和libstdc++4.7仍然有效。因此,我试图找出原因。

您在此重订基准:dereferencing Performance, 采用-O<0/strong>; 参照unique_ptr和 共享_ptr <>code>, 您的成果实际上正确。

www.un.org/Depts/DGACM/index_spanish.htm 在<代码>上储存点人和删除人:std:tuple,而commd_ptr<>。直接操作一个赤.的点子。因此,当你参考点(使用编号、编号、编号、编号或取用)时,请打电话到<代码>:std:get<0>unique_ptr。相比之下,<代码>共享_ptr直接退回点人。 On gcc-4.7, 即使在优化和上网时,也采用了:get<0>()比直接点人缓慢。 gcc-4.8.1在优化和上网时,完全忽略以下部分的间接费用:植被与技术;0>。在我用<代码>-O3汇编成的机器上,汇编者生成了完全相同的组装代码,即其为literally。

~~总而言之,使用目前的执行,<>>>> 共享_ptr在制作、移动、复制和参考上放慢,但与快速*>平等。~~
<>NOTE:>在问题中是空洞的,汇编者在优化时将 lo积。因此,我稍微修改了该守则,以正确观察优化结果:

#include <iostream> #include <string> #include <memory> #include <chrono> #include <vector> using namespace std; class Print { public: void print() { i++; } int i{ 0 }; }; void test() { typedef vector<shared_ptr<Print>> sh_vec; typedef vector<unique_ptr<Print>> u_vec; sh_vec shvec; u_vec uvec; // can t use initializer_list with unique_ptr for (int var = 0; var < 100; ++var) { shvec.push_back(make_shared<Print>()); uvec.emplace_back(new Print()); } //-------------test shared_ptr------------------------- auto time_sh_1 = std::chrono::system_clock::now(); for (auto var = 0; var < 1000; ++var) { for (auto it = shvec.begin(), end = shvec.end(); it != end; ++it) { (*it)->print(); } } auto time_sh_2 = std::chrono::system_clock::now(); cout << "test shared_ptr : " << (time_sh_2 - time_sh_1).count() << " microseconds." << endl; //-------------test unique_ptr------------------------- auto time_u_1 = std::chrono::system_clock::now(); for (auto var = 0; var < 1000; ++var) { for (auto it = uvec.begin(), end = uvec.end(); it != end; ++it) { (*it)->print(); } } auto time_u_2 = std::chrono::system_clock::now(); cout << "test unique_ptr : " << (time_u_2 - time_u_1).count() << " microseconds." << endl; } int main() { test(); }

<>NOTE 这不是一个根本问题,可以通过放弃使用:目前执行校准标准。

Answer 3

You re not testing anything useful here.

www.un.org/Depts/DGACM/index_spanish.htm 你们所说的话: 复印件

www.un.org/Depts/DGACM/index_spanish.htm 您正在测试:。循环

If you want to test copy, you actually need to perform a copy. Both smart pointers should have similar performance when it comes to reading, because good shared_ptr implementations will keep a local copy of the object pointed to.

EDIT:

关于新内容:

总的来说,在使用假装时,甚至不值得谈论速度。如果你关注履约情况,你将使用释放代码(-O2),从而衡量标准,因为 de和释放代码之间可能存在巨大差异。尤其值得注意的是,对模板代码进行分类可大大减少执行时间。

关于基准:

I would add another round of measures: naked pointers. Normally, unique_ptr and naked pointers should have the same performance, it would be worth checking it, and it need not necessarily be true in debug mode.
You might want to "interleave" the execution of the two batches or if you cannot, take the average of each among several runs. As it is, if the computer slows down during the end of the benchmark, only the unique_ptr batch will be affected which will perturbate the measure.

你们可能有兴趣更多地学习尼尔语:。基准的Joy没有确定指南,但很有意义。尤其是强迫副作用以避免死编码移走的部分;

此外,还仔细研究你如何衡量。解决你的锁子可能不如现在的准确。比如,如果24小时只复读,那么15天左右的任何措施都是可疑的。在衡量释放法时,这个问题可能是个问题(也许需要增加某些选择。

友情链接