English 中文(简体)
大量数据集的记忆优化
原标题:Memory optimization in huge data set
  • 时间:2011-11-21 12:15:06
  •  标签:
  • c++

处理所有问题, 我已经履行一些职能,并想问一些基本事项,因为我对C++没有健全的基本知识。 我希望,各位都能够告诉我,我可以从你那里学习的最好方式。 (请不要说这是家事,而不要有任何专家来问这个问题)

What I did is; I read the input x,y,z, point data (around 3GB data set) from a file and then compute one single value for each point and store inside a vector (result). Then, it will be used in next loop. And then, that vector will not be used anymore and I need to get that memory as it contains huge data set. I think I can do this in two ways. (1) By just initializing a vector and later by erasing it (see code-1). (2) By allocating a dynamic memory and then later de-allocating it (see 编码-2). I heard this de-allocation is inefficient as de-allocation again cost memory or maybe I misunderstood.

Q1) I would like to know what would be the optimized way in terms of memory and efficiency.

Q2) Also, I would like to know whether function return by reference is a good way of giving output. (Please look at code-3)

code-1

int main(){

    //read input data (my_data)

    vector<double) result;
    for (vector<Position3D>::iterator it=my_data.begin(); it!=my_data.end(); it++){

         // do some stuff and calculate a "double" value (say value)
         //using each point coordinate 

         result.push_back(value);

    // do some other stuff

    //loop over result and use each value for some other stuff
    for (int i=0; i<result.size(); i++){

        //do some stuff
    }

    //result will not be used anymore and thus erase data
    result.clear()

编码-2

int main(){

    //read input data

    vector<double) *result = new vector<double>;
    for (vector<Position3D>::iterator it=my_data.begin(); it!=my_data.end(); it++){

         // do some stuff and calculate a "double" value (say value)
         //using each point coordinate 

         result->push_back(value);

    // do some other stuff

    //loop over result and use each value for some other stuff
    for (int i=0; i<result->size(); i++){

        //do some stuff
    }

    //de-allocate memory
    delete result;
    result = 0;
}

准则03

vector<Position3D>& vector<Position3D>::ReturnLabel(VoxelGrid grid, int segment) const
{
  vector<Position3D> *points_at_grid_cutting = new vector<Position3D>;
  vector<Position3D>::iterator  point;

  for (point=begin(); point!=end(); point++) {

       //do some stuff         

  }
  return (*points_at_grid_cutting);
}
最佳回答

<代码>erase>将免费向病媒提供传记。 它缩小了规模,但并未降低能力,因此病媒仍为所有这些双倍留下足够的记忆。

The best way to make the memory available again is like your code-1, but let the vector go out of scope:

int main() {
    {
        vector<double> result;
        // populate result
        // use results for something
    }
    // do something else - the memory for the vector has been freed
}

Failing that, the idiomatic way to clear a vector and free the memory is:

vector<double>().swap(result);

这形成了一个空洞的临时病媒,然后用<代码>result(soresult)交换了该病媒的内容,而且其容量很小,而临时人员则拥有所有数据和很大能力。 最后,它销毁了临时人员,带去了巨大的缓冲。

关于代码03:通过提及方式归还有活力的物体并不好,因为它没有向打电话者提供他们负责释放的很多催复通知。 通常最好的做法是按价值回归当地变量:

vector<Position3D> ReturnLabel(VoxelGrid grid, int segment) const
{
  vector<Position3D> points_at_grid_cutting;
  // do whatever to populate the vector
  return points_at_grid_cutting;
}

原因是,如果打电话者使用这一功能作为他们自己的病媒的初始化,那么所谓的“收益价值最佳化”基数,并确保虽然你重新按价值返回,但不会复制价值。

从事国家扫盲工作的汇编者是一个坏的汇编者,很可能还有其他各种令人惊讶的业绩失败,但在有些情况下,国家扫盲委员会没有适用,最重要的是,当值被召集人分配给变数而不是用于初始化时。 这方面有三个办法:

1) C++11 introduces move semantics, which basically sort it out by ensuring that assignment from a temporary is cheap.

2) 在C++03中,打电话者可以玩一个称为“刺激”的骗子。 而不是:

vector<Position3D> foo;
// some other use of foo
foo = ReturnLabel();

vector<Position3D> foo;
// some other use of foo
ReturnLabel().swap(foo);

3) 你用更复杂的签名书写功能,例如通过非最参照和将数值填入该功能,或作为模板参数使用一个发射器。 后者还为打电话者提供更大的灵活性,因为他们不需要使用<>条码>查询器<>条码/代码>来储存结果,他们可以使用其他集装箱,甚至可以在没有一劳永逸的情况下处理这些集装箱。

问题回答

对于这种巨大的数据集,我将避免完全使用 st集装箱,并利用所绘制的记忆档案。

If you prefer to go on with std::vector, use vector::clear() or vector::swap(std::vector()) to free memory allocated.

你的法典似乎与第一部航程的计算价值一样,在第二处仅以不敏感的方式使用。 换言之,一旦你计算了第一胎中的双重价值,你就可以立即行动起来,无需一劳永逸地储存所有价值。

如果是这样的话,你就应当执行。 没有大额拨款、储存或任何东西的担忧。 提高切身业绩。 申请。

vector<double) result;
    for (vector<Position3D>::iterator it=my_data.begin(); it!=my_data.end(); it++){

         // do some stuff and calculate a "double" value (say value)
         //using each point coordinate 

         result.push_back(value);

如果“结果”病媒最终达到数千项价值,这将导致许多重新定位。 如果你以足够大的储存能力或利用储备职能来启动储备功能的话,最好:

vector<double) result (someSuitableNumber,0.0);

这将减少重新定位的次数,并可能进一步优化你的代码。

另外,我还要写以下文字:vector<Position3D>& Media<Position3D>:ReturnLabel(VoxelGrid grid, int part) const

与此类似:

void vector<Position3D>::ReturnLabel(VoxelGrid grid, int segment, vector<Position3D> & myVec_out) const //myVec_out is populated inside func

你关于返还参考书的想法是正确的,因为你想要避免复制。

“C++的教员不得失败,因此,办公场所不能给人留下记忆,因为记忆不能用不增长的保证分配。

除此以外:如果你以综合方式开展行动,即不装上整个数据集,那么减少各点的数据集,即各点的数据集,直接适用削减,可能更好。

load_my_data()
for_each (p : my_data)
    result.push_back(p)

for_each (p : result)
    reduction.push_back (reduce (p))

正义

file f ("file")
while (f)
    Point p = read_point (f)
    reduction.push_back (reduce (p))

如果你不需要储存这些削减,只是按顺序进行。

file f ("file")
while (f)
    Point p = read_point (f)
    cout << reduce (p)

第1号法典将实行罚款,几乎与第2号法典相同,没有任何重大优势或劣势。

准则03 其他人应当回答,但ibelieve 就此而言,点人与参照人之间的差别不大,但我倾向于点人。

尽管如此,我认为你可能会从错误的角度来看待优化。 你们是否真的需要所有点来计算一下你首处的某一点的产出? 或者,你能否将算法改写成一个点,计算一下你在座右首部分的价值,然后立即使用你想要的方法? 也许不是单点,而是分点。 这有可能减少你的记忆,只需要少量增加处理时间。





相关问题
Undefined reference

I m getting this linker error. I know a way around it, but it s bugging me because another part of the project s linking fine and it s designed almost identically. First, I have namespace LCD. Then I ...

C++ Equivalent of Tidy

Is there an equivalent to tidy for HTML code for C++? I have searched on the internet, but I find nothing but C++ wrappers for tidy, etc... I think the keyword tidy is what has me hung up. I am ...

Template Classes in C++ ... a required skill set?

I m new to C++ and am wondering how much time I should invest in learning how to implement template classes. Are they widely used in industry, or is this something I should move through quickly?

Print possible strings created from a Number

Given a 10 digit Telephone Number, we have to print all possible strings created from that. The mapping of the numbers is the one as exactly on a phone s keypad. i.e. for 1,0-> No Letter for 2->...

typedef ing STL wstring

Why is it when i do the following i get errors when relating to with wchar_t? namespace Foo { typedef std::wstring String; } Now i declare all my strings as Foo::String through out the program, ...

C# Marshal / Pinvoke CBitmap?

I cannot figure out how to marshal a C++ CBitmap to a C# Bitmap or Image class. My import looks like this: [DllImport(@"test.dll", CharSet = CharSet.Unicode)] public static extern IntPtr ...

Window iconification status via Xlib

Is it possible to check with the means of pure X11/Xlib only whether the given window is iconified/minimized, and, if it is, how?

热门标签