English 中文(简体)
缓冲和缓冲中具体位置的设置要素 - 如何加速 / 为什么不同的执行过程如此缓慢?
原标题:assigning elements to buffer and specific position in buffer - how to speed up / why are the different implementations so slow?
  • 时间:2012-05-24 17:44:18
  •  标签:
  • python

我试图加速以下代码, 给它一个字符串列表 < code> > list I m, 试图将字符串转换为数字 ( unpack ), 并将此数字指定到嵌套列表 data 的正确位置。 data 的维度大致是 data[4][20][1024] 。 不幸的是, 此函数运行非常慢。 这里的代码是 :

for abs_idx in range(nbr_elements):

    # get string
    this_element = str_list[abs_idx]

    # convert into number
    this_element = unpack( d , this_element)[0]

    # calculate the buffer number
    buffer_nbr = abs_idx / NBR_DATA_POINTS_PER_BUFFER_INT

    # calculate the position inside the buffer
    index_in_buffer = abs_idx % NBR_DATA_POINTS_PER_BUFFER_INT

    # write data into correct position
    data[file_idx][buffer_nbr][index_in_buffer] = this_element

我还尝试了以下更慢的替代解决办法:

# convert each string into a number
unpacked_values = [unpack( d , str_list[j])[0] for j in range(nbr_elements)]
for abs_idx in range(nbr_elements):

    # calculate the buffer number
    buffer_nbr = abs_idx / NBR_DATA_POINTS_PER_BUFFER_INT

    # calculate the position inside the buffer
    index_in_buffer = abs_idx % NBR_DATA_POINTS_PER_BUFFER_INT

    # write data into correct position
    data[file_idx][buffer_nbr][index_in_buffer] = unpacked_values[abs_idx]

令我惊讶的是,下一个执行最慢(我希望它最快):

# convert each string into a number
unpacked_values = [unpack( d , str_list[j])[0] for j in range(nbr_elements)]

# calculate all buffer numbers at once
buffer_ids = np.arange(nbr_elements) / NBR_DATA_POINTS_PER_BUFFER_INT

# calculate all positions inside the buffer at once
index_in_buffer_id = np.arange(nbr_elements) % NBR_DATA_POINTS_PER_BUFFER_INT

for abs_idx in range(nbr_elements):
    data[file_idx][buffer_ids[abs_idx]][index_in_buffer_id[abs_idx]] = unpacked_values[abs_idx]

为什么连续实施的情况更差? 个别的瓶颈在哪里? 我怎样才能加快我的初始代码?

EDIT: from my 剖析测试,以下两个步骤是瓶颈:运行 unpack 并给 data 指定值。 我不知道如何加快这些步骤。

EDIT2: 我需要使用 unpack ,因为我的字符串在十六进制中。

< 加强> EDIT3 : values = unfack ("d" * n, "".join(str_list)) 解决了拆解速度缓慢的问题。然而,使用三重(原始)或双重(修改后的)嵌套环的数据分配会消耗50%的时间。 是否有办法缩短这一时间?

最佳回答

一些优化 :

  1. Unpack all stings at once
  2. Get item data[file_idx] before loop

试试看:

n = len(str_list)
values = unpack("d" * n, "".join(str_list))

a = data[file_idx]

# Just to shorten this code sample
q = NBR_DATA_POINTS_PER_BUFFER_INT

for i in xrange(n):
    a[i / q][i % q] = values[i]

Btw,你有没有描述代码的哪个部分 最需要时间?

<强 > UPATE:

n = len(str_list)
values = unpack("d" * n, "".join(str_list))

# Just to shorten this code sample
q = NBR_DATA_POINTS_PER_BUFFER_INT

data[file_idx] = [values[i:i+q] for i in xrange(0, n, q)]
问题回答

这样可以更快吗? 它会减少一些检查, 不需要用所有字符串的编号来做中间列表吗?

df = data[file_idx]
index = 0
for value in str_list:
    # not sure what unpack does... is there a faster function 
    # that does the same?
    number = unpack( d , value)[0]

    # calculate the buffer number
    buffer_nbr = index / NBR_DATA_POINTS_PER_BUFFER_INT

    # calculate the position inside the buffer
    index_in_buffer = index % NBR_DATA_POINTS_PER_BUFFER_INT

    # write data into correct position
    df[buffer_nbr][index_in_buffer] = number

    index += 1

不如这样:

df = data[file_idx]
index = 0
bufnr = 0
buf = df[0]
for value in str_list:
    # not sure what unpack does... is there a faster function 
    # that does the same?
    number = unpack( d , value)[0]

    buf[index] = number

    index += 1

    if index >= NBR_DATA_POINTS_PER_BUFFER_INT:
        index = 0
        bufnr += 1
        buf = df[bufnr]

数据是否是一个字典而不是一个列表?





相关问题
Can Django models use MySQL functions?

Is there a way to force Django models to pass a field to a MySQL function every time the model data is read or loaded? To clarify what I mean in SQL, I want the Django model to produce something like ...

An enterprise scheduler for python (like quartz)

I am looking for an enterprise tasks scheduler for python, like quartz is for Java. Requirements: Persistent: if the process restarts or the machine restarts, then all the jobs must stay there and ...

How to remove unique, then duplicate dictionaries in a list?

Given the following list that contains some duplicate and some unique dictionaries, what is the best method to remove unique dictionaries first, then reduce the duplicate dictionaries to single ...

What is suggested seed value to use with random.seed()?

Simple enough question: I m using python random module to generate random integers. I want to know what is the suggested value to use with the random.seed() function? Currently I am letting this ...

How can I make the PyDev editor selectively ignore errors?

I m using PyDev under Eclipse to write some Jython code. I ve got numerous instances where I need to do something like this: import com.work.project.component.client.Interface.ISubInterface as ...

How do I profile `paster serve` s startup time?

Python s paster serve app.ini is taking longer than I would like to be ready for the first request. I know how to profile requests with middleware, but how do I profile the initialization time? I ...

Pragmatically adding give-aways/freebies to an online store

Our business currently has an online store and recently we ve been offering free specials to our customers. Right now, we simply display the special and give the buyer a notice stating we will add the ...

Converting Dictionary to List? [duplicate]

I m trying to convert a Python dictionary into a Python list, in order to perform some calculations. #My dictionary dict = {} dict[ Capital ]="London" dict[ Food ]="Fish&Chips" dict[ 2012 ]="...

热门标签