English 中文(简体)
Python create create array array
原标题:Pythonic way to create a numpy array from a list of numpy arrays

我在 lo体内生成了一个单体积阵列清单,随后将这份清单转换成2个单体。 如果我事先知道项目的数量,我将预先分配2个 n的阵列,但我没有这样做,因此我把所有事项列入一个清单。

mo如下:

>>> list_of_arrays = map(lambda x: x*ones(2), range(5))
>>> list_of_arrays
[array([ 0.,  0.]), array([ 1.,  1.]), array([ 2.,  2.]), array([ 3.,  3.]), array([ 4.,  4.])]
>>> arr = array(list_of_arrays)
>>> arr
array([[ 0.,  0.],
       [ 1.,  1.],
       [ 2.,  2.],
       [ 3.,  3.],
       [ 4.,  4.]])

My question is the following:

完成收集相应数字数据的任务(在我情况下是一阵列)是否比把这些数据列入清单,然后做一个假设,有更好的办法(表现)。 它的阵列(我正在制造新的 ob和复制数据)? 在经过良好测试的模块中是否有“可扩展”矩阵数据结构?

我2d矩阵的典型尺寸为100x10和5000x10浮动。

EDIT: 在这个例子中,我使用地图,但在我的实际应用中,我有 lo。

最佳回答

Suppose you know that the final array arr will never be larger than 5000x10. Then you could pre-allocate an array of maximum size, populate it with data as you go through the loop, and then use arr.resize to cut it down to the discovered size after exiting the loop.

The tests below suggest doing so will be slightly faster than constructing intermediate python lists no matter what the ultimate size of the array is.

另外,arr.resize 减去未使用的记忆,因此,最后的(但可能不是中间的)记忆足迹小于<代码>python_lists_to_array。

numpy_all_-way。 更快:

% python -mtimeit -s"import test" "test.numpy_all_the_way(100)"
100 loops, best of 3: 1.78 msec per loop
% python -mtimeit -s"import test" "test.numpy_all_the_way(1000)"
100 loops, best of 3: 18.1 msec per loop
% python -mtimeit -s"import test" "test.numpy_all_the_way(5000)"
10 loops, best of 3: 90.4 msec per loop

% python -mtimeit -s"import test" "test.python_lists_to_array(100)"
1000 loops, best of 3: 1.97 msec per loop
% python -mtimeit -s"import test" "test.python_lists_to_array(1000)"
10 loops, best of 3: 20.3 msec per loop
% python -mtimeit -s"import test" "test.python_lists_to_array(5000)"
10 loops, best of 3: 101 msec per loop

numpy_all_-way。 使用较少记忆:

% test.py
Initial memory usage: 19788
After python_lists_to_array: 20976
After numpy_all_the_way: 20348

测试:

import numpy as np
import os


def memory_usage():
    pid = os.getpid()
    return next(line for line in open( /proc/%s/status  % pid).read().splitlines()
                if line.startswith( VmSize )).split()[-2]

N, M = 5000, 10


def python_lists_to_array(k):
    list_of_arrays = list(map(lambda x: x * np.ones(M), range(k)))
    arr = np.array(list_of_arrays)
    return arr


def numpy_all_the_way(k):
    arr = np.empty((N, M))
    for x in range(k):
        arr[x] = x * np.ones(M)
    arr.resize((k, M))
    return arr

if __name__ ==  __main__ :
    print( Initial memory usage: %s  % memory_usage())
    arr = python_lists_to_array(5000)
    print( After python_lists_to_array: %s  % memory_usage())
    arr = numpy_all_the_way(5000)
    print( After numpy_all_the_way: %s  % memory_usage())
问题回答

召集方式,使用numpy.concatenate。 我认为,这比“@unutbu”的答复更快:

In [32]: import numpy as np 

In [33]: list_of_arrays = list(map(lambda x: x * np.ones(2), range(5)))

In [34]: list_of_arrays
Out[34]: 
[array([ 0.,  0.]),
 array([ 1.,  1.]),
 array([ 2.,  2.]),
 array([ 3.,  3.]),
 array([ 4.,  4.])]

In [37]: shape = list(list_of_arrays[0].shape)

In [38]: shape
Out[38]: [2]

In [39]: shape[:0] = [len(list_of_arrays)]

In [40]: shape
Out[40]: [5, 2]

In [41]: arr = np.concatenate(list_of_arrays).reshape(shape)

In [42]: arr
Out[42]: 
array([[ 0.,  0.],
       [ 1.,  1.],
       [ 2.,  2.],
       [ 3.,  3.],
       [ 4.,  4.]])

甚至比@Gill Bates的答复简单,这里是一条行文法:

np.stack(list_of_arrays, axis=0)

你正在做的是标准方法。 堆积层的财产是,它们需要毗连的记忆。 我认为,只有<代码>strides/code> member of PyArrayObject的“洞”的可能性,但这不影响这里的讨论。 假设阵列有毗连的记忆,并且是“预留”,增加了新的浏览/栏,用于分配新的记忆、复制数据,然后释放旧的记忆。 如果你做很多工作,那就没有效率。

一个人可能不想编制清单,然后将其转换成一个绝热阵列,最终就是名单包含大量数字:一大批人所占用的空间远远少于本土的甲型号清单(因为当地人名单是Adhurc)。 对于你典型的阵容规模,我认为这不是一个问题。

当你从一系列阵列清单中确定最后阵列时,请将全部数据复制到新阵列(如你的例子)的新地点。 这仍然比拥有一个绝食阵列和做next = numpy.vstack (next, new_row)更有成效,每次都得到新的数据。 <代码>vstack(>将复制每一“row”的所有数据。

有一个,在一段时间前,在Numpy-discussion senting list上讨论是否有可能增加一个能够有效推广/适用的新绝热阵列。 当时似乎对这一点有重大兴趣,尽管我不知道究竟是哪一部分。 你们或许希望看看一下这一read。

我要说的是,如果你真的需要其他东西(提高空间效率,或许会是吗?),那么你又做些什么是很 Python的,也是有效的。 这就是在我不知道一开始阵列中的内容数目时,我如何创造我的冷静阵列。

我要补充一下我自己的“~”回答。 就像一味一路一样,但如果你有指数错误,你就会生动地转售。 我认为,小套数据本来会更快,但速度要小一些——检查的束缚太慢。

initial_guess = 1000

def my_numpy_all_the_way(k):
    arr=np.empty((initial_guess,M))
    for x,row in enumerate(make_test_data(k)):
        try:
            arr[x]=row
        except IndexError:
            arr.resize((arr.shape[0]*2, arr.shape[1]))
            arr[x]=row
    arr.resize((k,M))
    return arr

更简单的@fnjn

np.vstack(list_of_arrays)




相关问题
Can Django models use MySQL functions?

Is there a way to force Django models to pass a field to a MySQL function every time the model data is read or loaded? To clarify what I mean in SQL, I want the Django model to produce something like ...

An enterprise scheduler for python (like quartz)

I am looking for an enterprise tasks scheduler for python, like quartz is for Java. Requirements: Persistent: if the process restarts or the machine restarts, then all the jobs must stay there and ...

How to remove unique, then duplicate dictionaries in a list?

Given the following list that contains some duplicate and some unique dictionaries, what is the best method to remove unique dictionaries first, then reduce the duplicate dictionaries to single ...

What is suggested seed value to use with random.seed()?

Simple enough question: I m using python random module to generate random integers. I want to know what is the suggested value to use with the random.seed() function? Currently I am letting this ...

How can I make the PyDev editor selectively ignore errors?

I m using PyDev under Eclipse to write some Jython code. I ve got numerous instances where I need to do something like this: import com.work.project.component.client.Interface.ISubInterface as ...

How do I profile `paster serve` s startup time?

Python s paster serve app.ini is taking longer than I would like to be ready for the first request. I know how to profile requests with middleware, but how do I profile the initialization time? I ...

Pragmatically adding give-aways/freebies to an online store

Our business currently has an online store and recently we ve been offering free specials to our customers. Right now, we simply display the special and give the buyer a notice stating we will add the ...

Converting Dictionary to List? [duplicate]

I m trying to convert a Python dictionary into a Python list, in order to perform some calculations. #My dictionary dict = {} dict[ Capital ]="London" dict[ Food ]="Fish&Chips" dict[ 2012 ]="...

热门标签