如果我正在使用sparse.lil_matrix格式,如何轻松高效地从矩阵中删除一列?
Is there a way to force Django models to pass a field to a MySQL function every time the model data is read or loaded? To clarify what I mean in SQL, I want the Django model to produce something like ...
如果我正在使用sparse.lil_matrix格式,如何轻松高效地从矩阵中删除一列?
我一直想要这个并且实际上还没有一个很好的内置方法来做到这一点。这是一种方法来做到这一点。我选择创建一个lil_matrix的子类并添加remove_col函数。如果您希望,您可以将removecol函数添加到您的lib/site-packages/scipy/sparse/lil.py
文件中的lil_matrix类中。这是代码:
from scipy import sparse
from bisect import bisect_left
class lil2(sparse.lil_matrix):
def removecol(self,j):
if j < 0:
j += self.shape[1]
if j < 0 or j >= self.shape[1]:
raise IndexError( column index out of bounds )
rows = self.rows
data = self.data
for i in xrange(self.shape[0]):
pos = bisect_left(rows[i], j)
if pos == len(rows[i]):
continue
elif rows[i][pos] == j:
rows[i].pop(pos)
data[i].pop(pos)
if pos == len(rows[i]):
continue
for pos2 in xrange(pos,len(rows[i])):
rows[i][pos2] -= 1
self._shape = (self._shape[0],self._shape[1]-1)
我已经试过了,没有发现任何错误。我当然认为它比切割列要好,因为据我所知,那只是创建一个新矩阵。
我决定也做一个 removerow 函数,但我认为它不像 removecol 那么好。我受到限制,不能以我想要的方式从 ndarray 中删除一行。这是 removerow,可以添加到上面的类中。
def removerow(self,i):
if i < 0:
i += self.shape[0]
if i < 0 or i >= self.shape[0]:
raise IndexError( row index out of bounds )
self.rows = numpy.delete(self.rows,i,0)
self.data = numpy.delete(self.data,i,0)
self._shape = (self._shape[0]-1,self.shape[1])
也许我应该将这些函数提交到Scipy存储库中。
更简单更快。你甚至可能不需要将其转换为csr格式,但我知道它可以与csr稀疏矩阵一起工作,并且在它们之间进行转换不应该是一个问题。
from scipy import sparse
x_new = sparse.lil_matrix(sparse.csr_matrix(x)[:,col_list])
对于稀疏的CSR矩阵(X)和要删除的索引列表(index_to_drop):
to_keep = list(set(xrange(X.shape[1]))-set(index_to_drop))
new_X = X[:,to_keep]
将lil_matrices转换为csr_matrices很容易。请查看lil_matrix文档中的tocsr()。
但请注意,使用tolil()从csr转换为lil矩阵很昂贵。因此,当您不需要将矩阵格式化为lil格式时,此选择非常好。
我说,我的回答可能是错的,但我很想知道,为什么像以下的胜利一样有效率?
假设您的lil_matrix被称为mat,您想要删除第i列:
mat=hstack( [ mat[:,0:i] , mat[:,i+1:] ] )
现在矩阵将在此之后转换为coo_matrix,但您可以将其转换回lil_matrix。
奥基公司,我的理解是,这将必须在头盔内建立两个矩阵,然后才对马塔变量进行分配,这样它就好像在原来的矩阵加上另一个,但我猜测,如果 sp光大,那么我就认为不存在任何记忆问题(因为记忆(和时间)是使用混凝土的全因)。
def removecols(W, col_list):
if min(col_list) = W.shape[1]:
raise IndexError( column index out of bounds )
rows = W.rows
data = W.data
for i in xrange(M.shape[0]):
for j in col_list:
pos = bisect_left(rows[i], j)
if pos == len(rows[i]):
continue
elif rows[i][pos] == j:
rows[i].pop(pos)
data[i].pop(pos)
if pos == len(rows[i]):
continue
for pos2 in xrange(pos,len(rows[i])):
rows[i][pos2] -= 1
W._shape = (W._shape[0], W._shape[1]-len(col_list))
return W
刚刚重写了你的代码,使其能够接受col_list作为输入 - 或许对某些人会有帮助。
通过查看每个稀疏矩阵的注释,特别是在我们的情况下是csc矩阵,它具有以下优点,如文档中所列。
If you have the column indices you want to remove, just use slicing. For removing rows use csr matrix since it is efficient in row slicing
Is there a way to force Django models to pass a field to a MySQL function every time the model data is read or loaded? To clarify what I mean in SQL, I want the Django model to produce something like ...
I am looking for an enterprise tasks scheduler for python, like quartz is for Java. Requirements: Persistent: if the process restarts or the machine restarts, then all the jobs must stay there and ...
Given the following list that contains some duplicate and some unique dictionaries, what is the best method to remove unique dictionaries first, then reduce the duplicate dictionaries to single ...
Simple enough question: I m using python random module to generate random integers. I want to know what is the suggested value to use with the random.seed() function? Currently I am letting this ...
I m using PyDev under Eclipse to write some Jython code. I ve got numerous instances where I need to do something like this: import com.work.project.component.client.Interface.ISubInterface as ...
Python s paster serve app.ini is taking longer than I would like to be ready for the first request. I know how to profile requests with middleware, but how do I profile the initialization time? I ...
Our business currently has an online store and recently we ve been offering free specials to our customers. Right now, we simply display the special and give the buyer a notice stating we will add the ...
I m trying to convert a Python dictionary into a Python list, in order to perform some calculations. #My dictionary dict = {} dict[ Capital ]="London" dict[ Food ]="Fish&Chips" dict[ 2012 ]="...