我有一个清单,其中包含重复的项目,我想要一个唯一项目和它们频率的清单。
例如,我有[ a, a, b, b, b ]
,我想要[( a, 2 ), ( b, 3 )]
。
寻找一种简单的方式在不需要循环两次的情况下完成。
我有一个清单,其中包含重复的项目,我想要一个唯一项目和它们频率的清单。
例如,我有[ a, a, b, b, b ]
,我想要[( a, 2 ), ( b, 3 )]
。
寻找一种简单的方式在不需要循环两次的情况下完成。
如果您的项目已分组(即相似的项目一起成捆),则使用最有效的方法是 itertools.groupby
:
>>> [(g[0], len(list(g[1]))) for g in itertools.groupby([ a , a , b , b , b ])]
[( a , 2), ( b , 3)]
否则,请查看这个计数器食谱。
在Python 2.7+下:
from collections import Counter
input = [ a , a , b , b , b ]
c = Counter( input )
print( c.items() )
输出是:
[(a,2), (b,3)] 的中文翻译为:[(a,2),(b,3)]
>>> mylist=[ a , a , b , b , b ]
>>> [ (i,mylist.count(i)) for i in set(mylist) ]
[( a , 2), ( b , 3)]
如果您愿意使用第三方库,NumPy提供了一个方便的解决方案。如果您的列表只包含数字数据,则这特别有效。
import numpy as np
L = [ a , a , b , b , b ]
res = list(zip(*np.unique(L, return_counts=True)))
# [( a , 2), ( b , 3)]
要理解语法,请注意这里的 np.unique
返回一个独特值和计数的元组:
uniq, counts = np.unique(L, return_counts=True)
print(uniq) # [ a b ]
print(counts) # [2 3]
我知道这不是一个一句话的问题......但对我来说,我喜欢它,因为我清楚地知道我们只需一次通过初始值列表(而不是对其调用计数)即可。
>>> from collections import defaultdict
>>> l = [ a , a , b , b , b ]
>>> d = defaultdict(int)
>>> for i in l:
... d[i] += 1
...
>>> d
defaultdict(<type int >, { a : 2, b : 3})
>>> list(d.iteritems())
[( a , 2), ( b , 3)]
>>>
“老派的方式”。
>>> alist=[ a , a , b , b , b ]
>>> d={}
>>> for i in alist:
... if not d.has_key(i): d[i]=1 #also: if not i in d
... else: d[i]+=1
...
>>> d
{ a : 2, b : 3}
这样做的另一个办法是:
mylist = [1, 1, 2, 3, 3, 3, 4, 4, 4, 4]
mydict = {}
for i in mylist:
if i in mydict: mydict[i] += 1
else: mydict[i] = 1
然后获取元组列表,
mytups = [(i, mydict[i]) for i in mydict]
只需遍历一次列表,但必须同时遍历字典一次。但是,考虑到列表中有很多重复项,因此字典应该更小,因此遍历速度更快。
然而,我必须承认这段代码不是很漂亮或简洁。
没有哈希的解决方案:
def lcount(lst):
return reduce(lambda a, b: a[0:-1] + [(a[-1][0], a[-1][1]+1)] if a and b == a[-1][0] else a + [(b, 1)], lst, [])
>>> lcount([])
[]
>>> lcount([ a ])
[( a , 1)]
>>> lcount([ a , a , a , b , b ])
[( a , 3), ( b , 2)]
a. 将任何数据结构转换成邮资系列:
代码
for i in sort(s.value_counts().unique()):
print i, (s.value_counts()==i).sum()
通过熊猫的帮助,您可以做出以下事情:
import pandas as pd
dict(pd.value_counts(my_list))
Is there a way to force Django models to pass a field to a MySQL function every time the model data is read or loaded? To clarify what I mean in SQL, I want the Django model to produce something like ...
I am looking for an enterprise tasks scheduler for python, like quartz is for Java. Requirements: Persistent: if the process restarts or the machine restarts, then all the jobs must stay there and ...
Given the following list that contains some duplicate and some unique dictionaries, what is the best method to remove unique dictionaries first, then reduce the duplicate dictionaries to single ...
Simple enough question: I m using python random module to generate random integers. I want to know what is the suggested value to use with the random.seed() function? Currently I am letting this ...
I m using PyDev under Eclipse to write some Jython code. I ve got numerous instances where I need to do something like this: import com.work.project.component.client.Interface.ISubInterface as ...
Python s paster serve app.ini is taking longer than I would like to be ready for the first request. I know how to profile requests with middleware, but how do I profile the initialization time? I ...
Our business currently has an online store and recently we ve been offering free specials to our customers. Right now, we simply display the special and give the buyer a notice stating we will add the ...
I m trying to convert a Python dictionary into a Python list, in order to perform some calculations. #My dictionary dict = {} dict[ Capital ]="London" dict[ Food ]="Fish&Chips" dict[ 2012 ]="...