Question

这是我的第一个假日方案。

要求: 阅读一个文件,每行“{用户Id}”。每个用户都印刷了独一无二的用户数目。

这里是我的法典,从读书时起。你们能否就我如何以更 p的眼光来写这句话?

CODE :

import csv

adDict = {}
reader = csv.reader(open("some.csv"), delimiter=   )
for row in reader:
    adId = row[0]
    userId = row[1]
    if ( adId in adDict ):
        adDict[adId].add(userId)
    else:
        adDict[adId] = set(userId)

for key, value in adDict.items():
    print (key,  ,  , len(value))

感谢。

Answer 1

法典行:

adDict[adId] = set(userId)

不大可能做你想要做的事情——它将把<条码>用户Id作为一系列信函处理,例如如果<条码>用户Id是<条码>。页: 1 后来,add(用户Id),userId:aleax 添加第五个项目,编号为 aleax,因为.add(不同于初始设计者,后者以其论点而具有可变性)将单一项目作为其论点。

单项使用<代码>([用户Id])。

这是相当频繁的ug,因此我想清楚解释一下。尽管如此,正如其他答复所建议的,defaultdict显然是正确的办法(避免setdefault<>code>,这从来不是好的设计,也不会有良好的业绩,也只是高尚的。

我也避免了<条码>csv的超高技能,以利于每个条线上有一个简易的路段。

Answer 2

Congratulations, your code is very nice. There are a few little tricks you could use to make it shorter/simpler.

收集模块提供了一种称为缺省字标。如果被搁置者有主人,则不必加以检查,那么,你可以设立一条不成文的起诉书,像固定的口号一样,除非它自动向你们提供一只空套,否则就没有钥匙。因此,你可以改变

if ( adId in adDict ):
    adDict[adId].add(userId)
else:
    adDict[adId] = set(userId)

简单

adDict[adId].add(userId)

而且,不是

for row in reader:
    adId = row[0]
    userId = row[1]

您可以缩短时间。

for adId,userId in reader:

<><>Edit>: 正如帕克在评论中指出的,

for key, value in adDict.iteritems():

is the most efficient way to iterate over a dict, if you are going to use both the key and value in the loop. In Python3, you can use

for key, value in adDict.items():

既然(a)项交还了一台拖车。

#!/usr/bin/env python
import csv
from collections import defaultdict

adDict = defaultdict(set)
reader = csv.reader(open("some.csv"), delimiter=   )
for adId,userId in reader:
    adDict[adId].add(userId)
for key,value in adDict.iteritems():
    print (key,  ,  , len(value))

Answer 3

您可以缩短这方面的空档:

for row in reader:
  adDict.setdefault(row[0], set()).add(row[1])

Answer 4

而不是:

for row in reader:
    adId = row[0]
    userId = row[1]

使用自动序列拆包:

for (adId, userId) in reader:

In:

if ( adId in adDict ):

你们不需要父母。

而不是:

if ( adId in adDict ):
    adDict[adId].add(userId)
else:
    adDict[adId] = set(userId)

使用<条码>faultdict:

from collections import defaultdict
adDict = defaultDict(set)

# ...

adDict[adId].add(userId)

或者,如果您不再允许教授使用其他模块,则使用<代码>setdefault():

adDict.setdefault(adId, set()).add(userId)

印刷时:

for key, value in adDict.items():
    print (key,  ,  , len(value))

采用严格格式可能会更容易格式:

print "%s,%s" % (key, len(value))

或者,如果你重新使用3/4:

print ("{0},{1}".format (key, len(value)))

Answer 5

由于你只有空档文件,我是:

from __future__ import with_statement
from collections import defaultdict

ads = defaultdict(set)
with open("some.csv") as f:
    for ad, user in (line.split(" ") for line in f):
        ads[ad].add(user)

for ad in ads:
    print "%s, %s" % (ad, len(ads[ad]))

Answer 6

这里有一些很大的答案。

我尤其希望使我的守则更容易在今后重复使用。

import csv

def parse_my_file(file_name):
     # some existing code goes here
     return aDict

if __name__ == "__main__":
     #this gets executed if this .py file is run directly, rather than imported
     aDict = parse_my_file("some.csv")
     for key, value in adDict.items():
         print (key,  ,  , len(value))

现在,你可以从另一个模块进口头巾,从方案上进入分局。

Answer 7

我所作的唯一改动是一劳永逸地从读者那里提取多个内容,并用打印发言稿的打印格式。

import csv

adDict = {}
reader = csv.reader(open("some.csv"), delimiter=   )
# Can extract multiple elements from a list in the iteration statement:
for adId, userId in reader: 
    if ( adId in adDict ):
        adDict[adId].add(userId)
    else:
        adDict[adId] = set(userId)

for key, value in adDict.items():
    # I believe this gives you more control over how things are formatted:
    print ("%s, %d" % (key, len(value)))

Answer 8

仅举几个字:

将浏览清单推入变量:

adId, userId = row

如果声明不需要改动:

if adId in adDict:

http://www.un.org/ga/president

try:
    adDict[adId].add(userId)
except KeyError:
    adDict[adId] = set(userId)

友情链接