我对R说了一个新东西,但真的喜欢,希望不断改善。 现在,在寻找一个时间之后,我需要请你提供帮助。
This is the given case:
(1) 我有句子(第1句和第2句——所有词都已经排在下),并列出其词的频率:
sentence.1 <- "bob buys this car, although his old car is still fine." # saves the sentence into sentence.1
sentence.2 <- "a car can cost you very much per month."
sentence.1.list <- strsplit(sentence.1, "\W+", perl=T) #(I have these following commands thanks to Stefan Gries) we split the sentence at non-word characters
sentence.2.list <- strsplit(sentence.2, "\W+", perl=T)
sentence.1.vector <- unlist(sentence.1.list) # then we create a vector of the list
sentence.2.vector <- unlist(sentence.2.list) # vectorizes the list
sentence.1.freq <- table(sentence.1.vector) # and finally create the frequency lists for
sentence.2.freq <- table(sentence.2.vector)
These are the results:
sentence.1.freq:
although bob buys car fine his is old still this
1 1 1 2 1 1 1 1 1 1
sentence.2.freq:
a can car cost month much per very you
1 1 1 1 1 1 1 1 1
现在,请说明我如何能够把这两个频率清单合并在一起,我将有以下内容:
a although bob buys can car cost fine his is month much old per still this very you
NA 1 1 1 NA 2 NA 1 1 1 NA NA 1 NA 1 1 NA NA
1 NA NA NA 1 1 1 NA NA NA 1 1 NA 1 NA NA 1 1
因此,这种“表”应当是“灵活”的,这样,如果加上“但”和“但”之间的“和”标签,表格就添加该栏。
我认为,在新的一行中只增加新的句子,将所有尚未列入清单一栏的字句(这里,“和”是“你”的权利)重新分类。 然而,我没有这样做,因为根据现有标签的频率对新句进行分类(例如,如果再次出现“伤.”,新句的汽车频率应写进新的句子和“伤.”一栏,但如存在的话。 第1次,其频率应写入新的句子和一个新的栏目,称为“青年”。