Question

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.

要求制定法典的问题必须尽可能少地理解正在解决的问题。包括尝试性解决办法,为什么他们做t工作,以及expected 结果。另见:。排出问题清单

Closed 9 years ago.

Improve this question

I have a folder of .txt files which I want to store in a hash. Then compare the file against an array of specific words. While counting the amount of times the specific words occur.

Answer 1

请注意,我使用<代码>p{Alpha},因为该词在技术上界定了词。各位可以key子,增加数字,或确保一开始出现甲型六氯环己烷或你可能再次需要的东西。

还要指出的是,对于每行一字的档案,reg鱼是多余的,你应该 o。 Just rel=“nofollow”>chomp/code> http://www.ohchr.org。



use 5.010; # for say
use strict;
use warnings;

my ( %hash );

sub load_words { 
    @hash{ @_ } = ( 0 ) x @_; return; 
}

sub count_words {
    $hash{$_}++ foreach grep { exists $hash{$_} } @_;
}


my $word_regex
    = qr{ (                # start a capture
            p{Alpha}+     # any sequence of one or more alpha characters
            (?:            # begin grouping of
                [ -]         # allow hyphenated words and contractions
                p{Alpha}+   # which must be followed by an alpha
            )*             # any number of times
            (?: (?<=s) )?  # case for plural possessives (ht: tchrist)
          )                # end capture
        }x;

# load @ARGV to do <> processing
@ARGV = qw( list of files I take words from );
while ( <> ) {
    load_words( m/$word_regex/g );
}
@ARGV = qw( list of files where I count words );
while ( <> ) { 
    count_words( m/$word_regex/g );
}

# take a look at the hash
say Data::Dumper->Dump( [ \%hash ], [  *hash  ] );

Answer 2

不要为你写法,但你可以做以下事情:

Loop all the files (see glob())
Loop all the words in each file (maybe with a regular expression or split()?)
Check each words against a hash of wanted words. If it s there, increment a "counter" hash value as such: $hash{ $word }++ OR you could store all the words in a hash and then grab the ones you want afterwards ..

OR ......有很多办法可以做到这一点。

如果你的档案是巨大的,你就必须以另一种方式这样做。

Answer 3

因此,我用我所想看到的一系列具体话来做。排定的会议

#!/usr/bin/perl
#use strict;
use warnings;
my @words;

my @triggers=(" [kK]ill"," [Aa]ssault", " [rR]ap[ie]"," [dD]rug");
my %hash;

sub count_words {
    print "
";
}

my $word_regex
    = qr{ (                # start a capture
            p{Alpha}+     # any sequence of one or more alpha characters
            (?:            # begin grouping of
                [ -]         # allow hyphenated words and contractions
                p{Alpha}+   # which must be followed by an alpha
            )*             # any number of times
          )                # end capture
        }x;

my @files;
my $dirname = "/home/directory";
opendir(DIR,$dirname) or die "can t opendir $dirname: $!";
while (defined($file = readdir(DIR))) {
     push @files, "$dirname$file";
}    # do something with "$dirname/$file" } 
closedir(DIR);
my @interestingfiles;

foreach $file (@files){

    open FILE, ("<$file") or die "No file";

    foreach $line (<FILE>){
        foreach $trigger (@triggers){
           if($line =~ /$trigger/g){
              push @interestingfiles, "$file
";
           }
        }
    } 
   close FILE;
}
print @interestingfiles;

友情链接