English 中文(简体)
将txt文档编成黑体,与使用[闭门]的字数比较
原标题:put .txt files into a hash and compare with an array of words using perl [closed]

要求制定法典的问题必须 尽可能少地理解正在解决的问题。 包括尝试性解决办法,为什么他们做t工作,以及expected 结果。 另见:。 排出问题清单

Closed 9 years ago.

I have a folder of .txt files which I want to store in a hash. Then compare the file against an array of specific words. While counting the amount of times the specific words occur.

问题回答

请注意,我使用<代码>p{Alpha},因为该词在技术上界定了词。 各位可以key子,增加数字,或确保一开始出现甲型六氯环己烷或你可能再次需要的东西。

还要指出的是,对于每行一字的档案,reg鱼是多余的,你应该 o。 Just rel=“nofollow”>chomp/code> http://www.ohchr.org。

use 5.010; # for say
use strict;
use warnings;

my ( %hash );

sub load_words { 
    @hash{ @_ } = ( 0 ) x @_; return; 
}

sub count_words {
    $hash{$_}++ foreach grep { exists $hash{$_} } @_;
}


my $word_regex
    = qr{ (                # start a capture
            p{Alpha}+     # any sequence of one or more alpha characters
            (?:            # begin grouping of
                [ -]         # allow hyphenated words and contractions
                p{Alpha}+   # which must be followed by an alpha
            )*             # any number of times
            (?: (?<=s) )?  # case for plural possessives (ht: tchrist)
          )                # end capture
        }x;

# load @ARGV to do <> processing
@ARGV = qw( list of files I take words from );
while ( <> ) {
    load_words( m/$word_regex/g );
}
@ARGV = qw( list of files where I count words );
while ( <> ) { 
    count_words( m/$word_regex/g );
}

# take a look at the hash
say Data::Dumper->Dump( [ \%hash ], [  *hash  ] );

不要为你写法,但你可以做以下事情:

  1. Loop all the files (see glob())
  2. Loop all the words in each file (maybe with a regular expression or split()?)
  3. Check each words against a hash of wanted words. If it s there, increment a "counter" hash value as such: $hash{ $word }++ OR you could store all the words in a hash and then grab the ones you want afterwards ..

OR ......有很多办法可以做到这一点。

如果你的档案是巨大的,你就必须以另一种方式这样做。

因此,我用我所想看到的一系列具体话来做。 排 定 的 会 议

#!/usr/bin/perl
#use strict;
use warnings;
my @words;

my @triggers=(" [kK]ill"," [Aa]ssault", " [rR]ap[ie]"," [dD]rug");
my %hash;

sub count_words {
    print "
";
}

my $word_regex
    = qr{ (                # start a capture
            p{Alpha}+     # any sequence of one or more alpha characters
            (?:            # begin grouping of
                [ -]         # allow hyphenated words and contractions
                p{Alpha}+   # which must be followed by an alpha
            )*             # any number of times
          )                # end capture
        }x;

my @files;
my $dirname = "/home/directory";
opendir(DIR,$dirname) or die "can t opendir $dirname: $!";
while (defined($file = readdir(DIR))) {
     push @files, "$dirname$file";
}    # do something with "$dirname/$file" } 
closedir(DIR);
my @interestingfiles;

foreach $file (@files){

    open FILE, ("<$file") or die "No file";

    foreach $line (<FILE>){
        foreach $trigger (@triggers){
           if($line =~ /$trigger/g){
              push @interestingfiles, "$file
";
           }
        }
    } 
   close FILE;
}
print @interestingfiles;




相关问题
Why does my chdir to a filehandle not work in Perl?

When I try a "chdir" with a filehandle as argument, "chdir" returns 0 and a pwd returns still the same directory. Should that be so? I tried this, because in the documentation to chdir I found: "...

How do I use GetOptions to get the default argument?

I ve read the doc for GetOptions but I can t seem to find what I need... (maybe I am blind) What I want to do is to parse command line like this myperlscript.pl -mode [sth] [inputfile] I can use ...

Object-Oriented Perl constructor syntax and named parameters

I m a little confused about what is going on in Perl constructors. I found these two examples perldoc perlbot. package Foo; #In Perl, the constructor is just a subroutine called new. sub new { #I ...

Where can I find object-oriented Perl tutorials? [closed]

A Google search yields a number of results - but which ones are the best? The Perl site appears to contain two - perlboot and perltoot. I m reading these now, but what else is out there? Note: I ve ...