我有一个300GB的文档, 我需要下面显示的线条。 从下面显示的线条中, 我只需要从 < code\\ gt; miR code > 开始的线条 。
我写了一个 Perl 程序, 它实际上打印了我想要的输出, 但当我对大文件( 类似线条在下面显示) 应用相同的代码, 最多300GB 数据时, 如何继续这个程序? 是否有其他方法可以在这个代码中执行? 因为代码如果运行, 将会被杀死 。
#!/usr/bin/perl -w
$len=@ARGV;
if($len eq 0){
print "Give file
";
exit;
}
$file=$ARGV[0];
open(FH,$file) || die "cant open file
";
@lines=<FH>;
close FH;
while ($line=<FH>){
chomp $line;
if ($line =~ /^>miR/){
$_=$line;
s/>//g && s/,//g;
print "$_
";
if($_=~ /(S+)s(S+)s(S+)s(S+)s(S+)s(S+)s(S+)s(S+)s(S+)s(S+)s(S+)/){
print $1," ",$2," ",$7," ",$3,"
";
}
..
Forward: Score: 124..000000 Q:2 to 18 R:1 to 20 Align Len (17) (64..71%) (82..35%)
Query: 3 gaauAUUCGUUAG-AAUGGUAa 5
|:: :|||| || ||||
Ref: 5 --ctTGGTTAATCATTCCCATt 3
Energy: -10..480000 kCal/Mol
Scores for this hit:
>miR844a AT2G33810, 124..00 -10..48 2 18 1 20 17 64..71% 82..35%
Forward: Score: 120..000000 Q:2 to 19 R:289 to 308 Align Len (17) (64..71%) (76..47%)
Query: 3 gaaUAUUCGUUAGAAUGGUAa 5
||::| || || ||||
Ref: 5 ttgATGGG-AAAATTTCCATt 3
Energy: -9..850000 kCal/Mol
Scores for this hit:
>miR844a AT2G33810, 120..00 -9..85 2 19 289 308 17 64..71% 76..47%
Forward: Score: 118..000000 Q:2 to 19 R:483 to 503 Align Len (17) (64..71%) (82..35%)
Query: 3 gaaUAUUCGUUAGAAUGGUAa 5
:||: |||| ||:|||
Ref: 5 gggGTAGAAAATCATATCATa 3