English 中文(简体)
我如何用reg鱼与多线相匹配,只得到最后的配对套?
原标题:How do I use regex in grep to match multiple lines and only get the last matched set?

我有这方面的一些统计数据。

2023-01-01 01:00:00 TOTAL MEMORY ALLOCATION CONSUMPTION:
2023-01-01 01:00:00 COMPONENT | USAGE (%)
2023-01-01 01:00:00 class.zzz.aaa.bbb | 32
2023-01-01 01:00:00 class.fff.aaa.ggg | 20
2023-01-01 01:00:00 TOTAL: 52% out of 100% allocated memory consumed
2023-01-01 01:00:00 TOTAL MEMORY ALLOCATION CONSUMPTION:
2023-01-02 01:00:00 COMPONENT | USAGE (%)
2023-01-02 01:00:00 class.xxx.aaa.bbb | 42
2023-01-02 01:00:00 class.bbb.aaa.zzz | 10
2023-01-02 01:00:00 class.zzz.xxx | 21
2023-01-02 01:00:00 class.xxx.sss.ggg | 5
2023-01-02 01:00:00 TOTAL: 78% out of 100% allocated memory consumed
2023-01-01 01:00:00 TOTAL MEMORY ALLOCATION CONSUMPTION:
2023-01-03 01:00:00 COMPONENT | USAGE (%)
2023-01-03 01:00:00 class.xxx.yyy.zzz | 10
2023-01-03 01:00:00 class.xxx.zzz.aaa | 20
2023-01-03 01:00:00 class.zzz.aaa.bbb | 30
2023-01-03 01:00:00 TOTAL: 60% out of 100% allocated memory consumed

并且我要详细列出最后一套统计数据(如上例,是最后的6行)。 如你所知,每个科的线路数量可以改变,但第一线和最后一行保持不变。 我正在考虑使用:

  • "TOTAL" as an anchor point to grab the first and the last line of the wanted block of text
  • (?s) mode to match all lines in between those two

我最后用这一条列载了<代码>(?m)^*? * ? 总共有* ,并用于在北海伦堡使用<条码>-P获得预期产出(In t有<条码>-E regex分机)(Intttts much有<>-E regex分机)。

tac con.log | grep -Po "(?m)^.*?TOTAL(?s).*?(?m)TOTAL.*?$" -m1 | tac

结果是产生了正确的产出

2023-01-01 01:00:00 TOTAL MEMORY ALLOCATION CONSUMPTION:
2023-01-03 01:00:00 COMPONENT | USAGE (%)
2023-01-03 01:00:00 class.xxx.yyy.zzz | 10
2023-01-03 01:00:00 class.xxx.zzz.aaa | 20
2023-01-03 01:00:00 class.zzz.aaa.bbb | 30
2023-01-03 01:00:00 TOTAL: 60% out of 100% allocated memory consumed

然而,在我的测试环境中,它使用了旧的 gr版(2.5.3<>>>>/代码,在我用正版<代码>3.6>>的正轨9号运行的其他机器上测试了该版本。 我没有取得任何结果。 考虑到该校在以下网址也发挥了作用:regex101.com。 我认为,这可能是一个更细致的 gr。 这些更新的 gr鱼品种是否特别需要像这种 work鱼那样做,或者是否还有其他方式来取得这种结果(最后,它将被用作一种篮子)。

问题回答

With Perl, one way

perl -0777 -wnE $r = $1 while /(^[0-9s:-]+TOTAL.+? TOTAL.+?$)/smxg; say $r  file

perl -0777 -wnE say f或 /.*( ^[0-9s:-]+ TOTAL.+? TOTAL.+?$ )/smxg  file

This does capture and assign all such rec或ds, 或 matches the whole file, until it gets to the last one, but one has to go over the file; the approach from the question makes three passes over the file. We can process backwards if perf或mance is an issue, like here f或 example. See the perf或mance effect here.

Altogether I d recommend a sh或t script instead.

Not sure why grep does what you show; I d imagine that the above regex should w或k, even slightly simplified using grep s conventions.


In the question as 或iginally posted by the OP there was a perl tag.

或只是越权:

echo  
2023-01-01 01:00:00 TOTAL MEMORY ALLOCATION CONSUMPTION:
2023-01-01 01:00:00 COMPONENT | USAGE (%)
2023-01-01 01:00:00 class.zzz.aaa.bbb | 32
2023-01-01 01:00:00 class.fff.aaa.ggg | 20
2023-01-01 01:00:00 TOTAL: 52% out of 100% allocated memory consumed
2023-01-01 01:00:00 TOTAL MEMORY ALLOCATION CONSUMPTION:
2023-01-02 01:00:00 COMPONENT | USAGE (%)
2023-01-02 01:00:00 class.xxx.aaa.bbb | 42
2023-01-02 01:00:00 class.bbb.aaa.zzz | 10
2023-01-02 01:00:00 class.zzz.xxx | 21
2023-01-02 01:00:00 class.xxx.sss.ggg | 5
2023-01-02 01:00:00 TOTAL: 78% out of 100% allocated memory consumed
2023-01-01 01:00:00 TOTAL MEMORY ALLOCATION CONSUMPTION:
2023-01-03 01:00:00 COMPONENT | USAGE (%)
2023-01-03 01:00:00 class.xxx.yyy.zzz | 10
2023-01-03 01:00:00 class.xxx.zzz.aaa | 20
2023-01-03 01:00:00 class.zzz.aaa.bbb | 30
2023-01-03 01:00:00 TOTAL: 60% out of 100% allocated memory consumed  | 

mawk  BEGIN { RS = ORS = " consumed
" } END { print }    
                                                      — or even -
gawk  BEGIN { RS=(ORS=FS=" consumed
")"$" } $0=$NF  

2023-01-01 01:00:00 TOTAL MEMORY ALLOCATION CONSUMPTION:
2023-01-03 01:00:00 COMPONENT | USAGE (%)
2023-01-03 01:00:00 class.xxx.yyy.zzz | 10
2023-01-03 01:00:00 class.xxx.zzz.aaa | 20
2023-01-03 01:00:00 class.zzz.aaa.bbb | 30
2023-01-03 01:00:00 TOTAL: 60% out of 100% allocated memory consumed

• G

grep -zPo  (?s).*
K.*TOTAL .*?TOTAL:.*?
  con.log

这项工作有3.7项。 大部分工作内容见2.20版(应用外加新线)。 大量投入文件很可能效率低下。

我怀疑贵格朗西岛的reg鱼在 gr鱼使用时未能奏效,其原因是 gr鱼在每一条投入线上都适用reg。 因此,试图一劳永逸地配对多个线的雷管总是会失败。


避免阅读整个档案:

tac con.log | awk  s+=($3~/^TOTAL:?$/); s>1{exit}  | tac

<代码>s开始为零/false。 每当发现一条开端或终点线时,便会增加。 如非零,则打印(不采取行动)。 当开端和终点线都相匹配时(s=2),我们便会放弃。

仅包括该记录中的信息翔实的记录。 允许统计记录之间有不相关数据。

如果档案能够以部分记录结束(应当被忽视),则:

tac con.log | awk  
    !s && $3=="TOTAL:" { s=1 }
    s;
    s && $3=="TOTAL" { exit }
  | tac

如果日志中无相关数据(完整统计记录清单),则只需要测试终止条件:

tac con.log | awk  1; $3=="TOTAL"{exit}  | tac

假设产出线数不会超过某个门槛值(每千米),也有一个直截了当的塔克和(GNU)灰色解决方案,无论有无部分最后记录:

tac con.log |
grep -A1000 -m1  TOTAL:  |
grep -B1000 -m1  TOTAL   |
tac

贵数据的关键观察是,希望从last上 出现的数据。 页: 1 你们能够利用满意的配对来实现这一目标。

use strict;
use warnings;


my $data = <<EOM;
2023-01-01 01:00:00 TOTAL MEMORY ALLOCATION CONSUMPTION:
2023-01-01 01:00:00 COMPONENT | USAGE (%)
2023-01-01 01:00:00 class.zzz.aaa.bbb | 32
2023-01-01 01:00:00 class.fff.aaa.ggg | 20
2023-01-01 01:00:00 TOTAL: 52% out of 100% allocated memory consumed
2023-01-01 01:00:00 TOTAL MEMORY ALLOCATION CONSUMPTION:
2023-01-02 01:00:00 COMPONENT | USAGE (%)
2023-01-02 01:00:00 class.xxx.aaa.bbb | 42
2023-01-02 01:00:00 class.bbb.aaa.zzz | 10
2023-01-02 01:00:00 class.zzz.xxx | 21
2023-01-02 01:00:00 class.xxx.sss.ggg | 5
2023-01-02 01:00:00 TOTAL: 78% out of 100% allocated memory consumed
2023-01-01 01:00:00 TOTAL MEMORY ALLOCATION CONSUMPTION:
2023-01-03 01:00:00 COMPONENT | USAGE (%)
2023-01-03 01:00:00 class.xxx.yyy.zzz | 10
2023-01-03 01:00:00 class.xxx.zzz.aaa | 20
2023-01-03 01:00:00 class.zzz.aaa.bbb | 30
2023-01-03 01:00:00 TOTAL: 60% out of 100% allocated memory consumed
EOM

$data =~ s/.+           # Do a greedy match
           (?=          # non-capturing group lookahead
              ^         #     Start of a line
              .+?       #     non-greedy match
              TOTALsMEMORYsALLOCATIONsCONSUMPTION # literal string
            )           # end of lookahead
            //smx; # allow . to match newline & ^ to match start of line

print $data;

页: 1

$ perl try.pl 
2023-01-01 01:00:00 TOTAL MEMORY ALLOCATION CONSUMPTION:
2023-01-03 01:00:00 COMPONENT | USAGE (%)
2023-01-03 01:00:00 class.xxx.yyy.zzz | 10
2023-01-03 01:00:00 class.xxx.zzz.aaa | 20
2023-01-03 01:00:00 class.zzz.aaa.bbb | 30
2023-01-03 01:00:00 TOTAL: 60% out of 100% allocated memory consumed

这一切都可以归入一线。

cat your data | perl -e  s/.+(?=^.+?TOTALsMEMORY ALLOCATION CONSUMPTION)//sm 

简言之,奥克文。

awk  /TOTAL MEMORY/ { p=$0; next }
  p { p = p ORS $0 }
  /TOTAL:/ { result=p; p="" }
  END { print result }  file

这部国家机器是简单的,在这个机器里,我们收集了目前进入地段的所有线,然后最后打印出所收集的座标。

更详细的情况是,Awk在某个时候对每个下行(或更广泛地说,输入记录)使用文字。 当我们看看一看一栏时,我们开始将项目收集到<条码>p,并绕过这一行文的其余部分。 在随后的行文中,只要<代码>p<>>>>>/代码>是非豁免的,我们就添加一条线,以<代码>ORS、产出记录分离器(对新线的过失)和在达到与<编码>TOTAL:匹配”相匹配的输入线时。 我们停止收集并抄录目前收集的<代码>p至result。 最后,<代码>END栏在投入流结束之后运行,我们把我们上次收集的插图印制成<>result。

除了能够回溯到原来的AT&T Unix之外,这还容易理解和修改;定期表述是三维的,总体逻辑是合理简单和明显的。

a. 在每一箱子的弹壳中使用 any:

$ awk  /TOTAL /{rec=$0; next} {rec=rec ORS $0} END{print rec}  file
2023-01-01 01:00:00 TOTAL MEMORY ALLOCATION CONSUMPTION:
2023-01-03 01:00:00 COMPONENT | USAGE (%)
2023-01-03 01:00:00 class.xxx.yyy.zzz | 10
2023-01-03 01:00:00 class.xxx.zzz.aaa | 20
2023-01-03 01:00:00 class.zzz.aaa.bbb | 30
2023-01-03 01:00:00 TOTAL: 60% out of 100% allocated memory consumed

完成以下工作:

$ grep TOTAL: file -B6 | tail -n6
2023-01-01 01:00:00 TOTAL MEMORY ALLOCATION CONSUMPTION:
2023-01-03 01:00:00 COMPONENT | USAGE (%)
2023-01-03 01:00:00 class.xxx.yyy.zzz | 10
2023-01-03 01:00:00 class.xxx.zzz.aaa | 20
2023-01-03 01:00:00 class.zzz.aaa.bbb | 30
2023-01-03 01:00:00 TOTAL: 60% out of 100% allocated memory consumed




相关问题
Signed executables under Linux

For security reasons, it is desirable to check the integrity of code before execution, avoiding tampered software by an attacker. So, my question is How to sign executable code and run only trusted ...

encoding of file shell script

How can I check the file encoding in a shell script? I need to know if a file is encoded in utf-8 or iso-8859-1. Thanks

How to write a Remote DataModule to run on a linux server?

i would like to know if there are any solution to do this. Does anyone? The big picture: I want to access data over the web, using my delphi thin clients. But i´would like to keep my server/service ...

How can I use exit codes to run shell scripts sequentially?

Since cruise control is full of bugs that have wasted my entire week, I have decided the existing shell scripts I have are simpler and thus better. Here is what I have so far svn update /var/www/...

Good, free, easy-to-use C graphics libraries? [closed]

I was wondering if there were any good free graphics libraries for C that are easy to use? It s for plotting 2d and 3d graphs and then saving to a file. It s on a Linux system and there s no gnuplot ...

热门标签