Question

我有这方面的一些统计数据。

2023-01-01 01:00:00 TOTAL MEMORY ALLOCATION CONSUMPTION:
2023-01-01 01:00:00 COMPONENT | USAGE (%)
2023-01-01 01:00:00 class.zzz.aaa.bbb | 32
2023-01-01 01:00:00 class.fff.aaa.ggg | 20
2023-01-01 01:00:00 TOTAL: 52% out of 100% allocated memory consumed
2023-01-01 01:00:00 TOTAL MEMORY ALLOCATION CONSUMPTION:
2023-01-02 01:00:00 COMPONENT | USAGE (%)
2023-01-02 01:00:00 class.xxx.aaa.bbb | 42
2023-01-02 01:00:00 class.bbb.aaa.zzz | 10
2023-01-02 01:00:00 class.zzz.xxx | 21
2023-01-02 01:00:00 class.xxx.sss.ggg | 5
2023-01-02 01:00:00 TOTAL: 78% out of 100% allocated memory consumed
2023-01-01 01:00:00 TOTAL MEMORY ALLOCATION CONSUMPTION:
2023-01-03 01:00:00 COMPONENT | USAGE (%)
2023-01-03 01:00:00 class.xxx.yyy.zzz | 10
2023-01-03 01:00:00 class.xxx.zzz.aaa | 20
2023-01-03 01:00:00 class.zzz.aaa.bbb | 30
2023-01-03 01:00:00 TOTAL: 60% out of 100% allocated memory consumed

并且我要详细列出最后一套统计数据(如上例,是最后的6行)。如你所知,每个科的线路数量可以改变,但第一线和最后一行保持不变。我正在考虑使用:

"TOTAL" as an anchor point to grab the first and the last line of the wanted block of text
(?s) mode to match all lines in between those two

我最后用这一条列载了<代码>(?m)^*? * ? 总共有* ,并用于在北海伦堡使用<条码>-P获得预期产出(In t有<条码>-E regex分机)(Intttts much有<>-E regex分机)。

tac con.log | grep -Po "(?m)^.*?TOTAL(?s).*?(?m)TOTAL.*?$" -m1 | tac

结果是产生了正确的产出

2023-01-01 01:00:00 TOTAL MEMORY ALLOCATION CONSUMPTION:
2023-01-03 01:00:00 COMPONENT | USAGE (%)
2023-01-03 01:00:00 class.xxx.yyy.zzz | 10
2023-01-03 01:00:00 class.xxx.zzz.aaa | 20
2023-01-03 01:00:00 class.zzz.aaa.bbb | 30
2023-01-03 01:00:00 TOTAL: 60% out of 100% allocated memory consumed

然而,在我的测试环境中,它使用了旧的 gr版(2.5.3<>>>>/代码,在我用正版<代码>3.6>>的正轨9号运行的其他机器上测试了该版本。我没有取得任何结果。考虑到该校在以下网址也发挥了作用:regex101.com。我认为,这可能是一个更细致的 gr。这些更新的 gr鱼品种是否特别需要像这种 work鱼那样做,或者是否还有其他方式来取得这种结果(最后,它将被用作一种篮子)。

Answer 1

With Perl,^† one way

perl -0777 -wnE $r = $1 while /(^[0-9s:-]+TOTAL.+? TOTAL.+?$)/smxg; say $r  file

或

perl -0777 -wnE say f或 /.*( ^[0-9s:-]+ TOTAL.+? TOTAL.+?$ )/smxg  file

This does capture and assign all such rec或ds, 或 matches the whole file, until it gets to the last one, but one has to go over the file; the approach from the question makes three passes over the file. We can process backwards if perf或mance is an issue, like here f或 example. See the perf或mance effect here.

Altogether I d recommend a sh或t script instead.

Not sure why grep does what you show; I d imagine that the above regex should w或k, even slightly simplified using grep s conventions.

^† In the question as 或iginally posted by the OP there was a perl tag.

Answer 2

或只是越权:

echo  
2023-01-01 01:00:00 TOTAL MEMORY ALLOCATION CONSUMPTION:
2023-01-01 01:00:00 COMPONENT | USAGE (%)
2023-01-01 01:00:00 class.zzz.aaa.bbb | 32
2023-01-01 01:00:00 class.fff.aaa.ggg | 20
2023-01-01 01:00:00 TOTAL: 52% out of 100% allocated memory consumed
2023-01-01 01:00:00 TOTAL MEMORY ALLOCATION CONSUMPTION:
2023-01-02 01:00:00 COMPONENT | USAGE (%)
2023-01-02 01:00:00 class.xxx.aaa.bbb | 42
2023-01-02 01:00:00 class.bbb.aaa.zzz | 10
2023-01-02 01:00:00 class.zzz.xxx | 21
2023-01-02 01:00:00 class.xxx.sss.ggg | 5
2023-01-02 01:00:00 TOTAL: 78% out of 100% allocated memory consumed
2023-01-01 01:00:00 TOTAL MEMORY ALLOCATION CONSUMPTION:
2023-01-03 01:00:00 COMPONENT | USAGE (%)
2023-01-03 01:00:00 class.xxx.yyy.zzz | 10
2023-01-03 01:00:00 class.xxx.zzz.aaa | 20
2023-01-03 01:00:00 class.zzz.aaa.bbb | 30
2023-01-03 01:00:00 TOTAL: 60% out of 100% allocated memory consumed  |

mawk  BEGIN { RS = ORS = " consumed
" } END { print }    
                                                      — or even -
gawk  BEGIN { RS=(ORS=FS=" consumed
")"$" } $0=$NF

2023-01-01 01:00:00 TOTAL MEMORY ALLOCATION CONSUMPTION:
2023-01-03 01:00:00 COMPONENT | USAGE (%)
2023-01-03 01:00:00 class.xxx.yyy.zzz | 10
2023-01-03 01:00:00 class.xxx.zzz.aaa | 20
2023-01-03 01:00:00 class.zzz.aaa.bbb | 30
2023-01-03 01:00:00 TOTAL: 60% out of 100% allocated memory consumed

Answer 3

• G

grep -zPo  (?s).*
K.*TOTAL .*?TOTAL:.*?
  con.log

这项工作有3.7项。大部分工作内容见2.20版(应用外加新线)。大量投入文件很可能效率低下。

我怀疑贵格朗西岛的reg鱼在 gr鱼使用时未能奏效,其原因是 gr鱼在每一条投入线上都适用reg。因此,试图一劳永逸地配对多个线的雷管总是会失败。

避免阅读整个档案:

tac con.log | awk  s+=($3~/^TOTAL:?$/); s>1{exit}  | tac

<代码>s开始为零/false。每当发现一条开端或终点线时,便会增加。如非零,则打印(不采取行动)。当开端和终点线都相匹配时(s=2),我们便会放弃。

仅包括该记录中的信息翔实的记录。允许统计记录之间有不相关数据。

如果档案能够以部分记录结束(应当被忽视),则:

tac con.log | awk  
    !s && $3=="TOTAL:" { s=1 }
    s;
    s && $3=="TOTAL" { exit }
  | tac

如果日志中无相关数据(完整统计记录清单),则只需要测试终止条件:

tac con.log | awk  1; $3=="TOTAL"{exit}  | tac

假设产出线数不会超过某个门槛值(每千米),也有一个直截了当的塔克和(GNU)灰色解决方案,无论有无部分最后记录:

tac con.log |
grep -A1000 -m1  TOTAL:  |
grep -B1000 -m1  TOTAL   |
tac

Answer 4

贵数据的关键观察是,希望从last上 出现的数据。页: 1 你们能够利用满意的配对来实现这一目标。


use strict;
use warnings;


my $data = <<EOM;
2023-01-01 01:00:00 TOTAL MEMORY ALLOCATION CONSUMPTION:
2023-01-01 01:00:00 COMPONENT | USAGE (%)
2023-01-01 01:00:00 class.zzz.aaa.bbb | 32
2023-01-01 01:00:00 class.fff.aaa.ggg | 20
2023-01-01 01:00:00 TOTAL: 52% out of 100% allocated memory consumed
2023-01-01 01:00:00 TOTAL MEMORY ALLOCATION CONSUMPTION:
2023-01-02 01:00:00 COMPONENT | USAGE (%)
2023-01-02 01:00:00 class.xxx.aaa.bbb | 42
2023-01-02 01:00:00 class.bbb.aaa.zzz | 10
2023-01-02 01:00:00 class.zzz.xxx | 21
2023-01-02 01:00:00 class.xxx.sss.ggg | 5
2023-01-02 01:00:00 TOTAL: 78% out of 100% allocated memory consumed
2023-01-01 01:00:00 TOTAL MEMORY ALLOCATION CONSUMPTION:
2023-01-03 01:00:00 COMPONENT | USAGE (%)
2023-01-03 01:00:00 class.xxx.yyy.zzz | 10
2023-01-03 01:00:00 class.xxx.zzz.aaa | 20
2023-01-03 01:00:00 class.zzz.aaa.bbb | 30
2023-01-03 01:00:00 TOTAL: 60% out of 100% allocated memory consumed
EOM

$data =~ s/.+           # Do a greedy match
           (?=          # non-capturing group lookahead
              ^         #     Start of a line
              .+?       #     non-greedy match
              TOTALsMEMORYsALLOCATIONsCONSUMPTION # literal string
            )           # end of lookahead
            //smx; # allow . to match newline & ^ to match start of line

print $data;

页: 1
$ perl try.pl 
2023-01-01 01:00:00 TOTAL MEMORY ALLOCATION CONSUMPTION:
2023-01-03 01:00:00 COMPONENT | USAGE (%)
2023-01-03 01:00:00 class.xxx.yyy.zzz | 10
2023-01-03 01:00:00 class.xxx.zzz.aaa | 20
2023-01-03 01:00:00 class.zzz.aaa.bbb | 30
2023-01-03 01:00:00 TOTAL: 60% out of 100% allocated memory consumed

这一切都可以归入一线。
cat your data | perl -e  s/.+(?=^.+?TOTALsMEMORY ALLOCATION CONSUMPTION)//sm

Answer 5


简言之,奥克文。
awk  /TOTAL MEMORY/ { p=$0; next }
  p { p = p ORS $0 }
  /TOTAL:/ { result=p; p="" }
  END { print result }  file

这部国家机器是简单的,在这个机器里,我们收集了目前进入地段的所有线,然后最后打印出所收集的座标。

更详细的情况是,Awk在某个时候对每个下行(或更广泛地说,输入记录)使用文字。当我们看看一看一栏时,我们开始将项目收集到<条码>p,并绕过这一行文的其余部分。在随后的行文中,只要<代码>p<>>>>>/代码>是非豁免的,我们就添加一条线,以<代码>ORS、产出记录分离器(对新线的过失)和在达到与<编码>TOTAL:匹配”相匹配的输入线时。我们停止收集并抄录目前收集的<代码>p至result。最后,<代码>END栏在投入流结束之后运行,我们把我们上次收集的插图印制成<>result。

除了能够回溯到原来的AT&T Unix之外,这还容易理解和修改;定期表述是三维的,总体逻辑是合理简单和明显的。

Answer 6

a. 在每一箱子的弹壳中使用 any:

$ awk  /TOTAL /{rec=$0; next} {rec=rec ORS $0} END{print rec}  file
2023-01-01 01:00:00 TOTAL MEMORY ALLOCATION CONSUMPTION:
2023-01-03 01:00:00 COMPONENT | USAGE (%)
2023-01-03 01:00:00 class.xxx.yyy.zzz | 10
2023-01-03 01:00:00 class.xxx.zzz.aaa | 20
2023-01-03 01:00:00 class.zzz.aaa.bbb | 30
2023-01-03 01:00:00 TOTAL: 60% out of 100% allocated memory consumed

Answer 7

完成以下工作:

$ grep TOTAL: file -B6 | tail -n6
2023-01-01 01:00:00 TOTAL MEMORY ALLOCATION CONSUMPTION:
2023-01-03 01:00:00 COMPONENT | USAGE (%)
2023-01-03 01:00:00 class.xxx.yyy.zzz | 10
2023-01-03 01:00:00 class.xxx.zzz.aaa | 20
2023-01-03 01:00:00 class.zzz.aaa.bbb | 30
2023-01-03 01:00:00 TOTAL: 60% out of 100% allocated memory consumed

友情链接