English 中文(简体)
采用湿重制成像线
原标题:Parsing log lines using awk

I have to parse some information out of big log file lines. Its something like

abc.log:2012-03-03 11:12:12,457 ABC[123.RPH.-101] XYZ: Query=get_data @a=0,@b=1 Rows=10Time=100   

There are many log lines like above in the logfiles. I need to extract information like datetime i.e. 2012-03-03 11:12:12,457 job details i.e. 123.RPH.-101 Query i.e. get_data (no parameters) Rows i.e. 10 Time i.e. 100

So output should look like

2012-03-03 11:12:12,457|123|-101|get_data|10|100  

我用刀子尝试过各种 per断,但并未正确。

问题回答

这确实是可怕的,但自sed起。 尚未找到答案。

sed -e  s/[^0-9]*//  -re  s/[^ ]*[([^.]*).[^.]*.([^]]*)]/| 1 | 2/  -e  s/[^ ]* Query=/| /  -e  s/ [^ ]* Rows=/ | /  -e  s/Time=/ | /  my_logfile

我在甘油的解决方案:它利用甘油延伸来匹配。

你没有具体说明文件格式,因此,你可能不得不调整文件格式。

Script invocation: gawk -v OFS= | -f script.awk

{
match($0, /[0-9]+-[0-9]+-[0-9]+ [0-9]+:[0-9]+:[0-9]+,[0-9]+/)
date_time = substr($0, RSTART, RLENGTH)

match($0, /[([0-9]+).RPH.(-?[0-9]+)]/, matches)
job_detail_1 = matches[1]
job_detail_2 = matches[2]

match($0, /Query=(w+)/, matches)
query = matches[1]

match($0, /Rows=([0-9]+)/, matches)
rows = matches[1]

match($0, /Time=([0-9]+)/, matches)
time = matches[1]

print date_time, job_detail_1, job_detail_2, query,rows, time
}

在这里,AWK解决办法又少了(但也是在 m子里工作):

BEGIN { OFS="|" }

{
    i = match($3, /[[^]]+]/)
    job = substr($3, i + 1, RLENGTH - 2)
    split($5, X, "=")
    query = X[2]
    split($7, X, "=")
    rows = X[2]
    split($8, X, "=")
    time= X[2]

    print $1 " " $2, job, query, rows, time
}

此处没有假设<代码>Rows=10和time=100 strings按空间分开,也就是说,问题就是一个典型例子。

主管机构:

@(collect :vars ())
@file:@year-@mon-@day @hh:@mm:@ss,@ms @jobname[@job1.RPH.@job2] @queryname: Query=@query @params Rows=@{rows /[0-9]+/}Time=@time
@(output)
@year-@mon-@day @hh-@mm-@ss,@ms|@job1|@job2|@query|@rows|@time
@(end)
@(end)

页: 1

$ txr data.txr data.log
2012-03-03 11-12-12,457|123|-101|get_data|10|100

这里,一种办法是使方案断言,记录中的每一条线都必须符合这一模式。 首先,无法弥补收集方面的差距。 也就是说,不能用非配对材料来寻找与下列内容相匹配的线索:

@(collect :gap 0 :vars ())

第二,在文字末尾,我们补充说:

@(eof)

这具体规定了在案卷末的对比。 如果由于非配对线(由于<代码>:gap 0 限值),@(eof)将失效,因此该字母将以失效状态终止。

In this type of task, field splitting regex hacks will backfire because they can blindly produce incorrect results for some subset of the input being processed. If the input contains a vast number of lines, there is no easy way to check for mistakes. It s best to have a very specific match that is likely to reject anything which doesn t resemble the examples on which the pattern is based.

正当的需要

awk -F  [][ =.]  -v OFS= |   {print $1 " " $2, $4, $6, $10, $15, $17} 

I m 假设“abc.log:”实际上并未列入记录。





相关问题
Parse players currently in lobby

I m attempting to write a bash script to parse out the following log file and give me a list of CURRENT players in the room (so ignoring players that left, but including players that may have rejoined)...

How to get instance from string in C#?

Is it possible to get the property of a class from string and then set a value? Example: string s = "label1.text"; string value = "new value"; label1.text = value; <--and some code that makes ...

XML DOM parsing br tag

I need to parse a xml string to obtain the xml DOM, the problem I m facing is with the self closing html tag like <br /> giving me the error of Tag mismatch expected </br>. I m aware this ...

Ruby parser in Java

The project I m doing is written in Java and parsers source code files. (Java src up to now). Now I d like to enable parsing Ruby code as well. Therefore I am looking for a parser in Java that parses ...

Locating specific string and capturing data following it

I built a site a long time ago and now I want to place the data into a database without copying and pasting the 400+ pages that it has grown to so that I can make the site database driven. My site ...

热门标签