English 中文(简体)
拥有X多个空间,被视为空栏,有时有两个空间分立两栏。
原标题:get a sequence of more than X spaces considered as an empty column while there are sometimes two spaces to separate two columns
  • 时间:2018-09-12 10:20:38
  •  标签:
  • linux
  • awk

To use the data from a publisher software, I have functions that I can call via the CLI (Linux Shell).

其中一人以以下格式向我汇报数据:

2601424 OPTDCBO3 EERP O 0254  5512240 TDCTAC01 B                 00 0000000 N  N
2602451 WHA      EERP O 0254  5512353 03ZEE003 B                 00 0000000 N  N
2602748 OPTDCBO4 EERP O 0254  5512380 TDCTAC01 B                 00 0000000 N  N
2603290 OPTDCBO3 EERP O 0254  5512440 TDCTAC01 B                 00 0000000 N  N
2604594 OPTDCBO3 EERP O 0254  5512560 TDCTAC01 B                 00 0000000 N  N
2605631 OP49LDB1 TRAN O 0254          EDRZZZ02 B     2605627-EDR 00 0000000 N  N
2605657 OP49LDB1 TRAN O 0254          EDRZZZ02 B     2605652-EDR 00 0000000 N  N
2605663 OP49LDB1 TRAN O 0254          EDRZZZ02 B     2605653-EDR 00 0000000 N  N
2606116 OPTDCBO3 EERP O 0254  5513080 TDCTAC01 B                 00 0000000 N  N
2716077 OPTDCBO3 EERP O 0255  5610080 TDCTAC01 B                 00 0000000 N  N
2716564 SOG01    TRAN O 0255 s2716564 TACSOG01 B     2716504-TAC 00 0000000 N  N
2718631 OPTDCBO3 EERP O 0255  5610160 TDCTAC01 B                 00 0000000 N  N
7158273 OPTDCBO4 EERP O 0251  5203300 TDCTAC01 B                 00 0000000 N  N
7158672 WHA      EERP O 0251  5203342 03ZEE001 B                 00 0000000 N  N
7158939 ZZZA4    LIST O 0251                   B     7158938-49W 00 0000000 N  N
7158978 OPTDCBO3 EERP O 0251  5203400 TDCTAC01 B                 00 0000000 N  N
7159853 OPTDCBO4 EERP O 0251  5203540 TDCTAC01 B                 00 0000000 N  N
2724704 SOU02    TRAN I 0255 s2724704 FTP_B    E     2724704-SOU 00 0000000 N  N
2724707 PRODS2I  EERP O 0255  6219255 S2IRCE03 E                 00 0000000 N  N
2724708 SOU01    TRAN I 0255 s2724708 FTP_B    E     2724708-SOU 00 0000001 N  N
2724709 SON01    TRAN O 0255 s2724709 SOUSON11 E     2724708-SOU 00 0000001 N  N

A space separates the first 5 columns, and one or two spaces separate the 5th and 6th columns. Sometimes columns 6 and 7 are empty.

目的是将一部分或全部数据归还给每个领域之间的限定文本档案。

为检索第1和第7栏,我使用了:

command | awk -F" "  { print $1,$7 } 

但它规定:

2603290 TDCTAC01
2604594 TDCTAC01
2605631 B
2605657 B
2605663 B
2606116 TDCTAC01
2606214 TDCTAC01
7158672 03ZEE001
7158939 7158938-49W
7158978 TDCTAC01

Awk将填满空间的栏目视为多个相邻的分离器,而不是一个空栏,因此退回了以下非排他列的数据。

我如何获得X多个空间的顺序,这些空间被视为空栏,而有时有两个空间分立两栏?

Column N°6 can be composed of 1 to 6 numeric characters, so you cannot simply use a fixed width to delimit the column.

问题回答

wrt Column N°6 may be composed of 1 to 6 numeric natures, so, You can only use a un width to delimiting the栏。 - ensuring You can。 该栏的宽度固定在8个焦炭的宽度(或9个红利包括分离空间)下,使用该和三米铅/带白色空间(>>(,<6/code>)。

在这里,如何确定使用<代码>的千兆瓦的所有领域。 FIELDWIDTHS (with other awks You need a while (substr (substr) loop):

awk -v FIELDWIDTHS="7 9 5 2 5 9 9 2 16 3 8 2 3"  
{
    print "----"
    print $0
    for (i=1;i<=NF;i++) {
        gsub(/^ +| +$/,"",$i)
        print i, "<" $i ">"
    }
}
  file

例如:

$ cat file
2602451 WHA      EERP O 0254  5512353 03ZEE003 B                 00 0000000 N  N
2605657 OP49LDB1 TRAN O 0254          EDRZZZ02 B     2605652-EDR 00 0000000 N  N
2724704 SOU02    TRAN I 0255 s2724704 FTP_B    E     2724704-SOU 00 0000000 N  N

$ awk -v FIELDWIDTHS="7 9 5 2 5 9 9 2 16 3 8 2 3"  { print "----"; print $0; for (i=1;i<=NF;i++) {gsub(/^s+|s+$/,"",$i); print i, "<" $i ">"} }  file
----
2602451 WHA      EERP O 0254  5512353 03ZEE003 B                 00 0000000 N  N
1 <2602451>
2 <WHA>
3 <EERP>
4 <O>
5 <0254>
6 <5512353>
7 <03ZEE003>
8 <B>
9 <>
10 <00>
11 <0000000>
12 <N>
13 <N>
----
2605657 OP49LDB1 TRAN O 0254          EDRZZZ02 B     2605652-EDR 00 0000000 N  N
1 <2605657>
2 <OP49LDB1>
3 <TRAN>
4 <O>
5 <0254>
6 <>
7 <EDRZZZ02>
8 <B>
9 <2605652-EDR>
10 <00>
11 <0000000>
12 <N>
13 <N>
----
2724704 SOU02    TRAN I 0255 s2724704 FTP_B    E     2724704-SOU 00 0000000 N  N
1 <2724704>
2 <SOU02>
3 <TRAN>
4 <I>
5 <0255>
6 <s2724704>
7 <FTP_B>
8 <E>
9 <2724704-SOU>
10 <00>
11 <0000000>
12 <N>
13 <N>




相关问题
Signed executables under Linux

For security reasons, it is desirable to check the integrity of code before execution, avoiding tampered software by an attacker. So, my question is How to sign executable code and run only trusted ...

encoding of file shell script

How can I check the file encoding in a shell script? I need to know if a file is encoded in utf-8 or iso-8859-1. Thanks

How to write a Remote DataModule to run on a linux server?

i would like to know if there are any solution to do this. Does anyone? The big picture: I want to access data over the web, using my delphi thin clients. But i´would like to keep my server/service ...

How can I use exit codes to run shell scripts sequentially?

Since cruise control is full of bugs that have wasted my entire week, I have decided the existing shell scripts I have are simpler and thus better. Here is what I have so far svn update /var/www/...

Good, free, easy-to-use C graphics libraries? [closed]

I was wondering if there were any good free graphics libraries for C that are easy to use? It s for plotting 2d and 3d graphs and then saving to a file. It s on a Linux system and there s no gnuplot ...

热门标签