English 中文(简体)
how to tell sed "dot match new line"
原标题:
  • 时间:2011-12-24 11:36:03
  •  标签:
  • sed

I can t figure how to tell sed dot match new line:

echo -e "one two three" | sed s/one.*two/one/m

I expect to get:

one
three

instead I get original:

one
two
three

最佳回答

sed is line-based tool. I don t think these is an option.
You can use h/H(hold), g/G(get).

$ echo -e  one
two
three  | sed -n  1h;1!H;${g;s/one.*two/one/p} 
one
three

Maybe you should try vim

:%s/one\_.*two/one/g
问题回答

If you use a GNU sed, you may match any character, including line break chars, with a mere ., see :

.
         Matches any character, including newline.

All you need to use is a -z option:

echo -e "one
two
three" | sed -z  s/one.*two/one/ 
# => one
#    three

See the online sed demo.

However, one.*two might not be what you need since * is always greedy in POSIX regex patterns. So, one.*two will match the leftmost one, then any 0 or more chars as many as possible, and then the rightmost two. If you need to remove one, then any 0+ chars as few as possible, and then the leftmost two, you will have to use perl:

perl -i -0 -pe  s/one.*?two//sg  file             # Non-Unicode version
perl -i -CSD -Mutf8 -0 -pe  s/one.*?two//sg  file # S&R in a UTF8 file 

The -0 option enables the slurp mode so that the file could be read as a whole and not line-by-line, -i will enable inline file modification, s will make . match any char including line break chars, and .*? will match any 0 or more chars as few as possible due to a non-greedy *?. The -CSD -Mutf8 part make sure your input is decoded and output re-encoded back correctly.

You can use python this way:

$ echo -e "one
two
three" | python -c  import re, sys; s=sys.stdin.read(); s=re.sub("(?s)one.*two", "one", s); print s, 
one
three
$

This reads the entire python s standard input (sys.stdin.read()), then substitutes "one" for "one.*two" with dot matches all setting enabled (using (?s) at the start of the regular expression) and then prints the modified string (the trailing comma in print is used to prevent print from adding an extra newline).

This might work for you:

<<<$ one
two
three  sed  /two/d 

or

<<<$ one
two
three  sed  2d 

or

<<<$ one
two
three  sed  n;d 

or

<<<$ one
two
three  sed  N;N;s/two.// 

Sed does match all characters (including the ) using a dot . but usually it has already stripped the off, as part of the cycle, so it no longer present in the pattern space to be matched.

Only certain commands (N,H and G) preserve newlines in the pattern/hold space.

  1. N appends a newline to the pattern space and then appends the next line.
  2. H does exactly the same except it acts on the hold space.
  3. G appends a newline to the pattern space and then appends whatever is in the hold space too.

The hold space is empty until you place something in it so:

sed G file

will insert an empty line after each line.

sed  G;G  file

will insert 2 empty lines etc etc.

How about two sed calls:
(get rid of the two first, then get rid of the blank line)

$ echo -e  one
two
three  | sed  s/two//  | sed  /^$/d 
one
three

Actually, I prefer Perl for one-liners over Python:

$ echo -e  one
two
three  | perl -pe  s/two
// 
one
three

Below discussion is based on Gnu sed.

sed operates on a line by line manner. So it s not possible to tell it dot match newline. However, there are some tricks that can implement this. You can use a loop structure (kind of) to put all the text in the pattern space, and then do the operation.

To put everything in the pattern space, use:

:a;N;$!ba;

To make "dot match newline" indirectly, you use:

(
|.)

So the result is:

root@u1804:~# echo -e "one
two
three" | sed -r  :a;N;$!ba;s/one(
|.)*two/one/ 
one
three
root@u1804:~#

Note that in this case, ( |.) matches newline and all characters. See below example:

root@u1804:~# echo -e "oneXXXXXX
XXXXXXtwo
three" | sed -r  :a;N;$!ba;s/one(
|.)*two/one/ 
one
three
root@u1804:~#

https://unix.stackexchange.com/questions/182153/sed-read-whole-file-into-pattern-space-without-failing-on-single-line-input/182154#182154

Use H;1h;\$!d;x; ... as prefix to portable load whole stream into pattern space.

The :a;N;$!ba; can not parse text consisted of a single line.

The :a;$!{N;ba}; can, but not portable.





相关问题
how to tell sed "dot match new line"

I can t figure how to tell sed dot match new line: echo -e "one two three" | sed s/one.*two/one/m I expect to get: one three instead I get original: one two three

sed: using search pattern in output

I d like to use the search pattern in the output part of a sed command. For instance, given the following sed command: sed /HOSTNAME=/cHOSTNAME=fred /etc/sysconfig/network I d like to replace the ...

Text manipulation and removal

I have text files generated by one of my tools with structure shown below. 1 line text (space) multiple lines text (space) multiple lines text nr 2 ----------------------------------------------------...

Parse players currently in lobby

I m attempting to write a bash script to parse out the following log file and give me a list of CURRENT players in the room (so ignoring players that left, but including players that may have rejoined)...

how to use sed, awk, or gawk to print only what is matched?

I see lots of examples and man pages on how to do things like search-and-replace using sed, awk, or gawk. But in my case, I have a regular expression that I want to run against a text file to extract ...

Bulk Insert Code Before </body> Tag in 100 Files

I d like to insert <?php include_once( google_analytics.php ); ?> before the closing body tag of about 100 php files. Unfortunately the person who made the site didn t make a header or ...

Extract float from text line with sed?

I am issuing a sed replace on a line, trying to extract a particular floating-point value, but all that appears to be matched is the right-side of the decimal Text Line: 63.544: [GC 63.544: [DefNew: ...

热门标签