English 中文(简体)
正则表达式 - 贪婪但在字符串匹配前停止
原标题:Regular Expressions - Greedy but stop before a string match

我有一些数据,我想把它转换成表格格式。

在此输入数据

1- This is the 1st line with a 
newline character
2- This is the 2nd line

每行可能包含多个新行字符 。

产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出产出

<td>1- This the 1st line with 
a new line character</td>
<td>2- This is the 2nd line</td>

我试过以下

(d{{1,3}-)[d]*

但它似乎只匹配到 1 位数 1 。

I d like to be able to stop matching after i find another d{1,3}- in my string. Any suggestions?

EDIT: I m using EditPad Lite.

问题回答

这是用于振动, 并使用零维正前视线:

/^d{1,3}-\_.*[
](d{1,3}-)@=

步骤:

/^d{1,3}-              1 to 3 digits followed by -
\_.*                      any number of characters including newlines/linefeeds
[
](d{1,3}-)@=   followed by a newline/linefeed ONLY if it is followed 
                          by 1 to 3 digits followed by - (the first condition)

这就是它如何 在pcre/ruby:

/(d{1,3}-.*?[
])(?=(?:d{1,3}-)|)/m

请注意,您需要一个带新行的字符串结尾,以匹配最后一个条目。

SEARCH:   ^d+-.*(?:[
]++(?!d+-).*)*

REPLACE:  <td>$0</td>

[ ]++ matches one or more carriage-returns or linefeeds, so you don t have to worry about whether the file use Unix ( ), DOS ( ), or older Mac ( ) line separators.

(?.d+-) 称,线条分隔符之后的第一件事不是另一行编号。

I used the possessive + in [ ]++ to make sure it matches the whole separator. Otherwise, if the separator is , [ ]+ could match the and (?!d+-) could match the .

在EdiedPad Pro中测试, 但它也应该在利特(Lite)中工作。

您没有指定语言( 有很多正则表达式执行), 但一般而言, 您所要寻找的是被称为“ 正面观”, 这使得您可以添加会影响匹配的模式, 但不会成为其中的一部分 。

在您所使用的语言的文档中查找外观 。

编辑: 以下样本似乎在动态中有效 。

:%s#v(^d+-\_.{-})ze(
d+-|%$)#<td>1</td>

说明如下:

%      - for all lines
s#     - substitute the following (you can use any delimiter, and slash is most
         common, but as that will require that we escape slashes in the command
         I chose to use the number sign)
v     - very magic mode, let s us use less backslashes
(      - start group for back referencing
^      - start of line
d+    - one or more digits (as many as possible)
-      - a literal dash!
\_.    - any character, including a newline
{-}    - zero or more of these (as few as possible)
)      - end group
ze    - end match (anything beyond this point will not be included in the match)
(      - start a new group
[

] - newline (in any format - thanks Alan)
d+    - one or more digits
-      - a dash
|      - or
%$     - end of file
)      - end group
#      - start substitute string
<td>1</td> - a TD tag around the first matched group
(d+-.+(
|$)((?!^d-).+(
|$))?)

您只能匹配分隔符, 并将其分开。 例如在 C# 中, 可以这样进行 :

string s = "1- This is the 1st line with a 
newline character
2- This is the 2nd line";
string ss = "<td>" + string.Join("</td>
<td>", Regex.Split(s.Substring(3), "
\d{1,3}- ")) + "</td>";
MessageBox.Show(ss);

分三步行好吗?

(这些是perl regex):

替换第一行:

$input =~ s/^(d{1,3})/<td>1/; 

替换其余

$input =~ s/
(d{1,3})/</td>
<td>1/gm;  

添加最后一项:

$input .=  </td> ; 




相关问题
Uncommon regular expressions [closed]

Recently I discovered two amazing regular expression features: ?: and ?!. I was curious of other neat regex features. So maybe you would like to share some tricky regular expressions.

regex to trap img tag, both versions

I need to remove image tags from text, so both versions of the tag: <img src="" ... ></img> <img src="" ... />

C++, Boost regex, replace value function of matched value?

Specifically, I have an array of strings called val, and want to replace all instances of "%{n}%" in the input with val[n]. More generally, I want the replace value to be a function of the match ...

PowerShell -match operator and multiple groups

I have the following log entry that I am processing in PowerShell I m trying to extract all the activity names and durations using the -match operator but I am only getting one match group back. I m ...

Is it possible to negate a regular expression search?

I m building a lexical analysis engine in c#. For the most part it is done and works quite well. One of the features of my lexer is that it allows any user to input their own regular expressions. This ...

regex for four-digit numbers (or "default")

I need a regex for four-digit numbers separated by comma ("default" can also be a value). Examples: 6755 3452,8767,9865,8766,3454 7678,9876 1234,9867,6876,9865 default Note: "default" ...

热门标签