English 中文(简体)
Explain this Regular Expression please
原标题:

Regular Expressions are a complete void for me. I m dealing with one right now in TextMate that does what I want it to do...but I don t know WHY it does what I want it to do.

/[[:alpha:]]+|( )/(?1::$0)/g

This is used in a TextMate snippet and what it does is takes a Label and outputs it as an id name. So if I type "First Name" in the first spot, this outputs "FirstName". Previously it looked like this:

/[[:alpha:]]+|( )/(?1:_:/L$0)/g (it might have been L instead)

This would turn "First Name" into "first_name". So I get that the underscore adds an underscore for a space, and that the /L lowercases everything...but I can t figure out what the rest of it does or why.

Someone care to explain it piece by piece?

EDIT

Here is the actual snippet in question:

<column header="$1"><xmod:field name="${2:${1/[[:alpha:]]+|( )/(?1::$0)/g}}"/></column>
最佳回答

This regular expression (regex) format is basically:

 /matchthis/replacewiththis/settings

The "g" setting at the end means do a global replace, rather than just restricting the regex to a particular line or selection.

Breaking it down further...

  [[:alpha:]]+|( )

That matches an alpha numeric character (held in parameter $0), or optionally a space (held in matching parameter $1).

  (?1::$0)

As Roger says, the ? indicates this part is a conditional. If a match was found in parameter $1 then it is replaced with the stuff between the colons :: - in this case nothing. If nothing is in $1 then the match is replaced with the contents of $0, i.e. any alphanumeric character that is not a space is output unchanged.

This explains why the spaces are removed in the first example, and the spaces get replaced with underscores in your second example.

In the second expression the L is used to lowercase the text.

The extra question in the comment was how to run this expression outside of TextMate. Using vi as an example, I would break it into multiple steps:

:0,$s/ //g
:0,$s/u/L/g

The first part of the above commands tells vi to run a substitution starting on line 0 and ending at the end of the file (that s what $ means).

The rest of the expression uses the same sorts of rules as explained above, although some of the notation in vi is a bit custom - see this reference webpage.

问题回答

I find RegexBuddy a good tool for me in dealing with regexs. I pasted your 1st regex in to Buddy and I got the explanation shown in the bottom frame:

RegexBuddy

I use it for helping to understand existing regexs, building my own, testing regexs against strings, etc. I ve become better @ regexs because of it. FYI I m running under Wine on Ubuntu.

it s searching for any alpha character that appears at least once in a row [[:alpha:]]+ or space ( ).

/[[:alpha:]]+|( )/(?1::$0)/g

The (?1 is a conditional and used to strip the match if group 1 (a single space) was matched, or replace the match with $0 if group 1 wasn t matched. As $0 is the entire match, it gets replaced with itself in that case. This regex is the same as:

/ //g

I.e. remove all spaces.

/[[:alpha:]]+|( )/(?1:_:/L$0)/g

This regex is still using the same condition, except now if group 1 was matched, it s replaced with an underscore, and otherwise the full match ($0) is used, modified by L. L changes the case of all text that comes after it, so LABC would result in abc; think of it as a special control code.





相关问题
Uncommon regular expressions [closed]

Recently I discovered two amazing regular expression features: ?: and ?!. I was curious of other neat regex features. So maybe you would like to share some tricky regular expressions.

regex to trap img tag, both versions

I need to remove image tags from text, so both versions of the tag: <img src="" ... ></img> <img src="" ... />

C++, Boost regex, replace value function of matched value?

Specifically, I have an array of strings called val, and want to replace all instances of "%{n}%" in the input with val[n]. More generally, I want the replace value to be a function of the match ...

PowerShell -match operator and multiple groups

I have the following log entry that I am processing in PowerShell I m trying to extract all the activity names and durations using the -match operator but I am only getting one match group back. I m ...

Is it possible to negate a regular expression search?

I m building a lexical analysis engine in c#. For the most part it is done and works quite well. One of the features of my lexer is that it allows any user to input their own regular expressions. This ...

regex for four-digit numbers (or "default")

I need a regex for four-digit numbers separated by comma ("default" can also be a value). Examples: 6755 3452,8767,9865,8766,3454 7678,9876 1234,9867,6876,9865 default Note: "default" ...

热门标签