English 中文(简体)
C# Regex.Split - Subpattern returns empty strings
原标题:

Hey, first time poster on this awesome community.

I have a regular expression in my C# application to parse an assignment of a variable:

NewVar = 40

which is entered in a Textbox. I want my regular expression to return (using Regex.Split) the name of the variable and the value, pretty straightforward. This is the Regex I have so far:

var r = new Regex(@"^(w+)=(d+)$", RegexOptions.IgnorePatternWhitespace);
var mc = r.Split(command);

My goal was to do the trimming of whitespace in the Regex and not use the Trim() method of the returned values. Currently, it works but it returns an empty string at the beginning of the MatchCollection and an empty string at the end.

Using the above input example, this is what s returned from Regex.Split:

mc[0] = ""
mc[1] = "NewVar"
mc[2] = "40"
mc[3] = ""

So my question is: why does it return an empty string at the beginning and the end?

Thanks.

最佳回答

The reson RegEx.Split is returning four values is that you have exactly one match, so RegEx.Split is returning:

  • All the text before your match, which is ""
  • All () groups within your match, which are "NewVar" and "40"
  • All the text after your match, which is ""

RegEx.Split s primary purpose is to extract any text between the matched regex, for example you could use RegEx.Split with a pattern of "[,;]" to split text on either commas or semicolons. In NET Framework 1.0 and 1.1, Regex.Split only returned the split values, in this case "" and "", but in NET Framework 2.0 it was modified to also include values matched by () within the Regex, which is why you are seeing "NewVar" and "40" at all.

What you were looking for is Regex.Match, not Regex.Split. It will do exactly what you want:

var r = new Regex(@"^(w+)=(d+)$");
var match = r.Match(command);
var varName = match.Groups[0].Value;
var valueText = match.Groups[1].Value;

Note that RegexOptions.IgnorePatternWhitespace means you can include extra spaces in your pattern - it has nothing to do with the matched text. Since you have no extra whitespace in your pattern it is unnecesssary.

问题回答

From the docs, Regex.Split() uses the regular expression as the delimiter to split on. It does not split the captured groups out of the input string. Also, the IgnorePatternWhitespace ignore unescaped whitespace in your pattern, not the input.

Instead, try the following:

var r = new Regex(@"s*=s*");
var mc = r.Split(command);

Note that the whitespace is actually consumed as a part of the delimiter.





相关问题
Anyone feel like passing it forward?

I m the only developer in my company, and am getting along well as an autodidact, but I know I m missing out on the education one gets from working with and having code reviewed by more senior devs. ...

NSArray s, Primitive types and Boxing Oh My!

I m pretty new to the Objective-C world and I have a long history with .net/C# so naturally I m inclined to use my C# wits. Now here s the question: I feel really inclined to create some type of ...

C# Marshal / Pinvoke CBitmap?

I cannot figure out how to marshal a C++ CBitmap to a C# Bitmap or Image class. My import looks like this: [DllImport(@"test.dll", CharSet = CharSet.Unicode)] public static extern IntPtr ...

How to Use Ghostscript DLL to convert PDF to PDF/A

How to user GhostScript DLL to convert PDF to PDF/A. I know I kind of have to call the exported function of gsdll32.dll whose name is gsapi_init_with_args, but how do i pass the right arguments? BTW, ...

Linqy no matchy

Maybe it s something I m doing wrong. I m just learning Linq because I m bored. And so far so good. I made a little program and it basically just outputs all matches (foreach) into a label control. ...

热门标签