English 中文(简体)
What are the things to watch out for with case insensitive regex replace?

I have written the following code to do case insensitive replace in C#:


Just wanted to check, whether this is the right approach, or is there a better approach and whether I m overlooking something that I should better be aware of.

Note: Please don t provide me some hand crafted code, I had used a fast replace function from codeproject, and that code crashes at client side, and I have no way to know, what input the user was using. So, I prefer some simple but correct and reliable method.


Your code seems ok, but remember that when you do case-insensitive matching like that, you use the current locale or culture. It is probably better to add the Culture you want, or have the user select it. CultureInvariant is usually a good general choice to act the same in any locale:

    RegexOptions.IgnoreCase | RegexOptions.CultureInvariant);

To use another locale, you need to do a bit more hocus pocus:

// remember current
CultureInfo originalCulture = Thread.CurrentThread.CurrentCulture;

// set user-selected culture here (in place of "en-US")
Thread.CurrentThread.CurrentCulture = CultureInfo.CreateSpecificCulture("en-US");

// do the regex

// reset the original culture
Thread.CurrentThread.CurrentCulture = originalCulture;

Note that you can switch case insensitivity on or off. It is not a toggle, that means that:

// these three statements are equivalent and yield the same results:
Regex.Replace("tExT", "[a-z]", "", RegexOptions.IgnoreCase);
Regex.Replace("tExT", "(?i)[a-z]", "", RegexOptions.IgnoreCase);
Regex.Replace("tExT", "(?i)[a-z]", "");

// once IgnoreCase is used, this switches it off for the whole expression...
Regex.Replace("tExT", "(?-i)[a-z]", "", RegexOptions.IgnoreCase);

//...and this can switch it off for only a part of the expression:
Regex.Replace("tExT", "(?:(?-i)[a-z])", "", RegexOptions.IgnoreCase);

The last one is interesting: between the (?:) after the non-capturing grouping parenthesis, the case-switch (?-i) is not effective anymore. You can use this as often as you like in an expression. Using it without grouping makes them effective until the next case-sensitivity switch, or to the end.

Update: I made the wrong assumption that you can t do case-sensitivity switching. The text above is edited with this in mind.



Anyone feel like passing it forward?

I m the only developer in my company, and am getting along well as an autodidact, but I know I m missing out on the education one gets from working with and having code reviewed by more senior devs. ...

NSArray s, Primitive types and Boxing Oh My!

I m pretty new to the Objective-C world and I have a long history with .net/C# so naturally I m inclined to use my C# wits. Now here s the question: I feel really inclined to create some type of ...

C# Marshal / Pinvoke CBitmap?

I cannot figure out how to marshal a C++ CBitmap to a C# Bitmap or Image class. My import looks like this: [DllImport(@"test.dll", CharSet = CharSet.Unicode)] public static extern IntPtr ...

How to Use Ghostscript DLL to convert PDF to PDF/A

How to user GhostScript DLL to convert PDF to PDF/A. I know I kind of have to call the exported function of gsdll32.dll whose name is gsapi_init_with_args, but how do i pass the right arguments? BTW, ...

Linqy no matchy

Maybe it s something I m doing wrong. I m just learning Linq because I m bored. And so far so good. I made a little program and it basically just outputs all matches (foreach) into a label control. ...
