Question

We do a lot of lexical processing with arbitrary strings which include arbitrary punctuation. I am divided as to whether to use magic characters/strings or symbolic constants.

The examples should be read as language-independent although most are Java.

There are clear examples where punctuation has a semantic role and should be identified as a constant:

File.separator not "/" or "\"; // a no-brainer as it is OS-dependent

and I write XML_PREFIX_SEPARATOR = ":";

However let s say I need to replace all examples of "" with an empty string ``. I can write:

s = s.replaceAll("""", "");

or

s = s.replaceAll(S_QUOT+S_QUOT, S_EMPTY);

(I have defined all common punctuation as S_FOO (string) and C_FOO (char))

In favour of magic strings/characters:

It s shorter
It s natural to read (sometimes)
The named constants may not be familiar (C_APOS vs )

In favour of constants

It s harder to make typos (e.g. contrast " " + " with S_APOS+S_APOS + C_QUOT)
It removes escaping problems Should a regex be "\s+" or "s+" or "\\s+"?
It s easy to search the code for punctuation

(There is a limit to this - I would not write regexes this way even though regex syntax is one of the most cognitively dysfunctional parts of all programming. I think we need a better syntax.)

Answer 1

If the definitions may change over time or between installations, I tend to put these things in a config file, and pick up the information at startup or on-demand (depending on the situation). Then provide a static class with read-only interface and clear names on the properties for exposing the information to the system.

Usage could look like this:

s = s.replaceAll(CharConfig.Quotation + CharConfig.Quotation, CharConfig.EmtpyString);

Answer 2

For general string processing, I wouldn t use special symbols. A space is always going to be a space, and it s just more natural to read (and write!):

s.replace("String", " ");

Than:

s.replace("String", S_SPACE);

I would take special care to use things like " " to represent tabs, for example, since they can t easily be distinguished from spaces in a string.

As for things like XML_PREFIX_SEPARATOR or FILE_SEPARATOR, you should probably never have to deal with constants like that, since you should use a library to do the work for you. For example, you shouldn t be hand-writing: dir + FILE_SEPARATOR + filename, but rather be calling: file_system_library.join(dir, filename) (or whatever equivalent you re using).

This way, you ll not only have an answer for things like the constants, you ll actually get much better handling of various edge cases which you probably aren t thinking about right now

友情链接