English 中文(简体)
Using magic strings or constants in processing punctuation?
原标题:

We do a lot of lexical processing with arbitrary strings which include arbitrary punctuation. I am divided as to whether to use magic characters/strings or symbolic constants.

The examples should be read as language-independent although most are Java.

There are clear examples where punctuation has a semantic role and should be identified as a constant:

File.separator not "/" or "\"; // a no-brainer as it is OS-dependent

and I write XML_PREFIX_SEPARATOR = ":";

However let s say I need to replace all examples of "" with an empty string ``. I can write:

s = s.replaceAll("""", "");

or

s = s.replaceAll(S_QUOT+S_QUOT, S_EMPTY);

(I have defined all common punctuation as S_FOO (string) and C_FOO (char))

In favour of magic strings/characters:

  1. It s shorter
  2. It s natural to read (sometimes)
  3. The named constants may not be familiar (C_APOS vs )

In favour of constants

  1. It s harder to make typos (e.g. contrast " " + " with S_APOS+S_APOS + C_QUOT)
  2. It removes escaping problems Should a regex be "\s+" or "s+" or "\\s+"?
  3. It s easy to search the code for punctuation

(There is a limit to this - I would not write regexes this way even though regex syntax is one of the most cognitively dysfunctional parts of all programming. I think we need a better syntax.)

最佳回答

If the definitions may change over time or between installations, I tend to put these things in a config file, and pick up the information at startup or on-demand (depending on the situation). Then provide a static class with read-only interface and clear names on the properties for exposing the information to the system.

Usage could look like this:

s = s.replaceAll(CharConfig.Quotation + CharConfig.Quotation, CharConfig.EmtpyString);
问题回答

For general string processing, I wouldn t use special symbols. A space is always going to be a space, and it s just more natural to read (and write!):

s.replace("String", " ");

Than:

s.replace("String", S_SPACE);

I would take special care to use things like " " to represent tabs, for example, since they can t easily be distinguished from spaces in a string.

As for things like XML_PREFIX_SEPARATOR or FILE_SEPARATOR, you should probably never have to deal with constants like that, since you should use a library to do the work for you. For example, you shouldn t be hand-writing: dir + FILE_SEPARATOR + filename, but rather be calling: file_system_library.join(dir, filename) (or whatever equivalent you re using).

This way, you ll not only have an answer for things like the constants, you ll actually get much better handling of various edge cases which you probably aren t thinking about right now





相关问题
XML-RPC Standard and XML Data Type

I was looking at XML-RPC for a project. And correct me if I m wrong, but it seems like XML-RPC has no XML datatype. Are you supposed to pass as a string? or something else? Am I missing something? ...

Is it exists any "rss hosting" with API for creating feeds

I am creating a desktop app that will create some reports. I want to export these reports as RSS or ATOM feeds. I can easily create feeds with Rome lib for Java. But I have no idea how to spread them. ...

Improving Q-Learning

I am currently using Q-Learning to try to teach a bot how to move in a room filled with walls/obstacles. It must start in any place in the room and get to the goal state(this might be, to the tile ...

High-traffic, Highly-secure web API, what language? [closed]

If you were planning on building a high-traffic, very secure site what language would you use? For example, if you were planning on say building an authorize.net-scale site, that had to handle tons ...

Def, Void, Function?

Recently, I ve been learning different programming langages, and come across many different names to initalize a function construct. For instance, ruby and python use the def keyword, and php and ...

热门标签