English 中文(简体)
Recursive woes - reducing an input string
原标题:

I m working on a portion of code that is essentially trying to reduce a list of strings down to a single string recursively.

I have an internal database built up of matching string arrays of varying length (say array lengths of 2-4).

An example input string array would be:

{"The", "dog", "ran", "away"}

And for further example, my database could be made up of string arrays in this manner:

(length 2) {{"The", "dog"},{"dog", "ran"}, {"ran", "away"}}
(length 3) {{"The", "dog", "ran"}.... and so on

So, what I am attempting to do is recursively reduce my input string array down to a single token. So ideally it would parse something like this:

1) {"The", "dog", "ran", "away"}

Say that (seq1) = {"The", "dog"} and (seq2) = {"ran", "away"}

2) {  (seq1), "ran", "away"}
3) {  (seq1), (seq2)}

In my sequence database I know that, for instance, seq3 = {(seq1), (seq2)}

4) {  (seq3)  }

So, when it is down to a single token, I m happy and the function would end.

Here is an outline of my current program logic:

public void Tokenize(Arraylist<T> string_array, int current_size)
{
  // retrieve all known sequences of length [current_size] (from global list array)
  loc_sequences_by_length = sequences_by_length[current_size-min_size]; // sequences of length 2 are stored in position 0 and so on

  // escape cases
  if (string_array.Count == 1)
  {
    // finished successfully
    return;
  }
  else if (string_array.Count < current_size)
  {
    // checking sequences of greater length than input string, bail
    return;
  }
  else
  {
    // split input string into chunks of size [current_size] and compare to local database 
    // of known sequences
    // (splitting code works fine)

    foreach (comparison)
    {
      if (match_found)
      {
        // update input string and recall function to find other matches
        string_array[found_array_position] = new_sequence;
        string_array.Removerange[found_array_position+1, new_sequence.Length-1];
        Tokenize(string_array, current_size)
      }
    }   
  }

  // ran through unsuccessfully, increment length and try again for new sequence group
  current_size++;
  if (current_size > MAX_SIZE)
    return;
  else
    Tokenize(string_array, current_size);
}

I thought it was straightforward enough, but have been getting some strange results. Generally it appears to work, but upon further review of my output data I m seeing some issues. Mainly, it appears to work up to a certain point...and at that point my curr_size counter resets to the minimum value.

So it is called with a size of 2, then 3, then 4, then resets to 2. My assumption was that it would run up to my predetermined max size, and then bail completely.

I tried to simplify my code as much as possible, so there are probably some simple syntax errors in transcribing. If there is any other detail that may help an eagle-eyed SO user, please let me know and I ll edit.

Thanks in advance

最佳回答

Not sure what all the issues are, but the first thing I d do is have your "catch-all" exit block right at the beginning of your method.

public void Tokenize(Arraylist<T> string_array, int current_size)
{
  if (current_size > MAX_SIZE)
    return;

  // Guts go here

  Tokenize(string_array, ++current_size);
}

A couple things:

  1. Your tokens are not clearly separated from your input string values. This makes it more difficult to handle, and to see what s going on.
  2. It looks like you re writing pseudo-code:
    • loc_sequences_by_length is not used
    • found_array_position is not defined
    • Arraylist should be ArrayList.
    • etc.

Overall I agree with James statement:

It would help if you cleaned up the code, to make it easier to read.

-Doug

问题回答

One bug is: string_array[found_array_position] = new_sequence;

I don t know where this is defined, and as far as I can tell if it was defined, it is never changed.

In your if statement, when if match_found ever set to true?

Also, it appears you have an extra close brace here, but you may want the last block of code to be outside of the function:

     }
    }   
  }

It would help if you cleaned up the code, to make it easier to read. Once we get past the syntactic errors it will be easier to see what is going on, I think.





相关问题
Anyone feel like passing it forward?

I m the only developer in my company, and am getting along well as an autodidact, but I know I m missing out on the education one gets from working with and having code reviewed by more senior devs. ...

NSArray s, Primitive types and Boxing Oh My!

I m pretty new to the Objective-C world and I have a long history with .net/C# so naturally I m inclined to use my C# wits. Now here s the question: I feel really inclined to create some type of ...

C# Marshal / Pinvoke CBitmap?

I cannot figure out how to marshal a C++ CBitmap to a C# Bitmap or Image class. My import looks like this: [DllImport(@"test.dll", CharSet = CharSet.Unicode)] public static extern IntPtr ...

How to Use Ghostscript DLL to convert PDF to PDF/A

How to user GhostScript DLL to convert PDF to PDF/A. I know I kind of have to call the exported function of gsdll32.dll whose name is gsapi_init_with_args, but how do i pass the right arguments? BTW, ...

Linqy no matchy

Maybe it s something I m doing wrong. I m just learning Linq because I m bored. And so far so good. I made a little program and it basically just outputs all matches (foreach) into a label control. ...

热门标签