English 中文(简体)
在寻找跨越两条线的短语时,我如何避免CR和LF?
原标题:How can I avoid CR and LF when searching a file for a phrase that straddles two lines?
  • 时间:2012-01-16 00:57:04
  •  标签:
  • c#
  • string

I m trying to search an Html file for a list of words or phrases and write the file back out with added html tags around those words/phrases. The rest of the file needs to remain as is. I don t know how to get around the situation where a phrase is broken across two lines. Can anyone help? I m new to this so please be explicit in your answer.

这里有一份投入文件:(超文本标记在另一行)

<p>
The thousand injuries of Fortunato I had borne as I best could, but
when he ventured upon insult, I vowed revenge.  You, who so well know
the nature of my soul, will not suppose, however, that I gave utterance
to a threat.  <i>At length</i> I would be avenged; this was a point definitely

这里是迄今为止的法典:

    //get the table of words
    DataTable table = LibraryAccess.GetWords(titleID);

    using (StreamReader streamReader = File.OpenText(fileUploadPath))
    {
        inputString = streamReader.ReadToEnd();
        streamReader.Close();
        textCopy.Append(inputString);
    }

    if (inputString != null)
    {
        inputString = inputString.ToUpper();

        foreach (DataRow r in table.Rows)
        {
            searchWord = (r["Word"].ToString()).ToUpper();
            wordLength = searchWord.Length;
            foundIndex = inputString.IndexOf(searchWord);

            //if (foundIndex >= 0)
            //{

                //Use the Stringbuilder to modify the output file, e.g. add Bold tags
                //around the word/expression
            //}

            foundIndex = -1;

        }
    }
    else
    {
        Response.Write("input string is empty");
    }
  }  

The Word I m searching for is "gave utterance to" In the source file, there is a CRLF after utterance, so the Indexof is not finding the Word. I could easily replace the CRLF with a blank, but I need to put them back in the file when I write out the modified version. I don t know how to preserve them.

任何关于如何有效地做到这一点的想法? 我想花大量时间。 我最初是在读取语时这样做的,出于记忆原因,我更喜欢这样做,但会遇到同样的问题。 增 编

问题回答

如何在尝试处理之前将超文本处理到一个速效区? 将其通过HtmlAgilityPack浏览应当有助于从该文件中提取案文的工作。

总的来说,我会做的是,我只是试图把你的超文本放在简单文本上,然后进行搜索。 也许可以通过HtmlAgilityPack来完成这项工作,但也有,该项目确实是用“ugly RegEx”搜索的bu。 我没有用斜线来使用,所以我不知道它是否会在超文本中固定新线的休息,但可能值得一枪。





相关问题
Anyone feel like passing it forward?

I m the only developer in my company, and am getting along well as an autodidact, but I know I m missing out on the education one gets from working with and having code reviewed by more senior devs. ...

NSArray s, Primitive types and Boxing Oh My!

I m pretty new to the Objective-C world and I have a long history with .net/C# so naturally I m inclined to use my C# wits. Now here s the question: I feel really inclined to create some type of ...

C# Marshal / Pinvoke CBitmap?

I cannot figure out how to marshal a C++ CBitmap to a C# Bitmap or Image class. My import looks like this: [DllImport(@"test.dll", CharSet = CharSet.Unicode)] public static extern IntPtr ...

How to Use Ghostscript DLL to convert PDF to PDF/A

How to user GhostScript DLL to convert PDF to PDF/A. I know I kind of have to call the exported function of gsdll32.dll whose name is gsapi_init_with_args, but how do i pass the right arguments? BTW, ...

Linqy no matchy

Maybe it s something I m doing wrong. I m just learning Linq because I m bored. And so far so good. I made a little program and it basically just outputs all matches (foreach) into a label control. ...

热门标签