English 中文(简体)
原标题:Get plain text from HTML in .NET


public string GetPlainText(string htmlString)
    // any .NET built in utility?

预 收



string htmlString = @"<p>I m HTML!</p>";
Regex.Replace(htmlString, @"<(.|
)*?>", "");


//using microsoft.mshtml
HTMLDocument htmldoc = new HTMLDocument();
IHTMLDocument2 htmldoc2 = (IHTMLDocument2)htmldoc;
htmldoc2.write(new object[] { "<p>Plateau <i>of<i> <b>Leng</b><hr /><b erp="arp">2 sugars please</b> <xxx>what? &amp; who?" });

string txt = htmldoc2.body.outerText;

Plateau of Leng 2 sugars please what? & who?


If you need to parse HTML I made good experience using a library called HTML Agility Pack.
It parses an HTML file and provides access to it by DOM, similar to the XML classes.

个人方面,我发现,reg和 H的结合是最佳和最短的解决办法。

Return HttpUtility.HtmlDecode(
                Regex.Replace(HtmlString, "<(.|
)*?>", "")


采用的方法建立了网。 但是,正如@rudi_visser所指出的,可以通过Regular expressions进行。


Simple JAVA: Password Verifier problem

I have a simple problem that says: A password for xyz corporation is supposed to be 6 characters long and made up of a combination of letters and digits. Write a program fragment to read in a string ...

Case insensitive comparison of strings in shell script

The == operator is used to compare two strings in shell script. However, I want to compare two strings ignoring case, how can it be done? Is there any standard command for this?

Trying to split by two delimiters and it doesn t work - C

I wrote below code to readin line by line from stdin ex. city=Boston;city=New York;city=Chicago and then split each line by ; delimiter and print each record. Then in yet another loop I try to ...

String initialization with pair of iterators

I m trying to initialize string with iterators and something like this works: ifstream fin("tmp.txt"); istream_iterator<char> in_i(fin), eos; //here eos is 1 over the end string s(in_i, ...

break a string in parts

I have a string "pc1|pc2|pc3|" I want to get each word on different line like: pc1 pc2 pc3 I need to do this in C#... any suggestions??

Quick padding of a string in Delphi

I was trying to speed up a certain routine in an application, and my profiler, AQTime, identified one method in particular as a bottleneck. The method has been with us for years, and is part of a "...