I have a DB with some text fields pasted from MS Word, and I m having trouble to strip just the , and tags, but obviously keeping their innerText.
I ve tried using the HAP but I m not going in the right direction..
Public Function StripHtml(ByVal html As String, ByVal allowHarmlessTags As Boolean) As String
Dim htmlDoc As New HtmlDocument()
htmlDoc.LoadHtml(html)
Dim invalidNodes As HtmlNodeCollection = htmlDoc.DocumentNode.SelectNodes("//div|//font|//span")
For Each node In invalidNodes
node.ParentNode.RemoveChild(node, False)
Next
Return htmlDoc.DocumentNode.WriteTo()
End Function
This code simply selects the desired elements and removes them... but not keeping their inner text..
Thanks in advance