我很想知道是否有任何类型的C#类或3级政党图书馆消除诸如文字标签等危险特性?
我知道你可以使用ex子,但我也知道,人们可以写上他们手脚的标签,这样你就可以把reg子ool起来,把它看作是科索沃。
我还听到。 紫外线 那么,我很想知道,是否为它规定过任何打字级?
<><>Edit>/strong>
我发现这种形式。 然而,我不相信,这一解决办法是完全的,因为伪装没有任何检验可加以证实,如果在某些地方使用这种文字的人每天都在检测到是否有任何东西,那那将是nic。
Great example (almost), Thanks! A few ways to make it stronger that I saw, though:
1) Use case-insensitive search when looking for links with "javascript:", "vbscript:", "jscript:". For example, the original example would not remove the HTML:
<a href="JAVAscRipt:alert( hi )">click> me</a>
2) Remove any style attributes that contain an expression rule. Internet Explorer evaluates the CSS rule express as script. For example, the following would product a message box:
<div style="width:expression(alert( hi ));">bad> code</div>
3) 拆除标签
I honestly have no idea why "expression" has not been removed from IE - major flaw in my opinion. (Try the div example in internet explorer and you ll see why - even IE8.) I just wish there was an easier/standard way to clean-up html input from a user.
Here s the code updated with these improvements. Let me know if you see anything wrong:
public string ScrubHTML(string html)
{
HtmlDocument doc = new HtmlDocument();
doc.LoadHtml(html);
//Remove potentially harmful elements
HtmlNodeCollection nc = doc.DocumentNode.SelectNodes("//script|//link|//iframe|//frameset|//frame|//applet|//object|//embed");
if (nc != null)
{
foreach (HtmlNode node in nc)
{
node.ParentNode.RemoveChild(node, false);
}
}
//remove hrefs to java/j/vbscript URLs
nc = doc.DocumentNode.SelectNodes("//a[starts-with(translate(@href, ABCDEFGHIJKLMNOPQRSTUVWXYZ , abcdefghijklmnopqrstuvwxyz ), javascript )]|//a[starts-with(translate(@href, ABCDEFGHIJKLMNOPQRSTUVWXYZ , abcdefghijklmnopqrstuvwxyz ), jscript )]|//a[starts-with(translate(@href, ABCDEFGHIJKLMNOPQRSTUVWXYZ , abcdefghijklmnopqrstuvwxyz ), vbscript )]");
if (nc != null)
{
foreach (HtmlNode node in nc)
{
node.SetAttributeValue("href", "#");
}
}
//remove img with refs to java/j/vbscript URLs
nc = doc.DocumentNode.SelectNodes("//img[starts-with(translate(@src, ABCDEFGHIJKLMNOPQRSTUVWXYZ , abcdefghijklmnopqrstuvwxyz ), javascript )]|//img[starts-with(translate(@src, ABCDEFGHIJKLMNOPQRSTUVWXYZ , abcdefghijklmnopqrstuvwxyz ), jscript )]|//img[starts-with(translate(@src, ABCDEFGHIJKLMNOPQRSTUVWXYZ , abcdefghijklmnopqrstuvwxyz ), vbscript )]");
if (nc != null)
{
foreach (HtmlNode node in nc)
{
node.SetAttributeValue("src", "#");
}
}
//remove on<Event> handlers from all tags
nc = doc.DocumentNode.SelectNodes("//*[@onclick or @onmouseover or @onfocus or @onblur or @onmouseout or @ondoubleclick or @onload or @onunload]");
if (nc != null)
{
foreach (HtmlNode node in nc)
{
node.Attributes.Remove("onFocus");
node.Attributes.Remove("onBlur");
node.Attributes.Remove("onClick");
node.Attributes.Remove("onMouseOver");
node.Attributes.Remove("onMouseOut");
node.Attributes.Remove("onDoubleClick");
node.Attributes.Remove("onLoad");
node.Attributes.Remove("onUnload");
}
}
// remove any style attributes that contain the word expression (IE evaluates this as script)
nc = doc.DocumentNode.SelectNodes("//*[contains(translate(@style, ABCDEFGHIJKLMNOPQRSTUVWXYZ , abcdefghijklmnopqrstuvwxyz ), expression )]");
if (nc != null)
{
foreach (HtmlNode node in nc)
{
node.Attributes.Remove("stYle");
}
}
return doc.DocumentNode.WriteTo();
}