English 中文(简体)
如何在JavaScript中剥离所有HTML标签并进行例外处理?
原标题:How do I strip all html tags in javascript with exceptions?

现在,我就把我的头盔打在最久的时间里,希望有人能够提供帮助。 基本上,我有一个WYSIWYYYG领域,用户可以打造格式文本。 当然,它们将复制和草签字/网站/名称。 因此,我有一份联合材料,收集过去的投入。 我的职责是,在案文中删除所有格式,但我想让它留下像p一样的标签,这样它就不仅仅是一个大的东西。

有任何正则表达式的高手吗?这里是我目前所拥有的,它能够工作。只需要允许标签。

o.node.innerHTML=o.node.innerHTML.replace(/(<([^>]+)>)/ig,"");
最佳回答

浏览器已经拥有一个完美的解析HTML树形结构在o.node中。将文档内容序列化为HTML(使用innerHTML),试图用正则表达式修改它(无法可靠地解析HTML),然后通过设置innerHTML重新解析结果回到文档内容...真的有点扭曲。

相反,检查您已经在 o.node 内拥有的元素和属性节点,删除您不想要的节点,例如:

filterNodes(o.node, {p: [], br: [], a: [ href ]});

被定义为:

// Remove elements and attributes that do not meet a whitelist lookup of lowercase element
// name to list of lowercase attribute names.
//
function filterNodes(element, allow) {
    // Recurse into child elements
    //
    Array.fromList(element.childNodes).forEach(function(child) {
        if (child.nodeType===1) {
            filterNodes(child, allow);

            var tag= child.tagName.toLowerCase();
            if (tag in allow) {

                // Remove unwanted attributes
                //
                Array.fromList(child.attributes).forEach(function(attr) {
                    if (allow[tag].indexOf(attr.name.toLowerCase())===-1)
                       child.removeAttributeNode(attr);
                });

            } else {

                // Replace unwanted elements with their contents
                //
                while (child.firstChild)
                    element.insertBefore(child.firstChild, child);
                element.removeChild(child);
            }
        }
    });
}

// ECMAScript Fifth Edition (and JavaScript 1.6) array methods used by `filterNodes`.
// Because not all browsers have these natively yet, bodge in support if missing.
//
if (!( indexOf  in Array.prototype)) {
    Array.prototype.indexOf= function(find, ix /*opt*/) {
        for (var i= ix || 0, n= this.length; i<n; i++)
            if (i in this && this[i]===find)
                return i;
        return -1;
    };
}
if (!( forEach  in Array.prototype)) {
    Array.prototype.forEach= function(action, that /*opt*/) {
        for (var i= 0, n= this.length; i<n; i++)
            if (i in this)
                action.call(that, this[i], i, this);
    };
}

// Utility function used by filterNodes. This is really just `Array.prototype.slice()`
// except that the ECMAScript standard doesn t guarantee we re allowed to call that on
// a host object like a DOM NodeList, boo.
//
Array.fromList= function(list) {
    var array= new Array(list.length);
    for (var i= 0, n= list.length; i<n; i++)
        array[i]= list[i];
    return array;
};
问题回答

首先,我不确定正则表达式是否是适合这种情况的正确工具。用户可能会输入无效的HTML(忘记输入>,或者将>放在属性中),那么正则表达式就会失败。不过我不确定解析器是否更好/更牢固。

其次,您的正则表达式中有一些不必要的括号。

第三,您可以使用前瞻来排除某些标签:

o.node.innerHTML=o.node.innerHTML.replace(/<(?!s*/?(br|p))[^>]+>/ig,"");

解释:

"<" 匹配开角括号。

(?!s*/?(br|p)) 断言不能匹配零个或多个空白字符、零个或一个 /,任何一个 br 或 p,直接跟着一个单词边界。单词边界很重要,否则可能会触发像

这样的标签前瞻。

[^>]+ 匹配一个或多个非闭合尖括号的字符

对关闭的尖括号进行匹配。

请注意,如果结束角括号出现在标签内的某个位置,可能会遇到问题。

这样做可以匹配(并剥离)

<pre> <a href="dot.com"> </a> </pre>的中文翻译是:<pre> <a href="dot.com"> </a> </pre>

休假

< /br>

等等。

孤独。





相关问题
selected text in iframe

How to get a selected text inside a iframe. I my page i m having a iframe which is editable true. So how can i get the selected text in that iframe.

How to fire event handlers on the link using javascript

I would like to click a link in my page using javascript. I would like to Fire event handlers on the link without navigating. How can this be done? This has to work both in firefox and Internet ...

How to Add script codes before the </body> tag ASP.NET

Heres the problem, In Masterpage, the google analytics code were pasted before the end of body tag. In ASPX page, I need to generate a script (google addItem tracker) using codebehind ClientScript ...

Clipboard access using Javascript - sans Flash?

Is there a reliable way to access the client machine s clipboard using Javascript? I continue to run into permissions issues when attempting to do this. How does Google Docs do this? Do they use ...

javascript debugging question

I have a large javascript which I didn t write but I need to use it and I m slowely going trough it trying to figure out what does it do and how, I m using alert to print out what it does but now I ...

Parsing date like twitter

I ve made a little forum and I want parse the date on newest posts like twitter, you know "posted 40 minutes ago ","posted 1 hour ago"... What s the best way ? Thanx.