English 中文(简体)
移除MSWord产生的超文本标签
原标题:Java class to remove HTML tags generated by MSWord

Some HTML forms are filled by users using copy&paste from MSWord, in FCK editors, or others. This generates nasty tags annoying other tools to work fine. Is there a way the server can clean the incoming parameters, so nasty HTML tags would be removed?

当然,只要用户能够书写任何东西,定期表达是不有用的。

我指的是专门从事这项工作的 Java类。

举例来说,所有这些可能都可以用真空扼制来取代。

<p><!--[if gte mso 9]><xml>
<w:WordDocument>
<w:View>Normal</w:View>
<w:Zoom>0</w:Zoom>
<w:HyphenationZone>21</w:HyphenationZone>
<w:PunctuationKerning />
<w:ValidateAgainstSchemas />
<w:SaveIfXMLInvalid>false</w:SaveIfXMLInvalid>
<w:IgnoreMixedContent>false</w:IgnoreMixedContent>
<w:AlwaysShowPlaceholderText>false</w:AlwaysShowPlaceholderText>
<w:Compatibility>
<w:BreakWrappedTables />
<w:SnapToGridInCell />
<w:WrapTextWithPunct />
<w:UseAsianBreakRules />
<w:DontGrowAutofit />
</w:Compatibility>
<w:BrowserLevel>MicrosoftInternetExplorer4</w:BrowserLevel>
</w:WordDocument>
</xml><![endif]--><!--[if gte mso 9]><xml>
<w:LatentStyles DefLockedState="false" LatentStyleCount="156">
</w:LatentStyles>
</xml><![endif]--><!--[if gte mso 10]>
<style>
/* Style Definitions */
table.MsoNormalTable
{mso-style-name:"Tabla normal";
mso-tstyle-rowband-size:0;
mso-tstyle-colband-size:0;
mso-style-noshow:yes;
mso-style-parent:"";
mso-padding-alt:0cm 5.4pt 0cm 5.4pt;
mso-para-margin:0cm;
mso-para-margin-bottom:.0001pt;
mso-pagination:widow-orphan;
font-size:10.0pt;
font-family:"Times New Roman";
mso-ansi-language:#0400;
mso-fareast-language:#0400;
mso-bidi-language:#0400;}
</style>
<![endif]--></p>
最佳回答
问题回答

FCKEditor has a "paste from word" button that works very well. Could you ask your users to use this functionality?

You could try JTidy. It s a Java port of HTMLtidy, which can do the type of cleanup you re looking for. Caveat emptor: I haven t used JTidy and I have no idea how well it works.

最新版本的CKEditor支持从字句中 past出的汽车探测,这意味着他们不得不使用ton子,即使 but子在那里。 它将发现从字面上重复,并提议加以清理或改成直截面。

cx4j生产干净的超文本,专门用于通过CKEditor进行往返。





相关问题
Spring Properties File

Hi have this j2ee web application developed using spring framework. I have a problem with rendering mnessages in nihongo characters from the properties file. I tried converting the file to ascii using ...

Logging a global ID in multiple components

I have a system which contains multiple applications connected together using JMS and Spring Integration. Messages get sent along a chain of applications. [App A] -> [App B] -> [App C] We set a ...

Java Library Size

If I m given two Java Libraries in Jar format, 1 having no bells and whistles, and the other having lots of them that will mostly go unused.... my question is: How will the larger, mostly unused ...

How to get the Array Class for a given Class in Java?

I have a Class variable that holds a certain type and I need to get a variable that holds the corresponding array class. The best I could come up with is this: Class arrayOfFooClass = java.lang....

SQLite , Derby vs file system

I m working on a Java desktop application that reads and writes from/to different files. I think a better solution would be to replace the file system by a SQLite database. How hard is it to migrate ...

热门标签