English 中文(简体)
如何截断HTML字符串而不让它留下格式不正确的问题?
原标题:how to truncate HTML string without leaving it malformated?
  • 时间:2010-02-24 08:31:19
  •  标签:
  • c#

I have to display first N (for example say 50 or 100) characters out of entire html string. I have to display well formated html.If i apply simple substring that will get me a malformated html string E.g.

Sample string:<html><one>a href=“http://foo.com”>foo</a></ person></html>

侵扰性体:<html><one>a href=“http://foo.com”>foo<

这会给我畸形的HTML :(

如何实现这一点,有任何想法吗?

最佳回答
问题回答

将HTML解析为DOM树。从最深层/最内层元素开始。

  • remove the content of the innermost node, or the node if it has no content
  • check the string length.

漱口,涂抹泡沫,再次漱口。

如果您所需的长度足够小,则可能会将您的字符串截断为空字符串。

为了更加有趣,你可以尝试在进行操作的过程中删除节点的属性

我看过一些论坛系统,在每个帖子之后简单地添加了</b></u></i></s>。您可以以类似的方式处理。

当然,它只是ug,它会拖拖拉;

那绝对是最简单的方法。更好的方法实际上是生成一棵树,然后...踢掉节点,直到满足要求。





相关问题
Anyone feel like passing it forward?

I m the only developer in my company, and am getting along well as an autodidact, but I know I m missing out on the education one gets from working with and having code reviewed by more senior devs. ...

NSArray s, Primitive types and Boxing Oh My!

I m pretty new to the Objective-C world and I have a long history with .net/C# so naturally I m inclined to use my C# wits. Now here s the question: I feel really inclined to create some type of ...

C# Marshal / Pinvoke CBitmap?

I cannot figure out how to marshal a C++ CBitmap to a C# Bitmap or Image class. My import looks like this: [DllImport(@"test.dll", CharSet = CharSet.Unicode)] public static extern IntPtr ...

How to Use Ghostscript DLL to convert PDF to PDF/A

How to user GhostScript DLL to convert PDF to PDF/A. I know I kind of have to call the exported function of gsdll32.dll whose name is gsapi_init_with_args, but how do i pass the right arguments? BTW, ...

Linqy no matchy

Maybe it s something I m doing wrong. I m just learning Linq because I m bored. And so far so good. I made a little program and it basically just outputs all matches (foreach) into a label control. ...

热门标签