English 中文(简体)
智能检测 html 内容变化
原标题:Smart detection of html content changes

我在找算法/图书馆(最好是在 c# 中), 可以明智地检测到 html 页面内容的变化 。

例如,如果网页是echcrunch.com, 只有当网页出现新文章或重大变化时, 页面才会匹配。 它会忽略 html 的评论、 javascript、 微小更新, 如评论数量等...

有人能指我方向对吗?

问题回答

您可以使用 JavaScript 来计算页面上的元素数量, 或者页面的具体部分 。 有数千种方法可以执行 JS 来检测变化 。

我假设,你要求你的C#程序页面。

事实上,有数百种方法可以做到这一点。

我给你一个:

第一,最简单最假的算法...

while(true) {
    checkModifyDate();

    if(date is newer) {
        do anything you want...
    }
    do it again in next 10 mins()
}

That checkModifyDate() function will "ONLY" check the HTTP Header for changes.
Then you can do anything later.

您可以将它添加到每xxx分钟运行一次的定时器对象或线索中, 并设置它来自动为您执行该任务 。

希望这能帮上忙





相关问题
Anyone feel like passing it forward?

I m the only developer in my company, and am getting along well as an autodidact, but I know I m missing out on the education one gets from working with and having code reviewed by more senior devs. ...

NSArray s, Primitive types and Boxing Oh My!

I m pretty new to the Objective-C world and I have a long history with .net/C# so naturally I m inclined to use my C# wits. Now here s the question: I feel really inclined to create some type of ...

C# Marshal / Pinvoke CBitmap?

I cannot figure out how to marshal a C++ CBitmap to a C# Bitmap or Image class. My import looks like this: [DllImport(@"test.dll", CharSet = CharSet.Unicode)] public static extern IntPtr ...

How to Use Ghostscript DLL to convert PDF to PDF/A

How to user GhostScript DLL to convert PDF to PDF/A. I know I kind of have to call the exported function of gsdll32.dll whose name is gsapi_init_with_args, but how do i pass the right arguments? BTW, ...

Linqy no matchy

Maybe it s something I m doing wrong. I m just learning Linq because I m bored. And so far so good. I made a little program and it basically just outputs all matches (foreach) into a label control. ...

热门标签