English 中文(简体)
Regex issue and c#
原标题:Regex issue and c#

我所写的操纵操纵方法是一个问题。 这种方法的目的是在长处内寻求连接点,并改造其耳朵。

为了提供某种情况,我对裁谈会上的大量超文本文档进行了分类,并核对了在一个单独项目中网站的xml文档结果(我作为青少年参考书的一部分写道)。 html的档案载有教学文本,其中载有与裁谈会档案有关的链接,我需要将资料放在网站上。

如果只有一个链接标,但通过两套,那么以下法则似乎行得当,产出非常大。 奇怪的是,“视觉”演播室的编辑声称,这一链接是有联系的。 下面的Tag regex只是将链接标签相匹配,但在用正确的手法取代链接时,在指示中各点插入连接碎片。

甲型六氯环己烷增产的原因 Dir是,我最终将扩大这一方法,以纠正与不同开端人的联系。 我们谈论的是数千份html档案,但迄今为止这种格式最为常见。

我是这方面损失的比喻,因为我是一位reg子的开端人,并写下了我下面的所有reg子,因此,对其中任何一种想法也将是巨大的。

Typical Input string

Hold 1st <strong><a href="../f/fist_hand.html">FIST</a></strong> hand, back outward
  &amp; fingers forward, and put 2nd <strong><a href="../f/fist_hand.html">FIST</a></strong> hand, back forward
  &amp; fingers inward, with lower knuckle of its 4th finger on
  lower knuckle of 1st thumb; then slide 2nd hand forwards one
  hand s length.

The Method

static string instructions(string instructions)
    {
        Regex Spaces = new Regex(@"s+|
|
");
        Regex linkTag = new Regex(@"<a(.*?)>(.*?)</a>");
        Regex linkTagHtml = new Regex(@"<a(.*?)>|</a>");
        Regex hrefAttr = new Regex("href="(.)*?"");
        Regex alphaDir = new Regex(@"/([a-z])?/");

        string signName = string.Empty;
        char alphaChar;
        string replacementLinkTag = string.Empty;
        string replacementHref = string.Empty;

        instructions = Spaces.Replace(instructions, " ");

        MatchCollection matches = linkTag.Matches(instructions);

        foreach (Match link in matches)
        {
            Match alphaDirMatch = alphaDir.Match(link.Value.ToString());
            if (alphaDirMatch.Success)
            {
                Match hrefAttrMatch = hrefAttr.Match(link.Value.ToString());
                if (hrefAttrMatch.Success)
                {
                    signName = linkTagHtml.Replace(link.Value.ToString(), string.Empty).ToLower().Trim();
                    signName = signName.Replace(" ", "_");
                    alphaChar = signName[0];

                    replacementHref = "href="/pages/displayc.aspx?c=dictionary&alpha=" + alphaChar.ToString() +"&sign=" + signName + """;
                    replacementLinkTag = hrefAttr.Replace(link.Value.ToString(), replacementHref);

                    instructions = instructions.Remove(link.Index, link.Length);
                    instructions = instructions.Insert(link.Index, replacementLinkTag);
                }
            }
        }            

        return instructions;
    }

Current output string

Hold 1st <strong><a href="/pages/displayc.aspx?c=dictionary&alpha=f&sign=fist">FIST</a></strong> hand, back outward &amp; finge<a href="/pages/displayc.aspx?c=dictionary&alpha=f&sign=fist">FIST</a>f="../f/fist_hand.html">FIST</a></strong> hand, back forward &amp; fingers inward, with lower knuckle of its 4th finger on lower knuckle of 1st thumb; then slide 2nd hand forwards one hand s length.

Desired output string

Hold 1st <strong><a href="/pages/displayc.aspx?c=dictionary&alpha=f&sign=fist">FIST</a></strong> hand, back outward &amp; fingers forward, and put 2nd <strong><a href="/pages/displayc.aspx?c=dictionary&alpha=f&sign=fist">FIST</a></strong> hand, back forward &amp; fingers inward, with lower knuckle of its 4th finger on lower knuckle of 1st thumb; then slide 2nd hand forwards one hand s length.

The solution - Thanks for the suggestion Oded!

我利用HtmlAgilityPack将指示装上作为html,并发现这些标签在HtmlNodeCollection中储存起来,在每一处铺上,并带上红利的价值观,并做its。

该法典最后对那些有兴趣的人来说就是这样:

static string instructions(string instructions)
    {
        char alphaChar;
        Regex Spaces = new Regex(@"s+|
|
");
        Regex alphaDir = new Regex(@"/([a-z])?/");
        string signName = string.Empty;
        string replacementHref = string.Empty;

        instructions = Spaces.Replace(instructions, " ");

        HtmlDocument instr = new HtmlDocument();
        instr.LoadHtml(instructions);

        HtmlNodeCollection links = instr.DocumentNode.SelectNodes("//a");

        if (links != null)
        {
            foreach (HtmlNode link in links)
            {
                string href = link.GetAttributeValue("href", string.Empty);

                if (!string.IsNullOrWhiteSpace(href))
                {
                    Match alphaDirMatch = alphaDir.Match(href);

                    if (alphaDirMatch.Success)
                    {
                        signName = Regex.Replace(href, "(.)*?/([a-z])?/|(.html)?", string.Empty);
                        signName = signName.Replace(" ", "_");
                        alphaChar = signName[0];

                        replacementHref = "/pages/displayc.aspx?c=dictionary&alpha=" + alphaChar.ToString() + "&sign=" + signName;
                        link.SetAttributeValue("href", replacementHref);
                    }
                }
            }
        }

        instructions = instr.DocumentNode.InnerHtml.ToString();

        return instructions;
    }
最佳回答
问题回答

除了“Oded”的回答外,你还可以作简单的XSL变革。 海事组织不是来这里。





相关问题
Anyone feel like passing it forward?

I m the only developer in my company, and am getting along well as an autodidact, but I know I m missing out on the education one gets from working with and having code reviewed by more senior devs. ...

NSArray s, Primitive types and Boxing Oh My!

I m pretty new to the Objective-C world and I have a long history with .net/C# so naturally I m inclined to use my C# wits. Now here s the question: I feel really inclined to create some type of ...

C# Marshal / Pinvoke CBitmap?

I cannot figure out how to marshal a C++ CBitmap to a C# Bitmap or Image class. My import looks like this: [DllImport(@"test.dll", CharSet = CharSet.Unicode)] public static extern IntPtr ...

How to Use Ghostscript DLL to convert PDF to PDF/A

How to user GhostScript DLL to convert PDF to PDF/A. I know I kind of have to call the exported function of gsdll32.dll whose name is gsapi_init_with_args, but how do i pass the right arguments? BTW, ...

Linqy no matchy

Maybe it s something I m doing wrong. I m just learning Linq because I m bored. And so far so good. I made a little program and it basically just outputs all matches (foreach) into a label control. ...

热门标签