English 中文(简体)
忽略 HTML 标签定义的匹配 [重复]
原标题:Ignore matches from HTML tag defintions [duplicate]

我用我发现 < a href=>的正文替换了某些文字,“https://stackoverflow.com/ questions/8829848/memologyon-replacement-php>这里有。

$items = array(
   :)    =>  smile ,
   :(    =>  sad ,
   =))   =>  laugh ,
   :p    =>  tongue ,      
); 

foreach($items as $key => $class)
  $regex[] = preg_quote($key,  # );

$regex =  #(?!<w)( .implode( | , $regex). )(?!w)# ;

$string = preg_replace_callback($regex, function($matches) use($items){

  if(isset($items[$matches[0]])) 
    return  <span class=" .$items[$matches[0]]. "> .$matches[0]. </span> ;

  return $matches[0];

}, $string);

如何忽略 HTML 标签定义中的匹配( 如标签属性中的匹配)?

例如:

$sstring = Hello:) & lt; a 标题= "你好:") & gt; 再见:( & lt;/ a>;

\\ gt; 第二个 < code> :) 应该忽略 。

最佳回答

此处的 DOMDocument 基础实施,用于替换 HTML 的旁书字符串 :

$string =  Hello :) <a title="Hello :)"> Bye :( </a> ;

$items = array(
   :)    =>  smile ,
   :(    =>  sad ,
   =))   =>  laugh ,
   :p    =>  tongue ,      
); 

foreach($items as $key => $class) $regex[] = preg_quote($key);

$regex =  #(?!<w)( .implode( | , $regex). )(?!w)# ;

$doc = new DOMDocument();
$doc->loadHTML($string);

$xp = new DOMXPath($doc);

$text_nodes = $xp->query( //text() );

foreach ($text_nodes as $text_node)
{
  $parent  = $text_node->parentNode;
  $context = $text_node->nextSibling;
  $text    = $text_node->nodeValue;
  $matches = array();
  $offset  = 0;

  $parent->removeChild($text_node);

  while ( preg_match($regex, $text, $matches, PREG_OFFSET_CAPTURE, $offset) > 0 )
  {
    $match  = $matches[0];
    $smiley = $match[0];
    $pos    = $match[1];
    $prefix = substr($text, $offset, $pos - $offset);
    $offset = $pos + strlen($smiley);

    $span = $doc->createElement( span , $smiley);
    $span->setAttribute( class , $items[$smiley]);

    $parent->insertBefore( $doc->createTextNode($prefix), $context );
    $parent->insertBefore( $span, $context );
  }

  $suffix = substr($text, $offset);
  $parent->insertBefore( $doc->createTextNode($suffix), $context );
}

$body = $doc->getElementsByTagName( body );
$html = $doc->saveHTML( $body[0] );

将它包在函数中, 你就可以去。 它可能比regex更多的代码线, 但它不是一个丑陋的、 充满错误的维护恶梦( 就像任何基于regex的解决方案一样 ) 。

问题回答

先过滤您的输入字符串。 清理 HTML 标签中的任何微笑 :

$regex =  #<[^>]+( .implode( | , $regex). )[^>]+># ;

并运行您的代码上面。





相关问题
Simple JAVA: Password Verifier problem

I have a simple problem that says: A password for xyz corporation is supposed to be 6 characters long and made up of a combination of letters and digits. Write a program fragment to read in a string ...

Case insensitive comparison of strings in shell script

The == operator is used to compare two strings in shell script. However, I want to compare two strings ignoring case, how can it be done? Is there any standard command for this?

Trying to split by two delimiters and it doesn t work - C

I wrote below code to readin line by line from stdin ex. city=Boston;city=New York;city=Chicago and then split each line by ; delimiter and print each record. Then in yet another loop I try to ...

String initialization with pair of iterators

I m trying to initialize string with iterators and something like this works: ifstream fin("tmp.txt"); istream_iterator<char> in_i(fin), eos; //here eos is 1 over the end string s(in_i, ...

break a string in parts

I have a string "pc1|pc2|pc3|" I want to get each word on different line like: pc1 pc2 pc3 I need to do this in C#... any suggestions??

Quick padding of a string in Delphi

I was trying to speed up a certain routine in an application, and my profiler, AQTime, identified one method in particular as a bottleneck. The method has been with us for years, and is part of a "...

热门标签