English 中文(简体)
Removing nested tags with simpleHTML
原标题:

I m trying to use simple_html_dom to remove all the spans from a snippet of HTML, and I m using the following:

$body = "<span class= outer  style= background:red >x<span class= mid  style= background:purple >y<span class= inner  style= background:orange >z</span></span></span>";
$HTML = new simple_html_dom;
$HTML->load($body);   
$spans = $HTML->find( span );
foreach($spans as $span_tag) {
    echo "working on ". $span_tag->class . " ... ";
    echo "setting " . $span_tag->outertext . " equal to " . $span_tag->innertext . "<br/>
";
    $span_tag->outertext = (string)$span_tag->innertext;
}
$text =  $HTML->save();
$HTML->clear();
unset($HTML);
echo "<br/>The Cleaned TEXT is: $text<br/>";

And here s the result in my browser:

http://www.pixeloution.com/RAC/clean.gif

So why is it I m only ending up with the outer most span removed?

Edit

Actually if there s an easier way to do this, I m game. The object is to remove the tags but keep anything inside them including other tags, or else I d just use $obj->paintext

Edit #2

Okay ... apparently I got it working, although oddly enough I d still like to actually understand the problem if anyone ran into this before. Knowing it was only removing the outermost span, I did this:

function cleanSpansRecursive(&$body) {

    $HTML = new simple_html_dom;
    $HTML->load($body); 
    $spans = $HTML->find( span );
    foreach($spans as $span_tag) {
        $span_tag->outertext = (string)$span_tag->innertext;
    }

    $body =  (string)$HTML;
    if($HTML->find( span )) {
        $HTML->clear();
        unset($HTML);
        cleanSpansRecursive($body);
    } else {
        $HTML->clear();
        unset($HTML);
    }  
}

And it seems to work.

最佳回答

I don t have simple_html_dom installed on my machine or dev server so I can t test, but from the looks of it, setting $span_tag->outertext will create new span objects inside the outer span, so the old references will no longer exist in $HTML. Going from innermost to outer should fix it since the references would be kept intact.

EDIT: In your second edit, you are finding the newly-created spans every time you do a replacement, which is why it works.

问题回答

暂无回答




相关问题
Brute-force/DoS prevention in PHP [closed]

I am trying to write a script to prevent brute-force login attempts in a website I m building. The logic goes something like this: User sends login information. Check if username and password is ...

please can anyone check this while loop and if condition

<?php $con=mysql_connect("localhost","mts","mts"); if(!con) { die( unable to connect . mysql_error()); } mysql_select_db("mts",$con); /* date_default_timezone_set ("Asia/Calcutta"); $date = ...

定值美元

如何确认来自正确来源的数字。

Generating a drop down list of timezones with PHP

Most sites need some way to show the dates on the site in the users preferred timezone. Below are two lists that I found and then one method using the built in PHP DateTime class in PHP 5. I need ...

Text as watermarking in PHP

I want to create text as a watermark for an image. the water mark should have the following properties front: Impact color: white opacity: 31% Font style: regular, bold Bevel and Emboss size: 30 ...

How does php cast boolean variables?

How does php cast boolean variables? I was trying to save a boolean value to an array: $result["Users"]["is_login"] = true; but when I use debug the is_login value is blank. and when I do ...

热门标签