English 中文(简体)
Converting an eregi_replace to a preg_replace
原标题:

I am trying to parse some HTML snippets and want to clean them up for various reasons (XSS et al).

I am currently trying to remove all of the attributes on any tag, except for the href on a anchor. I am doing this using a sequence of eregi_replace calls, but I am sure there is a smarter way of doing this using preg_replace and just a couple of lines of code, but I have not been able to get it to work. Can anyone help?

Current code:

$data_item = eregi_replace("<p[^>]*>","<p>", $data_item);
$data_item = eregi_replace("<h2[^>]*>","<h2>", $data_item);
$data_item = eregi_replace("<h3[^>]*>","<h3>", $data_item);
$data_item = eregi_replace("<h4[^>]*>","<h4>", $data_item);
$data_item = eregi_replace("<h5[^>]*>","<h5>", $data_item);
$data_item = eregi_replace("<h6[^>]*>","<h6>", $data_item);
$data_item = eregi_replace("<ul[^>]*>","<ul>", $data_item);
$data_item = eregi_replace("<ol[^>]*>","<ol>", $data_item);
$data_item = eregi_replace("<li[^>]*>","<li>", $data_item);

$data_item = preg_replace("/<a([^>]*)( href=S+)([^>]*)>/i",  <a$2 rel="nofollow"> , $data_item);

(I only need to parse a subset of HTML tags as prior to this I strip out any undesireables).

最佳回答

Why not use a general regex that will match any tag, and then preg_replace_callback() to allow you to determine what a given tag should be replaced with? That way you can have a simple function that checks to see if the matched tag was an a tag, and if so, not replace the href, but otherwise replace everything.

Alternatively, you could do something like this:

$data_item = preg_replace("/<(p|h2|h3|h4|h5|h6|ul|ol)[^>]*>/i","<$1>", $dataitem);

Where the () group in the regex captures the type of tag matched, the | is the "or" operator to match any of the indicated tags, and the $1 in the replacement text is used to substitute in what was matched by the first (and only) capture group from the pattern.

问题回答

暂无回答




相关问题
Brute-force/DoS prevention in PHP [closed]

I am trying to write a script to prevent brute-force login attempts in a website I m building. The logic goes something like this: User sends login information. Check if username and password is ...

please can anyone check this while loop and if condition

<?php $con=mysql_connect("localhost","mts","mts"); if(!con) { die( unable to connect . mysql_error()); } mysql_select_db("mts",$con); /* date_default_timezone_set ("Asia/Calcutta"); $date = ...

定值美元

如何确认来自正确来源的数字。

Generating a drop down list of timezones with PHP

Most sites need some way to show the dates on the site in the users preferred timezone. Below are two lists that I found and then one method using the built in PHP DateTime class in PHP 5. I need ...

Text as watermarking in PHP

I want to create text as a watermark for an image. the water mark should have the following properties front: Impact color: white opacity: 31% Font style: regular, bold Bevel and Emboss size: 30 ...

How does php cast boolean variables?

How does php cast boolean variables? I was trying to save a boolean value to an array: $result["Users"]["is_login"] = true; but when I use debug the is_login value is blank. and when I do ...

热门标签