English 中文(简体)
PHP- HTML parsing :: How can be taken charset value of webpage with simple html dom parser?
原标题:

PHP:: How can be taken charset value of webpage with simple html dom parser (utf-8, windows-255, etc..)?

remark: its have to be done with html dom parser http://simplehtmldom.sourceforge.net

Example1 webpage charset input:

<meta content="text/html; charset=utf-8" http-equiv="Content-Type">

result:utf-8



Example2 webpage charset input:

<meta content="text/html; charset=windows-255" http-equiv="Content-Type">

result:windows-255

Edit:

I try this (but its not works):

$html = file_get_html( http://www.google.com/ );
$el=$html->find( meta[content] ,0);
echo $el->charset; 

What should be change? (I know that $el->charset not working)

Thanks

最佳回答

You ll have to match the string using a regular expression (I hope you have PCRE...).

$el=$html->find( meta[http-equiv=Content-Type] ,0)
$fullvalue = $el->content;
preg_match( /charset=(.+)/ , $fullvalue, $matches);
echo $matches[1];

Not very robust, but should work.

问题回答
$dd = new DOMDocument;
$dd->loadHTML($data);
foreach ($dd->getElementsByTagName("meta") as $m) {
    if (strtolower($m->getAttribute("http-equiv")) == "content-type") {
        $v = $m->getAttribute("content");
        if (preg_match("#.+?/.+?;\s?charset\s?=\s?(.+)#i", $v, $m))
            echo $m[1];
    }
}

Note that the DOM extension implicitly converts all the data to UTF-8.

Thanks for MvanGeest answer - I just fix a bit and its works perfect.

$html = file_get_html( http://www.google.com/ );
$el=$html->find( meta[content] ,0);
$fullvalue = $el->content;
preg_match( /charset=(.+)/ , $fullvalue, $matches);
echo substr($matches[0], strlen("charset="));




相关问题
Parse players currently in lobby

I m attempting to write a bash script to parse out the following log file and give me a list of CURRENT players in the room (so ignoring players that left, but including players that may have rejoined)...

How to get instance from string in C#?

Is it possible to get the property of a class from string and then set a value? Example: string s = "label1.text"; string value = "new value"; label1.text = value; <--and some code that makes ...

XML DOM parsing br tag

I need to parse a xml string to obtain the xml DOM, the problem I m facing is with the self closing html tag like <br /> giving me the error of Tag mismatch expected </br>. I m aware this ...

Ruby parser in Java

The project I m doing is written in Java and parsers source code files. (Java src up to now). Now I d like to enable parsing Ruby code as well. Therefore I am looking for a parser in Java that parses ...

Locating specific string and capturing data following it

I built a site a long time ago and now I want to place the data into a database without copying and pasting the 400+ pages that it has grown to so that I can make the site database driven. My site ...

热门标签