English 中文(简体)
PHP Curl UTF-8 字符集
原标题:PHP Curl UTF-8 Charset

我有一个php 脚本, 调用另一个网页, 并写下页面上的所有 html 和一切顺利, 但是有一个字符设置问题。 我的 php 文件编码是 utf-8, 而所有其他 php 文件都工作正常( 这意味着服务器没有问题 ) 。 该代码中缺少的是什么, 以及所有西班牙字母看起来都很奇怪 。 PS. 当我把这些怪异的字符原始版本写进 php 时, 它们都看上去很准确 。

header("Content-Type: text/html; charset=utf-8");
function file_get_contents_curl($url)
{
    $ch=curl_init();
    curl_setopt($ch,CURLOPT_HEADER,0);
    curl_setopt($ch,CURLOPT_RETURNTRANSFER,1);
    curl_setopt($ch,CURLOPT_URL,$url);
    curl_setopt($ch,CURLOPT_FOLLOWLOCATION,1);
    $data=curl_exec($ch);
    curl_close($ch);
    return $data;
}
$html=file_get_contents_curl($_GET["u"]);
$doc=new DOMDocument();
@$doc->loadHTML($html);
最佳回答

Simple: When you use curl it encodes the string to utf-8 you just need to decode them..

Description

string utf8_decode ( string $data )

此函数将数据解码, 假称为 UTF-8 编码为 ISO- 8859-1

问题回答

您可以使用此信头

   header( Content-type: text/html; charset=UTF-8 );

在解码字符串之后

 $page = utf8_decode(curl_exec($ch));

它为我工作

$output = curl_exec($ch);
$result = iconv("Windows-1251", "UTF-8", $output);
function page_title($val){
    include(dirname(__FILE__). /simple_html_dom.php );
    $ch = curl_init();
    curl_setopt($ch, CURLOPT_URL,$val);
    curl_setopt($ch, CURLOPT_USERAGENT,  Mozilla/5.0 (Windows NT 6.1; WOW64; rv:25.0) Gecko/20100101 Firefox/25.0 );
    curl_setopt($ch, CURLOPT_ENCODING , "gzip");
    curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
    curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
    curl_setopt($ch, CURLOPT_HEADER, 0);
    $return = curl_exec($ch); 
    $encot = false;
    $charset = curl_getinfo($ch, CURLINFO_CONTENT_TYPE);

    curl_close($ch); 
    $html = str_get_html( " .$return. " );

    if(strpos($charset, charset= ) !== false) {
        $c = str_replace("text/html; charset=","",$charset);
        $encot = true;
    }
    else {
        $lookat=$html->find( meta[http-equiv=Content-Type] ,0);
        $chrst = $lookat->content;
        preg_match( /charset=(.+)/ , $chrst, $found);
        $p = trim($found[1]);
        if(!empty($p) && $p != "")
        {
            $c = $p;
            $encot = true;
        }
    }
    $title = $html->find( title )[0]->innertext;
    if($encot == true && $c !=  utf-8  && $c !=  UTF-8 ) $title = mb_convert_encoding($title, UTF-8 ,$c);

    return $title;
}

我正通过 CURL 和 mb_detect_encoding (curl_exec($ch)) (curl_exec($ch))) ( return UTF-8) 获取一个窗口-1252编码文件。 tryed utf8_encode(curl_exec($ch)) (curl_exec($ch))) (

First method (internal function)

我曾经尝试过的最好方法是使用 < a href=""http://php.net/urlencode" rel="nofollow noreferrer"\\ code>urlecode () 。记住,不要将它用于整个url;相反,它只用于所需的部分。例如,请求中有两个文本法和文本字段,它们分别包含波斯文和英文文本,你只需要编码波斯文文本,而不是英文文本。

Second Method (using cURL function)

然而,如果必须编码的字符范围比较有限,则有更好的方法。其中一种方法是使用 CURLOPT_ENCODING ,将其传送到 rel=“nofollown norefererr>code>curl_setopt () :

curl_setopt($ch, CURLOPT_ENCODING, "");




相关问题
Brute-force/DoS prevention in PHP [closed]

I am trying to write a script to prevent brute-force login attempts in a website I m building. The logic goes something like this: User sends login information. Check if username and password is ...

please can anyone check this while loop and if condition

<?php $con=mysql_connect("localhost","mts","mts"); if(!con) { die( unable to connect . mysql_error()); } mysql_select_db("mts",$con); /* date_default_timezone_set ("Asia/Calcutta"); $date = ...

定值美元

如何确认来自正确来源的数字。

Generating a drop down list of timezones with PHP

Most sites need some way to show the dates on the site in the users preferred timezone. Below are two lists that I found and then one method using the built in PHP DateTime class in PHP 5. I need ...

Text as watermarking in PHP

I want to create text as a watermark for an image. the water mark should have the following properties front: Impact color: white opacity: 31% Font style: regular, bold Bevel and Emboss size: 30 ...

How does php cast boolean variables?

How does php cast boolean variables? I was trying to save a boolean value to an array: $result["Users"]["is_login"] = true; but when I use debug the is_login value is blank. and when I do ...