English 中文(简体)
decoding hexdecimals from URLs
原标题:decoding hexdecimals from URLs

a lot of sites like google and wikipedia encode non-english characters in hex with a leading % sign. I looked for a tool that I could pipe URLs into it and, when such %AA strings are met, will translate these signs back to utf so that I can read them.

既然我找不到,我就把自己写进了正文,我想与大家分享,或许你会认为它有用:

#include <stdio.h>
int main()
   {
        char c;
        unsigned int i;
        while (!feof(stdin))
        {
                if (0==fread(&c,1,1,stdin)) break;
                if (c!= % ) putchar(c);
                else
                {
                        if (scanf("%X",&i)==1) putchar(i);
                        else putchar( % );
                }
        }
        return 0;
}

使用实例:

# echo "http://he.wikipedia.org/wiki/%D7%A2%D7%9E%D7%95%D7%93_%D7%A8%D7%90%D7%A9%D7%99" | ./dumpHex

results:

http://he.wikipedia.org/wiki/עמוד_ראשי
最佳回答

在VC++:

string dec = URLDecoder::decode(url)

PHP:

$d = urldecode($u)

Java:

String dec = URLDecoder.decode(url,"UTF-8");

......

问题回答

暂无回答




相关问题
Why are there duplicate characters in Unicode?

I can see some duplicate characters in Unicode. For example, the character C can be represented by the code points U+0043 and U+0421. Why is this so?

how to extract characters from a Korean string in VBA

Need to extract the initial character from a Korean word in MS-Excel and MS-Access. When I use Left("한글",1) it will return the first syllable i.e 한, what I need is the initial character i.e ㅎ . Is ...

File open error by using codec utf-8 in python

I execute following code on windows xp and python 2.6.4 But it show IOError. How to open file whose name has utf-8 codec. >>> open( unicode( 한글.txt , euc-kr ).encode( utf-8 ) ) Traceback ...

UnicodeEncodeError on MySQL insert in Python

I used lxml to parse some web page as below: >>> doc = lxml.html.fromstring(htmldata) >>> element in doc.cssselect(sometag)[0] >>> text = element.text_content() >>>...

Fast way to filter illegal xml unicode chars in python?

The XML specification lists a bunch of Unicode characters that are either illegal or "discouraged". Given a string, how can I remove all illegal characters from it? I came up with the following ...

热门标签