English 中文(简体)
With a utf8-encoded Perl script, can it open a filename encoded as GB2312?
原标题:

I m not talking about reading in the file content in utf-8 or non-utf-8 encoding and stuff. It s about file names. Usually I save my Perl script in the system default encoding, "GB2312" in my case and I won t have any file open problems. But for processing purposes, I m now having some Perl script files saved in utf-8 encoding. The problem is: these scripts cannot open the files whose names consist of characters encoded in "GB2312" encoding and I don t like the idea of having to rename my files.

Does anyone happen to have any experience in dealing with this kind of situation? Thanks like always for any guidance.

Edit

Here s the minimized code to demonstrate my problem:

# I m running ActivePerl 5.10.1 on Windows XP (Simplified Chinese version)
# The file system is NTFS

#!perl -w
use autodie;

my $file = "./测试.txt"; #the file name consists of two Chinese characters
open my $in, < ,"$file";

while (<$in>){
print;
}

This test script can run well if saved in "ANSI" encoding (I assume ANSI encoding is the same as GB2312, which is used to display Chinese charcters). But it won t work if saved as "UTF-8" and the error message is as follows:

Can t open  ./娴嬭瘯.txt  for reading:  No such file or directory .

In this warning message, "娴嬭瘯" are meaningless junk characters.

Update

I tried first encoding the file name as GB2312 but it does not seem to work :( Here s what I tried:

#!perl -w
use autodie;
use Encode;

my $file = "./测试.txt";
encode("gb2312", decode("utf-8", $file));
open my $in, < ,"$file";

while (<$in>){
print;
}

My current thinking is: the file name in my OS is 测试.txt but it is encoded as GB2312. In the Perl script the file name looks the same to human eyes, still 测试.txt. But to Perl, they are different because they have different internal representations. But I don t understand why the problem persists when I already converted my file name in Perl to GB2312 as shown in the above code.

Update

I made it, finally made it :)

@brian s suggestion is right. I made a mistake in the above code. I didn t give the encoded file name back to the $file.

Here s the solution:

#!perl -w
use autodie;
use Encode;

my $file = "./测试.txt";
$file = encode("gb2312", decode("utf-8", $file));
open my $in, < ,"$file";

while (<$in>){
print;
}
最佳回答

If you

 use utf8;

in your Perl script, that merely tells perl that the source is in UTF-8. It doesn t affect how perl deals with the outside world. Are you turning on any other Perl Unicode features?

Are you having problems with every filename, or just some of them? Can you give us some examples, or a small demonstration script? I don t have a filesystem that encodes names as GB2312, but have you tried encoding your filenames as GB2312 before you call open?

If you want specific strings encoded with a specific encoding, you can use the Encode module. Try that with your filenames that you give to open.

问题回答

暂无回答




相关问题
Why does my chdir to a filehandle not work in Perl?

When I try a "chdir" with a filehandle as argument, "chdir" returns 0 and a pwd returns still the same directory. Should that be so? I tried this, because in the documentation to chdir I found: "...

How do I use GetOptions to get the default argument?

I ve read the doc for GetOptions but I can t seem to find what I need... (maybe I am blind) What I want to do is to parse command line like this myperlscript.pl -mode [sth] [inputfile] I can use ...

Object-Oriented Perl constructor syntax and named parameters

I m a little confused about what is going on in Perl constructors. I found these two examples perldoc perlbot. package Foo; #In Perl, the constructor is just a subroutine called new. sub new { #I ...

Where can I find object-oriented Perl tutorials? [closed]

A Google search yields a number of results - but which ones are the best? The Perl site appears to contain two - perlboot and perltoot. I m reading these now, but what else is out there? Note: I ve ...

热门标签