English 中文(简体)
SAX Parser 德语编码问题
原标题:SAX Parser encoding issue in German Language

我正在研究一个用德语写成的应用程序。 我正以 XML 格式获取数据。 我用 SAX 采集器来解析这些 XML 并在文本视图中显示数据。 除了在解析后我得到的特殊特性问题之外, 一切都很好。

这是我通过 URL 链接得到的 XML 。 这个 XML 有 utf-8 编码 。 此 XML 文件的所有字符都很好 。

<?xml version="1.0" encoding="utf-8"?>
<posts>
    <page id="001">
        <title><![CDATA[Sie kaufen bei uns ausschließlich Holzkunst- und Volkskunst-Produkte ]]></title>
        <detial><![CDATA[Durch enge Beziehungen mit unseren Lieferanten können wir attraktive rückläufig 
        Preise und schnelle Lieferungen gewährleisten. Caroline Féry and Laura Herbst Universität Potsdam Mein 
        Flugzeug hatte zwölf Stunden VERSPÄTUNG </p>]]></detial>
    </page>     
</posts>

I 使用 SAX 解析此 XML 解析器 : - (并在 TextView 中显示解析数据 。 )

public class GermanParseActivity extends Activity {
    /** Called when the activity is first created. */

    static final String URL = "http://www.xyz.com/id=1";

    ItemList itemList;

    @Override
    public void onCreate(Bundle savedInstanceState) {
        super.onCreate(savedInstanceState);
        setContentView(R.layout.main);

        XMLParser parser = new XMLParser();
        String XML = parser.getXmlFromUrl(URL);

        System.out.println("This XML is ========>"+XML);

       try
       {
           SAXParserFactory spf = SAXParserFactory.newInstance();
       SAXParser sp = spf.newSAXParser();
           XMLReader xr = sp.getXMLReader();

           /** Create handler to handle XML Tags ( extends DefaultHandler ) */
           MyXMLHandler myXMLHandler = new MyXMLHandler();
           xr.setContentHandler(myXMLHandler);

       ByteArrayInputStream is = new ByteArrayInputStream(XML.getBytes());
       xr.parse(new InputSource(is));
      }
      catch(Exception e)
      {

      }

      itemList = MyXMLHandler.itemList;

      ArrayList<String> listItem= itemList.getTitle();


     ListView lview = (ListView) findViewById(R.id.listview1);
     myAdapter adapter = new myAdapter(this, listItem);
     lview.setAdapter(adapter);
    }


}

但是,在分析后,我正在获取不在 XML 文件中,而是在分析 XML 文件后生成的奇怪字符。

喜欢这些字符 :

在解析后进行解析

können- & gt; knen

rückläufiig- & gt; rö1⁄4klufig

gewährleisten -- -- & gt; gewährleisten

有没有人能建议 解决这个问题的正确方法?

最佳回答

您需要重新编码您的输入。 问题是文本是 UTF-8, 但被解释为 ISO- 8859-1。 这似乎是 SAX 的错误 。

String output=new String(input.getBytes("8859_1"), "utf-8");

该直线采用ISO-8859-1,并转换为爪哇使用的utf-8。

问题回答

got my anwser from here They suggest that the heading should be:

<?xml version="1.0" encoding="ISO-8859-1"?>

代替

<?xml version="1.0" encoding="utf-8"?>

Hope that is the answer- edit just saw that you don t have control over the xml, so this will not help, rekire s answer is then a option





相关问题
Mojarra for JSF Encoding

Can anyone teach me how to use mojarra to encode my JSF files. I downloaded mojarra and expected some kind of jar but what i had downloaded was a folder of files i don t know what to do with

encoding of file shell script

How can I check the file encoding in a shell script? I need to know if a file is encoded in utf-8 or iso-8859-1. Thanks

Using Java PDFBox library to write Russian PDF

I am using a Java library called PDFBox trying to write text to a PDF. It works perfect for English text, but when i tried to write Russian text inside the PDF the letters appeared so strange. It ...

what is encoding in Ajax?

Generally we are using UTF-8 encoding standard for sending the request for every language. But in some language this encoding standard is not working properly,then in that case we are using ISO-8859-1....

Encoding of window.location.hash

Does window.location.hash contain the encoded or decoded representation of the url part? When I open the same url (http://localhost/something/#%C3%BC where %C3%BCtranslates to ü) in Firefox 3.5 and ...

Auth-code with A-Za-z0-9 to use in an URL parameter

As part of a web application I need an auth-code to pass as a URL parameter. I am currently using (in Rails) : Digest::SHA1.hexdigest((object_id + rand(255)).to_s) Which provides long strings like : ...

热门标签