• 如何制止Xml 空中化,插入非法性质
原标题:How to stop .net Xml Serialisation inserting illegal characters

低于0x20(除0x09、0x0a、0x0d i.e.tab、汽车回程和线性饲料外)的任何货物不能列入XML文件。


Soap Formatter happily encode 0x12 nature (Ascii 18, Equipment Control 2) as ,但用户的反应没有hexadecimal Value 0x12, 即无效特性

<rant> What I find quite frustrating is these are two sides of the same coin, both client and service are .net apps. Why will the soap formatter write bad xml if nothing can read it?</rant>

I d like to either

  1. Get the Xml Serialiser to handle these odd characters correctly or
  2. Have the request fail in the Web Service

除了(a) “使你的投入”或(b)“改变你的文件结构”之外,我还ve了这种想法。

a) Isn t a runner as some of this data is +20 years old
b) isn t much of an option either, as other than our own front end, we have clients that code against the Web Service directly.

是否有明显的Im失踪? 或者,它只是AcII控制守则周围的守则?

增 编

This is actually a problem with the XmlSerialiser, the following code will serialise an invalid character to the stream, but will not de-serialise it

public class MyData 
    public string Text { get; set; }

class Program
    public static void Main(string[] args)
        var myData = new MyData {Text = "hello " 
                + ASCIIEncoding.ASCII.GetString(new byte[] { 0x12 }) 
                + " world"};

        var serializer = new XmlSerializer(typeof(MyData));

        var xmlWriter = new StringWriter();

        serializer.Serialize(xmlWriter, myData);

        var xmlReader = new StringReader(xmlWriter.ToString());

        var newData = (MyData)serializer.Deserialize(xmlReader); // Exception 
        // hexadecimal value 0x12, is an invalid character.


I can get it to choke writing the xml by explicitly creating an XmlWriter and passing that to Serialise (I ll post that shortly as my own answer), but that still means I ve to sanatize my data before sending it.
As these characters are significant I can t just strip them, I need to encode them before transmission and decode them when read, and I m really quite surprised that there doesn t appear to be an existing framework method to do this.


<>秒钟>: 解决办法

使用<代码>DataContractSerializer (用于全球合作框架服务)而不是<代码>。 XmlSerializer work a treatment

public class MyData
    public string Text { get; set; }
class Program
    public static void Main(string[] args)
        var myData = new MyData
            Text = "hello "
                + ASCIIEncoding.ASCII.GetString(new byte[] { 0x12 })
                + " world"

        var serializer = new DataContractSerializer(typeof(MyData));

        var mem = new MemoryStream();

        serializer.WriteObject(mem, myData);

        mem.Seek(0, SeekOrigin.Begin);
        MyData myData2 = (MyData)serializer.ReadObject(mem);

        Console.WriteLine("myData2 {0}", myData2.Text);

www.un.org/Depts/DGACM/index_russian.htm 工作情况

在撰写Xml时,我可以通过使用XmlWriter来这样做,这或许比委托人做这件事要好。 e.g.


public class MyData 
    public string Text { get; set; }
class Program
    public static void Main(string[] args)
        var myData = new MyData {Text = "hello " 
            + ASCIIEncoding.ASCII.GetString(new byte[] { 0x12 }) 
            + " world"};
        var serializer = new System.Xml.Serialization.XmlSerializer(typeof(MyData));

        var sw = new StringWriter();
        XmlWriterSettings settings = new XmlWriterSettings();

        using (var writer = XmlWriter.Create(sw))
            serializer.Serialize(writer, myData); // Exception
            // hexadecimal value 0x12, is an invalid character
        var xmlReader = new StringReader(sw.ToString());

        var newUser = (MyData)serializer.Deserialize(xmlReader);

        Console.WriteLine("User Name = {0}", newUser);



public List<MyData> MyWebServiceMethod()
    var mydata = GetMyData();
    return Helper.ScrubObjectOfSpecialCharacters<List<MyData>>(mydata);


public static T ScrubObjectOfSpecialCharacters<T>(T obj)
    var serializer = new XmlSerializer(obj.GetType());

    using (StringWriter writer = new StringWriter())
        serializer.Serialize(writer, obj);

        string content = writer.ToString();

        content = FixSpecialCharacters(content);

        using (StringReader reader = new StringReader(content))
            obj = (T)serializer.Deserialize(reader);
    return obj;
public static string FixSpecialCharacters(string input)
    if (string.IsNullOrEmpty(input)) return input;

    StringBuilder output = new StringBuilder();
    for (int i = 0; i < input.Length; i++)
        int charCode = (int)input[i];
        switch (charCode)
            case 8211:
            case 8212:
                    // replaces short and long hyphen
                    output.Append( - );
                    if ((31 < charCode && charCode < 127) || charCode == 9)
    return output.ToString();

