Code Garage – Escaping and UnEscaping XML Strings

Filed under Code Garage, XML

image If you work much with XML, eventually, you’ll end up needing to take raw string data and either Escape it (replace characters that are illegal in XML in the string with legal XML markup) or UnEscape it (replace any XML markup in the string with the represented characters).

For instance, a “>” (greater than) symbol is illegal in the content of an XML node, and must be represented by >

There are any number of approaches out there to handle this. Most involve simple brute force string search and replaces. They work, but I thought there must be a more elegant (and already thought out) solution to this problem.

However, after quite a bit of searching, I gave up and wrote my own, making use of the XML functions already defined in the .net framework.

To ENCODE a string into legal XML node content…

    Public Function EncodeXML(ByVal s As String) As String
        If Len(s) = 0 Then Return ""
        Dim encodedString = New StringBuilder()
        Dim writersettings = New System.Xml.XmlWriterSettings
        writersettings.ConformanceLevel = Xml.ConformanceLevel.Fragment
        Using writer = System.Xml.XmlWriter.Create(encodedString, writersettings)
            writer.WriteString(s)
        End Using
        Return encodedString.ToString
    End Function

To reverse the process and Decode a string…

    Public Function DecodeXML(ByVal s As String) As String
        If Len(s) = 0 Then Return ""
        Dim decodedString As String = ""
        Dim readersettings = New System.Xml.XmlReaderSettings
        readersettings.ConformanceLevel = Xml.ConformanceLevel.Fragment
        Dim ms = New System.IO.StringReader(s)
        Using reader = System.Xml.XmlReader.Create(ms, readersettings)
            reader.MoveToContent()
            decodedString = reader.ReadString
        End Using
        Return decodedString
    End Function

No, I haven’t performed any exhaustive performance measures on these. I’ve generally not used them in any kind of ultra high volume situations, but they’re clean, simple and leverage existing code that already performs the necessary functions.

Post a Comment

Your email is never published nor shared. Required fields are marked *

*
*