NeuroAgent

Fix Double-Escaped Backslashes in XML to JSON SQL Server

Learn how to fix double-escaped backslashes in XML to JSON conversion using SQL Server's STRING_ESCAPE function. Complete guide with code examples and best practices.

How to correctly escape trailing backslashes in XML to JSON conversion using SQL Server?

I have created a recursive SQL Server scalar-valued function that converts XML data to a JSON string. The function works well for most cases, including nested elements and handling of arrays (using a json:Array attribute).

sql
CREATE OR ALTER FUNCTION dbo.XmlToJson(@XmlData xml)  
RETURNS nvarchar(max)
WITH RETURNS NULL ON NULL INPUT
AS  
BEGIN  
    DECLARE @m nvarchar(max);

    WITH XMLNAMESPACES (N'http://james.newtonking.com/projects/json' AS json)
    SELECT @m = '{' + STRING_AGG(
  '"' + STRING_ESCAPE(name, 'json') + '":' + value,
  ','
) + '}'
    FROM 
        (SELECT
             v.name,
             CONCAT(CASE WHEN COUNT(*) > 1 OR MAX(isArray) = 1 THEN '[' END,
                    STRING_AGG(ISNULL('"' + REPLACE(STRING_ESCAPE(x.a.value('text()[1]', 'nvarchar(max)'), 'json'), '\', '\\') + '"', dbo.XmlToJson(x.a.query('./*'))), ','),
                    CASE WHEN COUNT(*) > 1 OR MAX(isArray) = 1 THEN ']' END
                   ) AS value
         FROM @XmlData.nodes('./*') x(a)
         CROSS APPLY 
             (SELECT
                  x.a.value('local-name(.)', 'nvarchar(4000)') AS name,
                  x.a.value('xs:int(xs:boolean(@json:Array))', 'int') AS isArray) v
         GROUP BY
             v.name) grouped;

    SET @m = ISNULL(@m, 'null');
    SET @m = REPLACE(@m, '\/', '/');

    RETURN @m;
END;

However, I’m facing an issue with escaping backslashes in text content. Specifically, when an XML element’s text content ends with a backslash, my current logic results in an extra backslash escape in the final JSON output.

The desired output for a path like C:\Books\Book1\Book1.pdf\ should be "C:\\Books\\Book1\\Book1.pdf\\". My current output is producing "C:\\\\Books\\\\Book1\\\\Book1.pdf\\\\".

For this input:

sql
DECLARE @xml xml = N'<root>
    <Book>Book1</Book>
    <TransactionId  xmlns:json="http://james.newtonking.com/projects/json" json:Array="true">abc123</TransactionId>
    <Publisher>Amazon</Publisher>
    <Edition  xmlns:json="http://james.newtonking.com/projects/json" json:Array="true">
        <Name>Ed1</Name>
        <Color>Red</Color>
        <Price>100</Price>
        <file>C:\Books\Book1\Book1.pdf\</file>
    </Edition>
    <PublisherId>1</PublisherId>
    <UserId>1234</UserId>
    <Release />
</root>
';

I get this output:

json
{"Book":"Book1","Edition":[{"Color":"Red","file":"C:\\Books\\Book1\\Book1.pdf\\","Name":"Ed1","Price":"100"}],"Publisher":"Amazon","PublisherId":"1","Release":null,"TransactionId":["abc123"],"UserId":"1234"}

The issue seems to stem from a conflict between STRING_ESCAPE and a manual REPLACE I’m using to handle general backslashes within the string, and how this interacts when the character is at the very end of the text.

I’ve created a DB fiddle for reference: https://dbfiddle.uk/rUlklVK8

However, I cannot replicate the same issue which I’m facing on my SQL Server.

Details:
Microsoft SQL Server 2019 (RTM-CU22-GDR) (KB5029378) - 15.0.4326.1 (X64)
Copyright © 2019 Microsoft Corporation
Developer Edition (64-bit) on Windows Server 2019 Standard 10.0 (Build 17763: ) (Hypervisor)

NeuroAgent

The issue occurs because you’re applying REPLACE after STRING_ESCAPE, causing double-escaping of backslashes. STRING_ESCAPE already properly escapes backslashes for JSON, so the additional REPLACE is unnecessary and problematic.

Here’s how to fix your function:

sql
CREATE OR ALTER FUNCTION dbo.XmlToJson(@XmlData xml)  
RETURNS nvarchar(max)
WITH RETURNS NULL ON NULL INPUT
AS  
BEGIN  
    DECLARE @m nvarchar(max);

    WITH XMLNAMESPACES (N'http://james.newtonking.com/projects/json' AS json)
    SELECT @m = '{' + STRING_AGG(
  '&quot;' + STRING_ESCAPE(name, 'json') + '&quot;:' + value,
  ','
) + '}'
    FROM 
        (SELECT
             v.name,
             CONCAT(CASE WHEN COUNT(*) > 1 OR MAX(isArray) = 1 THEN '[' END,
                    STRING_AGG(ISNULL('&quot;' + STRING_ESCAPE(x.a.value('text()[1]', 'nvarchar(max)'), 'json') + '&quot;', dbo.XmlToJson(x.a.query('./*'))), ','),
                    CASE WHEN COUNT(*) > 1 OR MAX(isArray) = 1 THEN ']' END
                   ) AS value
         FROM @XmlData.nodes('./*') x(a)
         CROSS APPLY 
             (SELECT
                  x.a.value('local-name(.)', 'nvarchar(4000)') AS name,
                  x.a.value('xs:int(xs:boolean(@json:Array))', 'int') AS isArray) v
         GROUP BY
             v.name) grouped;

    SET @m = ISNULL(@m, 'null');
    -- Only replace forward slashes, leave backslashes as escaped by STRING_ESCAPE
    SET @m = REPLACE(@m, '\/', '/');

    RETURN @m;
END;

The key changes:

  1. Removed backslash replacement: Eliminated REPLACE(STRING_ESCAPE(...), '\', '\\') since STRING_ESCAPE already handles backslash escaping correctly
  2. Kept forward slash replacement: Maintained REPLACE(@m, '\/', '/') to handle forward slashes that STRING_ESCAPE might escape unnecessarily

Contents

Understanding the Issue

The problem stems from double-escaping of backslashes in your XML to JSON conversion. When STRING_ESCAPE processes a string containing backslashes, it properly escapes them as required by JSON standards. However, your subsequent REPLACE function was treating these already-escaped backslashes as raw backslashes and escaping them again.

For example:

  • Original content: C:\Books\Book1\Book1.pdf\
  • After STRING_ESCAPE: C:\\Books\\Book1\\Book1.pdf\\
  • After your REPLACE: C:\\\\Books\\\\Book1\\\\Book1.pdf\\\\ (double-escaped)

This creates invalid JSON and doesn’t match the expected output format.

STRING_ESCAPE Function Behavior

According to Microsoft documentation, the STRING_ESCAPE function “escapes special characters in texts and returns text with escaped characters.” Specifically for JSON:

  • Backslashes (\) are escaped as \\
  • Double quotes (") are escaped as \"
  • Forward slashes (/) are escaped as \/ (though this is optional in JSON)
  • Control characters are properly escaped

The research from SQL Authority confirms that STRING_ESCAPE handles all necessary escaping for JSON strings.

Solution Implementation

The corrected function removes the unnecessary backslash replacement while keeping the forward slash handling:

sql
-- Before (problematic):
STRING_ESCAPE(x.a.value('text()[1]', 'nvarchar(max)'), 'json'), '\', '\\')

-- After (correct):
STRING_ESCAPE(x.a.value('text()[1]', 'nvarchar(max)'), 'json')

This ensures that:

  1. STRING_ESCAPE performs all necessary JSON escaping once
  2. Only forward slashes are manually handled if needed
  3. Backslashes remain properly escaped as \\ in the final JSON

Testing the Fix

With your test data:

sql
DECLARE @xml xml = N'<root>
    <Book>Book1</Book>
    <TransactionId  xmlns:json="http://james.newtonking.com/projects/json" json:Array="true">abc123</TransactionId>
    <Publisher>Amazon</Publisher>
    <Edition  xmlns:json="http://james.newtonking.com/projects/json" json:Array="true">
        <Name>Ed1</Name>
        <Color>Red</Color>
        <Price>100</Price>
        <file>C:\Books\Book1\Book1.pdf\</file>
    </Edition>
    <PublisherId>1</PublisherId>
    <UserId>1234</UserId>
    <Release />
</root>';

The corrected function will now produce:

json
{"Book":"Book1","Edition":[{"Color":"Red","file":"C:\\Books\\Book1\\Book1.pdf\\","Name":"Ed1","Price":"100"}],"Publisher":"Amazon","PublisherId":"1","Release":null,"TransactionId":["abc123"],"UserId":"1234"}

Notice the file path now shows "C:\\Books\\Book1\\Book1.pdf\\" with proper single escaping instead of the double-escaped version.

Alternative Approaches

If you need more control over the escaping process, consider these alternatives:

1. Custom Escaping Function

sql
CREATE FUNCTION dbo.CustomJsonEscape(@String nvarchar(max))
RETURNS nvarchar(max)
AS
BEGIN
    -- Apply STRING_ESCAPE first
    DECLARE @Escaped nvarchar(max) = STRING_ESCAPE(@String, 'json');
    
    -- Handle any special cases not covered by STRING_ESCAPE
    -- For example, if you need to preserve certain escape sequences
    
    RETURN @Escaped;
END

2. Post-Processing with JSON_MODIFY

If you’re working with SQL Server 2016+, you could use JSON_MODIFY for fine-grained control:

sql
SELECT JSON_MODIFY(
    '{"file":"C:\Books\Book1\Book1.pdf\"}',
    '$.file', 
    STRING_ESCAPE('C:\Books\Book1\Book1.pdf\', 'json')
) AS JsonResult;

Best Practices for XML to JSON Conversion

When converting XML to JSON in SQL Server:

  1. Use STRING_ESCAPE once: Apply it directly to your text values without additional escaping
  2. Handle special cases carefully: Only manually replace characters that STRING_ESCAPE doesn’t handle appropriately
  3. Test edge cases: Always test with trailing backslashes, quotes, and special characters
  4. Validate JSON output: Use ISJSON() to ensure your results are valid JSON
  5. Consider performance: For large datasets, test different approaches to find the most efficient method

The key insight is that STRING_ESCAPE is designed to handle all necessary JSON escaping, so additional manual escaping is typically unnecessary and can cause issues like the double-escaping problem you encountered.

Sources

  1. Microsoft Learn - STRING_ESCAPE (Transact-SQL)
  2. SQL Authority - SQL SERVER - STRING_ESCAPE() for JSON
  3. Database Administrators Stack Exchange - Properly escape characters in a string for JSON
  4. Stack Overflow - STRING_ESCAPE json forward slash