Reason behind com.ctc.wstx.exc.WstxUnexpectedCharException: Unexpected character β>β (code 62) in content after β<β (malformed start element?) in XSLT conversion WSO2 ESB4.9.0
[Article moved from ajanthane.blogspot.com]
This article explains the reason behind the above exception which occurs while doing a xslt transformation with < and disable-output-escaping=βyesβ.
Consider we have a proxy service as below:
<?xml version="1.0" encoding="UTF-8"?>
<proxy xmlns="http://ws.apache.org/ns/synapse"
name="CheckXSLTEscapeCharacters"
transports="https,http"
statistics="disable"
trace="disable"
startOnLoad="true">
<target>
<inSequence>
<log level="custom">
<property name="STATUS"
value="--------------------CheckXSLTEscapeCharacters Invoked--------------------"/>
</log>
<log level="full"/>
<log level="custom">
<property name="STATUS"
value="--------------------CheckXSLTEscapeCharacters After Log full Invoked--------------------"/>
</log>
<xslt key="conf:TransformationXSLT.xslt"/>
<log level="custom">
<property name="STATUS"
value="--------------------CheckXSLTEscapeCharacters Invoked After XSLT Transformation--------------------"/>
</log>
<respond/>
</inSequence>
</target>
<description/>
</proxy>
The XSLT as below:
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="2.0">
<xsl:output method="xml" indent="yes" cdata-section-elements="Description"/>
<xsl:preserve-space elements="*" />
<xsl:template match="Description">
<Description>
<xsl:value-of select="." disable-output-escaping="yes" />
</Description>
</xsl:template>
<xsl:template match="@*|node()">
<xsl:copy>
<xsl:apply-templates select="@*|node()"/>
</xsl:copy>
</xsl:template>
</xsl:stylesheet>
Send a request to the proxy as below:
<soapenv:Envelope xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/" xmlns:echo="http://echo.services.core.carbon.wso2.org">
<soapenv:Header/>
<soapenv:Body>
<Description>Test for escaping < > characters</Description>
</soapenv:Body>
</soapenv:Envelope>
At this time we will get an error like below:
[2017-04-21 20:21:15,469] DEBUG - wire >> "POST /services/CheckXSLTEscapeCharacters HTTP/1.1[\r][\n]"
[2017-04-21 20:21:15,470] DEBUG - wire >> "Accept-Encoding: gzip,deflate[\r][\n]"
[2017-04-21 20:21:15,470] DEBUG - wire >> "Content-Type: text/xml;charset=UTF-8[\r][\n]"
[2017-04-21 20:21:15,470] DEBUG - wire >> "SOAPAction: "urn:echoString"[\r][\n]"
[2017-04-21 20:21:15,470] DEBUG - wire >> "Content-Length: 281[\r][\n]"
[2017-04-21 20:21:15,470] DEBUG - wire >> "Host: ajanthan-ThinkPad-T440p:8280[\r][\n]"
[2017-04-21 20:21:15,470] DEBUG - wire >> "Connection: Keep-Alive[\r][\n]"
[2017-04-21 20:21:15,471] DEBUG - wire >> "User-Agent: Apache-HttpClient/4.1.1 (java 1.5)[\r][\n]"
[2017-04-21 20:21:15,471] DEBUG - wire >> "[\r][\n]"
[2017-04-21 20:21:15,471] DEBUG - wire >> "<soapenv:Envelope xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/" xmlns:echo="http://echo.services.core.carbon.wso2.org">[\n]"
[2017-04-21 20:21:15,471] DEBUG - wire >> " <soapenv:Header/>[\n]"
[2017-04-21 20:21:15,471] DEBUG - wire >> " <soapenv:Body>[\n]"
[2017-04-21 20:21:15,471] DEBUG - wire >> " <Description>Test for escaping < > characters</Description>[\n]"
[2017-04-21 20:21:15,471] DEBUG - wire >> " </soapenv:Body>[\n]"
[2017-04-21 20:21:15,471] DEBUG - wire >> "</soapenv:Envelope>"
[2017-04-21 20:21:15,499] INFO - LogMediator STATUS = --------------------CheckXSLTEscapeCharacters Invoked--------------------
[2017-04-21 20:21:15,500] INFO - LogMediator To: /services/CheckXSLTEscapeCharacters, WSAction: urn:echoString, SOAPAction: urn:echoString, MessageID: urn:uuid:d8ed88d8-a144-4741-8d48-a2eac641d898, Direction: request, Envelope: <?xml version='1.0' encoding='utf-8'?><soapenv:Envelope xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/" xmlns:echo="http://echo.services.core.carbon.wso2.org"><soapenv:Body>
<Description>Test for escaping < > characters</Description>
</soapenv:Body></soapenv:Envelope>
[2017-04-21 20:21:15,500] INFO - LogMediator STATUS = --------------------CheckXSLTEscapeCharacters After Log full Invoked--------------------
[2017-04-21 20:21:17,876] ERROR - XSLTMediator Unable to perform XSLT transformation using : Value {name ='null', keyValue ='conf:TransformationXSLT.xslt'} against source XPath : s11:Body/child::*[position()=1] | s12:Body/child::*[position()=1] reason : com.ctc.wstx.exc.WstxUnexpectedCharException: Unexpected character ' ' (code 32) in content after '<' (malformed start element?).
at [row,col {unknown-source}]: [2,33]
org.apache.axiom.om.OMException: com.ctc.wstx.exc.WstxUnexpectedCharException: Unexpected character ' ' (code 32) in content after '<' (malformed start element?).
at [row,col {unknown-source}]: [2,33]
at org.apache.axiom.om.impl.builder.StAXOMBuilder.next(StAXOMBuilder.java:296)
at org.apache.axiom.om.impl.llom.OMSerializableImpl.build(OMSerializableImpl.java:78)
at org.apache.axiom.om.impl.llom.OMElementImpl.build(OMElementImpl.java:722)
at org.apache.axiom.om.impl.llom.OMElementImpl.detach(OMElementImpl.java:700)
at org.apache.axiom.om.impl.llom.OMNodeImpl.setParent(OMNodeImpl.java:105)
at org.apache.axiom.om.impl.llom.OMNodeImpl.insertSiblingAfter(OMNodeImpl.java:203)
at org.apache.synapse.mediators.transform.XSLTMediator.performXSLT(XSLTMediator.java:360)
at org.apache.synapse.mediators.transform.XSLTMediator.mediate(XSLTMediator.java:196)
at org.apache.synapse.mediators.AbstractListMediator.mediate(AbstractListMediator.java:81)
at org.apache.synapse.mediators.AbstractListMediator.mediate(AbstractListMediator.java:48)
at org.apache.synapse.mediators.base.SequenceMediator.mediate(SequenceMediator.java:149)
at org.apache.synapse.core.axis2.ProxyServiceMessageReceiver.receive(ProxyServiceMessageReceiver.java:185)
at org.apache.axis2.engine.AxisEngine.receive(AxisEngine.java:180)
at org.apache.synapse.transport.passthru.ServerWorker.processEntityEnclosingRequest(ServerWorker.java:395)
at org.apache.synapse.transport.passthru.ServerWorker.run(ServerWorker.java:142)
at org.apache.axis2.transport.base.threads.NativeWorkerPool$1.run(NativeWorkerPool.java:172)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Caused by: com.ctc.wstx.exc.WstxUnexpectedCharException: Unexpected character ' ' (code 32) in content after '<' (malformed start element?).
at [row,col {unknown-source}]: [2,33]
at com.ctc.wstx.sr.StreamScanner.throwUnexpectedChar(StreamScanner.java:639)
at com.ctc.wstx.sr.BasicStreamReader.nextFromTree(BasicStreamReader.java:2843)
at com.ctc.wstx.sr.BasicStreamReader.next(BasicStreamReader.java:1072)
at org.apache.axiom.om.impl.builder.StAXOMBuilder.parserNext(StAXOMBuilder.java:681)
at org.apache.axiom.om.impl.builder.StAXOMBuilder.next(StAXOMBuilder.java:214)
at org.apache.axiom.om.impl.llom.OMElementImpl.buildNext(OMElementImpl.java:653)
at org.apache.axiom.om.impl.llom.OMNodeImpl.getNextOMSibling(OMNodeImpl.java:122)
at org.apache.axiom.om.impl.traverse.OMChildrenIterator.getNextNode(OMChildrenIterator.java:36)
at org.apache.axiom.om.impl.traverse.OMAbstractIterator.hasNext(OMAbstractIterator.java:58)
at org.apache.axiom.om.impl.util.OMSerializerUtil.serializeChildren(OMSerializerUtil.java:554)
at org.apache.axiom.om.impl.llom.OMElementImpl.internalSerialize(OMElementImpl.java:875)
at org.apache.axiom.om.impl.llom.OMSerializableImpl.serialize(OMSerializableImpl.java:125)
at org.apache.axiom.om.impl.llom.OMSerializableImpl.serialize(OMSerializableImpl.java:113)
at org.apache.axiom.om.impl.llom.OMElementImpl.toString(OMElementImpl.java:988)
at org.apache.synapse.mediators.transform.XSLTMediator.performXSLT(XSLTMediator.java:310)
... 12 more
[2017-04-21 20:21:17,879] INFO - LogMediator To: /services/CheckXSLTEscapeCharacters, WSAction: urn:echoString, SOAPAction: urn:echoString, MessageID: urn:uuid:d8ed88d8-a144-4741-8d48-a2eac641d898, Direction: request, MESSAGE = Executing default 'fault' sequence, ERROR_CODE = 0, ERROR_MESSAGE = Unable to perform XSLT transformation using : Value {name ='null', keyValue ='conf:TransformationXSLT.xslt'} against source XPath : s11:Body/child::*[position()=1] | s12:Body/child::*[position()=1] reason : com.ctc.wstx.exc.WstxUnexpectedCharException: Unexpected character ' ' (code 32) in content after '<' (malformed start element?)., Envelope: <?xml version='1.0' encoding='utf-8'?><soapenv:Envelope xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/" xmlns:echo="http://echo.services.core.carbon.wso2.org"><soapenv:Body>
<Description>Test for escaping < > characters</Description>
</soapenv:Body></soapenv:Envelope>
[2017-04-21 20:21:17,879] DEBUG - wire << "HTTP/1.1 202 Accepted[\r][\n]"
[2017-04-21 20:21:17,880] DEBUG - wire << "Date: Fri, 21 Apr 2017 14:51:17 GMT[\r][\n]"
[2017-04-21 20:21:17,880] DEBUG - wire << "Transfer-Encoding: chunked[\r][\n]"
[2017-04-21 20:21:17,880] DEBUG - wire << "Connection: Keep-Alive[\r][\n]"
[2017-04-21 20:21:17,881] DEBUG - wire << "[\r][\n]"
[2017-04-21 20:21:17,881] DEBUG - wire << "0[\r][\n]"
[2017-04-21 20:21:17,881] DEBUG - wire << "[\r][\n]"
Below is the detailed explanation for this behavior:
When we disable-output-escaping=βnoβ the output will be escaped. So here if we send < > then it will be directly sent as < and >. Here we can see that the > is converted to >, But < goes as it is. In this case it will not throw any exception.
But when we set the disable-output-escaping=βyesβ and then when we send < > the output xml generated as the transformation of xslt will convert the
< to < character ( To provide unescaped character ). This is a invalid character inside a XML [1], if it is used as a single character. When we consider the xslt mediator, while performing the transformation and while outputing the transfomed xml, STAXOMBuilder will consider this as a single character < and it will throw the error. Thatβs why we are getting the exception.
Below segment retrieved from [1] mentions that < is considered invalid if defined as a single character.
[1] https://www.w3.org/TR/REC-xml/#syntax
The ampersand character (&) and the left angle bracket (<) MUST NOT appear in their literal form, except when used as markup delimiters, or within a comment, a processing instruction, or a CDATA section. If they are needed elsewhere, they MUST be escaped using either numeric character references or the strings β & β and β < β respectively. The right angle bracket (>) may be represented using the string β > β, and MUST, for compatibility, be escaped using either β > β or a character reference when it appears in the string β ]]> β in content, when that string is not marking the end of a CDATA section.
Also, if we send a payload like below:
<soapenv:Envelope xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/" xmlns:echo="http://echo.services.core.carbon.wso2.org">
<soapenv:Header/>
<soapenv:Body>
<Description>Test for escaping <test>ajan</test> characters</Description>
</soapenv:Body>
</soapenv:Envelope>
In this case even if we enable the disable-output-escaping=βyesβ, the output will be retrieved because here it is a valid xml representation. The output will be as below:
[2017-04-21 19:20:05,158] DEBUG - wire >> "POST /services/CheckXSLTEscapeCharacters HTTP/1.1[\r][\n]"
[2017-04-21 19:20:05,158] DEBUG - wire >> "Accept-Encoding: gzip,deflate[\r][\n]"
[2017-04-21 19:20:05,158] DEBUG - wire >> "Content-Type: text/xml;charset=UTF-8[\r][\n]"
[2017-04-21 19:20:05,158] DEBUG - wire >> "SOAPAction: "urn:echoString"[\r][\n]"
[2017-04-21 19:20:05,159] DEBUG - wire >> "Content-Length: 301[\r][\n]"
[2017-04-21 19:20:05,159] DEBUG - wire >> "Host: ajanthan-ThinkPad-T440p:8280[\r][\n]"
[2017-04-21 19:20:05,159] DEBUG - wire >> "Connection: Keep-Alive[\r][\n]"
[2017-04-21 19:20:05,159] DEBUG - wire >> "User-Agent: Apache-HttpClient/4.1.1 (java 1.5)[\r][\n]"
[2017-04-21 19:20:05,159] DEBUG - wire >> "[\r][\n]"
[2017-04-21 19:20:05,159] DEBUG - wire >> "<soapenv:Envelope xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/" xmlns:echo="http://echo.services.core.carbon.wso2.org">[\n]"
[2017-04-21 19:20:05,160] DEBUG - wire >> " <soapenv:Header/>[\n]"
[2017-04-21 19:20:05,160] DEBUG - wire >> " <soapenv:Body>[\n]"
[2017-04-21 19:20:05,160] DEBUG - wire >> " <Description>Test for escaping <test>ajan</test> characters</Description>[\n]"
[2017-04-21 19:20:05,160] DEBUG - wire >> " </soapenv:Body>[\n]"
[2017-04-21 19:20:05,160] DEBUG - wire >> "</soapenv:Envelope>"
[2017-04-21 19:20:05,162] INFO - LogMediator STATUS = --------------------CheckXSLTEscapeCharacters Invoked--------------------
[2017-04-21 19:20:05,163] INFO - LogMediator To: /services/CheckXSLTEscapeCharacters, WSAction: urn:echoString, SOAPAction: urn:echoString, MessageID: urn:uuid:dc961c46-e107-4c44-bb02-38b954aab201, Direction: request, Envelope: <?xml version='1.0' encoding='utf-8'?><soapenv:Envelope xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/" xmlns:echo="http://echo.services.core.carbon.wso2.org"><soapenv:Body>
<Description>Test for escaping <test>ajan</test> characters</Description>
</soapenv:Body></soapenv:Envelope>
[2017-04-21 19:20:05,164] INFO - LogMediator STATUS = --------------------CheckXSLTEscapeCharacters After Log full Invoked--------------------
[2017-04-21 19:20:07,440] INFO - LogMediator STATUS = --------------------CheckXSLTEscapeCharacters Invoked After XSLT Transformation--------------------
[2017-04-21 19:20:07,442] DEBUG - wire << "HTTP/1.1 200 OK[\r][\n]"
[2017-04-21 19:20:07,444] DEBUG - wire << "Host: ajanthan-ThinkPad-T440p:8280[\r][\n]"
[2017-04-21 19:20:07,444] DEBUG - wire << "SOAPAction: "urn:echoString"[\r][\n]"
[2017-04-21 19:20:07,444] DEBUG - wire << "Accept-Encoding: gzip,deflate[\r][\n]"
[2017-04-21 19:20:07,444] DEBUG - wire << "Content-Type: text/xml;charset=UTF-8; charset=UTF-8[\r][\n]"
[2017-04-21 19:20:07,444] DEBUG - wire << "Date: Fri, 21 Apr 2017 13:50:07 GMT[\r][\n]"
[2017-04-21 19:20:07,445] DEBUG - wire << "Transfer-Encoding: chunked[\r][\n]"
[2017-04-21 19:20:07,445] DEBUG - wire << "Connection: Keep-Alive[\r][\n]"
[2017-04-21 19:20:07,445] DEBUG - wire << "[\r][\n]"
[2017-04-21 19:20:07,445] DEBUG - wire << "147[\r][\n]"
[2017-04-21 19:20:07,446] DEBUG - wire << "<?xml version='1.0' encoding='UTF-8'?><soapenv:Envelope xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/" xmlns:echo="http://echo.services.core.carbon.wso2.org">[\n]"
[2017-04-21 19:20:07,446] DEBUG - wire << " <soapenv:Header/>[\n]"
[2017-04-21 19:20:07,447] DEBUG - wire << " <soapenv:Body>[\n]"
[2017-04-21 19:20:07,447] DEBUG - wire << " <Description>Test for escaping <test>ajan</test> characters</Description>[\n]"
[2017-04-21 19:20:07,448] DEBUG - wire << " </soapenv:Body>[\n]"
[2017-04-21 19:20:07,448] DEBUG - wire << "</soapenv:Envelope>[\r][\n]"
[2017-04-21 19:20:07,448] DEBUG - wire << "0[\r][\n]"
[2017-04-21 19:20:07,448] DEBUG - wire << "[\r][\n]"
This means that a single < with disable-output-escaping=βyesβ will cause the problem as it canβt be a single character inside a xml.