Validating messages in XI using XML Schema

Introduction

XML Schema is a World Wide Web Consortium standard for describing the structure and content of XML documents. A document is said to be valid according to an XML Schema, if the document adheres to the rules specified in that schema. The XML Schema technology has been a core component of SAP Exchange Infrastructure since version 3.0, but as of SP16 it's still not possible to have the Integration Server validate messages against an XML Schema. In other words, even if an interface is based on a message type, which is ultimately defined by a schema, we cannot know whether or not messages on the interface are valid according to that schema. This blog entry discusses one approach to performing validation in XI. Full source code is not provided; implementing the suggested solution is left as an exercise for the reader.

What are the reasons for wanting to validate messages passing through XI, anyway? For one, mapping programs necessarily make certain assumptions about the structure of the message being processed, meaning that these programs can fail, if the message is invalid. Failing an invalid message early helps pinpoint the exact cause of the problem and provides the developer with better debugging information. Another prime reason for performing validation, is ensuring the data quality of messages that go out to external partners, suppliers and customers.

In XI "userland" (i.e. outside the Integration Engine core), XML Schema validation can be implemented in a Java mapping program or in an adapter module. Code executing in either has access to the contents of the message being processed and to a validating XML parser. ABAP mapping programs are not an option, as the ABAP XML parser currently does not support XML Schema. The solution discussed in this blog entry is a Java mapping program, but the techniques described can be applied equally well to an adapter module.

A brief introduction to Java mapping programs

Contrary to adapter modules, which are Enterprise Java Beans, Java mapping programs are garden-variety J2SE classes. They do not require deployment in the Web Application Server, rather they must be packaged in ZIP or JAR archives and uploaded to the Integration Repository under Imported Archives. Every mapping program class must implement the StreamTransformation interface provided by SAP, which is located in the aii_map_api.jar file. The interface contains only two methods: execute and setParameter.

The setParameter method is called by the mapping runtime, which passes a java.util.Map object containing runtime information such as the message ID, sender and receiver of the message being processed. The keys used for looking up values in this Map object are constants in the StreamTransformationConstants class.

The execute method is where the interesting stuff happens. The method receives a java.io.InputStream object and a java.io.OutputStream object. The former provides access to the payload of the message being processed. The latter is used for writing the transformed message, which will either be passed on to the next mapping program or to the next XI pipeline step.

Finally, a word of caution: Since mapping programs are nothing more than ordinary Java classes, the developer has unrestricted access to the entire J2SE class library. However, code that reads or writes files, opens network connections, accesses databases etc. should never find its way into your mapping class; your code should only perform transformations on the message being processed.

Validating XML documents in Java

Performing schema validation in Java code is fairly straightforward. The first step is to obtain an object of class javax.xml.parsers.SAXParser, which the following piece of code takes care of for us (the appropriate import statements are not shown):

	
SAXParserFactory spf = SAXParserFactory.newInstance();
spf.setNamespaceAware(true);
spf.setValidating(true);
SAXParser sp = spf.newSAXParser();



The next step is to inform the parser that we want to validate using XML Schema and provide it with an InputStream to the schema we want to use:




sp.setProperty("http://java.sun.com/xml/jaxp/properties/schemaLanguage",
"http://www.w3.org/2001/XMLSchema");
sp.setProperty("http://java.sun.com/xml/jaxp/properties/schemaSource",
mySchemaInputStream);



Now we're almost ready to start parsing an actual document, which we do by calling the method parse(InputStream is, DefaultHandler dh) on the SAXParser object. When recoverable errors occur during parsing (validation errors are considered recoverable), the parser calls the error method on the DefaultHandler object. However, the implementation of the error method in the DefaultHandler class does nothing. When validation errors occur, we want to throw an exception, so instead of using the DefaultHandler class directly, we subclass it as follows:




final class ParseErrorHandler extends DefaultHandler {
public void error(SAXParseException e) throws SAXException {
throw e;
}
}



If myDocument is an InputStream to an XML document, we're now able to parse and validate the document by calling the parse method like so:




sp.parse(myDocument, new ParseErrorHandler());



The proposed solution


The solution I'm proposing is implemented in a Java mapping program. When the class's execute method is called by the runtime, the mapping program validates the message being processed. If the message is valid, it is passed unchanged on to the next mapping program or pipeline step. If the message is invalid, an exception is thrown (specifically a StreamTransformationException with the original SAXException as its cause) and processing of the message stops. This approach solves one problem but creates two new ones: How does the mapping program know which schema to validate a given message against and how does it access that particular schema? At the moment there's no published API for accessing an interface's underlying schema definition, so for now we have to provide the schemas ourselves. My suggested solution is to package the schemas in a JAR archive with the mapping program class and add the XML document SchemaRepository.xml:




<?xml version="1.0" encoding="UTF-8"?>

<availableSchemas>
<schema name="schema1.xsd">
<interface name="Interface1">
<namespace>http://your.domain/some/namespace</namespace>
</interface>
</schema>
<schema name="schema2.xsd">
<interface name="Interface2">
<namespace>http://your.domain/some/other/namespace</namespace>
</interface>
<interface name="Interface3">
<namespace>http://your.domain/yet/another/namespace</namespace>
</interface>
</schema>
</availableSchemas>



When the mapping program is processing a given message, it uses the interface and namespace of the message to look up a corresponding schema name in this XML document. If no schema is available, no validation will be performed. If a schema is found, we need an InputStream to the schema file in order to perform the validation. The following code obtains this stream (schema is a variable of type InputStream , o is a reference to any object loaded from the JAR archive and schemaName is a String containing the name of the schema we want to load from the JAR):




schema = o.getClass().getClassLoader().getResourceAsStream(schemaName);



Putting it all together


When you have a working Java mapping class, you need to create a JAR file containing the mapping program class file, the schemas and the SchemaRepository.xml document. This JAR file must then be uploaded to the Integration Repository under Imported Archives. At this point, the Java mapping program is available to Interface Mappings in every namespace below the Software Component Version, under which the archive was uploaded.



The final step is to add the Java mapping program to each Interface Mapping that requires validation. Add the program as the first mapping step, if you want to validate inbound (relative to the Integration Server) messages. Add it as the last mapping step, if you want to validate outbound (again, relative to the Integration Server) messages.



Have fun!

SAP Developer Network SAP Weblogs: SAP Process Integration (PI)