ORIGO Home | Contact Us | Copyright | Terms and Conditions | Competition Act
Origo Services Ltd

Experimental XForms XML Instance Validator



Resources

Further Documentation

Eclipse

XFG Pre-requisists

IBM XForms Generator

Origo XFG Extension

Prototype NB Bond Form

Novell XForms IE Plug-in


W3C

Disclaimer

Origo Services Limited believes it has employed personnel using reasonable skill and care in the creation of this document. However, the document is provided to the reader 'as is' without any warranty (express or implied) as to accuracy or completeness and Origo Services Limited cannot be held liable for any errors or omissions in this document, nor for any losses, damages or expenses arising consequent to the use of this document by the reader.

What Does The Experimental XForms Instance Validator Do?

Simply put, the Validator is an XForm that validates an XML instance against the definition of a vertical industry standard XML vocabulary. The form is not intended for conventional data input, but as a reporting/debugging tool for use by anyone wishing to understand and develop to the standard XML vocabulary.

Below is a description of how you can try out the Validator, and a more detailed account of why the Validator was built and how it works.

How to play with the Form

Assuming that you have the Novell XForms plug-in for Internet Explorer 6 installed and working, you can load the form from here. Follow the instructions below to try the form out.

Below is a list of changes you can make to the XML instance populated by the form in order to trigger error conditions.

A future version of the form may well allow the user to delete items (other than repeating ones), but it will have to use XForms 1.1 to allow the form user to put them back again if they change their mind.

The Problem

The development of the Experimental XForms Instance Validator brings together a number of complementary ideas, and technologies in an attempt to solve what amounts to a problem with documentation.

Origo develops XML-based information exchange standards for the UK Life and Pensions Industry. It produces sets of W3C XML Schemas (WXS) to define document structures and element and attribute values. It also produces documentation to describe the meaning and correct usage of the schemas to help organisations exchange business-valid XML documents.

One document, the Message Implementation Guideline (MIG - an EDI term) acts as the key information source for standards implementers. It gives developers a clear description of what to implement and helps business analysts understand how sets of requirements have been tranlated into XML structures.

A typical MIG consists of:

In recent months Origo has been working to improve the quality of MIGs. For example the descriptions of XML structures are now generated directly from annotated WXS. In consequence if a schema is right then its MIG is too. Conversely any errors in a MIG must be corrected by fixing the underlying schema.

Dependency Rules

One area where there is still considerable room for improvement is in the definition of dependency rules. Rules express the relationships between XML elements and attributes that it is either not possible or not convenient to implement in WXS. Rules are recorded as prose. For example:

The following XML:

<product type="Text" sub_type="Text" product_code="Text"/>

has a relationship with the following child element:

    <investment_strategy>
           <main_commission commission_entitlement_id="Text"/>
    

which was written as follows:

main_commission is required if product/@sub_type not = "Traditional With Profit". Must not appear if product/@sub_type = "Traditional With Profit".

This rule is fairly typical in that the relationship between two XML nodes is determined by the value of one or other of the nodes. It is expressed in a mixture of prose and something approaching pseudo-code. This rule is unambiguous, but many others are not. For example, some rules state only when an element is required, but do not state whether or not the element is optional or must be absent in other circumstances. As the rules are mostly prose they cannot be checked quickly for consistancy and completeness. Clearly dependency rules offer scope for further improvement to MIGs.

Schematron

Fortunately there exists an XML schema language designed to express just the kinds of relationships that appear in Origo MIGs. Schematron was developed by Rick Jelliffe and has latterly been adopted by ISO. A Schematron schema is at heart a very simple, but powerful thing; it uses one set of XPath expressions to locate XML nodes that are of interest and another set to test their validity.

To take our example, we might define the context we want to validate as:

product[@sub_type != 'Traditional With Profit']

while the simple test, which returns either true or false, is just:

main_commission

The full Schematron expression is:

<rule context="product[@sub_type != 'Traditional With Profit']">
  <assert test="investment_strategy/investment_contribution/main_commission">
   main_commission is required if product/@sub_type not = "Traditional With Profit".
  </assert>
</rule>

If the XPath content of @test on the assert element evaluates to true then the node identified by @context is valid.

Note that Schematron provides a very convenient mechanism for associating our original text with its XPath definition. A Schematron processor will present the text of the assert to a user if the expression in the test evaluates to false.

To complete the dependency note we can use another rule:

<rule context="product[@sub_type = 'Traditional With Profit']">
   <report test="investment_strategy/investment_contribution/main_commission">
     main_commission must not appear if product/@sub_type = "Traditional With Profit".
   </report>
</rule>

This time we have use the report element, rather than assert. With the report element the user is presented with the text value when @test evaluates to true.

Schematron in Use

Schematron is fairly simple to get to grips with for anyone familiar with XPath (XSLT users for example). Origo generated the rules for around 90 dependency notes in approximately 3 man days, and that included time for learning the basics of Schematron, and understanding the dependency rules in the chosen MIG.

The act of producing the Schematron schema proved useful simply as an exercise in validating the dependency rules and the existing WXS schema.

XForms

Having built our Schematron schema we could have just shipped it as part of the documentation set for the particular standard. There are enough readily available tools with Schematron support to make the schema useful. However, we had been thinking for some time about producing XForms representations of Origo XML documents as part of our documentation sets. An XForm combines an XML instance with its associated WXS schema and a user interface to produce an electronic form with built-in validation. The idea is to use XForms as interactive documentation that not only explains how a standard should be used, but also validates attempts to do so. The form could also be used to generate and populate sample documents.

Clearly such a form could only be considered complete if it implemented Origo dependency rules. Fortunately XForms has a mechanism very similar to Schematron for defining XPath based constraints, so the hope was that the Schematron schema would form the basis of XForms constraints. This would have the benefit of combining the WXS defined grammar with the Schematron defined constraints in one executable package with a built in user interface.

The IBM XML Form Generator

To generate such forms by hand is likely to be very time-consuming. However, IBM has released an Eclipse plug-in that automatically generates an XForm from a sample XML instance and WXS Schema. Furthermore the plug-in allows you to pass its own output stream (the generated form) to another plug-in to allow further work to be done. We built another Eclipse plug-in that allows us to chain arbitrary XSLT transforms onto the output of the IBM plug-in in order to insert the dependency rules and any other enhancements we might require.

Using Eclipse as the UI to the generation tools is a considerable convenience. It makes them readily accessable to end users without requiring too much effort. Furthermore Novell is working on an Eclipse plug-in XForms processor, so ultimately Eclipse will be the only tool a user needs to generate and view forms. For the time being Origo is targetting the Novell plug-in for Microsoft Internet Explorer, though the forms should work with other XForms processors.

However, it should be borne in mind that the IBM tool (and Origo's own) are experimental. What you see is very much what you get, and what you get is functional, but not much else. The chief virtue of the IBM tool (beyond the fact that it is free to use) is that it produces a working form of predictable structure that is a good input to an XSLT transform.

How the Form Works

The Eclipse generated form contains all of the elements and attributes from the source XML instance, validated against the appropriate WXS schema. Enumerations are represented correctly and repeating structures repeating as they should. Origo's XSLT attaches a user-specified Schematron schema to the form. It adds a new piece of user interface that is bound to the validation conditions contained in the Schematron schema. A set of XForms binds derived from the Schematron schema are used to ensure that only text for rules that have failed are presented to the user.

The Schematron rule examined above appears thus when used by the XForm:

<rule context="product[@sub_type = 'Traditional With Profit']" id="investment_strategy2">
  <report id="main_commission" flag="false" test="investment_strategy/investment_contribution/main_commission">
   main_commission must not appear if product/@sub_type = "Traditional With Profit".
  </report>
</rule>

Things to note:

The flag attribute takes a boolean value and is set to true when an assert or report condition fails. Only asserts and reports with a @flag set to true are presented to the form user.

The schematron rule appears in the XForm model like this:

<xforms:bind 
   nodeset="instance('instAssertions')/sch:rule[@id = 'investment_strategy2']/sch:assert[@id = 'main_commission']/@flag"
   calculate="if(instance('instance_model_m_content')/application/product[@sub_type != 'Traditional With Profit Bond']
   [not(investment_strategy/investement_contribution/main_commission)], 'true', 'false')"/>

Things to note are:

So the XForms bind toggles the value of the flag attribute depending on whether the XPath condition originally defined in the Schematron schema returns true or false.

The corresponding bit of user interface looks like this:

<xforms:group
 ref="sch:rule[@id = 'investment_strategy2']/*[@id = 'main_commission'][@flag = 'true']">
   <p>
    <xforms:output value="."/>
   </p>
</xforms:group>

So the XForms group is bound dynamically to any element in the Schematron schema that matches our combination of rule and report with a flag attribute that has a value of "true". The binding ensures that only relevant validation information is presented to the user.

So this is what the form actually looks like when the dependency rule fails:

Internet Explorer rendering the XForm. On the left is the validation report.

The validation error at the bottom of the form reports that main_commission is required, and the form on the right is styled to indicate that the attribute commission_entitlement_id (a child of main_commission) is missing.

It is worth noting that rudimentary navigation has been added to the generated form, splitting the form into three sections. Simple navigation from a validation error report to the correct part of the form is provided. At the moment this only takes you to the correct form section. Both of these features will be enhanced in the future.

Limitations

alert

The current version of the form is not terribly sophisticated, though it is just about usable. However one of the main problems with the form is that it makes no use of XForms' own mechanism for providing users with validation error information (the alert element). alert appears as a child of any form control, the idea being that human-readable information about why a value is invalid is presented in the context of the form control used to remedy the problem.

This form presents a complete report of all failed dependency rules, and so alert was not used. However, future versions of the form will make use of alert to provide information inline too. The text of an alert can be determined dynamically by binding it to an XPath expression, so it can reflect the particular cause of invalidity. However, there can be a number of errors to report simultaneously for the same node, and it not yet clear how more than one validation error at a time can be reported using alert.

The other problem is that the "home grown" reporting of failed dependency rules suddenly looks a lot better than the reporting of WXS validation errors. An XForm can be styled to indicate invalidity, but alert messages cannot be made sensitive to the exact cause(s) of WXS invalidity. This is especially so if an instance is structurally invalid. The answer might be to generate duplicate definitions of choice, sequence, etc as Schematron rules and leave WXS validation to worry about simple types.

Conclusions

The combination of Schrematron and XForms to provide interactive documentation of XML dependency rules looks promising. This method becomes viable to implement and deploy thanks to the combination of Eclipse, the IBM XML Form Generator, some XSLT and free-to-use XForms processors such as that released by Novell and x-port.net, though No-doubt a commercial vendor could come up with something more sophisticated. Origo intends to make incremental improvements to the embryonic functionality

The experience of building this form has demonstrated the power of combining grammar and rule based schema languages in one executable package. On reflection it would save a lot of work if XForms was able to understand Schematron in the way that it understands WXS. Most of the plumbing is already there as XForms already has support for XPath. However, XForms lacks support for XSLT XPath functions, so any Schematron schema making use of these is not translatable into XForms. Similarly Schematron has support for variables, which XForms lacks. Even so, XForms covers a large enough subset for it to be worth building a formal mapping between the two to see exactly what mismatches exist.

An alternative approach would be to dump Schematron altogether and just use constraints defined directly in an XForms model. However, Schematron tends to be easier to author and to understand and is more powerful. For example, unlike XForms Schematron does not stop an author from defining as many, separate constraints for a given context as they like. Clearly XForms could be made to be more Schematron-like, but it would be quicker and more inkeeping with the principle of re-using existing technologies where possible already established by the XForms Working Group, to make XForms just enough like Schematron that it is able to support the language natively.

Should you require any assistance please use the Contact Us option on the top menu or phone Origo on +44131 451 5181.