Validating well-formed XML files
PDI offers different options for validating XML documents, including the validation of a well-formed document. The structure of an XML document is formed by tags that begin with the character <
and end with the character >
. In an XML document, you can find start-tags: <exampletag>
, end-tags: </exampletag>
, or empty-element tags: <exampletag/>
, and these tags can be nested. An XML document is called well-formed when it follows the following set of rules:
They must contain at least one element
They must contain a unique root element – this means a single opening and closing tag for the whole document
The tags are case sensitive
All of the tags must be nested properly, without overlapping
In this recipe, you will learn to validate whether a document is well-formed, which is the simplest kind of XML validation. Assume that you want to extract data from several XML documents with museums information, but only want to process those files that...