Parsing an XML document with the SAX API
SAX parsing is based on the push model in which events are generated by an SAX parser and a document handler receives notification of the events. The SAX parsing model is faster than DOM parsing, but is limited in its scope to generating parsing events, without any provision for navigating the nodes or retrieving nodes with XPath. In this section, we shall parse the example XML document with a SAX parser and output the events generated by the parser. The SAX parsing application will be developed in JDeveloper in the Java application SAXParserApp.java
. First, import the oracle.xml.jaxp
package:
import oracle.xml.jaxp.*;
An SAX parsing application typically extends the DefaultHandler
class, which has event-notification methods for the parse events. The DefaultHandler
class implements the ErrorHandler
interface. A DefaultHandler
object may also be used for error handling.
Creating the factory
Create a JXSAXParserFactory
object with the static method newInstance()
. The newInstance()
method returns a SAXParserFactory
object that may be cast to JXSAXParserFactory
as the JXSAXParserFactory
class extends the SAXParserFactory
class. Why cast, you might ask? We are using Oracle XDK 11g's implementation classes for various standard interfaces and abstract classes.
JXSAXParserFactory factory = (JXSAXParserFactory) JXSAXParserFactory.newInstance();
The factory object is used to obtain a SAX parser that may be used to parse an XML document.
Parsing the XML document
Create a SAXParser
object from the factory object with the newSAXParser()
method. As class JXSAXParser
extends the SAXParser
class, a SAXParser
object may be cast to JXSAXParser:
JXSAXParser saxParser = (JXSAXParser) factory.newSAXParser();
Create an InputStream
for the XML document to parse, and parse the XML document with one of the parse()
methods in the SAXParser
class. The parse methods take an XML document in the form of InputSource, InputStream, URI
, or File
, and an event handler such as the DefaultHandler
.
InputStream input = new FileInputStream(new File("catalog.xml")); saxParser.parse(input, this);
The DefaultHandler
class provides the parsing event notification methods and error handling methods. Event notification and error handling methods may be overridden in the SAX parsing application for application-specific events and error handling. Some of the event notification methods in the DefaultHandler
class are listed in the following table:
Method Name |
Description |
---|---|
|
Receive notification of the start of the document. The |
|
Receive notification of the end of the document. The |
|
Receive notification of the start of an element. The URI parameter specifies the namespace URI. |
|
Receive notification of the end of an element. |
|
Receive notification of the character data (text). |
|
Receive notification of the ignorable whitespace in an element. |
|
Receive notification of a notation. |
|
Receive notification of a processing instruction. |
|
Receive notification of an unparsed entity declaration. Unparsed entity declarations are entities that refer to non-XML data that a parser does not have to parse. For example:
|
|
Receive notification of a skipped entity. Skipped entity notifications may be received when using a non-validating parser, which is not required to parse an external DTD and thus not required to resolve all entity references. Entity references that are not resolved are skipped entities. |
|
Receive notification of the start of a namespace mapping. An example of a namespace mapping is as follows:
|
|
Receive notification of the end of a namespace mapping. |
In the SAXParserApp.java
application, some of the notification methods are overridden to output the event type, element name, element attributes, and element text. For example, the attributes represented by the Attributes
object in the startElement(java.lang.String uri, java.lang.String localName, java.lang.String qName, Attributes atts)
method may be iterated to output the attribute name, namespace URI, and attribute value:
for (int i = 0; i < atts.getLength(); i++) { System.out.println("Attribute QName:" +atts.getQName(i)); System.out.println("Attribute Local Name:"+ atts.getLocalName(i)); System.out.println("Attribute Namespace URI:" + atts.getURI(i)); System.out.println("Attribute Value:"+atts.getValue(i)); }
The error handler methods in DefaultHandler
may also be overridden. In the SAXParserApp.java
application, the error handler methods are overridden to output the error message. We shall demonstrate error handling in a SAX parsing application with an example. Error handler methods in the DefaultHandler
class are listed in the following table:
Method Name |
Description |
---|---|
|
Receives notification of a recoverable error |
|
Receives notification of a non-recoverable error |
|
Receives notification of a warning |
Running the Java application
The SAXParserApp.java
application is listed as follows with notes about the different sections of the Java class:
1. Add the package and import statements.
package xmlparser; import oracle.xml.jaxp.*; import org.xml.sax.Attributes; import org.xml.sax.SAXException; import org.xml.sax.SAXParseException; import org.xml.sax.helpers.DefaultHandler; import java.io.*; import javax.xml.parsers.ParserConfigurationException;
2. Next, we add the Java class
SAXParserApp
.public class SAXParserApp extends DefaultHandler {
3. Next, we add a method
parseXMLDocument
to parse the XML document,catalog.xml
.public void parseXMLDocument() { try { JXSAXParserFactory factory = (JXSAXParserFactory) JXSAXParserFactory.newInstance(); JXSAXParser saxParser = (JXSAXParser) factory.newSAXParser(); InputStream input = new FileInputStream(new File("catalog.xml")); saxParser.parse(input, this); }catch (ParserConfigurationException e) { System.err.println(e.getMessage()); }catch (FileNotFoundException e) { System.err.println(e.getMessage()); }catch (IOException e) { System.err.println(e.getMessage()); }catch (SAXException e) { System.err.println(e.getMessage()); } }
4. The
startDocument
event notification method notifies about the start of an XML document.public void startDocument() throws SAXException { System.out.println("SAX Event : Start Document"); }
5. The
endDocument
event notification method notifies about the end of an XML document.public void endDocument() throws SAXException { System.out.println("SAX Event : End Document"); }
6. The
startElement
event notification method notifies about the start of an element.public void startElement(java.lang.String namespaceURI, java.lang.String localName, java.lang.String qName, Attributes atts)throws SAXException { System.out.println("SAX Event : Start Element"); System.out.println("Element QName:" + qName); System.out.println("Element Local Name:" + localName); System.out.println("Element Namespace URI:"+namespaceURI); for (int i = 0; i < atts.getLength(); i++) { System.out.println("Attribute QName:"+atts.getQName(i)); System.out.println("Attribute Local Name:"+ atts.getLocalName(i)); System.out.println("Attribute Namespace URI:"+ atts.getURI(i)); System.out.println("Attribute Value:"+atts.getValue(i)); } }
7. The event notification method
endElement
notifies about the end of an element.public void endElement(java.lang.String namespaceURI, java.lang.String localName, java.lang.String qName) throws SAXException { System.out.println("SAX Event : End Element"); System.out.println("Element QName:" + qName); }
8. The event notification method
characters
notifies about character text.public void characters(char[] ch, int start, int length) throws SAXException { System.out.println("SAX Event : Text"); String text = new String(ch, start, length).trim(); if (text.length() > 0) { System.out.println("Text:" + text); } }
9. Here, the error handling method
error
handles recoverable errors.public void error(SAXParseException e) throws SAXException { System.err.println("Error:" + e.getMessage()); }
10. The error handling method
fatalError
handles non-recoverable errors.public void fatalError(SAXParseException e) throws SAXException { System.err.println("Fatal Error:" + e.getMessage()); }
11. The error handling method
warning
handles parser warnings.public void warning(SAXParseException e) throws SAXException { System.out.println("Warning:" + e.getMessage()); }
12. Finally, add the
main
method. In themain
method, create an instance of theSAXParserApp
class and invoke theparseXMLDocument
method.public static void main(String[] argv) { SAXParserApp saxParser = new SAXParserApp(); saxParser.parseXMLDocument(); } }
13. To run the
SAXParserApp.java
application in JDeveloper, right-click on the SAXParserApp.java node in Application Navigator, and select Run.
The output from the SAX parsing application lists the SAX events, elements, and attributes in the parsed XML document.
The complete output from the SAX parsing application is listed as follows:
SAX Event : Start Document SAX Event : Start Element Element QName:catalog Element Local Name:catalog Element Namespace URI: SAX Event : Text SAX Event : Start Element Element QName:journal:journal Element Local Name:journal Element Namespace URI:http://xdk.com/catalog/journal Attribute QName:journal:title Attribute Local Name:title Attribute Namespace URI:http://xdk.com/catalog/journal Attribute Value:Oracle Magazine Attribute QName:journal:publisher Attribute Local Name:publisher Attribute Namespace URI:http://xdk.com/catalog/journal Attribute Value:Oracle Publishing Attribute QName:journal:edition Attribute Local Name:edition Attribute Namespace URI:http://xdk.com/catalog/journal Attribute Value:March-April 2008 Attribute QName:xmlns:journal Attribute Local Name:journal Attribute Namespace URI:http://www.w3.org/2000/xmlns/ Attribute Value:http://xdk.com/catalog/journal SAX Event : Text SAX Event : Start Element Element QName:journal:article Element Local Name:article Element Namespace URI:http://xdk.com/catalog/journal Attribute QName:journal:section Attribute Local Name:section Attribute Namespace URI:http://xdk.com/catalog/journal Attribute Value:Oracle Developer SAX Event : Text SAX Event : Start Element Element QName:journal:title Element Local Name:title Element Namespace URI:http://xdk.com/catalog/journal SAX Event : Text Text:Declarative Data Filtering SAX Event : End Element Element QName:journal:title SAX Event : Text SAX Event : Start Element Element QName:journal:author Element Local Name:author Element Namespace URI:http://xdk.com/catalog/journal SAX Event : Text Text:Steve Muench SAX Event : End Element Element QName:journal:author SAX Event : Text SAX Event : End Element Element QName:journal:article SAX Event : Text SAX Event : End Element Element QName:journal:journal SAX Event : Text SAX Event : Start Element Element QName:journal Element Local Name:journal Element Namespace URI: Attribute QName:title Attribute Local Name:title Attribute Namespace URI: Attribute Value:Oracle Magazine Attribute QName:publisher Attribute Local Name:publisher Attribute Namespace URI: Attribute Value:Oracle Publishing Attribute QName:edition Attribute Local Name:edition Attribute Namespace URI: Attribute Value:September-October 2008 SAX Event : Text SAX Event : Start Element Element QName:article Element Local Name:article Element Namespace URI: Attribute QName:section Attribute Local Name:section Attribute Namespace URI: Attribute Value:FEATURES SAX Event : Text SAX Event : Start Element Element QName:title Element Local Name:title Element Namespace URI: SAX Event : Text Text:Share 2.0 SAX Event : End Element Element QName:title SAX Event : Text SAX Event : Start Element Element QName:author Element Local Name:author Element Namespace URI: SAX Event : Text Text:Alan Joch SAX Event : End Element Element QName:author SAX Event : Text SAX Event : End Element Element QName:article SAX Event : Text SAX Event : End Element Element QName:journal SAX Event : Text SAX Event : End Element Element QName:catalog SAX Event : End Document
To demonstrate error handling, add an error in the example XML document. For example, remove a </journal>
node, as we did earlier. Run the SAXParserApp.java
application in JDeveloper. An error message gets outputted:
Fatal Error:<Line 15, Column 10>: XML-20121: (Fatal Error) End tag does not match start tag 'journal'.