11.7 Reading XML documents
The XML markup language is widely used to represent the state of objects in a serialized form. For details, see http://www.w3.org/TR/REC-xml/. Python includes a number of libraries for parsing XML documents.
XML is called a markup language because the content of interest is marked with tags, written with a start <tag> and an end </tag>, used to define the structure of the data. The overall file text includes both the content and the XML markup.
Because the markup is intermingled with the text, there are some additional syntax rules that must be used to distinguish markup from text. A document must use < instead of <, > instead of >, and & instead of & in text. Additionally, " is also used to embed a " character in an attribute value. For the most part, XML parsers will handle this transformation when consuming XML.
The example document, then, will have items as follows...