Reading XML documents
The XML markup language is widely used to represent the state of objects in a serialized form. For details, see http://www.w3.org/TR/REC-xml/. Python includes a number of libraries for parsing XML documents.
XML is called a markup language because the content of interest is marked with tags, and also written with a start <tag>
and an end </tag>
to clarify the structure of the data. The overall file text includes both the content and the XML markup.
Because the markup is intermingled with the text, there are some additional syntax rules that must be used to distinguish markup from text. In order to include the <
character in our data, we must use XML character entity references. We must use <
to include <
in our text. Similarly, >
must be used instead of >
, &
, which is used instead of &
. Additionally, "
is also used to embed a "
character in an attribute value delimited by "
characters....