Reading XML documents
The XML markup language is widely used to organize data. For details, see http://www.w3.org/TR/REC-xml/. Python includes a number of libraries for parsing XML documents.
XML is called a markup language because the content of interest is marked with <tag>
and </tag>
constructs that define the structure of the data. The overall file includes the content plus the XML markup text.
Because the markup is intermingled with our text, there are some additional syntax rules that must be used. In order to include the <
character in our data, we'll use XML character entity references to avoid confusion. We use <
to be able to include <
in our text. Similarly, >
is used instead of >
, &
is used instead of &
, and "
is also used to embed a "
in an attribute value.
A document, then, will have items as follows:
<team><name>Team SCA</name><position>...</position></team>
Most XML processing allows...