Examples of Python XML Processing
Python uses the Document Object Model to store and manipulate XML elements. It automatically parses XML code and creates a Document object to model it, which contains XML nodes. The XML library can then step through the document to find different elements and attributes. You must import the library "xml.dom.minidom" in order to access Python's DOM XML processing library.
-
Accessing and Reading an XML File
-
Python's XML library can automatically parse XML text files or text strings which represent an XML document. Here is an example of how to parse an XML text file and store the result as a Python Document object:
xmlFile = open ("sample.xml")
xmlDocument = xml.dom.minidom.parse(xmlFile)If you want to parse a string of XML, you need the "parseString" function.
xmlString = "<parentnode> <childnode /> <childnode /> <differentchildnode> A different node </differentchildnode> </parentnode>"
xmlDocument = xml.dom.minidom.parseString (xmlString);
Accessing Element Nodes and Child Nodes
-
XML elements are represented as nodes. To access an element in the XML document, you must search for it by name with the method "getElementsByTagName," e.g.:
searchResults = getElementsByTagName("ExampleNode")
This line of code returns a NodeList. To get a single element node, you can simply call:
myNode = searchResults[0]
Nodes can contain child nodes. You can get a list of child nodes given a reference to a parent node, e.g.:
myChildNodes = myNode.childNodes
-
Data and Attributes
-
To access the data contained within an element's opening and closing tags, you must access the "data" field of the node object. For instance, if a node named "myNode" represented the XML text "<ExampleNode attr1="1" attr2="2"> sampletext </ExampleNode>" in a document, then you could extract the word "sampletext" by referencing "myNode.data." To access the attributes of that node, you would need to first access the NamedNodeMap of the attributes as follows:
attrList = myNode.attributes
Then from that list, you extract the names and values of the attributes:
myList = []
for i in range(attrList.length): myList.append(attrList.item(i).name + '= ' + attrList.item(i).value)
print "; ".join(myList)
Writing XML Code
-
You can write a Node object to any "writable" object using the function "writexml." This includes writable files. An example is as follows:
destinationFile = open( "samplewrite.xml", "w" )
doc.writexml(destinationFile)
destinationFile.close()You can also print the XML document as a string using the function "toxml" or "toprettyxml," e.g.:
print doc.toxml()
or
print doc.toprettyxml()
The function "toprettyxml" uses spaces and indents to make the XML more readable to humans.
-