The most common things i find myself doing when working with xml files are the following.
- Read the content of an xml tag
- Read attribute value
- Add a node
- Delete a node
Lets see how to do this in python using minidom. For the purpose of this post, lets assume that the name of the file is "file.xml" with following content.
<?xml version="1.0" encoding="UTF-8" ?>
<menu>
  <food id="1">
    <name>Pesto Chicken Sandwich</name>
    <price>$7.50</price>
  </food>
  <food id="2">
    <name>Chipotle Chicken Pizza</name>
    <price>$12.00</price>
  </food>
  <food id="3">
    <name>Burrito</name>
    <price>$6.20</price>
  </food>
</menu>
Read the content of xml tag
from xml.dom import minidom
doc = minidom.parse('file.xml')
nodes = doc.getElementsByTagName('name')
for node in nodes:
    print node.firstChild.nodeValue
Output:
Pesto Chicken Sandwich
Chipotle Chicken Pizza
Burrito
Read attribute value
from xml.dom import minidom
doc = minidom.parse('file.xml')
nodes = doc.getElementsByTagName('food')
for node in nodes:
    if node.attributes.has_key('id'):
        print node.attributes['id'].value
Output:
1
2
3
Add a node
Lets add a <rating> node with default value 5 to each of the food item to know its popularity.
from xml.dom import minidom
doc = minidom.parse('file.xml')
nodes = doc.getElementsByTagName('food')
for node in nodes:
    rating = doc.createElement('rating')
    rating.setAttribute('value','5')
    text = doc.createTextNode('Average')
    rating.appendChild(text)
    node.appendChild(rating)
ofile = open('newfile.xml','w')
doc.writexml(ofile)
ofile.close()
Output: The resulting xml looks like:
<?xml version="1.0" ?>
<menu>
  <food id="1">
    <name>Pesto Chicken Sandwich</name>
    <price>$7.50</price>
    <rating value="5">Average</rating>
   </food>
  <food id="2">
    <name>Chipotle Chicken Pizza</name>
    <price>$12.00</price>
    <rating value="5">Average</rating>
   </food>
  <food id="3">
    <name>Burrito</name>
    <price>$6.20</price>
    <rating value="5">Average</rating>
   </food>
</menu>
Delete a node
Lets now delete the <rating> tag from the food item.
from xml.dom import minidom
doc = minidom.parse('file.xml')
nodes = doc.getElementsByTagName('rating')
for node in nodes:
    parent = node.parentNode
    parent.removeChild(node)
ofile = open('newfile.xml','w')
doc.writexml(ofile)
ofile.close()
             
This result is xml file similar to the one we started with.