Python XML Parse with xml attributes

I have many rows in a file that contains xml and I'm trying to write a Python script that will go through those rows and count how many instances of a particular node attribute show up. For instance, my tree looks like:

<foo>
   <bar>
      <type name="controller">A</type>
      <type name="channel">12</type>
   </bar>
</foo>

I want to get text of line with 'name="controller"'. In the above xml text, I need to receive "A" and not "controller".

I used xml.etree.ElementTree but it shows me the value of name attribute that is "controller".


Assuming your file is input.xml . You can use the following piece of code :

import xml.etree.ElementTree as ET

tree = ET.parse('input.xml')
tree_ = tree.findall('bar')

for i in tree_:
    i_ = i.findall('type')

    for elem in i_:
        if elem.attrib['name'] == 'controller':
            print elem.text

For xml.etree.ElementTree , use the text property of an Element to get the text inside the element -

Example -

import xml.etree.ElementTree as ET
x = ET.fromstring('<a>This is the text</a>')
x.text
>> 'This is the text'

ElementTree supports some limited XPath (XPath is a language for specifying nodes in an xml file). We can use this to find all of your desired nodes and the text attribute to get their content.

import xml.etree.ElementTree as ET

tree = ET.parse("filename.xml")

for x in tree.findall(".//type[@name='controller']"):
    print(x.text)

This will loop over all type elements whose name attribute is controller. In XPath the .// means all descendents of the current node and the name type means just those whose tag is type. The bracket is a predicate expression which means only nodes satisfiing a condition. @name means the name attribute. Thus this expression means to select all type nodes (no matter how deep) with a name attribute equal to controller.

In this example, I have just printed the text in the node. You can do whatever you want in the body of that loop.

If you want all nodes with that attribute and not just the type nodes, replace the argument to the findall function with

.//*[@name='controller']

The * matches ANY element node.

链接地址: http://www.djcxy.com/p/29956.html

上一篇: 什么是实现JSR的API

下一篇: Python XML解析XML属性