好用的python cElementTree

来源:互联网 发布:mac如何制作u盘安装盘 编辑:程序博客网 时间:2024/06/11 17:37

ElementTree是python的XML解析模块,cElementTree是ElementTree的C语言实现。Python 2.5的标准库已经包含了ElementTree和cElementTree。

下面是从cElementTree网站得到的测试数据:

Here are some benchmark figures, using a number of popular XML toolkits to parse a 3405k document-style XML file, from disk to memory:

library time space notes xml.dom.minidom (Python 2.1) 6.3 s 80000k (1) gnosis.objectify 2.0 s 22000k (5) xml.dom.minidom (Python 2.4) 1.4 s 53000k (1) ElementTree 1.2 1.6 s 14500k   ElementTree 1.2.4/1.3 1.1 s 14500k   cDomlette (C extension) 0.540 s 20500k (1) PyRXPU (C extension) 0.175 s 10850k (2) lxml.etree (C extension) (4) (4) (3) libxml2 (C extension) 0.098 s 16000k (3) readlines (read as utf-8) 0.093 s 8850k   cElementTree (C extension) 0.047 s 4900k   readlines (read as ascii) 0.032 s 5050k  

library time throughput xml.sax (Python 2.1) 0.330 s 10300 k/s xml.sax (Python 2.4) 0.292 s 11700 k/s xml.parsers.expat 0.184 s 18500 k/s cElementTree XMLParser 0.124 s 27500 k/s sgmlop 0.092 s 37000 k/s cElementTree iterparse 0.071 s 48000 k/s
ElementTree是一棵由元素节点构成的树,文本内容是作为元素的text或tail属性表现的,如ele.text。这点比DOM把元素和文本都作为节点的方式简洁、方便很多。element支持一些字典或列表的操作,属性用字典方式,子节点用列表。查找用find或findall函数。

 

Operation Result elem[n] Returns n'th child element. elem[m:n] Returns list of m'th through n'th child elements. len(elem) Returns number of child elements. list(elem) Returns list of child elements. elem.append(elem2) Adds elem2 as a child. elem.insert(index, elem2) Inserts elem2 at the specified location. del elem[n] Deletes n'th child element. elem.keys() Returns list of attribute names. elem.get(name) Returns value of attribute name. elem.set(name, value) Sets new value for attribute name. elem.attrib Retrieves the dictionary containing attributes. del elem.attrib[name] Deletes attribute name.
确实是好东西,而且用起来非常方便,简单的写几行代码体验一下~~~
#Python2.4下的代码
import cElementTree as ET

#解析文件
tree = ET.parse('test.xml')

#获得根节点
root = tree.getroot()

#找到第一个tagformat标签
tag = root.find('tagformat')
#遍历所有的opt标签
for ele in tag.findall('opt'):
    
print ele.text

#获得属性
print root.get('name')
#修改或新建属性
root.set('user''liujunzhi')

#以utf-8编码保存
= open('output.xml''w')
tree.write(f, encoding
='utf-8')
f.close()


<script type="text/javascript"><!--google_ad_client = "pub-5615169233988013";google_ad_width = 468;google_ad_height = 60;google_ad_format = "468x60_as";google_cpa_choice = "CAEaCDceCwgGgKIBUA1QBVAIUANQnQI";//--></script> <script type="text/javascript" src="http://pagead2.googlesyndication.com/pagead/show_ads.js"></script>
原创粉丝点击