Разбор XML в Python- возникли проблемы при написании цикла
Я пробую XML ICD-10 и хочу поместить все термины в CSV. Я думаю, что у меня это работает, но есть ли более эффективный способ перебрать все вложенные теги и получить и.
<section id="A00-A09">
<desc>Intestinal infectious diseases (A00-A09)</desc>
<diag>
<name>A02</name>
<desc>Other salmonella infections</desc>
<includes>
<note>infection or foodborne intoxication due to any Salmonella species other than S. typhi and S.paratyphi
</note>
</includes>
<diag>
<name>A02.0</name>
<desc>Salmonella enteritis</desc>
<inclusionTerm>
<note>Salmonellosis</note>
</inclusionTerm>
</diag>
<diag>
<name>A02.1</name>
<desc>Salmonella sepsis</desc>
</diag>
<diag>
<name>A02.2</name>
<desc>Localized salmonella infections</desc>
<diag>
<name>A02.20</name>
<desc>Localized salmonella infection, unspecified</desc>
</diag>
<diag>
<name>A02.21</name>
<desc>Salmonella meningitis</desc>
</diag>
<diag>
<name>A02.22</name>
<desc>Salmonella pneumonia</desc>
</diag>
<diag>
<name>A02.23</name>
<desc>Salmonella arthritis</desc>
</diag>
<diag>
<name>A02.24</name>
<desc>Salmonella osteomyelitis</desc>
</diag>
<diag>
<name>A02.25</name>
<desc>Salmonella pyelonephritis</desc>
<inclusionTerm>
<note>Salmonella tubulo-interstitial nephropathy</note>
</inclusionTerm>
</diag>
<diag>
<name>A02.29</name>
<desc>Salmonella with other localized infection</desc>
</diag>
</diag>
<diag>
<name>A02.8</name>
<desc>Other specified salmonella infections</desc>
</diag>
<diag>
<name>A02.9</name>
<desc>Salmonella infection, unspecified</desc>
</diag>
</diag>
</section>
Это код, который я использую.
import xml.etree.ElementTree as ET
import csv
csvOut = open("diagnostics.csv", "wb")
csvwriter = csv.writer(csvOut, delimiter='|')
tree = ET.parse('/Users/mike/PycharmProjects/xml_sqlite_icd/Tabular.xml')
root = tree.getroot()
for diag in root.iter('section'): # Loop through every diagnostic tree
sdesc = diag.find('desc').text.encode('utf8')
for node in diag.iter('diag'):
t = node.find('name').text.encode('utf8')
desc = node.find('desc').text.encode('utf8')
for diag1 in node.iter('diag'): # Loop through every diagnostic tree
temp_3 = []
temp_4 = []
temp3 = []
temp4 = []
name = diag1.find('name').text.encode('utf8') # Extract the diag code
desc1 = diag1.find('desc').text.encode('utf8') # Extract the description
for node1 in diag1.iter('inclusionTerm'):
t3 = node1.find('note').text.encode('utf8')
v3 = t3
temp_3.append(v3)
for node1 in diag1.iter('includes'):
t4 = node1.find('note').text.encode('utf8')
v4 = t4
temp_4.append(v4)
for diag2 in node1.iter('diag'): # Loop through every diagnostic tree
name = diag2.find('name').text.encode('utf8') # Extract the diag code
desc1 = diag2.find('desc').text.encode('utf8') # Extract the description
for node2 in diag2.iter('inclusionTerm'):
t_3 = node2.find('note').text.encode('utf8')
v_3 = t_3
temp3.append(v_3)
for node2 in diag2.iter('includes'):
t_4 = node2.find('note').text.encode('utf8')
v_4 = t_4
temp4.append(v_4)
csvwriter.writerow(
[diag.attrib["id"], sdesc, t, desc, name, desc1,temp_3, temp_4,temp3, temp4, temp7])