Проблема кодирования при разборе XML
Я пытаюсь разобрать XML с минидом (Python). У меня есть URL с моими данными XML.
Но я получаю ошибку:
DjangoUnicodeDecodeError
Кажется, это проблема кодирования при разборе!
Это мой код:
url = "http://sirh6.eolia-software.com/wscvsearch.asp"
datasource = urllib2.urlopen(url)
data = datasource.read()#read() to get the sourcecode of the url
dom = parseString(data)
handleElements(dom)
Мой xml:
<!--?xml version="1.0" encoding="ISO-8859-1" ?-->
...
(see the source code of the url above)
Я старался:
1-
url = "http://sirh6.eolia-software.com/wscvsearch.asp"
datasource = urllib2.urlopen(url)
data = datasource.read().decode('iso8859-1')
dom = parseString(data)
Я получил:
UnicodeEncodeError
'ascii' codec can't encode character u'\x92' in position 345: ordinal not in range(128)
The string that could not be encoded/decoded was: nce darchi
Проследить:
File "/usr/local/lib/python2.7/dist-packages/django/core/handlers/base.py" in get_response
100. response = callback(request, *callback_args, **callback_kwargs)
File "/usr/local/lib/python2.7/dist-packages/annoying/decorators.py" in wrapper
74. output = function(request, *args, **kwargs)
File "/home/astrocybernaute/projects/emploi/views.py" in parse_xml
880. dom = parseString(data)
File "/usr/lib/python2.7/xml/dom/minidom.py" in parseString
1924. return expatbuilder.parseString(string)
File "/usr/lib/python2.7/xml/dom/expatbuilder.py" in parseString
940. return builder.parseString(string)
File "/usr/lib/python2.7/xml/dom/expatbuilder.py" in parseString
223. parser.Parse(string, True)
Exception Type: UnicodeEncodeError at /emploi/xml/
Exception Value: 'ascii' codec can't encode character u'\x92' in position 345: ordinal not in range(128)
Спасибо