Пиориент проблемы парсинга
В настоящее время я пытаюсь заполнить базу данных графа orientdb, используя pyorient. В целом все работает хорошо. Тем не менее, я наткнулся на проблему разбора с одной из моих команд. В Python, если я запускаю следующий код:
>>> ab = 'UPDATE Patent SET primary_id = 676, original_abstract = set(original_abstract, "<p num=\\"0000\\">The present invention relates to compounds of the general formula (I) wherein\\n\\nR<sup>1</sup> is the group (A) or (B) or (C) or (D); R<sup>2</sup> is a non aromatic\\n\\nheterocycle, or is OR\' or N(R\\")<sub>2</sub>; R\' is lower alkyl,\\n\\nlower alkyl substituted by halogen or -(CH<sub>2</sub>)<sub>n</sub>-cycloalkyl;\\n\\nR\\" is lower alkyl; R<sup>3</sup> is NO<sub>2</sub>, CN or SO<sub>2</sub>R\';\\n\\nR<sup>4 </sup>is hydrogen, hydroxy, halogen, NO<sub>2</sub>, lower alkyl, lower\\n\\nalkyl, substituted by halogen, lower alkoxy, SO<sub>2</sub>R\' or C(O)OR\\";\\n\\nR<sup>5</sup>/R<sup>6</sup>/R<sup>7</sup> are hydrogen, halogen, lower alkyl\\n\\nor lower alkyl, substituted by halogen; X<sup>1</sup>/X<sup>1\\u00bf</sup>\\n\\nare CH or N, with the proviso that X<sup>1</sup>/X<sup>1\\u00bf</sup> are not simultaneously\\n\\nCH; X<sup>2</sup> is O, S, NH or N(lower alkyl); n is 0, l or 2; and to pharmaceutically\\n\\nactive acid addition salts and to their use in the treatment of neurological and\\n\\nneuropsychiatric disorders.</p>") UPSERT WHERE primary_id = 676'
>>> client.batch(ab)
Я получаю следующие ошибки:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/Users/shaungupta/anaconda/lib/python2.7/site-packages/pyorient/orient.py", line 402, in batch
.prepare(( QUERY_SCRIPT, ) + args).send().fetch_response()
File "/Users/shaungupta/anaconda/lib/python2.7/site-packages/pyorient/messages/commands.py", line 145, in fetch_response
super( CommandMessage, self ).fetch_response()
File "/Users/shaungupta/anaconda/lib/python2.7/site-packages/pyorient/messages/base.py", line 256, in fetch_response
self._decode_all()
File "/Users/shaungupta/anaconda/lib/python2.7/site-packages/pyorient/messages/base.py", line 240, in _decode_all
self._decode_header()
File "/Users/shaungupta/anaconda/lib/python2.7/site-packages/pyorient/messages/base.py", line 192, in _decode_header
[ exception_message.decode( 'utf8' ) ]
pyorient.exceptions.PyOrientCommandException: com.orientechnologies.orient.core.sql.parser.TokenMgrError - Lexical error at line 1, column 311. Encountered: <EOF> after : "\"<p num=\\\"0000\\\">The present invention relates to compounds of the general formula (I) wherein\\n\\nR<sup>1</sup> is the group (A) or (B) or (C) or (D); R<sup>2</sup> is a non aromatic\\n\\nheterocycle, or is OR\' or N(R\\\")<sub>2</sub>"
Если я запускаю это с помощью команды, я также получаю ошибки:
>>> client.command(ab)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/Users/shaungupta/anaconda/lib/python2.7/site-packages/pyorient/orient.py", line 398, in command
.prepare(( QUERY_CMD, ) + args).send().fetch_response()
File "/Users/shaungupta/anaconda/lib/python2.7/site-packages/pyorient/messages/commands.py", line 145, in fetch_response
super( CommandMessage, self ).fetch_response()
File "/Users/shaungupta/anaconda/lib/python2.7/site-packages/pyorient/messages/base.py", line 256, in fetch_response
self._decode_all()
File "/Users/shaungupta/anaconda/lib/python2.7/site-packages/pyorient/messages/base.py", line 240, in _decode_all
self._decode_header()
File "/Users/shaungupta/anaconda/lib/python2.7/site-packages/pyorient/messages/base.py", line 192, in _decode_header
[ exception_message.decode( 'utf8' ) ]
pyorient.exceptions.PyOrientCommandException: com.orientechnologies.orient.core.sql.OCommandSQLParsingExceptioncom.orientechnologies.orient.core.exception.OSerializationException - Error on parsing command at position #0: Error on reading parameters in: set(original_abstract, "<p num="0000">The present invention relates to compounds of the general formula (I) wherein
R<sup>1</sup> is the group (A) or (B) or (C) or (D); R<sup>2</sup> is a non aromatic
heterocycle, or is OR' or N(R")<sub>2</sub>; R' is lower alkyl,
lower alkyl substituted by halogen or -(CH<sub>2</sub>)<sub>n</sub>-cycloalkyl;
R" is lower alkyl; R<sup>3</sup> is NO<sub>2</sub>, CN or SO<sub>2</sub>R';
R<sup>4 </sup>is hydrogen, hydroxy, halogen, NO<sub>2</sub>, lower alkyl, lower
alkyl, substituted by halogen, lower alkoxy, SO<sub>2</sub>R' or C(O)OR";
R<sup>5</sup>/R<sup>6</sup>/R<sup>7</sup> are hydrogen, halogen, lower alkyl
or lower alkyl, substituted by halogen; X<sup>1</sup>/X<sup>1¿</sup>
are CH or N, with the proviso that X<sup>1</sup>/X<sup>1¿</sup> are not simultaneously
CH; X<sup>2</sup> is O, S, NH or N(lower alkyl); n is 0, l or 2; and to pharmaceutically
active acid addition salts and to their use in the treatment of neurological and
neuropsychiatric disordersFound invalid ) character at position 229 of text original_abstract, "<p num="0000">The present invention relates to compounds of the general formula (I) wherein
R<sup>1</sup> is the group (A) or (B) or (C) or (D); R<sup>2</sup> is a non aromatic
heterocycle, or is OR' or N(R")<sub>2</sub>. Ensure it is opened and closed correctly.
Однако я обнаружил, что если я не добавляю информацию в original_abstract как набор, он работает при использовании команды:
>>> aba = 'UPDATE Patent SET primary_id = 676, original_abstract = "<p num=\\"0000\\">The present invention relates to compounds of the general formula (I) wherein\\n\\nR<sup>1</sup> is the group (A) or (B) or (C) or (D); R<sup>2</sup> is a non aromatic\\n\\nheterocycle, or is OR\' or N(R\\")<sub>2</sub>; R\' is lower alkyl,\\n\\nlower alkyl substituted by halogen or -(CH<sub>2</sub>)<sub>n</sub>-cycloalkyl;\\n\\nR\\" is lower alkyl; R<sup>3</sup> is NO<sub>2</sub>, CN or SO<sub>2</sub>R\';\\n\\nR<sup>4 </sup>is hydrogen, hydroxy, halogen, NO<sub>2</sub>, lower alkyl, lower\\n\\nalkyl, substituted by halogen, lower alkoxy, SO<sub>2</sub>R\' or C(O)OR\\";\\n\\nR<sup>5</sup>/R<sup>6</sup>/R<sup>7</sup> are hydrogen, halogen, lower alkyl\\n\\nor lower alkyl, substituted by halogen; X<sup>1</sup>/X<sup>1\\u00bf</sup>\\n\\nare CH or N, with the proviso that X<sup>1</sup>/X<sup>1\\u00bf</sup> are not simultaneously\\n\\nCH; X<sup>2</sup> is O, S, NH or N(lower alkyl); n is 0, l or 2; and to pharmaceutically\\n\\nactive acid addition salts and to their use in the treatment of neurological and\\n\\nneuropsychiatric disorders.</p>" UPSERT WHERE primary_id = 676'
>>> client.command(aba)
['1']
Однако client.batch все еще не может обработать это правильно:
>>> client.batch(aba)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/Users/shaungupta/anaconda/lib/python2.7/site-packages/pyorient/orient.py", line 402, in batch
.prepare(( QUERY_SCRIPT, ) + args).send().fetch_response()
File "/Users/shaungupta/anaconda/lib/python2.7/site-packages/pyorient/messages/commands.py", line 145, in fetch_response
super( CommandMessage, self ).fetch_response()
File "/Users/shaungupta/anaconda/lib/python2.7/site-packages/pyorient/messages/base.py", line 256, in fetch_response
self._decode_all()
File "/Users/shaungupta/anaconda/lib/python2.7/site-packages/pyorient/messages/base.py", line 240, in _decode_all
self._decode_header()
File "/Users/shaungupta/anaconda/lib/python2.7/site-packages/pyorient/messages/base.py", line 192, in _decode_header
[ exception_message.decode( 'utf8' ) ]
pyorient.exceptions.PyOrientCommandException: com.orientechnologies.orient.core.exception.OSerializationException - Found invalid ) character at position 274 of text UPDATE Patent SET primary_id = 676, original_abstract = "<p num=\"0000\">The present invention relates to compounds of the general formula (I) wherein\n\nR<sup>1</sup> is the group (A) or (B) or (C) or (D); R<sup>2</sup> is a non aromatic\n\nheterocycle, or is OR' or N(R\")<sub>2</sub>; R' is lower alkyl,\n\nlower alkyl substituted by halogen or -(CH<sub>2</sub>)<sub>n</sub>-cycloalkyl;\n\nR\" is lower alkyl; R<sup>3</sup> is NO<sub>2</sub>, CN or SO<sub>2</sub>R';\n\nR<sup>4 </sup>is hydrogen, hydroxy, halogen, NO<sub>2</sub>, lower alkyl, lower\n\nalkyl, substituted by halogen, lower alkoxy, SO<sub>2</sub>R' or C(O)OR\";\n\nR<sup>5</sup>/R<sup>6</sup>/R<sup>7</sup> are hydrogen, halogen, lower alkyl\n\nor lower alkyl, substituted by halogen; X<sup>1</sup>/X<sup>1\u00bf</sup>\n\nare CH or N, with the proviso that X<sup>1</sup>/X<sup>1\u00bf</sup> are not simultaneously\n\nCH; X<sup>2</sup> is O, S, NH or N(lower alkyl); n is 0, l or 2; and to pharmaceutically\n\nactive acid addition salts and to their use in the treatment of neurological and\n\nneuropsychiatric disorders.</p>" UPSERT WHERE primary_id = 676. Ensure it is opened and closed correctly.
Запуск строки ab непосредственно в консоли orientdb также вызывает те же ошибки, однако запуск строки aba в консоли orientdb работает нормально, поэтому я не могу понять, почему aba не работает с пакетной командой pyorient (мне также удалось запустить aba в orientdb консоль как часть начала / совершения транзакции с успехом).
Кто-нибудь понимает, почему эта команда работает через client.command, а не client.batch? Мне нужно, чтобы эта команда выполнялась как часть пакета команд, поэтому мне нужно найти исправление..... В идеале, я хотел бы иметь команду в строке ab, где я добавляю оригинальный реферат в качестве набора для работы, так как мне нужно отслеживать любую новую информацию для соответствующих узлов.
Из того, что я вижу, это ограничение синтаксического анализа исполнителя команды, но, пожалуйста, скажите мне, если я здесь что-то не так делаю...
Спасибо!!
1 ответ
Первая ошибка показывает, что в вашей команде есть символ "Конец файла" (EOF), который может быть невидим. Обычно, если он скопирован откуда-то еще.