Scrapyd, Celery и Django работают с супервизором - ошибка GenericHTTPChannellProtocol
Я использую проект под названием Django Dynamic Scraper для создания базового веб-скребка поверх Django. Все работает найти в разработке, но при настройке моего VPS Digital Ocean я сталкиваюсь с проблемами.
Я использую Supervisor для поддержки трех вещей:
- Скрапид на 0.0.0.0:6800
- Планировщик заданий из сельдерея
- Сельдерей
Всякий раз, когда Celery передает задание Scrapyd для очистки, я записываю ошибку в журнал Scrapyd:
2017-08-29T08:49:06+0000 [twisted.python.log#info] "127.0.0.1" - - [29/Aug/2017:08:49:05 +0000] "POST /schedule.json HTTP/1.1" 200 3464 "-" "-"
2017-08-29T08:49:07+0000 [_GenericHTTPChannelProtocol,5,127.0.0.1] Unhandled Error
Traceback (most recent call last):
File "/home/dean/website/venv/local/lib/python2.7/site-packages/twisted/web/http.py", line 2059, in allContentReceived
req.requestReceived(command, path, version)
File "/home/dean/website/venv/local/lib/python2.7/site-packages/twisted/web/http.py", line 869, in requestReceived
self.process()
File "/home/dean/website/venv/local/lib/python2.7/site-packages/twisted/web/server.py", line 184, in process
self.render(resrc)
File "/home/dean/website/venv/local/lib/python2.7/site-packages/twisted/web/server.py", line 235, in render
body = resrc.render(self)
--- <exception caught here> ---
File "/home/dean/website/venv/local/lib/python2.7/site-packages/scrapyd/webservice.py", line 21, in render
return JsonResource.render(self, txrequest).encode('utf-8')
File "/home/dean/website/venv/local/lib/python2.7/site-packages/scrapyd/utils.py", line 20, in render
r = resource.Resource.render(self, txrequest)
File "/home/dean/website/venv/local/lib/python2.7/site-packages/twisted/web/resource.py", line 250, in render
return m(request)
File "/home/dean/website/venv/local/lib/python2.7/site-packages/scrapyd/webservice.py", line 49, in render_POST
spiders = get_spider_list(project, version=version)
File "/home/dean/website/venv/local/lib/python2.7/site-packages/scrapyd/utils.py", line 137, in get_spider_list
raise RuntimeError(msg.encode('unicode_escape') if six.PY2 else msg)
exceptions.RuntimeError: Traceback (most recent call last):\n File "/usr/lib/python2.7/runpy.py", line 174, in _run_module_as_main\n "__main__", fname, loader, pkg_name)\n File "/usr/lib/python2.7/runpy.py", line 72, in _run_code\n exec code in run_globals\n File "/home/dean/website/venv/lib/python2.7/site-packages/scrapyd/runner.py", line 40, in <module>\n main()\n File "/home/dean/website/venv/lib/python2.7/site-packages/scrapyd/runner.py", line 37, in main\n execute()\n File "/home/dean/website/venv/local/lib/python2.7/site-packages/scrapy/cmdline.py", line 148, in execute\n cmd.crawler_process = CrawlerProcess(settings)\n File "/home/dean/website/venv/local/lib/python2.7/site-packages/scrapy/crawler.py", line 243, in __init__\n super(CrawlerProcess, self).__init__(settings)\n File "/home/dean/website/venv/local/lib/python2.7/site-packages/scrapy/crawler.py", line 134, in __init__\n self.spider_loader = _get_spider_loader(settings)\n File "/home/dean/website/venv/local/lib/python2.7/site-packages/scrapy/crawler.py", line 330, in _get_spider_loader\n return loader_cls.from_settings(settings.frozencopy())\n File "/home/dean/website/venv/local/lib/python2.7/site-packages/scrapy/spiderloader.py", line 61, in from_settings\n return cls(settings)\n File "/home/dean/website/venv/local/lib/python2.7/site-packages/scrapy/spiderloader.py", line 25, in __init__\n self._load_all_spiders()\n File "/home/dean/website/venv/local/lib/python2.7/site-packages/scrapy/spiderloader.py", line 47, in _load_all_spiders\n for module in walk_modules(name):\n File "/home/dean/website/venv/local/lib/python2.7/site-packages/scrapy/utils/misc.py", line 71, in walk_modules\n submod = import_module(fullpath)\n File "/usr/lib/python2.7/importlib/__init__.py", line 37, in import_module\n __import__(name)\n File "/home/dean/website/venv/local/lib/python2.7/site-packages/dynamic_scraper/spiders/checker_test.py", line 9, in <module>\n from dynamic_scraper.spiders.django_base_spider import DjangoBaseSpider\n File "/home/dean/website/venv/local/lib/python2.7/site-packages/dynamic_scraper/spiders/django_base_spider.py", line 13, in <module>\n django.setup()\n File "/home/dean/website/venv/local/lib/python2.7/site-packages/django/__init__.py", line 22, in setup\n configure_logging(settings.LOGGING_CONFIG, settings.LOGGING)\n File "/home/dean/website/venv/local/lib/python2.7/site-packages/django/conf/__init__.py", line 56, in __getattr__\n self._setup(name)\n File "/home/dean/website/venv/local/lib/python2.7/site-packages/django/conf/__init__.py", line 41, in _setup\n self._wrapped = Settings(settings_module)\n File "/home/dean/website/venv/local/lib/python2.7/site-packages/django/conf/__init__.py", line 110, in __init__\n mod = importlib.import_module(self.SETTINGS_MODULE)\n File "/usr/lib/python2.7/importlib/__init__.py", line 37, in import_module\n __import__(name)\nImportError: No module named IG_Tracker.settings\n
В последней строке трассировки стека, похоже, возникают проблемы с импортом настроек моего проекта Django в настройки проекта scrapy. Мой проект scrapy находится внутри одного из моих приложений Django, как рекомендовано Django Dynamic Scraper.
Вот мой файл настроек Scrapy, в который он пытается импортировать настройки Django (и успешно развивается):
import os
import sys
sys.path.append('../../../IG_Tracker/')
os.environ['DJANGO_SETTINGS_MODULE'] = 'IG_Tracker.settings'
Моя конфигурация Scrapyd Supervisor:
[program:scrapyd]
directory=//home/dean/website/instagram/ig_scraper
command=/home/dean/website/venv/bin/scrapyd -n
environment=MY_SETTINGS=/home/dean/website/IG_Tracker/settings.py
user=dean
autostart=true
autorestart=true
redirect_stderr=true
numprocs=1
stdout_logfile=/home/dean/website/scrapyd.log
stderr_logfile=/home/dean/website/scrapyd.log
startsecs=10