scrapydweb爬虫框架
简介
Installation
pip install scrapydweb==1.0.0rc1
[root@server opt]# scrapydweb -h
>>> Main pid: 25327
>>> scrapydweb version: 1.0.0rc1
>>> Run 'scrapydweb -h' to get help
>>> Loading default settings from /usr/local/lib/python2.7/site-packages/scrapydweb/default_settings.py
!!! Overriding custom settings from /opt/scrapydweb_settings_v4.py
usage: scrapydweb [-h] [--bind BIND] [-p PORT] [-ss SCRAPYD_SERVER]
[--disable_auth] [--disable_cache] [--delete_cache]
[--disable_email] [--debug] [--verbose]
ScrapydWeb -- Full-featured web UI for Scrapyd cluster management, Scrapy log
analysis & visualization
optional arguments:
-h, --help show this help message and exit
--bind BIND current: 0.0.0.0, note that setting 0.0.0.0 or IP-OF-
CURRENT-HOST makes ScrapydWeb server visible
externally, otherwise, set 127.0.0.1 to disable that
-p PORT, --port PORT current: 5000, the port which ScrapydWeb would run on
-ss SCRAPYD_SERVER, --scrapyd_server SCRAPYD_SERVER
current: ['server', ('username', 'password',
'localhost', '6801', 'group')], type '-ss 127.0.0.1
-ss username:password@192.168.123.123:6801#group' to
set up any number of Scrapyd servers to control.
--disable_auth current: False, append '--disable_auth' to disable
basic auth for web UI
--disable_cache current: False, append '--disable_cache' to disable
caching HTML for Log and Stats page in the background
periodically
--delete_cache current: False, append '--delete_cache' to delete
cached HTML files of Log and Stats page at startup
--disable_email current: True, append '--disable_email' to disable
email notice
--debug current: False, append '--debug' to enable debug mode
and the debugger would be available in the browser
--verbose current: False, append '--verbose' to set logging
leverl to DEBUG for getting more information about how
ScrapydWeb works
scrapydweb -h执行后在当前目录产生一个配置文件如:/opt/scrapydweb_settings_v4.py
scrapydweb 开始运行
配置文件 /opt/scrapydweb_settings_v4.py
SCRAPYDWEB_BIND = '0.0.0.0'
SCRAPYDWEB_PORT = 4998
DISABLE_AUTH = False
USERNAME = 'root'
PASSWORD = '123456'
安装Scrapy
pip install Scrapy