We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
mongodb只生成了Fans和Follows两个表,然后爬数据一直显示302,没有爬到数据。登录又显示成功,cookie获取成功,哪位高手解答下,万分感谢!
登录提示: 2017-11-07 10:45:58 [Sina_spider1.cookies] WARNING: Get Cookie Success!( Account:我是马赛克 ) 2017-11-07 10:45:58 [urllib3.connectionpool] DEBUG: Starting new HTTPS connection (1): login.sina.com.cn 2017-11-07 10:45:58 [urllib3.connectionpool] DEBUG: https://login.sina.com.cn:443 "POST /sso/login.php?client=ssologin.js(v1.4.18) HTTP/1.1" 200 None 2017-11-07 10:45:58 [Sina_spider1.cookies] WARNING: Get Cookie Success!( Account:我是马赛克 )
爬内容时提示: 2017-11-07 10:46:00 [scrapy.downloadermiddlewares.redirect] DEBUG: Redirecting (302) to <GET https://weibo.cn/5235640836/follow> from <GET http://weibo.cn/5235640836/follow> 2017-11-07 10:46:40 [scrapy.downloadermiddlewares.redirect] DEBUG: Redirecting (302) to <GET https://weibo.cn/5235640836/fans> from <GET http://weibo.cn/5235640836/fans> 2017-11-07 10:46:59 [scrapy.extensions.logstats] INFO: Crawled 0 pages (at 0 pages/min), scraped 0 items (at 0 items/min)
The text was updated successfully, but these errors were encountered:
需要将spider中所有http改成https即可
Sorry, something went wrong.
1、已将spider中所有http改成https即可; 2、修改getcookie函数 browser = webdriver.PhantomJS(executable_path=r"D:\java\Python27\我是路径马赛克\phantomjs.exe") 3、获取cookie正常: Get Cookies Finish!( Num:1) 4、系统环境win7 64位 4G内存
但是获取cookie后弹出系统错误:python.exe 已停止运行 错误如下;
问题签名: 问题事件名称: BEX 应用程序名: python.exe 应用程序版本: 0.0.0.0 应用程序时间戳: 4c303241 故障模块名称: MSVCR90.dll 故障模块版本: 9.0.30729.6161 故障模块时间戳: 4dace5b9 异常偏移: 00066d03 异常代码: c0000417 异常数据: 00000000 OS 版本: 6.1.7601.2.1.0.256.1 区域设置 ID: 2052 其他信息 1: abf7 其他信息 2: abf7f34af3b04ddccc0d33fe401c1c02 其他信息 3: 79a5 其他信息 4: 79a5afb460eb4649151b9562e857bf2f
程序并没有报错,如何处理求指教
我不是使用的这个 browser = webdriver.PhantomJS 用的火狐的driver
No branches or pull requests
mongodb只生成了Fans和Follows两个表,然后爬数据一直显示302,没有爬到数据。登录又显示成功,cookie获取成功,哪位高手解答下,万分感谢!
登录提示:
2017-11-07 10:45:58 [Sina_spider1.cookies] WARNING: Get Cookie Success!( Account:我是马赛克 )
2017-11-07 10:45:58 [urllib3.connectionpool] DEBUG: Starting new HTTPS connection (1): login.sina.com.cn
2017-11-07 10:45:58 [urllib3.connectionpool] DEBUG: https://login.sina.com.cn:443 "POST /sso/login.php?client=ssologin.js(v1.4.18) HTTP/1.1" 200 None
2017-11-07 10:45:58 [Sina_spider1.cookies] WARNING: Get Cookie Success!( Account:我是马赛克 )
爬内容时提示:
2017-11-07 10:46:00 [scrapy.downloadermiddlewares.redirect] DEBUG: Redirecting (302) to <GET https://weibo.cn/5235640836/follow> from <GET http://weibo.cn/5235640836/follow>
2017-11-07 10:46:40 [scrapy.downloadermiddlewares.redirect] DEBUG: Redirecting (302) to <GET https://weibo.cn/5235640836/fans> from <GET http://weibo.cn/5235640836/fans>
2017-11-07 10:46:59 [scrapy.extensions.logstats] INFO: Crawled 0 pages (at 0 pages/min), scraped 0 items (at 0 items/min)
The text was updated successfully, but these errors were encountered: