-
Notifications
You must be signed in to change notification settings - Fork 36
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
links that read-art can not crawl #1
Comments
Hi! I'm using your module in my web crawler, called Web page Content Extractor (wce), and I've recently discovered that the read-art returns with "Error: 400 Bad Request" for these URLs, however the node-readability works on these ones, without any problem. Could you please check them? |
Hi, @mxr576, thanks a lot, there is a bug of setting |
Thanks for the fast reaction! I was suspicious too, that this should a req-fast issue. I can confirm, that the content extraction works fine on these links now with read-art. |
@Tjatse , for URL: http://mp.weixin.qq.com/s?__biz=MjYyMzc1Mjk4MA==&mid=400815255&idx=1&sn=d91b630394b8ba70209406bbf44b41e8&scene=0#wechat_redirect with pictures as article, the result is
|
https://medium.com/google-developers/drawing-a-rounded-corner-background-on-text-5a610a95af5 |
No description provided.
The text was updated successfully, but these errors were encountered: