You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This looks interesting to speed things up, have not looked at it closely, hope it looks at more than the URL ;)
This is a Scrapy spider middleware to ignore requests to pages containing items seen in previous crawls of the same spider, thus producing a "delta crawl" containing only new items.
This also speeds up the crawl, by reducing the number of requests that need to be crawled, and processed (typically, item requests are the most CPU intensive).
This looks interesting to speed things up, have not looked at it closely, hope it looks at more than the URL ;)
https://github.com/scrapy-plugins/scrapy-deltafetch
The text was updated successfully, but these errors were encountered: