Easy off on crawling #36

ojongerius · 2017-10-30T01:19:26Z

We crawl once a day, for bigger sites this can imply visiting all the job urls, of which me might have seen most. Some solutions:

Fetch only new jobs (new that 24 or 28 hours).
Save job urls (we can search, or add functionality to search on URL to the REST API), and check when it was last seen.

merge PR with a solution
re-enble scheduled jobs

ojongerius · 2017-10-31T01:09:27Z

Until we can update objects in #26 , I decided to not revisit URLs that are already associated with an existing job. This will solve most of our use cases; we save on a lot of unnecessary traffic, and I don't expect objects to change often.

ojongerius added enhancement help wanted labels Oct 30, 2017

ojongerius closed this as completed Oct 31, 2017

ojongerius mentioned this issue Oct 31, 2017

Stop visiting job urls that have already been created #40

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Easy off on crawling #36

Easy off on crawling #36

ojongerius commented Oct 30, 2017 •

edited

Loading

ojongerius commented Oct 31, 2017

Easy off on crawling #36

Easy off on crawling #36

Comments

ojongerius commented Oct 30, 2017 • edited Loading

ojongerius commented Oct 31, 2017

ojongerius commented Oct 30, 2017 •

edited

Loading