Skip to content

Weedmaps scraper module for Nest (a web-scraping framework for Node.js)

License

Notifications You must be signed in to change notification settings

dsalehipour/nest-weedmaps

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

13 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Weedmaps Scraper Module for Nest

Scrapes every strain on Weedmaps and helps you organize by price and location.

Requirements

  • MongoDB up and running
  • Node

Installation

git clone https://github.com/dsalehipour/nest-weedmaps.git
cd nest-weedmaps
npm install

Also, make sure MongoDB is up and running. See Install MongoDB.

Usage

  1. Scrape Weedmaps by running node index.js

What's happening?

After running index.js, the workers (scraper bots) will go to the strains directory, scrape the 40 strains in the grid, store those scraped items in the database, and queue scraping jobs to those strains by their href. Then, it will paginate and scrape the next page of the strains directory.

Meanwhile, the other workers will pick the jobs in the queue, scrape the strain pages, and update the strain in the database by their href.

Try looking at the scraped data using mongo's native REPL:

mongo nest
> db.items.count()
> db.items.find().pretty()

Have fun.

About

Weedmaps scraper module for Nest (a web-scraping framework for Node.js)

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published