Skip to content

jdanprad0/INE5454-Topicos-Especiais-em-Gerencia-de-Dados

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

INE5454-Topicos-Especiais-em-Gerencia-de-Dados

This application is a crawler that extracts data from Wikipedia São Paulo's Page, Wikipedia Rio de Janeiro's Page, Wikipedia Minas Gerais' Page.

Therefore, as a result from the spider's execution, the files below contain Santa Catarina's cities information:

minas_gerais.csv and minas_gerais.json, rio_de_janeiro.csv and rio_de_janeiro.json, sao_paulo.csv and sao_paulo.json

To run our application, please create venv:

sudo apt-get install python3-venv python3 -m venv webscraping source webscraping/bin/activate

Install the dependencies with the following command:

make install

Then, to run the spider, type the following command:

make run

or

scrapy crawl cities

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published