Skip to content

Latest commit

 

History

History
13 lines (8 loc) · 649 Bytes

README.md

File metadata and controls

13 lines (8 loc) · 649 Bytes

simpleCrawler - A Simple Web Crawler

A Simple (multi-threaded) Web Crawler written in Python

CustomWebSpiderImg

Full discretion: Credit goes to Bucky Roberts for developing python tutorials on his website and introducing the idea of the web crawler/web spider.

This project served as an opportunity to familiarize myself with Python and expand upon an existing project.

TODO

  • Figure out how to make this guy obey robots.txt file (done!)
  • Update domain.py to allow more link formats (i.e. example.com instead of only https://www.example.com/) (done!)