Skip to content

istiakshihab/bn-newspaper-crawler

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

16 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Bangla Newspaper and News Crawler

Bangla Newspaper Crawlers written using Scrapy.

Description

Newspaper Covered So far:

- Amader Shomoy
- Bangladesh Today
- Bangla Tribune
- BDNews24
- Daily Nayadiganta
- Bangla Dailystar
- Ittefaq
- Janakantha
- Kalerkantho
- TBS Bangla
- Prothom Alo

Getting Started

Dependencies

  • Scrapy
  • Newspaper3k (Modfified)

Installing

Optional:

conda create -n scrapy-env python=3.9
conda activate scrapy-env

Install Dependency:

pip install -r requirements.txt

Executing program

python main.py

If you want to run individual crawlers

scrapy crawl <crawler-name> -o output.[csv|json|jsonl]

Authors

Istiak Shihab ([email protected])

Version History

  • 0.1
    • Initial Release

License

This project is licensed under the GPLv2.0 License - see the LICENSE.md file for details

Acknowledgments

About

Bangla Newspaper Crawler. Crawls over 10 Newspapers. More to be added

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages