This project aims to provide an end point on top of newpaper3k, https://newspaper.readthedocs.io/en/latest/, for user to summarize news articles. It also have a home page that shows summarized news from major news sites.
- implement based on uml design
- scrape engine to create article entry in database daily from popular news sites
- webpage that list latest article summary from popular news sites
- schedule jobs to run periodically to scrape news sites, use Celery
- reddit bot
- twitter bot
- Caching in redis and db
- Content based summary API end point rather than url
- Plugins, chrome, safari that talks to this end point.