🔍 A simple, quick tool to search for keywords in various scholar search engines and retrieve relevant academic information, including titles, authors, and abstracts. The tool then ranks each source using the predefined scoring function which could be optimized by the user.
- 🌐 Keyword search on Google Scholar.
- 📑 Extraction of titles, authors, and abstracts.
- 🛡️ Implement proxies to prevent blocking.
- 💬 Implementing API for easier handling
- 💬 Develop a custom ChatGPT interface for the scraper.
- 📄 Implement a scoring function to rank the papers.
- Python 3.x
- BeautifulSoup
# Instructions to install your tool
git clone https://github.com/amirbabaei97/paper_scrapper
cd paper_scrapper
pip install -r requirements.txt
🚀 How to use the tool:
# Example command or script
python scraper.py --keyword "machine learning"
Output format: Results are presented in a structured JSON format.
🚧 Future enhancements:
- Integration with additional scientific paper search engines:
- Google Scholar
- Arxiv
- Semantic Scholar
- Open Review
- Science.gov
- core.ac.uk
- Science Direct
- PubMed
- Scopus
🤝 We welcome contributions!
- Please do a fork and then send a PR with the explanations of the changes.
- For major changes, please open an issue first to discuss what you would like to change.
📄 This project is licensed under the GNU General Public License.
- Hat tip to ChatGPT for helping in the development process
- Thank you to arXiv for use of its open access interoperability.
- Thank you Semantic Scholar for providing a free API key for this project.