This Python web scraping project is developed for learning and exploration purposes, with no malicious intentions. It aims to provide a practical application of web scraping techniques using Selenium and BeautifulSoup, focusing on extracting data from the Google Summer of Code (GSoC) portal.
Click here to view sample data
- org_name
- org_description
- technology
- topics
- official_link
- gsoc_link
- Organization Name
- Official Link
- GSOC Link
- Project Details Link
- Technology Used
- Project Topics
- Project Details Description
- Programming Language: Python
- Core Tools: Selenium, BeautifulSoup
- Database / Visualization: Google Sheet and JSON
- Scrape Data: The script is capable of scraping data from the Google GSoC portal based on a particular year.
- Data Properties: It can generate data for all organizations along with their projects for a specific year, including organization names, descriptions, project details, and relevant links.
- Data Output: It provides output in both JSON and CSV formats, facilitating data analysis and visualization.
This project is created solely for learning and exploration purposes, and it is not intended for any malicious activities. Users are advised to use the scraping tool responsibly and in compliance with website terms of service and legal regulations.