This repository contains two Jupyter notebooks and two .csv files.
Contains the code to retrieve news articles from the Ministry of Education website during the period 2019-2022. It also includes the code used to process the data and extract topics.
Contains the code to retrieve automatically generated subtitles from a collection of news on a YouTube channel. It also includes the code used to process the data and extract topics.
Contains the original subtitles extracted from YouTube. It also includes the data resulting from each preprocessing step.
Contains the original news articles extracted from the Ministry of Education website. Available news: 1480. Last accessed: 03/06/2023. It also includes the data resulting from each preprocessing step.