Automatic Web Scraping of Spanish Public TV Films

This is a repository to automate the collection of data on films shown in Spanish public TV, named TDT. We use the R package for web scraping rvest and GitHub Actions.

The data comes from the following website.

It is updated every day and provides the film title (both the original and the spanish version), the film genre, a brief film synopsis, the TV channel and the day and time.

In the workflows folder it is the .yaml file that calls GitHub to autoscrape the data, using a R Script.

A quick report

If you want to see a report based on this data just clink the link.

Some useful resources

Automate Web Scraping with GitHub Actions: video tutorial.
Link to the repository used in the video tutorial: repository.
It provides the cron numnbers required to schedule the autoscrape: web.

Name		Name	Last commit message	Last commit date
Latest commit History 790 Commits
.github/workflows		.github/workflows
data		data
to_DO		to_DO
README.md		README.md
TDT_film_scraper.R		TDT_film_scraper.R

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Automatic Web Scraping of Spanish Public TV Films

A quick report

Some useful resources

About

Releases

Packages

Contributors 2

Languages

GuilleDiaz7/Automatic-Web-Scraping-of-Spanish-TDT-Films

Folders and files

Latest commit

History

Repository files navigation

Automatic Web Scraping of Spanish Public TV Films

A quick report

Some useful resources

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages