Skip to content

Latest commit

 

History

History
27 lines (19 loc) · 1.01 KB

README.md

File metadata and controls

27 lines (19 loc) · 1.01 KB

tpe-mrt-traffic-etl-serverless

A serverless ETL pipeline deployed by Serverless Framework

The process consist of 3 functions:

  • mrt_traffic_file_list: check existing object files in assigned s3 bucket, and return a list of existing files
  • mrt_traffic : get the offical dataset table, which contains year-month of dataset, and file url, download the dataset after comparing dataset table with the list recieved from mrt_traffic_file_list,
  • email_notification : send notifying email containing number of files downloaded, and year-month of downloaded files

Objects in S3

taipei_metro

These functions can be further structured into a data pipeline using AWS Step Function:

taipei_metro

Analyze data in AWS Athena:

taipei_metro