Using API to obtain real-time information on problems reported through the TraffyFondue website. From there, using ours multilabel classification model to predict images associated with the reported problems. Finally, the results are visualized in a PowerBI dashboard by sending an API.
Model Deployment sources code: https://github.com/AkiraSitdhi/TofuFondue-API
Visualization (PowerBI): https://shorturl.at/eIJR4
Model REST API: https://tofu-api-nj2eo5v2pq-as.a.run.app
Made as part of final project in 2110446 Data Science and Data Engineering, semester 2/2022.
- Web scraping: scraped images to enlarge train dataset
- Implemented GET Request Api from Traffy Fondue
- Airflow: use to create project pipeline consisting of 4 tasks
- Using api GET request the reported problems in realtime (daily)
- Call the REST API of our deployed model to predict the images obtained from the previous task
- Sending the prediction result and problems information to visualize via PoweBI streaming dataset
- Clear all metadata on airflow XComs database
- Multilabel image classification:
- Train: train a neural network model using 9376 images sourced from Traffy Fondue and scraped images from the Data Engineering Part. These images have been classified into 10 categories: sanitary, sewer, stray, canal, light, flooding, electric, traffic, road, and sidewalk.
- Predict: realtime predict images that obtained from Data Engineering Part
- MLFlow: use to save parameters and artifacts, as well as monitor loss and macro F1 scores during model training.
- Onnx: use to optimize and reduce the size of the model when deployed.
- Google Cloud Services: use to deploy model and can call by REST API
- Power BI streaming dataset
- Power BI Dashboard
- Geospatial visualization
-
Open terminal in /airflow-local and run
docker-compose up airflow-init
-
Launch Airflow
docker-compose up
Wait for scheduler and webserver to get healthy, then go to localhost:8080
username: airflow
password: airflow