This capstone project course will give you a taste of what data scientists go through in real life when working with real datasets. You will assume the role of a Data Scientist working for a startup intending to compete with SpaceX, and in the process follow the Data Science methodology involving data collection, data wrangling, exploratory data analysis, data visualization, model development, model evaluation, and reporting your results to stakeholders.
🎉 Labs completed by Debdatta Sarkar
This repo is shared for learning purpose not for cheating! If you find same slides or files copy and pasted in other assignments same as it is, report it as plagiarism.
- Introduction
- Week 1
- Week 2
- Week 3
- Week 4
- Week 5
- Out of the Box Thinking
- Technology Stack
- Tips & Tricks
- Extra Study Materials
The commercial space age is here, companies are making space travel affordable for everyone. Virgin Galactic is providing suborbital spaceflights. Rocket Lab is a small satellite provider. Blue Origin manufactures sub-orbital and orbital reusable rockets. Perhaps the most successful is SpaceX. SpaceX’s accomplishments include: Sending spacecraft to the International Space Station. Starlink, a satellite internet constellation providing satellite Internet access. Sending manned missions to Space. One reason SpaceX can do this is the rocket launches are relatively inexpensive. SpaceX advertises Falcon 9 rocket launches on its website with a cost of 62 million dollars; other providers cost upwards of 165 million dollars each, much of the savings is because SpaceX can reuse the first stage.
In this capstone, I take the role of a data scientist working for a new rocket company. Space Y that would like to compete with SpaceX founded by Billionaire industrialist Allon Musk. My job is to determine the price of each launch. I have to do this by gathering information about Space X and creating dashboards for my team. I also determine if SpaceX will reuse the first stage. Instead of using rocket science to determine if the first stage will land successfully, I will train a machine learning model and use public information to predict if SpaceX will reuse the first stage.
- Advanced Folium Function MeasurementControl Plugin with Folium
- Advanced Folium Function Folium Custom Pane with Labels
- Hosting the Plotly on "Python Anywhere" Server with Flask
- Machine Learning - Decision Tree Construction
- EDA Visualization with Interactive Plotly (Instead of Seaborn)
- IBM Cognos Visualization with Player
- Python - Programming Language
- IBM Watson Studio - IBM’s software platform for data science
- IBM Db2 - Db2 is a family of data management products, including database servers, developed by IBM
- Jupyter Notebooks - Open-source web application that allows data scientists to create and share documents that integrate live code, equations, computational output, visualizations, and other multimedia resources.
- Anaconda - Local environment for practice
- pythonanywhere.com - Host your Python App via Flask
- plotly.com - Plotly stewards Python's leading data viz and UI libraries. With Dash Open Source, Dash apps run on your local laptop or server
- IBM Cognos Dashboard - IBM Cognos Analytics provides dashboards and stories to communicate your insights and analysis. You can assemble a view that contains visualizations such as a graph, chart, plot, table, map, or any other visual representation of data.
- GitHub - Repository for storing all files
Dont jump into labs without study. Take time to study previous labs step by step and also practice on your own laptop locally by installing Anaconda. This will also limit the usage of IBM Cloud Trial version.