Submission:
- Please submit your project via GitHub and send a private message on Slack to both Dan and Ivan with a link to it.
"A problem well-stated is half-solved" -- Charles Kettering
Welcome to Data Science! In this first project you will create a framework to scope out data science projects. This framework will provide you with a guide to develop a well-articulated problem statement and analysis plan that will be robust and reproducible.
Objective: Create a structured Jupyter Notebook using markup.
- Requirements:
- Identify the variables of the dataset, including the response and predictors.
- Create a data dictionary with classification of available variables.
- Write a high quality problem statement.
- State the risks and assumptions of your data.
- Outline exploratory data analysis methods.
The dataset is available here.
For this project we will be using an Jupyter Notebook. Jupyter Notebooks are a handy way to communicate your research with your team and share your analysis. Using markup syntax will allow you create more visually appealing notebooks.
- Open the starter notebook in Anaconda.
Check out the sample notebook, which includes a data dictionary and responses to questions. Wonder how to format your notebook the same way? Simply double-click on any section to view the markdown.
- Get used to the Jupyter Notebook layout. Play around with keyboard shortcuts.
- Try out basic markdown for commonly used formats; look up commands for headers, bold, italic, and tables.
- Read the documentation for Jupyter Notebooks. Most of the time, there is a tutorial that you can follow, but not always, and learning to read documentation is crucial to your success as a data scientist!
The rubric is available here.