Skip to content

Commit

Permalink
Built site for gh-pages
Browse files Browse the repository at this point in the history
  • Loading branch information
srvanderplas committed Jul 21, 2024
1 parent e58120f commit a7dfc1a
Show file tree
Hide file tree
Showing 7 changed files with 36 additions and 36 deletions.
2 changes: 1 addition & 1 deletion .nojekyll
Original file line number Diff line number Diff line change
@@ -1 +1 @@
d3ddec6d
34dad753
4 changes: 2 additions & 2 deletions exams/2023-midterm.html
Original file line number Diff line number Diff line change
Expand Up @@ -289,8 +289,8 @@ <h2 class="anchored" data-anchor-id="tasks">Tasks</h2>
<span id="cb1-13"><a href="#cb1-13" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb1-14"><a href="#cb1-14" aria-hidden="true" tabindex="-1"></a>df_means</span>
<span id="cb1-15"><a href="#cb1-15" aria-hidden="true" tabindex="-1"></a><span class="do">## y mean</span></span>
<span id="cb1-16"><a href="#cb1-16" aria-hidden="true" tabindex="-1"></a><span class="do">## 1 Group 1 9.971812</span></span>
<span id="cb1-17"><a href="#cb1-17" aria-hidden="true" tabindex="-1"></a><span class="do">## 2 Group 2 20.049941</span></span>
<span id="cb1-16"><a href="#cb1-16" aria-hidden="true" tabindex="-1"></a><span class="do">## 1 Group 1 9.941761</span></span>
<span id="cb1-17"><a href="#cb1-17" aria-hidden="true" tabindex="-1"></a><span class="do">## 2 Group 2 19.934356</span></span>
<span id="cb1-18"><a href="#cb1-18" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb1-19"><a href="#cb1-19" aria-hidden="true" tabindex="-1"></a><span class="co"># Demonstration of na.rm</span></span>
<span id="cb1-20"><a href="#cb1-20" aria-hidden="true" tabindex="-1"></a><span class="fu">mean</span>(<span class="fu">c</span>(<span class="cn">NA</span>, <span class="dv">1</span>, <span class="dv">2</span>, <span class="dv">3</span>), <span class="at">na.rm =</span> T) <span class="co"># Remove NAs</span></span>
Expand Down
6 changes: 3 additions & 3 deletions exams/2024-midterm.html
Original file line number Diff line number Diff line change
Expand Up @@ -289,9 +289,9 @@ <h2 class="anchored" data-anchor-id="tasks">Tasks</h2>
<span id="cb1-12"><a href="#cb1-12" aria-hidden="true" tabindex="-1"></a>}</span>
<span id="cb1-13"><a href="#cb1-13" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb1-14"><a href="#cb1-14" aria-hidden="true" tabindex="-1"></a>df_means</span>
<span id="cb1-15"><a href="#cb1-15" aria-hidden="true" tabindex="-1"></a><span class="do">## y mean</span></span>
<span id="cb1-16"><a href="#cb1-16" aria-hidden="true" tabindex="-1"></a><span class="do">## 1 Group 1 9.848624</span></span>
<span id="cb1-17"><a href="#cb1-17" aria-hidden="true" tabindex="-1"></a><span class="do">## 2 Group 2 20.204907</span></span>
<span id="cb1-15"><a href="#cb1-15" aria-hidden="true" tabindex="-1"></a><span class="do">## y mean</span></span>
<span id="cb1-16"><a href="#cb1-16" aria-hidden="true" tabindex="-1"></a><span class="do">## 1 Group 1 9.91790</span></span>
<span id="cb1-17"><a href="#cb1-17" aria-hidden="true" tabindex="-1"></a><span class="do">## 2 Group 2 20.17033</span></span>
<span id="cb1-18"><a href="#cb1-18" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb1-19"><a href="#cb1-19" aria-hidden="true" tabindex="-1"></a><span class="co"># Demonstration of na.rm</span></span>
<span id="cb1-20"><a href="#cb1-20" aria-hidden="true" tabindex="-1"></a><span class="fu">mean</span>(<span class="fu">c</span>(<span class="cn">NA</span>, <span class="dv">1</span>, <span class="dv">2</span>, <span class="dv">3</span>), <span class="at">na.rm =</span> T) <span class="co"># Remove NAs</span></span>
Expand Down
Binary file modified figs/calendar.pdf
Binary file not shown.
56 changes: 28 additions & 28 deletions index.html

Large diffs are not rendered by default.

4 changes: 2 additions & 2 deletions search.json
Original file line number Diff line number Diff line change
Expand Up @@ -102,7 +102,7 @@
"href": "exams/2023-midterm.html#tasks",
"title": "Practice Midterm",
"section": "Tasks",
"text": "Tasks\n\nRead in the data and create a data frame that you will work with for this exam.\nCreate a new column variable, decade, in your data frame.\n\n\nYou will need to take the response year and truncate it to the decade, so that 1972 becomes 1970 and 1989 becomes 1980. You can use a series of logical statements if you want, but it may be more effective to find a numerical function or combination of functions that will perform the operation you want.\nfloor() and math.floor() in R and python respectively are good places to start.\nCreate a scatterplot (use geom_point) of your happy year vs decade to show that your approach succeeded.\n\n\nCreate a new data set by iterating through each year to find the proportion of people who are very happy. Use a for loop. Using your new data frame, plot the proportion of very happy people over time.\nNote: You may have to pass an argument to the mean function to tell it to exclude missing values from the calculation, such as na.rm or skipna. Or, you can remove the NAs from happy using a function like na.omit or dropna, but be careful to only drop rows with an NA in variables we care about, like happy or year.\n\nThe code below provides an example of how to create a summary dataset and handle NAs in R and python. You may modify this code to help you answer part 3.\n\n# Create sample data\ndf &lt;- data.frame(x = c(rnorm(100, 10), rnorm(100, 20)),\n y = rep(c(\"Group 1\", \"Group 2\"), each = 100))\n\ndf_means &lt;- data.frame(y = NULL, mean = NULL)\n\n# For each y group, what is the mean of x?\nfor (i in unique(df$y)) {\n sub_df &lt;- subset(df, y == i)\n df_means &lt;- rbind(df_means, \n data.frame(y = i, mean = mean(sub_df$x, na.rm = T)))\n}\n\ndf_means\n## y mean\n## 1 Group 1 9.971812\n## 2 Group 2 20.049941\n\n# Demonstration of na.rm\nmean(c(NA, 1, 2, 3), na.rm = T) # Remove NAs\n## [1] 2\nmean(c(NA, 1, 2, 3), na.rm = F) # Don't remove NAs\n## [1] NA\n\n\nimport pandas as pd\nimport numpy as np\n\n# Create a new data frame\ndf = pd.DataFrame({\n 'y': np.repeat(['Group1', 'Group2'], (100, 100)), \n 'x': np.concatenate((np.random.normal(loc = 10, size = 100), np.random.normal(loc = 12, size = 100)), axis = None)\n })\n\n# Create an empty dataframe\ndf_means = pd.DataFrame(columns = ['y', 'mean'])\n\n# For each age, how many values?\nfor i in np.unique(df.y):\n # Create the subset\n df_sub = df.loc[df.y == i]\n # Drop NAs from the data frame\n # This step isn't necessary because mean() uses skipna = T by default\n # df_sub = df_sub.dropna(subset = ['x', 'y']) \n # Add a new row to the end of df_means\n df_means.loc[len(df_means.index)] = [i, df_sub.x.mean()]\n\n\n# Demonstrating skipna parameter of mean\npd.DataFrame({'y':[1, 2, 3, np.nan]}).y.mean(skipna = True)\n## 2.0\npd.DataFrame({'y':[1, 2, 3, np.nan]}).y.mean(skipna = False)\n## nan"
"text": "Tasks\n\nRead in the data and create a data frame that you will work with for this exam.\nCreate a new column variable, decade, in your data frame.\n\n\nYou will need to take the response year and truncate it to the decade, so that 1972 becomes 1970 and 1989 becomes 1980. You can use a series of logical statements if you want, but it may be more effective to find a numerical function or combination of functions that will perform the operation you want.\nfloor() and math.floor() in R and python respectively are good places to start.\nCreate a scatterplot (use geom_point) of your happy year vs decade to show that your approach succeeded.\n\n\nCreate a new data set by iterating through each year to find the proportion of people who are very happy. Use a for loop. Using your new data frame, plot the proportion of very happy people over time.\nNote: You may have to pass an argument to the mean function to tell it to exclude missing values from the calculation, such as na.rm or skipna. Or, you can remove the NAs from happy using a function like na.omit or dropna, but be careful to only drop rows with an NA in variables we care about, like happy or year.\n\nThe code below provides an example of how to create a summary dataset and handle NAs in R and python. You may modify this code to help you answer part 3.\n\n# Create sample data\ndf &lt;- data.frame(x = c(rnorm(100, 10), rnorm(100, 20)),\n y = rep(c(\"Group 1\", \"Group 2\"), each = 100))\n\ndf_means &lt;- data.frame(y = NULL, mean = NULL)\n\n# For each y group, what is the mean of x?\nfor (i in unique(df$y)) {\n sub_df &lt;- subset(df, y == i)\n df_means &lt;- rbind(df_means, \n data.frame(y = i, mean = mean(sub_df$x, na.rm = T)))\n}\n\ndf_means\n## y mean\n## 1 Group 1 9.941761\n## 2 Group 2 19.934356\n\n# Demonstration of na.rm\nmean(c(NA, 1, 2, 3), na.rm = T) # Remove NAs\n## [1] 2\nmean(c(NA, 1, 2, 3), na.rm = F) # Don't remove NAs\n## [1] NA\n\n\nimport pandas as pd\nimport numpy as np\n\n# Create a new data frame\ndf = pd.DataFrame({\n 'y': np.repeat(['Group1', 'Group2'], (100, 100)), \n 'x': np.concatenate((np.random.normal(loc = 10, size = 100), np.random.normal(loc = 12, size = 100)), axis = None)\n })\n\n# Create an empty dataframe\ndf_means = pd.DataFrame(columns = ['y', 'mean'])\n\n# For each age, how many values?\nfor i in np.unique(df.y):\n # Create the subset\n df_sub = df.loc[df.y == i]\n # Drop NAs from the data frame\n # This step isn't necessary because mean() uses skipna = T by default\n # df_sub = df_sub.dropna(subset = ['x', 'y']) \n # Add a new row to the end of df_means\n df_means.loc[len(df_means.index)] = [i, df_sub.x.mean()]\n\n\n# Demonstrating skipna parameter of mean\npd.DataFrame({'y':[1, 2, 3, np.nan]}).y.mean(skipna = True)\n## 2.0\npd.DataFrame({'y':[1, 2, 3, np.nan]}).y.mean(skipna = False)\n## nan"
},
{
"objectID": "exams/2023-midterm.html#solutions",
Expand Down Expand Up @@ -739,7 +739,7 @@
"href": "exams/2024-midterm.html#tasks",
"title": "2024 Midterm",
"section": "Tasks",
"text": "Tasks\n\nRead in the data and create a data frame that you will work with for this exam.\nCreate a new column variable, decade, in your data frame.\n\n\nYou will need to take the response year and truncate it to the decade, so that 1972 becomes 1970 and 1989 becomes 1980. You can use a series of logical statements if you want, but it may be more effective to find a numerical function or combination of functions that will perform the operation you want.\nfloor() and math.floor() in R and python respectively are good places to start.\nCreate a scatterplot (use geom_point) of your happy year vs decade to show that your approach succeeded.\n\n\nCreate a new data set by iterating through each year to find the proportion of people who are very happy. Use a for loop. Using your new data frame, plot the proportion of very happy people over time.\nNote: You may have to pass an argument to the mean function to tell it to exclude missing values from the calculation, such as na.rm or skipna. Or, you can remove the NAs from happy using a function like na.omit or dropna, but be careful to only drop rows with an NA in variables we care about, like happy or year.\n\nThe code below provides an example of how to create a summary dataset and handle NAs in R and python. You may modify this code to help you answer part 3.\n\n# Create sample data\ndf &lt;- data.frame(x = c(rnorm(100, 10), rnorm(100, 20)),\n y = rep(c(\"Group 1\", \"Group 2\"), each = 100))\n\ndf_means &lt;- data.frame(y = NULL, mean = NULL)\n\n# For each y group, what is the mean of x?\nfor (i in unique(df$y)) {\n sub_df &lt;- subset(df, y == i)\n df_means &lt;- rbind(df_means, \n data.frame(y = i, mean = mean(sub_df$x, na.rm = T)))\n}\n\ndf_means\n## y mean\n## 1 Group 1 9.848624\n## 2 Group 2 20.204907\n\n# Demonstration of na.rm\nmean(c(NA, 1, 2, 3), na.rm = T) # Remove NAs\n## [1] 2\nmean(c(NA, 1, 2, 3), na.rm = F) # Don't remove NAs\n## [1] NA\n\n\nimport pandas as pd\nimport numpy as np\n\n# Create a new data frame\ndf = pd.DataFrame({\n 'y': np.repeat(['Group1', 'Group2'], (100, 100)), \n 'x': np.concatenate((np.random.normal(loc = 10, size = 100), np.random.normal(loc = 12, size = 100)), axis = None)\n })\n\n# Create an empty dataframe\ndf_means = pd.DataFrame(columns = ['y', 'mean'])\n\n# For each age, how many values?\nfor i in np.unique(df.y):\n # Create the subset\n df_sub = df.loc[df.y == i]\n # Drop NAs from the data frame\n # This step isn't necessary because mean() uses skipna = T by default\n # df_sub = df_sub.dropna(subset = ['x', 'y']) \n # Add a new row to the end of df_means\n df_means.loc[len(df_means.index)] = [i, df_sub.x.mean()]\n\n\n# Demonstrating skipna parameter of mean\npd.DataFrame({'y':[1, 2, 3, np.nan]}).y.mean(skipna = True)\n## 2.0\npd.DataFrame({'y':[1, 2, 3, np.nan]}).y.mean(skipna = False)\n## nan"
"text": "Tasks\n\nRead in the data and create a data frame that you will work with for this exam.\nCreate a new column variable, decade, in your data frame.\n\n\nYou will need to take the response year and truncate it to the decade, so that 1972 becomes 1970 and 1989 becomes 1980. You can use a series of logical statements if you want, but it may be more effective to find a numerical function or combination of functions that will perform the operation you want.\nfloor() and math.floor() in R and python respectively are good places to start.\nCreate a scatterplot (use geom_point) of your happy year vs decade to show that your approach succeeded.\n\n\nCreate a new data set by iterating through each year to find the proportion of people who are very happy. Use a for loop. Using your new data frame, plot the proportion of very happy people over time.\nNote: You may have to pass an argument to the mean function to tell it to exclude missing values from the calculation, such as na.rm or skipna. Or, you can remove the NAs from happy using a function like na.omit or dropna, but be careful to only drop rows with an NA in variables we care about, like happy or year.\n\nThe code below provides an example of how to create a summary dataset and handle NAs in R and python. You may modify this code to help you answer part 3.\n\n# Create sample data\ndf &lt;- data.frame(x = c(rnorm(100, 10), rnorm(100, 20)),\n y = rep(c(\"Group 1\", \"Group 2\"), each = 100))\n\ndf_means &lt;- data.frame(y = NULL, mean = NULL)\n\n# For each y group, what is the mean of x?\nfor (i in unique(df$y)) {\n sub_df &lt;- subset(df, y == i)\n df_means &lt;- rbind(df_means, \n data.frame(y = i, mean = mean(sub_df$x, na.rm = T)))\n}\n\ndf_means\n## y mean\n## 1 Group 1 9.91790\n## 2 Group 2 20.17033\n\n# Demonstration of na.rm\nmean(c(NA, 1, 2, 3), na.rm = T) # Remove NAs\n## [1] 2\nmean(c(NA, 1, 2, 3), na.rm = F) # Don't remove NAs\n## [1] NA\n\n\nimport pandas as pd\nimport numpy as np\n\n# Create a new data frame\ndf = pd.DataFrame({\n 'y': np.repeat(['Group1', 'Group2'], (100, 100)), \n 'x': np.concatenate((np.random.normal(loc = 10, size = 100), np.random.normal(loc = 12, size = 100)), axis = None)\n })\n\n# Create an empty dataframe\ndf_means = pd.DataFrame(columns = ['y', 'mean'])\n\n# For each age, how many values?\nfor i in np.unique(df.y):\n # Create the subset\n df_sub = df.loc[df.y == i]\n # Drop NAs from the data frame\n # This step isn't necessary because mean() uses skipna = T by default\n # df_sub = df_sub.dropna(subset = ['x', 'y']) \n # Add a new row to the end of df_means\n df_means.loc[len(df_means.index)] = [i, df_sub.x.mean()]\n\n\n# Demonstrating skipna parameter of mean\npd.DataFrame({'y':[1, 2, 3, np.nan]}).y.mean(skipna = True)\n## 2.0\npd.DataFrame({'y':[1, 2, 3, np.nan]}).y.mean(skipna = False)\n## nan"
},
{
"objectID": "exams/2024-midterm.html#solutions",
Expand Down
Binary file modified syllabus.pdf
Binary file not shown.

0 comments on commit a7dfc1a

Please sign in to comment.