Skip to content

Commit

Permalink
Project 2 updated
Browse files Browse the repository at this point in the history
  • Loading branch information
smruthig committed Jan 16, 2025
1 parent 1280b17 commit 10717e6
Show file tree
Hide file tree
Showing 2 changed files with 51 additions and 43 deletions.
92 changes: 50 additions & 42 deletions projects/project2.md
Original file line number Diff line number Diff line change
Expand Up @@ -38,61 +38,43 @@ Your write-up should contain the following information:

- The specific question each visualization aims to answer.
- A description of your design rationale and important considerations for each visualization.
- A clear mention of which of the two visualizations is deceptive and which one is earnest.

## Recommended Data Sources

To help get you started, this assignment, we’ve provided three possible datasets for you to use, although you're welcome to select any dataset you prefer. **You must use the same dataset for both visualizations**, but you may transform the data differently, use additional data variables, or choose to address a different question for each design. These datasets are intentionally chosen to cover politically charged topics for the simple reason that these are typically the types of data where deceptive visualizations may proliferate.
To help get you started with this assignment, we’ve provided three possible datasets for you to use, although you're welcome to select any dataset you prefer. **You must use the same dataset for both visualizations**, but you may transform the data differently, use additional data variables, or choose to address a different question for each design.

### Data on Energy by Our World in Data, 1900-2022
All datasets contain time series data relevant to health, but from different cohort (and species), different time scales, and different modalities. Time series analyses in themselves are not critical to exploring the data, but allow for forecasting and windowing of classifier data alongside unsupervised and statistical approaches.

[Our World in Data][link], a non-profit that gathers and analyzes data about global issues, has published data about energy usage for countries (e.g. coal consumption, hydropower consumption, etc.) around the world since 1900. You can download the data [here][link2].
### BIG IDEAs Lab Glycemic Variability and Wearable Device Data v1.0.0

[link]: https://ourworldindata.org/
[link2]: https://github.com/owid/energy-data?tab=readme-ov-file#data-on-energy-by-our-world-in-data
Prof. Jessilynn Dunn at Duke released glucose measurements and wrist-worn multimodal wearable sensor data from high-normoglycemic participants. 2 weeks each to identify spikes and/or hypoglycemic events.

### Education Data
Address: [BIG IDEAs Lab Glycemic Variability and Wearable Device Data][link1]

Every year, the federal government releases large amounts of data on US schools, districts, and colleges. However, this information is scattered across multiple datasets. Urban Institute’s Education Data Explorer tries to fix this issue by putting together data from various sources such as the National Center for Education Statistics’ Common Core of Data (CCD), the Civil Rights Data Collection (CRDC), the US Department of Education’s EDFacts, and IPUMS’ National Historical Geographic Information System (NHGIS) and makes it available as an API. You can download the data by making an API call using the code available on the [website][link3] or alternatively clicking on the downloads button on the website.
Generated by the Empatica 4 wearable device paired with a DexCom 6 continuous glucose monitor, 16 people aged 35-65 are represented by 8-10 days each of continuous, multimodal wearable data. This is paired with meal logs and medical histories about metabolic and cardiac conditions. Data is downloadable as .csv files.

[link3]: https://educationdata.urban.org/documentation/schools.html#overview
[link1]: https://physionet.org/content/big-ideas-glycemic-wearable/1.1.2/

### Internet Usage Data
### Physionet

UNdata brings international statistical databases within easy reach of users through a single-entry point. It is maintained by the Development Data Section of the Development Data and Outreach Branch within the Statistics Division of the Department of Economic and Social Affairs (UN DESA) of the UN Secretariat. You can find the internet usage data [here][link4]. Feel free to take a look at some of the other datasets made available by UNdata [here][link5].
Physionet, an online repository of physiological data sets from many sources, has an Open Datasets section. These contain abstracts and descriptions per project.

This data has the following columns:
Address: [PhysioNet Databases][link2]

- `Region/country Code:` code representing the country or region.
- `Region or Country Name:` Field containing the country name.
- `Year:` Field containing the year at which the data was collected.
- `Value:` Field denoting the Percentage of individuals using the internet.
- `Source:` Field denoting the source of the data.
The largest of the data sources on this list, provided for those seeking more open exploration. This Database is comprised of dozens of data sets. Open Access data sets can be reached by the “Open databases” link at the top left of the landing page.

[link4]: https://github.com/dsc-courses/dsc106-wi24/raw/gh-pages/resources/data/Internet_data.csv
[link5]: https://data.un.org/
[link2]: https://physionet.org/about/database/

Here are some other possible sources to consider. You are also free to use data from a source different from those included here. If you have any questions on whether a dataset is appropriate, please ask the course staff ASAP!
### Mouse data

- [City of San Diego open data][link6]
- [U.S. Government Open Datasets][link7]
- [U.S. Census Bureau][link8] - Census Datasets
- [IPUMS.org][link9] - Integrated Census & Survey Data from around the World
- [Federal Elections Commission][link10] - Campaign Finance & Expenditures
- [Federal Aviation Administration][link11] - FAA Data & Research
- NOAA Daily Weather - NOAA Daily Global Historical Climatology Network Data
- [yelp.com/dataset][link12] - Yelp Open Dataset
- [fivethirtyeight.com][link13] - Data and Code behind the Stories and Interactives
- [Buzzfeed News][link14] - Open-source data from BuzzFeed's newsroom
Mouse data covering 2 weeks of minute level activity and core body temperature in males and females. Light is on a 12-on : 12-off controlled square wave so that daily rhythms are aligned. Every 4 days females display “estrus” which is associated with ovulation, and a longer/hotter active period.

[link6]: https://data.sandiego.gov/
[link7]: data.gov
[link8]: https://www.census.gov/data.html
[link9]: https://www.ipums.org/
[link10]: https://www.fec.gov/data/
[link11]: https://www.faa.gov/data_research/
[link12]: https://www.yelp.com/dataset
[link13]: https://github.com/fivethirtyeight/data/
[link14]: https://github.com/BuzzFeedNews
Address: [Mouse Data.xlsx][link3]

Columns are unique IDs (so, e.g., F1 is the same in all tabs). Rows are minutes, in order, across 14 days (there are 1440 minutes in a day). Lights turn on and off every 12 h (mice are nocturnal, so most active when it is dark). Lights Off is t=0, then every 720 it switches. Estrus (the day of ovulation) for all females starts day 2, repeating every 4 days. Activity and body temperature for each individual in each minute are recorded. Data is available as a .xlsx file.

[link3]: https://docs.google.com/spreadsheets/d/1RGpsjzFyJ6nUMBNFDVdkv-Uey-ftyYo8/edit?gid=1851630377#gid=1851630377

## Grading

Expand All @@ -105,12 +87,38 @@ The assignment score is out of a maximum of 10 points. We will determine scores

We will reward entries that go above and beyond the assignment requirements to produce effective (and deceptive) graphics. Examples may include outstanding visual design, effective annotations and other narrative devices, exceptional creativity, or deceptive designs that require the write-up in order to properly identify the misleading design components.

### Rubric

The assignment is out of 10 points possible – 4 points for each visualization, and 2 points for the writeup. Submissions that squarely meet the project requirements (Satisfactory column) will get 8/10 points. Note that there are a total of 3 possible bonus points available on this assignment.

| Component | Excellent | Satisfactory | Poor |
| ------------------------------------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
| Marks, Encodings, and Visual Design (per visualization) | Visual design persuasively argues the visualization’s stance, and facilitates effortless reading even when used deceptively. Any deceptive visual design choices are very subtle—even seasoned readers can only identify them on close study. (+2 points) | Visual design is largely persuasive, but some issues hinder comprehension. Any deceptive visual design cannot be detected at first glance, but are identifiable on a second look. (+1.5 points) | Visual design is distracting or makes the visualization unnecessarily or unintentionally difficult to read. Any deceptive design can be immediately identified. (+1 point) |
| Titles, Labels, and Annotations (per visualization) | Titles, labels, and annotations persuasively describe, contextualize or frame the depicted data. Any slants that may be considered deceptive are imperceptible to the reader. (+2 points) | Necessary titles and labels are present, but annotations could be better used to persuasively narrate the visualization’s stance. Any deceptively slanted content is more easily detectable by readers. (+1.5 points) | Several titles or labels are missing, or do not provide human-understandable information. Annotations are rarely used. Strong, charged, or colorful language makes it easy to detect deceptive content. (+1 point) |
| Data Transformations (per visualization) | More advanced transformations (e.g., groupings, binnings, calculated fields, etc.) extend or manipulate the dataset in interesting and/or unexpected ways. (+1 bonus point) | The raw dataset was mostly used directly, with perhaps some simple transforms (e.g., sorting, filtering) to facilitate communicating the visualization’s message. (0 points) | |
| Writeup | | Well-crafted write-up provides reasoned justification for all design choices with a thoughtful reflection on their ethical implications. (+2 points) | Most design decisions are described, but rationale or ethical reflections could be explained at a greater level of detail. (+1 point) |
| Creativity and Originality | The submission exceeds the assignment requirements, with original insights or particularly engaging visualizations. (+1 bonus point) | The submission meets the assignment requirements. (+0 points) | |

## Submission Details

This is an individual assignment. **You may not work in groups**. Your completed assignment is due on **Fri 2/2, by 11:59 pm**.
This is an individual assignment. **You may not work in groups**. Your
completed assignment is due on **Tue 1/28, by 11:59pm**.

You must submit your assignment using Gradescope. Please upload a PDF containing the following:

<ul>
<li>A single image of your earnest visualization</li>
<li>A single image of your deceptive visualization</li>
<li>In a <b>separate</b> page, a writeup conforming to the aforementioned rules.</li>
</ul>

You must submit your assignment using Gradescope. Please upload two image files (PNG, JPG) of your visualization design using the correct file extension, such as **“a2_earnest.png”** and **“a2_deceptive.png”** for PNG image files or **“a2_earnest.jpg”** or **“a2_deceptive.jpg”** for JPEG image files. Please do not include your name or PID in the filename, and be sure your image is sized for a reasonable viewing experience! Viewers should not have to zoom or scroll in order to effectively view your submission.
Here are a few important things to keep in mind:

In addition, submit your write-up to Gradescope as a plain text file, named exactly as **“readme.txt”**, with content that follows the instructions above. Do not include your name or PID in the filename!
<ul>
<li>Do <b>not</b> label the images as "earnest" or "deceptive". <b>Remember, the visualization itself should not give away which design is earnest and which is deceptive</b>. Failure to comply may result in point deductions as it hinders the peer review process!</li>
<li>Ensure that the writeup is in a separate page, as the write-up contains information about which image is deceptive. Failure to have the write-up in a separate page may result in point deductions.</li>
<li>Do not forget to clearly mention which visualization in deceptive <b>in the write-up</b>.</li>
<li>Be sure your image is sized for a reasonable viewing experience! Viewers should not have to zoom in order to effectively view your submission.</li>
</ul>

Please use the correct file names for your submissions; typos that require manual correction by the course staff may result in point deductions. Do not worry about resubmissions, feel free to resubmit as needed prior to the deadline (if you are using late days to do a resubmission, please notify the course staff). **Remember, the visualization itself should not give away which design is earnest and which is deceptive;** the file names will be randomized by the course staff prior to peer review.
Do not worry about resubmissions, feel free to resubmit as needed prior to the deadline (if you are using late days to do a resubmission, please notify the course staff).
2 changes: 1 addition & 1 deletion projects/project2peer.md
Original file line number Diff line number Diff line change
Expand Up @@ -36,4 +36,4 @@ To share critique, we will use the ["I like / I wish / What if?"][link] format.

This is an individual assignment. You may not work in groups.

Your peer reviews are due Fri 02/09, 11:59pm. The links to the submissions that you are required to evaluate will be emailed to you. You must submit peer reviews for two Project 2 submissions. To submit your review, you must use gradescope. Please carefully respond to each of the questions raised.
Your peer reviews are due **Tue 02/04, 11:59pm**. The submissions that you are required to evaluate will be emailed to you. You must submit peer reviews for two Project 2 submissions. To submit your review, you must use gradescope. Please carefully respond to each of the questions raised.

0 comments on commit 10717e6

Please sign in to comment.