Add annotation.reference information to the CSV exports #1628

mkdir-washington-edu · 2024-12-20T15:31:21Z

The problem
CSVs are much more approachable than JSON files for the average user, and instructors using annotation exports for various kinds of analysis want to see the relationship between annotations.

Example ticket: https://app.hubspot.com/contacts/6291320/record/0-5/18074066011

The solution
In the "export" option in the client, include the "reference" information so people using the exports for analysis can easily relate or reconstruct the annotation threads.

Example:
Current CSV export:
Created at Author Page URL Group Type Quote Comment Tags
2024-12-20 10:09 mdiroberts https://example.com/ abc internal testing? Reply reply
2024-10-30 14:02 mdiroberts https://example.com/ abc internal testing? Annotation documents anno question

Proposed CSV export:
Created at Author Page URL Group Type ID Reference Quote Comment Tags
2024-12-20 10:09 mdiroberts https://example.com/ abc internal testing? Reply "X72iLr7kEe-8vIsqCNlnHw" "F5YqwJbpEe-kJWcL3BHQxQ" reply
2024-10-30 14:02 mdiroberts https://example.com/ abc internal testing? Annotation "F5YqwJbpEe-kJWcL3BHQxQ" NULL documents anno question

robertknight · 2024-12-20T15:39:41Z

The references field in the API is an array containing every ancestor of the annotation in the thread. Some ancestors may have been deleted, so you need the full list to be sure of being able to associate a reply with its top-level annotation. In JSON this is straightforward to encode as an array. In CSV we'd need to choose an encoding. The simplest solution is a comma-separated list, making sure that the field is properly escaped when exported.

mkdir-washington-edu · 2024-12-20T16:21:14Z

For encoding: currently the list of tags on an annotation are handled correctly by Google Sheets when importing the csv, though I've seen issues with Excel properly decoding them. Excel will keep assuming that each tag in the list is a new column value, steadily displacing all f the data for subsequent rows.

Some additional context from an instructor (to help with prioritization):

JSON files are not practical. I need the text of annotation and replies in a text format to use in a text-based program for qualitative research. It is important to know which reply attaches to what “original” annotation for the purpose of data analysis, since I will need to treat replied to annotation differently to original annotations. Also, for an in-depth content analysis, I need to know which rely matches to what annotation. The replies are almost useless unless I know what they are replying to.

robertknight · 2024-12-20T17:06:44Z

I'd forgotten we'd already had to solve encoding lists for handling the tags field. We should treat references in the same way. The request makes sense and is likely quite straightforward to implement.

JSON files are not practical. I need the text of annotation and replies in a text format to use in a text-based program for qualitative research.

For what it's worth, an interim solution may be to use AI to help with this:

Go to ChatGPT
Start a new chat and attach an exported JSON file
Enter a prompt like: "Convert the records in this JSON file to CSV. Include only these fields: ID, username, text, tags, references."

This worked for me for a small file of 10-20 annotations. Not sure if it will work with a much larger one.

mkdir-washington-edu added the feature request label Dec 20, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add annotation.reference information to the CSV exports #1628

Add annotation.reference information to the CSV exports #1628

mkdir-washington-edu commented Dec 20, 2024

robertknight commented Dec 20, 2024

mkdir-washington-edu commented Dec 20, 2024

robertknight commented Dec 20, 2024

Add annotation.reference information to the CSV exports #1628

Add annotation.reference information to the CSV exports #1628

Comments

mkdir-washington-edu commented Dec 20, 2024

robertknight commented Dec 20, 2024

mkdir-washington-edu commented Dec 20, 2024

robertknight commented Dec 20, 2024