- Download all oral history audio, images, and metadata as json and csv files directly from oralhistory.nypl.org
python get_metadata_and_assets.py -out "path/to/output/dir/"
This script creates in the output directory:
neighborhoods.json
andneighborhoods.csv
interviews.json
andinterviews.csv
- individual
.json
files for each interview which contain more metadata and annotations - write images and audio to ./audio and ./images folders
- Download all oral history transcripts as json, plain text, and web vtt files directly from transcribe.oralhistory.nypl.org
python get_transcripts.py -out "path/to/output/dir/"
This script creates in the output directory:
- A manifest
transcripts.json
file with links to each interview transcripts - Individual folders for each interview that contains three formats of transcripts (
.json
,.txt
,.vtt
) .json
files contain all the of the edits, while the.txt
and.vtt
contain the "best guess" transcriptions for each line