We scraped https://www.msnbc.com/transcripts to get all the transcripts from 2010--2021.
year n_transcripts
2010 43
2011 115
2012 205
2013 175
2014 217
2015 986
2016 907
2017 1185
2018 1468
2019 1475
2020 1286
2021 1476
2022 131
The final data posted on the Harvard Dataverse includes 16k scripts spanning 2003--2014 that were scraped earlier. The data are posted at: https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi%3A10.7910%2FDVN%2FUPJDE1