Emma Tarcson [email protected]
Summary: Is it able to be determined whether a series of text is from a republican source or a democratic source? What are common factors of rhetoric used by democrats and republicans(word choice, speech length, sentence length, TTR, etc.).
The data will be speech transcripts from RNC and DNC speakers. Because, the RNC and DNC speeches are known for their heavy persuasion tactics from a system with two clear opposing sides, focusing on transcripts from both sides will be useful for this project.
Because the data would be coming from corpora, clean-up efforts would mainly include getting word tokens and sentence lists. I would also use pandas to organize the information in the study. For size, I would definitely look for more than three speeches from both RNC and DNC speakers. As of right now, I am not set on any particular corpus, but kaggle.com has a couple options that I could use (https://www.kaggle.com/christianlillelund/2020-democratic-convention-speeches)
This will also include debate speeches
The goal of this project is to end up providing an analysis on the ways that democrats and republicans use certain language mechanics to inevitably persuade an audience toward a desired result, and also how different or similar they are. From there I want to see if it can be determined if a line of text is republican or democrat.
In order to better present the data, I think graphs would be useful. For example, showing the amount or kinds of the persuasion tactics could be neatly shown on a graph. This project will also be apolitical and will form no bias toward either side (any judgmental hypothesis of the data will be purely for personal reasons).
What builds a persuasive argument? What are common factors used to achieve effectiveness and success in a rhetorical artifact. With this project, I want to produce some of the main ways speech writers and authors frame their arguments. For example, I want to find out if and how authors tend to use certain persuasive devices like word choice, speech length, sentence length, TTR, etc. to their advantage. Then, if device effectiveness can be reflected from that information. It would also be interesting to see if there have been changes over time to see how people formulate their arguments, considering I use recent and older speeches. Also, I would like to see just how many of these tactics are being used generally within each speech.
The data will be speech transcripts from RNC and DNC speakers. Because, the RNC and DNC speeches are known for their heavy persuasion tactics from a system with two clear opposing sides, focusing on transcripts from both sides will be useful for this project.
Because the data would be coming from corpora, clean-up efforts would mainly include getting word tokens and sentence lists. I would also use pandas to organize the information in the study. For size, I would definitely look for more than three speeches from both RNC and DNC speakers. As of right now, I am not set on any particular corpus, but kaggle.com has a couple options that I could use (https://www.kaggle.com/christianlillelund/2020-democratic-convention-speeches)
If I don't choose to go forward with the RNC speeches, I found a dataset about debate speeches from the United Nations from 1970-2016 on kaggle.com, as well. https://www.kaggle.com/unitednations/un-general-debates
I definitely want to get a early start at playing around with the dataset that I choose to use, so that I can form a better understanding of where I want to go with my project.
The goal of this project is to end up providing an analysis on the ways that speech writers use certain language mechanics to inevitably persuade an audience toward a desired result. For the analysis, it'll be important to keep in mind the context and the audience of the speeches, otherwise the data will be less meaningful. A current hypothesis is that most speeches will tend to have short sentences and short words so that the audience will remain engaged in the speech. For the rhetorical factors, I might consider some kind of tagging for "positive" and "negative" words, etc.
In terms of measuring "effectiveness", there's not a real formidable way (I don't think) to determine if a speech actually succeeded in it's purpose alone. However, through this data analysis, I hope to have a good idea oh how speakers tend to use rhetorical devices to try and increase effectiveness. If I wanted to try and determine the actually result of the speeches, I could create my own data experiment and ask subjects which speech they felt was more persuasive. Or maybe there is a pre-existing study that I could search for.
In order to better present the data, I think graphs would be useful. For example, showing the effectiveness of the persuasion tactics could be neatly shown on a graph. This project will also be apolitical and will form no bias toward either side (any judgmental hypothesis of the data will be purely for personal reasons). That being said, the presentation will include a comparison and constrast between the RNC and DNC speeches.