Skip to content

Data-Science-for-Linguists-2023/For-Reddit-Grammaticality-Analysis

Repository files navigation

For-Reddit-Grammaticality-Analysis

Camryn Simons

[email protected]

Due: April 30, 2023

Project Overview

This repository analyzes the grammaticality of posts on Reddit. Specifically this repistory seeks to answer the main question: Is there a way to categorize common grammatical errors?

Within that question, several subquestions are developed:

Which grammatical errors are most prevalant on Reddit?

Are there grammatical errors that are more common across certain subreddits?

Does the grammaticality of a post have an effect on its interactions?

Data Overview

All of the data that I used for the project can be found here. I collected all of this data myself using PRAW.

Directory

  • final_report: This file contains my final report regarding this project.
  • Data Samples: This folder contains samples of the data, collected from Reddit, that I work with in my notebooks.
  • Notebooks: This folder contains several notebooks, for purposes of data collection, organization, and analysis.
  • Git Ignore: This file contains information regarding which files my repository should ignore.
  • License: This file contains information regarding the license I have chosen from my repository.
  • README: This is where you are now! This file contains information regarding my repository.
  • Progress Report: This file contains information regarding the progress I have made on my project at various points in the semester.
  • Project Plan: This file contains information regarding the original plan for this project.

Guestbook

My guestbook for this repository can be accessed here.

About

This is Camryn's term project for Data Science for Linguists.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published