-
Notifications
You must be signed in to change notification settings - Fork 0
/
proposal.txt
212 lines (187 loc) · 11.6 KB
/
proposal.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
Title of your project proposal
==============================
Video game reviews: text classification and predicted ratings
Background and Motivation
Discuss your motivations and reasons for choosing this project, especially any
background or research interests that may have influenced your decision.
================================================================================
Since their inception video games have been somewhat of a fringe cultural
phenomenon. Not having the same aesthetic appeal as film or literature and not
bearing the institutional legitimacy of more traditional games like chess,
video games were considered for a long time awkwardly placed in the world of
leisure activities.
But in the past decade with the development of new family-oriented gaming
consoles like the Wii, the creation of artistically rendered and presented
games like Heavy Rain and The Last of Us, and the organization of
pre-professional and competitive gaming societies, video games are slowly
evolving from merely being the visual toys used exclusively by male teenagers
to a new form of media which people of widely different demographic backgrounds
can engage with.
The mediated reality potential video games make possible only further ensures
their relevance will continue to grow. And as this relevance grows, and the
demand for a wider assortment of video games beyond the traditional
action/adventure, role-playing game, shooter trifecta grows in tandem, there
will spring a larger need for video game criticism akin to that which exists
for film and television today.
Data scientists have a role in categorizing and understanding this criticism.
By applying textual analysis techniques, we can determine whether a review for
a video game is positive or negative and thus whether a prospective buyer
should purchase the game. More over, not all online video game
reviews/discussions are associated with a numerical score. Developing a
text-based review classification scheme represents a step towards being able to
determine what any arbitrary online text related to a video game suggests about
the quality of that video game.
Project Objectives
What are the scientific and inferential goals for this project? What would you
like to learn and accomplish? List the benefits.
================================================================================
The broad goal of this project is to explore the possibility of classifying the
sentiment (colloquially constrained to be either "good" or "bad") of video
game reviews by algorithmically analyzing their summary text. In more detail,
we want to compare the performance of various classifiers as applied to this
task and understand why certain classifiers might be more accurate than others.
We also would like to determine the extent to which video game review summary
text can be used to predict the corresponding numerical rating of a review.
As a first benefit, the project extends the domain of applicability of textual
analysis from movie reviews (e.g. last year's "Bayesian Tomatoes" assignment)
to video game reviews. As a secondary benefit, our methods could perhaps be
adapted to more general contexts where the community opinion for a certain game
could be determined by scraping together online text related to the game. This
"opinion" could then perhaps be used to pre-emptively gauge the potential
demand for a game which has yet to be widely adopted or even released.
Must-Have Features
These are features or calculations without which you would consider your
project to be a failure.
================================================================================
The first must have feature of our project is the data collection, in
particular gathering thousands of video game review text summaries and their
corresponding numerical ratings. We have already successfully performed this
data collection by using the Giant Bomb API and by scraping the GameSpot
website (see below for further details).
Second, we must execute a Naive Bayes classifier on the review summary corpus.
Third, we must attempt a k-nearest neighbors regression to predict numerical
review ratings based on review summary text.
Optional Features
Those features or calculations which you consider would be nice to have, but
not critical.
================================================================================
As part of our exploratory analysis we might want to investigate how certain
metadata variables affect video game reviews. Time permitting, it would be
interesting to incorporate such metadata variables into our classifiers via
priors. Some examples of relevant questions are: Is there a time trend
associated with video game review scores? Do certain video game reviewers
consistently give better or worse ratings than the average? Do the same
companies consistently produce highly rated games?
Other optional features are associated with additional classification options.
For instance, Naive Bayes features are single words. But, time permitting, it
would be interesting to investigate using n-grams of words, e.g. pairs of words
as features. Also, it would be interesting to compare Naive Bayes
classification to random forest classification, still using single-word
features.
Another possible avenue we may not have time explore is comparing formally
published reviews to informal user reviews found online.
What Data?
From where and how are you collecting your data?
================================================
We have already gathered data from two sources. Our first source is the Giant
Bomb video gaming website, which conveniently offers APIs for published
reviews, gaming company information and user reviews. Giant Bomb review scores
are integers, 1 (worst) through 5 (best). The Giant Bomb API is helpful in that
we can access review summaries, full review text and plenty of metadata for
each game without scraping. The downside is that only 646 published reviews are
available via the API, which is likely insufficient given that we anticipate
requiring a training set of several thousand review summaries. On the other
hand, GameSpot has 13702 published reviews, but no API. We have therefore
written a web scraper to gather all of the review summaries and corresponding
scores for every GameSpot review published from 1996 to present. Although we
now have these GameSpot review summaries and scores in hand, the lack of a
GameSpot API makes gathering further metadata difficult/tedious. GameSpot
review scores are more finely grained, ranging from 0 (worst) to 10 (best),
including some non-integer scores.
The crucial data for our text classification are review summaries for published
reviews (typically ~20 words) and their corresponding review scores. For each
source of reviews, we have already gathered the review summary text and scores.
We will need to make a cut on the review scores from each source to split
reviews into labels of "good" and "bad".
Design Overview
List the statistical and computational methods you plan to use.
===============================================================
Our final analysis will center around applying and comparing various text
sentiment classification techniques. For our baseline classification model, we
will apply Naive Bayes classification to review summaries to predict whether a
review is "good" (favorable) or "bad" (unfavorable), in a manner analogous to
that employed by last year's "Bayesian Tomatoes" assignment, including cross
validation. Individual words will be used as features. We will also extend our
baseline Naive Bayes classifier in several ways. One example would be to use
the "stemming" technique from natural language processing to merge features
which are the same in sentiment such that e.g. "awesomeness" and "awesome"
would both map to a single feature. Second, we could incorporate priors on e.g.
reviewer or game manufacturer during Naive Bayes classification. We also hope
to attempt random forest classification for the sake of comparison. Time
permitting, we may explore using n-grams as features rather than single words.
In terms of predicting the scores of ratings, we will attempt to use k-nearest
neighbors regression as a baseline, and attempt other regression methods as
time permits.
Verification
How will you verify your project's results? In other words, how do you know
that your project does well?
================================================================================
There are two main components of our verification process. First, for each
classifier in our final analysis, we will need to split our data into train and
test sets, and then make predictions for the unseen test data to verify
that our model generalizes properly to additional data and is not overfit.
A second verification step will be to compare our classification accuracy
to the "Bayesian Tomatoes" movie review classification accuracy of ~77%. This
will tell us whether we did "well" or did poorly on our video game review
predictions relative to the previously studied case of movie reviews from
Rotten Tomatoes.
Visualization & Presentation
How will you visualize and communicate your results?
====================================================
Visualizations will be important at various stages in our analysis. First,
we will include exploratory visualizations near the beginning of our process
book to answer questions like: What is the distribution of review scores? What
is the time trend of the average review score over the years? What is the
average review score grouped by company, console, or reviewer? Such plots
would include basic histograms, line plots and bar charts.
Second, we will want to use visualizations to display our model calibration
findings, similar to the "Bayesian Tomatoes" part 3.4 plots.
Lastly, we will want to create one or more "take-home" visualization that
illustrate the results of our final classification analysis. One such
visualization we intend to make would be JavaScript word clouds of the most
strongly positive and negative words, wherein mousing over each word deploys a
tooltip showing P(good|word).
Schedule / timeline
Make sure that you plan your work so that you can avoid a big rush right before
the final project deadline, and delegate different modules and responsibilities
among your team members. Write this in terms of weekly deadlines.
=================================================================
There are roughly 4 weeks until the video presentation and website are due.
Currently, from data scraping Gamespot reviews and using the Giant Bomb API, we
have already gathered our main data sets for analysis. We have also answered
some basic exploratory analysis questions such as what is the distribution of
Giant Bomb and Gamespot scores. From now until the project presentation
deadline, we hope to meet the following other objectives.
Week 1
- Nov 16 - 23, 2014
- Investigate all other exploratory analysis questions.
- Precisely define main question for analysis next week.
Week 2
- Nov 23 - 30, 2014
- Complete major portions of classification analysis.
Week 3
- Nov 30 - Dec 7, 2014
- Complete first draft of process book; Includes the "narrative" of the
project in addition to the visualizations and main code resulting
from last week's analysis.
- Make software preparations for website and video construction.
- Have website template prepared.
- Acquire hardware and programs which are needed to make video.
Week 4
- Dec 7 - 10, 2014
- Edit process book and prepare final draft for submission .
- Outline sections of website .
- Prepare script for video presentation.
- Dec 10 - 12, 2014
- Complete website.
- Complete video presentation.