Skip to content

The objective of this project is to extract textual data articles from URLs and perform text analysis to compute variables.

Notifications You must be signed in to change notification settings

pushkarsaini18/Data-Mining

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 

Repository files navigation

Data-Mining

The objective of this project is to extract textual data articles from URLs and perform text analysis to compute variables.

Data Extraction:

For each of the articles extract the article text and save the extracted article in a text file with URL_ID as its file name. While extracting text, make sure your program extracts only the article title and the article text. It should not extract the website header, footer, or anything other than the article text.

Data Analysis

For each of the extracted texts from the article, perform textual analysis and compute variables. You need to save the output in CSV format.

Variable

POSITIVE SCORE

NEGATIVE SCORE

POLARITY SCORE

SUBJECTIVITY SCORE

AVG SENTENCE LENGTH

PERCENTAGE OF COMPLEX WORDS

FOG INDEX

AVG NUMBER OF WORDS PER SENTENCE

COMPLEX WORD COUNT

WORD COUNT

SYLLABLE PER WORD

PERSONAL PRONOUNS

AVG WORD LENGTH

About

The objective of this project is to extract textual data articles from URLs and perform text analysis to compute variables.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published