Books Success Prediction Experiment

Assuming we were a huge books publisher and a writer came to us with a book, how could we know if this book will be successful? Also if we were to be the authors of the book, could we ever know if the book will get the audience sympathy or even reach the cinema theatres? To answer the following questions we set a goal to our research: to see if we can build a model that will predict if a book is so successful that it will also be awarded by list of books features.

Introduction
Imports
Data acquisition
3.1 Scraping challanges
3.2 Scraping clean data
3.3 Authentication process
3.4 Authentication class
3.5 Scraping Process
3.6 Book Spider Class
3.7 Scraping route creation
3.8 Genre spider
Scrapping and threading
4.1 First crawl
4.2 Concating Data
4.3 Total data scraped
Data cleaning
5.1 Corrupted data cleaning
5.2 Replace missing data - original title
5.3 None values - discussion and strategy
Pre outliers cleaning EDA
6.1 Genre distribution
6.2 Mean rating by genre
6.3 Language distribution
6.4 Edition count to rating
6.5 Rating to award
6.6 Pages count to books count
Dealing with outliers
7.1 Outliers detection
7.2 Outliers cleaning
7.3 Outliers cleaning results
EDA after outliers cleaning
8.1 Thoughts of the results
8.2 Aggregation metrics
8.3 Original title correlation with awards
8.4 Awards count per genre
8.5 Awards percentage by genre
Machine learning preperation
Machine learning - Decision tree
10.1 Single decision tree
10.2 First prediction
10.3 New dimenstion - The ace in the sleeve
10.4 Depth optiomazation
Machine learning - Random forest
11.1 Overfitting?
11.2 Model improvment
11.3 Adjusting features
11.4 Grid search many forests
11.5 F-score accuracy addition
11.6 Random states tests
Conclusion and credits

For implementation, visit hosted notebook:

https://chapost1.github.io/books-success-prediction-experiment/

Name		Name	Last commit message	Last commit date
Latest commit History 23 Commits
docs		docs
.gitignore		.gitignore
README.md		README.md
_config.yml		_config.yml
books research.ipynb		books research.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Books Success Prediction Experiment

Table of contents

About

Languages

chapost1/books-success-prediction-experiment

Folders and files

Latest commit

History

Repository files navigation

Books Success Prediction Experiment

Table of contents

About

Resources

Stars

Watchers

Forks

Languages