ExploratoryDataAnalysis

Illustration on merging of multiple datasets for implementing Machine Learning algorithms

1. a. A very common scenario while working with datasets to build any model, using Machine Learning Algorithms, is the merging of different datasets of train samples.

1. b. This step arrives even before Exploratory Data Analysis. So, here is an illustration regarding the same.

2.a. train1.csv: Employee Number,Employee Name,Employee Department

are the columns in first csv file of train data

2.b. train2.csv: Employee Job Level,Employee Salary

are the columns in second csv file of train data

2.c. train3.csv: Employee Id ,Employee Job Level

are the columns in third csv file of train data

3.a. We will have to merge these 3 train sets in order to begin with Exploratory Data Analysis.

3.b. Merging will be based on Employee Number(train1.csv) and Employee Id(train3.csv); post-renaming both these columns as they are the unique identifiers.

3.c. The next merge will be between the resultant of (train1.csv and train3.csv) AND train2.csv on Employee Job Level column.

4. Tools and Technologies used: Jupyter Notebook, Python3, pandas (python library)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ExploratoryDataAnalysis

Illustration on merging of multiple datasets for implementing Machine Learning algorithms

1. a. A very common scenario while working with datasets to build any model, using Machine Learning Algorithms, is the merging of different datasets of train samples.

1. b. This step arrives even before Exploratory Data Analysis. So, here is an illustration regarding the same.

2.a. train1.csv: Employee Number,Employee Name,Employee Department

2.b. train2.csv: Employee Job Level,Employee Salary

2.c. train3.csv: Employee Id ,Employee Job Level

3.a. We will have to merge these 3 train sets in order to begin with Exploratory Data Analysis.

3.b. Merging will be based on Employee Number(train1.csv) and Employee Id(train3.csv); post-renaming both these columns as they are the unique identifiers.

3.c. The next merge will be between the resultant of (train1.csv and train3.csv) AND train2.csv on Employee Job Level column.

4. Tools and Technologies used: Jupyter Notebook, Python3, pandas (python library)

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
README.md		README.md
Reading data from multiple Datasets- EDA and Data pre-processing.ipynb		Reading data from multiple Datasets- EDA and Data pre-processing.ipynb
train1.csv		train1.csv
train2.csv		train2.csv
train3.csv		train3.csv

Arkaprabha-B/ExploratoryDataAnalysis_Merge

Folders and files

Latest commit

History

Repository files navigation

ExploratoryDataAnalysis

Illustration on merging of multiple datasets for implementing Machine Learning algorithms

1. a. A very common scenario while working with datasets to build any model, using Machine Learning Algorithms, is the merging of different datasets of train samples.

1. b. This step arrives even before Exploratory Data Analysis. So, here is an illustration regarding the same.

2.a. train1.csv: Employee Number,Employee Name,Employee Department

2.b. train2.csv: Employee Job Level,Employee Salary

2.c. train3.csv: Employee Id ,Employee Job Level

3.a. We will have to merge these 3 train sets in order to begin with Exploratory Data Analysis.

3.b. Merging will be based on Employee Number(train1.csv) and Employee Id(train3.csv); post-renaming both these columns as they are the unique identifiers.

3.c. The next merge will be between the resultant of (train1.csv and train3.csv) AND train2.csv on Employee Job Level column.

4. Tools and Technologies used: Jupyter Notebook, Python3, pandas (python library)

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages