For almost a decade stack overflow conducts an annual developer survey asking for information ranging from technologies and behaviors to questions that will help them improve their community for all the developers around the world. Looking at these surveys from 2015 to 2020, regardless of the main theme of each year’s survey, job satisfaction level has been always among the questions. Just like personal questions such as gender, level of education, country of origin, continent, company size and so on. These data sets are robust since every year developer participation is round 65.000 to 100.000 from all around the globe.
There should be no necessary libraries to run the code here beyond the Anaconda distribution of Python. The code should run with no issues using Python versions 3.*. This is a test.
This project is part of 'UDACITY Data Science Nanoo Degree' program, I was interestested in using Stack Overflow survey results from 2015 -2020 to better understand:
- Is there a relationship between size of the firm and employee satisfaction? (Company size Vs satisfaction)
- Which countries have the most satisfied developers, and which has the least? (Country Vs satisfaction)
- How satisfied are developers based on their continent? (Continent Vs satisfaction)
- Does amount of salary affects job satisfaction? (Salary Vs satisfaction)
- What role the age range play in satisfaction? Can we say as they grow older their level of satisfaction is increased or decreased? (age Vs Satisfation)
- Dose gender Matter in the level of job satisfaction in this field? (Gender Vs Satisfaction)
- What is the overall trend of job satisfaction based on the survey year? (Year Vs Satisfaction)
There are 3 notebooks available to showcase work related to the above questions. The first notebook Data_Preparion used to remove dissimilarities between selected data between all the survey results like same terms and condition for selected columns like 'Salary_Range' or 'Age_Range', .... Second notebook Data_Visualization gives insight about the relation and influence of different parameters on job satisfaction by normalizing and plotting results. In Prediction_Model notebook different machine learning models applied to results.
The main findings of the code can be found at the post available here and in the PDF JS_results.pdf . This results show that in IT industry all around the globe the feeling of the level of average job satisfaction is very close no matter of the gender, education degree, salary, company size, ....
Must give credit to Stack Overflow for the data. You can find the Licensing and datasets for all the survey results in Stack Overflow Annual Developer Survey website here. Otherwise, feel free to use the code and datasets here!