GitHub Pages: https://priauwindu.github.io/California-School-Dataset-Exploration/
The California School Dataset Exploration repository is a comprehensive analysis of the R Ecdat package's dataset, containing information on California public schools in 1998. This dataset comprises data from 420 schools, including variables on student test scores, class sizes, teacher experience and education, and various school characteristics.
The dataset is commonly used in educational research to investigate the factors influencing student achievement and evaluate the effectiveness of policies and interventions. It also serves as a valuable teaching tool for statistics and econometrics courses, providing real-world data to illustrate statistical and modeling techniques.
The primary objective of this repository is to explore the relationship between student test scores in California public schools and several factors using the Multiple Linear Regression method.
The analysis of the "caschool" dataset using multiple linear regression revealed significant associations between student test scores in California public schools and various factors. The findings can be summarized as follows:
1. Factors positively associated with higher test scores:
- Higher expenditures per student (expnstu)
- Higher average income in the school district (avginc)
2. Factors negatively associated with student test scores:
- Higher percentage of students not receiving free or reduced-price meals (mealpct)
- Higher student-teacher ratio (str)
- Lower percentage of English learners (elpct) in the school
These results suggest that investing more resources in education, addressing income-related issues, combating poverty, reducing class sizes, and addressing language barriers for non-English learners could potentially improve student achievement in California public schools.
It is essential to acknowledge the limitations and exercise caution when interpreting the results of the multiple linear regression analysis. The limitations include:
- Limited variables available in the dataset
- Range of data values within the dataset
- Potential confounding factors not considered in the analysis
Additionally, it is crucial to note that correlations do not necessarily imply causation. Other factors, such as student motivation and teacher quality, may also influence student achievement. Therefore, these results should be interpreted with caution, and further research is necessary to comprehensively understand the complex factors impacting student achievement in California public schools.
If you find this repository useful, you are welcome to replicate the code for your own projects or works. However, kindly ensure that you properly cite this repository in your work to acknowledge the source.
Thank you for your interest in the California School Dataset Exploration repository!
Author: Putranegara Riauwindu