Skip to content

Exploring the relationship between student test scores in California public schools with several factors using Multiple Linear Regression method.

Notifications You must be signed in to change notification settings

PRiauwindu/California-School-Dataset-Exploration

Repository files navigation

California School Dataset Exploration

GitHub Pages: https://priauwindu.github.io/California-School-Dataset-Exploration/

Introduction

The California School Dataset Exploration repository is a comprehensive analysis of the R Ecdat package's dataset, containing information on California public schools in 1998. This dataset comprises data from 420 schools, including variables on student test scores, class sizes, teacher experience and education, and various school characteristics.

The dataset is commonly used in educational research to investigate the factors influencing student achievement and evaluate the effectiveness of policies and interventions. It also serves as a valuable teaching tool for statistics and econometrics courses, providing real-world data to illustrate statistical and modeling techniques.

Objective

The primary objective of this repository is to explore the relationship between student test scores in California public schools and several factors using the Multiple Linear Regression method.

Analysis Result Summary

The analysis of the "caschool" dataset using multiple linear regression revealed significant associations between student test scores in California public schools and various factors. The findings can be summarized as follows:

1. Factors positively associated with higher test scores:

  • Higher expenditures per student (expnstu)
  • Higher average income in the school district (avginc)

2. Factors negatively associated with student test scores:

  • Higher percentage of students not receiving free or reduced-price meals (mealpct)
  • Higher student-teacher ratio (str)
  • Lower percentage of English learners (elpct) in the school

These results suggest that investing more resources in education, addressing income-related issues, combating poverty, reducing class sizes, and addressing language barriers for non-English learners could potentially improve student achievement in California public schools.

Limitations and Cautions

It is essential to acknowledge the limitations and exercise caution when interpreting the results of the multiple linear regression analysis. The limitations include:

  • Limited variables available in the dataset
  • Range of data values within the dataset
  • Potential confounding factors not considered in the analysis

Additionally, it is crucial to note that correlations do not necessarily imply causation. Other factors, such as student motivation and teacher quality, may also influence student achievement. Therefore, these results should be interpreted with caution, and further research is necessary to comprehensively understand the complex factors impacting student achievement in California public schools.

Usage and Citation:

If you find this repository useful, you are welcome to replicate the code for your own projects or works. However, kindly ensure that you properly cite this repository in your work to acknowledge the source.

Thank you for your interest in the California School Dataset Exploration repository!

Author: Putranegara Riauwindu

About

Exploring the relationship between student test scores in California public schools with several factors using Multiple Linear Regression method.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages