diff --git a/_assets/data/si_may.csv b/_assets/data/si_may.csv new file mode 100644 index 0000000..6a68681 --- /dev/null +++ b/_assets/data/si_may.csv @@ -0,0 +1,7 @@ +State,Biden,Trump,N +AZ,43,49,626 +GA,41,50,604 +MI,47,46,616 +NV,38,51,614 +PA,45,48,1023 +WI,46,47,614 \ No newline at end of file diff --git a/my_util b/my_util new file mode 100644 index 0000000..8e290c4 --- /dev/null +++ b/my_util @@ -0,0 +1,117 @@ +function find_True_positive(Pr_Positive_True::Float64,Pr_Positive_False::Float64, Pr_True::Float64) + Pr_Positive = Pr_Positive_True * Pr_True + Pr_Positive_False * ( 1 - Pr_True ) + posterior = Pr_Positive_True * Pr_True / Pr_Positive + return(posterior) +end +find_True_positive(0.95,0.01,0.001) +# 0.08683729433272395 + +using Distributions +using StatsBase + +using Distributions +using StatsBase +using StatsPlots + +function find_posterior(successes::Int, trials::Int, grid_points::Int, num_samples::Int) + # Create a grid of probabilities + p_grid = range(0, stop=1, length=grid_points) + + # Uniform prior + prob_p = fill(1.0, grid_points) + + # Likelihood of observing 'successes' successes in 'trials' trials + prob_data = [pdf(Binomial(trials, p), successes) for p in p_grid] + + # Posterior distribution + posterior = prob_data .* prob_p + + # Normalize the posterior + posterior ./= sum(posterior) + + # Sample from the posterior distribution + samples = sample(p_grid, Weights(posterior), num_samples, replace=true) + + # Create a density plot of the samples + density(samples, xlabel="Probability", ylabel="Density", title="Posterior Density") + + # Calculate the sum of posterior probabilities where p < 0.5 + prob_less_than_half = sum(posterior[p_grid .< 0.5]) + + # Calculate the proportion of samples between 0.5 and 0.75 + proportion = sum((samples .> 0.5) .& (samples .< 0.75)) / length(samples) + + return p_grid, posterior, samples, prob_less_than_half, proportion +end + +p_grid, posterior, samples, prob_less_than_half, proportion = find_posterior(6, 9, 1000, 10000) +println("Probability that p < 0.5: ", prob_less_than_half) +println("Proportion of samples between 0.5 and 0.75: ", proportion) + + +quantile(samples,[0.1,0.9]) + +""" +Yes, there are several additional metrics you can consider when assessing the election outcome based on your Bayesian model: + +1. Posterior mean or median: The mean or median of the posterior distribution can provide a point estimate of the proportion of voters supporting each candidate. This can give you a single "best guess" of the election outcome based on your model. + +2. Probability of winning: You can calculate the probability that a candidate will win the election by computing the proportion of posterior samples where their vote share exceeds 50%. This gives you a direct measure of the likelihood of each candidate winning based on your model. + +3. Posterior probability of a tie: You can also compute the probability of a tie by calculating the proportion of posterior samples where the vote shares are exactly equal (or within a small margin to account for rounding). + +4. Marginal likelihood or Bayes factor: If you want to compare different models (e.g., with different priors or likelihood functions), you can compute the marginal likelihood or Bayes factor. These metrics provide a way to assess the relative fit of different models to the data while accounting for model complexity. + +5. Sensitivity analysis: You can explore how sensitive your results are to the choice of prior or other model assumptions. This can help you assess the robustness of your conclusions and identify which assumptions are most critical. + +6. Predictive accuracy: If you have historical data from previous elections, you can assess the predictive accuracy of your model by comparing its predictions to the actual outcomes. This can help you calibrate your model and understand how well it performs in practice. + +These additional metrics can provide a more comprehensive understanding of the election outcome and the uncertainty associated with your predictions. They can also help you communicate your results to a wider audience and make more informed decisions based on your analysis. + +Certainly! In Julia, you can use the DynamicLinearModels.jl package for Kalman filtering and state-space modeling. This package provides a flexible and efficient framework for working with dynamic linear models, including the Kalman filter. + +To install the DynamicLinearModels.jl package, you can use the following command in the Julia REPL: + +```julia +using Pkg +Pkg.add("DynamicLinearModels") +``` + +Once the package is installed, you can use it to implement Kalman filtering for your poll aggregation. Here's a basic example of how you can use DynamicLinearModels.jl to create a Kalman filter for tracking the true state of the race over time: + +```julia +using DynamicLinearModels + +# Define the state-space model +model = LocalLevel(1.0) + +# Initialize the Kalman filter +kf = kalman_filter(model, 1.0) + +# Iterate over the polls +for poll in polls + # Extract the poll date, sample size, and candidate support + date = poll.date + sample_size = poll.sample_size + candidate_support = poll.candidate_support + + # Update the Kalman filter with the new poll + update_kalman_filter!(kf, date, candidate_support, sample_size) +end + +# Get the filtered estimates of the true state +filtered_state = get_filtered_state(kf) +``` + +In this example, we define a `LocalLevel` model, which assumes that the true state of the race follows a random walk over time. We then initialize a Kalman filter with an initial estimate of the state. + +For each poll, we extract the relevant information (date, sample size, and candidate support) and update the Kalman filter using the `update_kalman_filter!` function. The function takes the Kalman filter object, the poll date, the candidate support (as a proportion), and the sample size. + +After processing all the polls, we can retrieve the filtered estimates of the true state using the `get_filtered_state` function. + +DynamicLinearModels.jl provides a wide range of options for customizing the state-space model, handling missing data, and estimating model parameters. You can refer to the package documentation for more details and examples: https://github.com/LAMPSPUC/DynamicLinearModels.jl + +Keep in mind that Kalman filtering assumes that the polls are noisy observations of the true state, and it tries to estimate the true state by balancing the information from the polls with the assumed dynamics of the state. The performance of the Kalman filter will depend on the quality of the polls, the appropriateness of the state-space model, and the choice of model parameters. + +As with any model, it's important to validate the Kalman filter's assumptions and performance using historical data or out-of-sample predictions before relying on it for real-time poll aggregation. +""" \ No newline at end of file diff --git a/recount.md b/recount.md index 0fb39e6..ca2be01 100644 --- a/recount.md +++ b/recount.md @@ -12,7 +12,7 @@ On request when margin between first and second place candidates is less than or ## North Carolina -On request when margin between first and second place candidates is less than or equal to 0.5% of the total vote for all candidates for the office or 10,000 votes, whichever is less. **27,625** votes, based on 2020 election. [Citation](https://ncleg.gov/EnactedLegislation/Statutes/PDF/ByArticle/Chapter_163/Article_15A.pdf) +On request when margin between first and second place candidates is less than or equal to 0.5% of the total vote for all candidates for the office or 10,000 votes, whichever is less. [Citation](https://ncleg.gov/EnactedLegislation/Statutes/PDF/ByArticle/Chapter_163/Article_15A.pdf) ## Michigan