In this article we will analyse the men's 4x200m freestyle event at the olympics. For this we will use the data from the finals of all previous summer olympics from Athens 2004 up to Tokio 2020.
The distribution of the data looks like a normal distribution, but it seems to be skewed. Keep in mind that the data only comprises 40 entries. We will assume that the data is normal distributed with the sample mean and sample standard deviation.
We can also look at the cumulative distribution function of the data.
This plot also shows some deviations from the fitted distribution but it doesn't look that bad especially for the fast times. We can perform the Anderson-Darling test to check if the data is explained by the fitted distribution. The p-value of an one-sample Anderson-Darlign test is given by 0.80. We must retain the null-hypothesis that the finishing times are normal distributed with mean 427.15s and standard deviation 5.27s.Given the distribution of the times of the athletes we can determine the distribution of the first, second and third time of a sample. The final consists of 8 athletes as such we need to consider a samplesize of 8. The expected value of the medal times are
Place | Expected Time (s) |
---|---|
1st | 419.65 |
2nd | 422.66 |
3nd | 424.66 |
The probability that a given time wins at least a gold, silver or bronze medal can also be calculated.
The current records are
Record | Time (s) |
---|---|
OR | 418.56 |
WR | 418.55 |
We can also look at the probability distributions of the time of the new record. This also allows for the calculation of the expected value of the new records and the probability that we'll see a new record.
Record | Probability | Expected Value (s) |
---|---|---|
OR | 34.48% | 416.16 |
WR | 34.37% | 416.15 |