diff --git a/Seminar_materials/Seminar01/Seminar 1 (Introduction).ipynb b/Seminar_materials/Seminar01/Seminar 1 (Introduction).ipynb deleted file mode 100644 index 40597fa..0000000 --- a/Seminar_materials/Seminar01/Seminar 1 (Introduction).ipynb +++ /dev/null @@ -1,632 +0,0 @@ -{ - "cells": [ - { - "cell_type": "markdown", - "id": "ed2cd65f", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "# Seminar 1" - ] - }, - { - "cell_type": "markdown", - "id": "632282a7", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Let's get to know each other" - ] - }, - { - "cell_type": "markdown", - "id": "5756efe3", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Counting" - ] - }, - { - "cell_type": "markdown", - "id": "19acb677", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "### Problem 1\n", - "How many Russian-style car plates are possible in one region?" - ] - }, - { - "cell_type": "markdown", - "id": "6320db58", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "### Solution\n", - "\n", - "Russain car plate consists of three letters and three digits. Any digits are permitted, but the only permitted letters are the ones that have English-lookalikes. How many letters are there?" - ] - }, - { - "cell_type": "markdown", - "id": "c262061b", - "metadata": { - "slideshow": { - "slide_type": "fragment" - } - }, - "source": [ - "A, B, C, E, H, K, M, O, P, T, X, Y - total 12 letters." - ] - }, - { - "cell_type": "markdown", - "id": "85e5d53b", - "metadata": { - "slideshow": { - "slide_type": "fragment" - } - }, - "source": [ - "We choose the digits and the letters. Using sampling with replacement (why?):\n", - "- We choose three of ten digits: $10^3$\n", - "- We choose three of twelve letters: $12^3$\n", - "\n", - "Since the choice of the digits and the letters is independent, the total number of plates is therefore $10^3 \\cdot 12^3 = 1728000$." - ] - }, - { - "cell_type": "markdown", - "id": "8913f59a", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "### Problem 2\n", - "How many 7-digit phone numbers are possible, assuming that the first digit can’t be a 0 or a 1?" - ] - }, - { - "cell_type": "markdown", - "id": "0d8962e6", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "### Solution\n", - "\n", - "We independently choose each digit. Using sampling with replacement (why?):\n", - "- We choose the first digit from reduced set of 8 digits: $8$\n", - "- We choose the rest 6 digits: $10^6$\n", - "\n", - "The total number of phone numbers is therefore $8 \\cdot 10^6$." - ] - }, - { - "cell_type": "markdown", - "id": "182aa1e5", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "### Problem 3\n", - "How many paths are there from the point (0,0) to the point (110,111) in the plane such that each step either consists of going one unit up or one unit to the right?" - ] - }, - { - "cell_type": "markdown", - "id": "cf05e84b", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "### Solution\n", - "\n", - "We will encode a path as a sequence of letters $U$ (for up step) and $R$ (for right step), like $URURURU\\ldots UURUR$.\n", - "\n", - "The sequence must consist of 110 $R$s and 111 $U$s (why?)" - ] - }, - { - "cell_type": "markdown", - "id": "c5bbb80d", - "metadata": { - "slideshow": { - "slide_type": "fragment" - } - }, - "source": [ - "We will use the factorial rule: the number of shuffles of this sequence is $(110+111)! = 221!$. Is it correct?" - ] - }, - { - "cell_type": "markdown", - "id": "09d469d4", - "metadata": { - "slideshow": { - "slide_type": "fragment" - } - }, - "source": [ - "It is not correct, because we do not care about individual permutations of $R$s and $U$s, but we counted these permutations as different. We need to adjust for overcounting.\n", - "\n", - "We need to get rid of permutations that we counted multiple times. In order to do that, we divide byy the number of such permutations, and this gives the correct answer:\n", - "\n", - "$$\\frac{221!}{110!111!}$$" - ] - }, - { - "cell_type": "markdown", - "id": "e15dda8a", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "### Binomial coefficient\n", - "A binomial coefficient counts the number of subsets of a certain size for a set, such as the number of ways to choose a committee of size $k$ from a set of $n$ people. Sets and subsets are by definition unordered, e.g., $\\{3, 1, 4\\} = \\{4, 1, 3\\}$, so we are counting the number of ways to choose $k$ objects out of $n$, without replacement and without distinguishing between the different orders in which they could be chosen.\n", - "\n", - "For any nonnegative integers $k$ and $n$, the binomial coefficient $\\begin{pmatrix}n\\\\k\\end{pmatrix}$, read as \"$n$ choose $k$\", is the number of subsets of size $k$ for a set of size $n$. For $ k \\leqslant n$,\n", - "\n", - "$$\n", - "\\begin{pmatrix}n\\\\k\\end{pmatrix}=\\frac{n!}{k!(n-k)!}\n", - "$$" - ] - }, - { - "cell_type": "markdown", - "id": "43ad8ad1", - "metadata": { - "slideshow": { - "slide_type": "fragment" - } - }, - "source": [ - "Note that to fully describe the sequence we actually only need to specify where the $R$s are located. This falls under binomial coefficient definition. So there are $\\begin{pmatrix}110+111\\\\110\\end{pmatrix}$ possible paths." - ] - }, - { - "cell_type": "markdown", - "id": "8e3b937b", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "### Problem 3\n", - "How many ways are there to split a dozen people into 3 teams, where each team has 4 people?" - ] - }, - { - "cell_type": "markdown", - "id": "1d70cd45", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "### Solution\n", - "\n", - "Let's randomly pick the first team, then randomly pick the second and claim the remaining people the third team.\n", - "\n", - "This gives us $\\begin{pmatrix}12\\\\4\\end{pmatrix}\\cdot\\begin{pmatrix}8\\\\4\\end{pmatrix}$ possibilities. Is it correct?" - ] - }, - { - "cell_type": "markdown", - "id": "7c0b75a6", - "metadata": { - "slideshow": { - "slide_type": "fragment" - } - }, - "source": [ - "It is not correct, because we overcounted due the fact that we do not actually care which team is the first, second or third. So we need to divide the expression by $3!$. The final answer is:\n", - "\n", - "$$\n", - "\\frac{1}{3!} \\cdot \\begin{pmatrix}12\\\\4\\end{pmatrix}\\cdot\\begin{pmatrix}8\\\\4\\end{pmatrix} = \\frac{1}{3!} \\cdot \\frac{12!}{4!8!} \\cdot \\frac{8!}{4!4!} = \\frac{12!}{4! 4! 4! 3!}\n", - "$$" - ] - }, - { - "cell_type": "markdown", - "id": "8cd47068", - "metadata": { - "slideshow": { - "slide_type": "fragment" - } - }, - "source": [ - "If we cared which team is which, we would obtain $\\frac{12!}{4! 4! 4!}$, which is called a multinomial coefficient. The only difference is that we choose more than one subset from one total." - ] - }, - { - "cell_type": "markdown", - "id": "3f3c4caa", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "### Problem 5\n", - "A certain casino uses 10 standard decks of cards mixed together into one big deck, which we will call a superdeck. Thus, the superdeck has 52 · 10 = 520 cards, with 10 copies of each card. How many different 10-card hands can be dealt from the superdeck? The order of the cards does not matter, nor does it matter which of the original 10 decks the cards came from. Express your answer as a binomial coefficient." - ] - }, - { - "cell_type": "markdown", - "id": "1172a6f5", - "metadata": { - "slideshow": { - "slide_type": "fragment" - } - }, - "source": [ - "### Solution\n", - "\n", - "Since we have 10 copies of each card, there are in fact no limitations on the hand and sampling from superdeck without replacement is equivalent to sampling from deck with replacement. So we just use the formula for sampling with replacement where the order matters:\n", - "$$\n", - "\\begin{pmatrix}52+10-1\\\\10\\end{pmatrix}\n", - "$$" - ] - }, - { - "cell_type": "markdown", - "id": "bfae1a78", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Naive definition" - ] - }, - { - "cell_type": "markdown", - "id": "fc86e8db", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "### Problem 7\n", - "A city with 6 districts has 6 robberies in a particular week. Assume the robberies are located randomly, with all possibilities for which robbery occurred where equally likely. What is the probability that some district had more than 1 robbery?" - ] - }, - { - "cell_type": "markdown", - "id": "cf105e4e", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "### Solution\n", - "\n", - "We will compute the probability of the complement.\n", - "\n", - "- All cases: There are $6^6$ possible configurations for which robbery occurred where.\n", - "- Favorable cases: There are $6!$ configurations where each district had exactly 1 of the 6.\n", - "\n", - "So the probability of the complement of the desired event is $6!/6^6$.\n", - "\n", - "Finally, the probability of some district having more than 1 robbery is $1 - 6!/6^6$." - ] - }, - { - "cell_type": "code", - "execution_count": 1, - "id": "5dc1948a", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "outputs": [ - { - "data": { - "text/plain": [ - "0.9845679012345679" - ] - }, - "execution_count": 1, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "from scipy.special import factorial\n", - "1 - factorial(6) / (6 ** 6)" - ] - }, - { - "cell_type": "markdown", - "id": "beacaf6b", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "### Problem 8\n", - "Each of $n$ balls is independently placed into one of $n$ boxes, with all boxes equally likely.\n", - "What is the probability that exactly one box is empty?" - ] - }, - { - "cell_type": "markdown", - "id": "171b4cea", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "### Solution\n", - "\n", - "Reformulate: one box empty means one box has two balls.\n", - "\n", - "- All cases: $n^n$ (why?)\n", - "- Favorable cases:\n", - " - Choose empty box: $\\begin{pmatrix}n\\\\1\\end{pmatrix}$\n", - " - Choose box with two balls: $\\begin{pmatrix}n-1\\\\1\\end{pmatrix}$\n", - " - Choose two balls: $\\begin{pmatrix}n\\\\2\\end{pmatrix}$\n", - " - Permutations of the rest balls: $(n-2)!$\n", - " \n", - "Overall:\n", - "$$\n", - "\\frac{\\begin{pmatrix}n\\\\1\\end{pmatrix}\\begin{pmatrix}n-1\\\\1\\end{pmatrix}\\begin{pmatrix}n\\\\2\\end{pmatrix}(n-2)!}{n^n} = \\frac{\\begin{pmatrix}n\\\\2\\end{pmatrix} n!}{n^n}\n", - "$$" - ] - }, - { - "cell_type": "markdown", - "id": "ff01ba29", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Non-naive definition" - ] - }, - { - "cell_type": "markdown", - "id": "245a4685", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "### Definition\n", - "\n", - "A probability space consists of a sample space $S$ and a probability function $P$ which takes an event $A \\subseteq S$ as input and returns $P(A)$, a real number between $0$ and $1$, as output. The function $P$ must satisfy the following axioms:\n", - "- $P(\\varnothing) = 0, P(S) = 1$\n", - "- If $A1, A2, \\ldots$ are disjoint ($A_i \\cap A_j = \\varnothing, i \\neq j$) events, then\n", - " $$\n", - " P\\left(\\bigcup\\limits_{j=1}^\\infty A_j\\right) = \\sum\\limits_{j=1}^\\infty P(A_j)\n", - " $$" - ] - }, - { - "cell_type": "markdown", - "id": "0eb7c712", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "### Properties\n", - "\n", - "1. $P(A^c) = 1 − P(A)$\n", - "2. If $A \\subseteq B$, then $P(A) \\leqslant P(B)$\n", - "3. $P (A \\cup B) = P (A) + P (B) − P (A \\cap B)$" - ] - }, - { - "cell_type": "markdown", - "id": "36e2e51e", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "### Inclusion-exclusion formula\n", - "\n", - "$$P (A \\cup B) = P (A) + P (B) − P (A \\cap B)$$" - ] - }, - { - "cell_type": "markdown", - "id": "7c5cd08c", - "metadata": { - "slideshow": { - "slide_type": "fragment" - } - }, - "source": [ - "$$P\\left(\\bigcup\\limits_{i=1}^n A_i\\right) = \\sum_i P(A_i) − \\sum_{i < j} P(A_i \\cap A_j) + \\sum_{i < j < k}P(A_i \\cap A_j \\cap A_k)−\\ldots+(−1)^{n+1} P(A_1 \\cap\\ldots \\cap A_n)$$" - ] - }, - { - "cell_type": "code", - "execution_count": 4, - "id": "237d7658", - "metadata": { - "slideshow": { - "slide_type": "skip" - } - }, - "outputs": [], - "source": [ - "from matplotlib_venn import venn2, venn3" - ] - }, - { - "cell_type": "code", - "execution_count": 3, - "id": "9a01b5cf", - "metadata": { - "slideshow": { - "slide_type": "skip" - } - }, - "outputs": [ - { - "data": { - "text/plain": [ - "" - ] - }, - "execution_count": 3, - "metadata": {}, - "output_type": "execute_result" - }, - { - "data": { - "image/png": "\n", - "text/plain": [ - "
" - ] - }, - "metadata": {}, - "output_type": "display_data" - } - ], - "source": [ - "venn2(({'A', 'B', 'C'}, {'A', 'D', 'E'}))" - ] - }, - { - "cell_type": "code", - "execution_count": 6, - "id": "4c3960ed", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "outputs": [ - { - "data": { - "text/plain": [ - "" - ] - }, - "execution_count": 6, - "metadata": {}, - "output_type": "execute_result" - }, - { - "data": { - "image/png": "\n", - "text/plain": [ - "
" - ] - }, - "metadata": {}, - "output_type": "display_data" - } - ], - "source": [ - "venn3(({'A', 'B', 'C'}, {'A', 'D', 'E'}, {'A', \"F\", \"G\"}))" - ] - }, - { - "cell_type": "markdown", - "id": "1dada35e", - "metadata": { - "slideshow": { - "slide_type": "fragment" - } - }, - "source": [ - "$$(3+3+3)-(1+1+1)+1=9-3+1=7$$" - ] - }, - { - "cell_type": "markdown", - "id": "5da2e9e5", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "### Problem 9\n", - "A fair dice is rolled $n$ times. What is the probability that at least 1 of the 6 values never appears?" - ] - }, - { - "cell_type": "markdown", - "id": "8ad48654", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "### Solution\n", - "\n", - "$A_i$ - the event that $i$-th value does not appear. Then, $\\bigcup\\limits_{i=1}^6 A_i$ is the event that at least one values does not appear.\n", - "\n" - ] - } - ], - "metadata": { - "celltoolbar": "Slideshow", - "kernelspec": { - "display_name": "Python 3 (ipykernel)", - "language": "python", - "name": "python3" - }, - "language_info": { - "codemirror_mode": { - "name": "ipython", - "version": 3 - }, - "file_extension": ".py", - "mimetype": "text/x-python", - "name": "python", - "nbconvert_exporter": "python", - "pygments_lexer": "ipython3", - "version": "3.8.12" - } - }, - "nbformat": 4, - "nbformat_minor": 5 -} diff --git a/Seminar_materials/Seminar01/Seminar 1 (Introduction).pdf b/Seminar_materials/Seminar01/Seminar 1 (Introduction).pdf deleted file mode 100644 index 57918d5..0000000 Binary files a/Seminar_materials/Seminar01/Seminar 1 (Introduction).pdf and /dev/null differ diff --git a/Seminar_materials/Seminar02/Seminar 2 (Definition of probability).ipynb b/Seminar_materials/Seminar02/Seminar 2 (Definition of probability).ipynb deleted file mode 100644 index be01908..0000000 --- a/Seminar_materials/Seminar02/Seminar 2 (Definition of probability).ipynb +++ /dev/null @@ -1,442 +0,0 @@ -{ - "cells": [ - { - "cell_type": "markdown", - "id": "1cac9fc4", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "# Seminar 2" - ] - }, - { - "cell_type": "markdown", - "id": "3b8f1f70", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Recap of counting and naive probability\n", - "\n", - "Sampling $k$ objects from $n$ choices:\n", - "\n", - "|With replacement|Order matters|Formula|Example|\n", - "|:-:|:-:|:-:|:-:|\n", - "|Yes|Yes|\\begin{eqnarray}n^k\\end{eqnarray}|Car plates|\n", - "|Yes|No|\\begin{eqnarray}\\begin{pmatrix}n+k-1\\\\k\\end{pmatrix}\\end{eqnarray}|\"Stars and bars\"|\n", - "|No|Yes|\\begin{eqnarray}\\lfloor n \\rfloor_k\\end{eqnarray}|Birthday paradox complement numerator|\n", - "|No|No|\\begin{eqnarray}\\begin{pmatrix}n\\\\k\\end{pmatrix}\\end{eqnarray}|Bose-Einstein|\n", - "\n", - "Arranging $k$ objects into $n$ boxes:\n", - "\n", - "|With replacement|Objects distinguishable|Formula|\n", - "|:-:|:-:|:-:|\n", - "|Yes|Yes|\\begin{eqnarray}n^k\\end{eqnarray}|\n", - "|Yes|No|\\begin{eqnarray}\\begin{pmatrix}n+k-1\\\\k\\end{pmatrix}\\end{eqnarray}|\n", - "|No|Yes|\\begin{eqnarray}\\lfloor n \\rfloor_k\\end{eqnarray}|\n", - "|No|No|\\begin{eqnarray}\\begin{pmatrix}n\\\\k\\end{pmatrix}\\end{eqnarray}|" - ] - }, - { - "cell_type": "markdown", - "id": "d8d195a0", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Problem 1\n", - "\n", - "There are 15 chocolate bars and 10 children. In how many ways can the chocolate bars be distributed to the children, in each of the following scenarios?\n", - "- The chocolate bars are fungible (interchangeable).\n", - "- The chocolate bars are fungible, and each child must receive at least one.\n", - "- The chocolate bars are not fungible (it matters which particular bar goes where).\n", - "- The chocolate bars are not fungible, and each child must receive at least one. Hint: The strategy suggested in (b) does not apply. Instead, consider randomly giving the chocolate bars to the children, and apply inclusion-exclusion." - ] - }, - { - "cell_type": "markdown", - "id": "d9a7231d", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Solution 1.1\n", - "\n", - "The chocolate bars are fungible (interchangeable). Since the children are interchangeable as well, we will be using \"stars and bars\":\n", - "\n", - "$$|\\underbrace{oo}_{\\text{child }1}|\\underbrace{o}_{\\text{child }2}|\\underbrace{o}_{\\text{child }3}|\\ldots|\\underbrace{o}_{\\text{child }10}|$$\n", - "\n", - "- We have $10-1=9$ bars (separators between children), because left- and right-most bars are fixed\n", - "- We have $15$ stars (chocolates)\n", - "- Total $9+15=24$ possible object positions\n", - "\n", - "Therefore, we have $\\begin{pmatrix}24\\\\9\\end{pmatrix}$ combinations." - ] - }, - { - "cell_type": "markdown", - "id": "68d5be3e", - "metadata": { - "slideshow": { - "slide_type": "subslide" - } - }, - "source": [ - "There is a different way to arrive at this answer: for each of 15 chocolate bars we are making a decision from 10 children with replacement. The formula from the lecture gives $\\begin{pmatrix}10+15-1\\\\15\\end{pmatrix}$, which is the same number." - ] - }, - { - "cell_type": "markdown", - "id": "bf200579", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Solution 1.2\n", - "\n", - "\n", - "The chocolate bars are fungible (interchangeable). Since the children are interchangeable as well, we will be using \"stars and bars\". Let's first lay out all the chocolate bars in a line:\n", - "\n", - "$$oooo\\ldots o$$\n", - "\n", - "Next, we need to put the boundaries into their possible positions, but now without replacement (so that boundaries do not coincide leaving a child without his chocolate bar)\n", - "\n", - "$$|\\underbrace{oo}_{\\text{child }1}|\\underbrace{o}_{\\text{child }2}|\\underbrace{o}_{\\text{child }3}|\\ldots|\\underbrace{o}_{\\text{child }10}|$$\n", - "\n", - "- We have $10-1=9$ bars (separators between children), because left- and right-most bars are fixed\n", - "- We have $15-1=14$ stars (chocolates) i.e. object positions\n", - "\n", - "Sampling without replacement, we obtain $\\begin{pmatrix}14\\\\9\\end{pmatrix}$." - ] - }, - { - "cell_type": "markdown", - "id": "e6f120f7", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Solution 1.3\n", - "\n", - "The chocolate bars are not fungible (it matters which particular bar goes where), but the children are still interchangeable. Can't use \"stars and bars\", though.\n", - "\n", - "For each of 15 chocolate bars we will be selecting one of 10 children who gets it, with replacement of children. The formula from the lecture gives us: $10^{15}$." - ] - }, - { - "cell_type": "markdown", - "id": "fe4464dd", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Solution 1.4\n", - "\n", - "The chocolate bars are not fungible, and each child must receive at least one. The children are interchangeable. Can't use \"stars and bars\". Instead, let's apply inclusion-exclusion. From the previous subproblem, the number of all possible combinations is $10^{15} = \\begin{pmatrix}10\\\\0\\end{pmatrix} 10^{15}$.\n", - "\n", - "Next, let's count how many cases are there, when exactly one child has no chocolate bar. Denote $A_{i}$ the event that child $i$ does not get a chocolate bar. The number of such combinations is $N(A_i) = \\begin{pmatrix}10\\\\1\\end{pmatrix}9^{15}$.\n", - "\n", - "Next, let's count how many cases are there, when exactly two children have no chocolate bar: $N(A_i \\cap A_j) = \\begin{pmatrix}10\\\\2\\end{pmatrix}8^{15}$.\n", - "\n", - "See the pattern? Now we need to apply inclusion-exclusion formula. The final number of combinations is:\n", - "$$\n", - "\\sum_{k=0}^{10} (-1)^k \\begin{pmatrix}10\\\\k\\end{pmatrix} (10-k)^{15}\n", - "$$" - ] - }, - { - "cell_type": "markdown", - "id": "5a3ae9da", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Problem 2\n", - "\n", - "What is the number of all subsets of a set with $N$ elements?" - ] - }, - { - "cell_type": "markdown", - "id": "7606e3a5", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Solution 2\n", - "\n", - "Denote our set $A = \\{a_1, a_2, \\ldots, a_N\\}$. Now let's create a subset $B \\subset A$. For every element $a_i$, let's choose if we will include it into subset ($1$) or not ($0$). How many combinations of zeros and ones are there then?" - ] - }, - { - "cell_type": "markdown", - "id": "6db37b9a", - "metadata": { - "slideshow": { - "slide_type": "fragment" - } - }, - "source": [ - "Using ordered sampling with replacement, we obtain $2^N$ combinations." - ] - }, - { - "cell_type": "markdown", - "id": "63428a7f", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Problem 3\n", - "\n", - "There are 100 passengers lined up to board an airplane with 100 seats (with each seat assigned to one of the passengers). The first passenger in line crazily decides to sit in a randomly chosen seat (with all seats equally likely). Each subsequent passenger takes their assigned seat if available, and otherwise sits in a random available seat. What is the probability that the last passenger in line gets to sit in their assigned seat?" - ] - }, - { - "cell_type": "markdown", - "id": "1554fccc", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Solution 3\n", - "\n", - "Denote $i$-th passenger true seat as $i$, regardless of its position in the plane.\n", - "\n", - "Next, notice that if any passenger $j$ sits into seat $1$, it means that his place $j$ is taken. Such case therefore removes the source of permutation. After that, all the passengers that enter the plane will be able to sit in their true seats. It is important that it always happens and can happen with any passenger $j$.\n", - "\n", - "Generally, the last $100$-th passenger may observe two cases:\n", - "- The premutation was removed, then he has the option to sit into his true $100$-th seat\n", - "- The permutation was not removed, then he is the one to remove the permutation and take seat $1$.\n", - "\n", - "We can now reduce the problem to just two seats: $1$-st and $100$-th. One of the passengers seating on these seats is the last $100$-th passenger, the other is any other passenger $j$." - ] - }, - { - "cell_type": "markdown", - "id": "0113aef1", - "metadata": { - "slideshow": { - "slide_type": "subslide" - } - }, - "source": [ - "Since $j<100$, it means that both $1$-st and $100$-th seats were empty when he or she boarded the plane! And the probabilities to sit in any of them is equal.\n", - "\n", - "If $j$ sat in $1$ then the last passenger ended up sitting in $100$ and the resulting configuration of passengers sitting in the 100 seats is the same as if $j$ had sat in $100$ except for the fact that the passengers in $1$ and $100$ are swapped. Therefore these two configurations occur with the same probability and exactly one of them has the last passenger in her seat $100$.\n", - "\n", - "This implies that all the final configurations of passengers can be paired such that the two configurations in any pair occur with the same probability and exactly one has the last passenger in her seat.\n", - "\n", - "This implies that the probability that the last passenger is in her seat is $0.5$." - ] - }, - { - "cell_type": "markdown", - "id": "b5342290", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Axiomatic definition\n", - "\n", - "A probability space is the following tuple: $(\\Omega, \\cal{F}, \\mathbb{P})$." - ] - }, - { - "cell_type": "markdown", - "id": "714fcb86", - "metadata": { - "slideshow": { - "slide_type": "subslide" - } - }, - "source": [ - "- **Sample space** $\\Omega = \\{\\omega\\}$ is an arbitrary set. It is a space of **elementary outcomes** (basic mutually exclusive events)." - ] - }, - { - "cell_type": "markdown", - "id": "994580c0", - "metadata": { - "slideshow": { - "slide_type": "subslide" - } - }, - "source": [ - "- Basic and non-basic events are associated with **sets** of outcomes, which belong to **set of events** - a family $\\cal{F} \\subset 2^\\Omega$, such that\n", - " 1. $\\Omega \\in \\cal{F}$\n", - " 2. If $A \\in \\cal{F}$, then $\\overline{A} \\in \\cal{F}$ (closed under complement operation)\n", - " 3. If $A, B \\in \\cal{F}$, then $A \\cup B \\in \\cal{F}$ (closed under union operation)\n", - " 4. If $A_1, A_2, \\ldots \\in \\cal{F}$, then $\\bigcup_{k=1}^\\infty A_k \\in \\cal{F}$ (closed under countable union operation)\n", - " \n", - "A set $\\cal{F}$ that satisfies conditions (1, 2, 3) is called an **algebra of sets**. A set $\\cal{F}$ that satisfies conditions (1, 2, 4) is called a **$\\sigma$-algebra of sets**. If $\\Omega$ is finite, any algebra is a $\\sigma$-algebra.\n", - "\n", - "Properties of a $\\sigma$-algebra:\n", - "- $\\varnothing \\in \\cal{F}$\n", - "- If $A_1, A_2, \\ldots \\in \\cal{F}$, then $\\bigcap_{k=1}^\\infty A_k \\in \\cal{F}$\n", - "\n", - "The pair $(\\Omega, \\cal{F})$ is called a **measurable space**." - ] - }, - { - "cell_type": "markdown", - "id": "209206a1", - "metadata": { - "slideshow": { - "slide_type": "subslide" - } - }, - "source": [ - "- The function $\\mathbb{P}: \\cal{F} \\to \\mathbb{R}_+$ is called a **probability measure**, if\n", - " 1. $\\mathbb{P}(\\Omega) = 1$\n", - " 2. If $A_1, A_2, \\ldots \\in \\cal{F}$ and $A_i \\cap A_j = \\varnothing$ for $i\\neq j$, then $\\mathbb{P}\\left(\\bigcup_{k=0}^\\infty A_k \\right) = \\sum_{k=1}^\\infty \\mathbb{P}(A_k)$ ($\\sigma$-additivity)\n", - " \n", - "Properties of probability measure:\n", - "- $\\mathbb{P}(\\overline{A}) = 1 - \\mathbb{P}(A)$\n", - "- If $B \\subset A$, then $\\mathbb{P}(B) \\leqslant \\mathbb{P}(A)$\n", - "- If $A_1 \\subset A_2 \\subset \\ldots$, then $\\mathbb{P}\\left(\\bigcup_{k=1}^\\infty\\right) = \\lim_{k\\to\\infty} \\mathbb{P}(A_k)$\n", - "- If $A_1 \\supset A_2 \\supset \\ldots$, then $\\mathbb{P}\\left(\\bigcap_{k=1}^\\infty\\right) = \\lim_{k\\to\\infty} \\mathbb{P}(A_k)$" - ] - }, - { - "cell_type": "markdown", - "id": "764dbfee", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Problem 4\n", - "\n", - "Consider set $S$ of all (how many?) subsets of set $M = \\{1, \\ldots, N\\}$. We take two sets randomly and independently two sets $A, B \\in S$. Find the probability that $A \\cap B = \\varnothing$." - ] - }, - { - "cell_type": "markdown", - "id": "eb8e0e1b", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Solution 4\n", - "\n", - "Take any element $e \\in M$ from the original set.\n", - "\n", - "- $\\mathbb{P}(e \\in A) = p_1 = \\tfrac12$, by construction of $A$\n", - "- $\\mathbb{P}(e \\in B) = p_2 = \\tfrac12$, by construction of $B$\n", - "- $\\mathbb{P}(e \\in A \\cap B) = p_{12} = p_1 \\cdot p_2 = \\tfrac14$\n", - "- $\\mathbb{P}(e \\notin A \\cap B) = 1 - \\mathbb{P}(e \\in A \\cap B) = 1 - p_{12} = \\tfrac34$\n", - "\n", - "Repeat for every $e \\in M$ to obtain:\n", - "$$\n", - "\\mathbb{P}(A \\cap B = \\varnothing) = \\left( \\frac34 \\right)^N\n", - "$$" - ] - }, - { - "cell_type": "markdown", - "id": "97ac6cc4", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Problem 3\n", - "\n", - "Let $B_1, B_2, \\ldots, B_n \\in \\cal{F}$ be some events. Prove that\n", - "$$\n", - "\\mathbb{P}\\left(\\bigcup_{k=0}^\\infty B_k \\right) \\leqslant \\sum_{k=1}^\\infty \\mathbb{P}(B_k)\n", - "$$" - ] - }, - { - "cell_type": "markdown", - "id": "86ea074b", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Solution 3\n", - "\n", - "- $\\sigma$-additivity axiom of probability: If $A_1, A_2, \\ldots \\in \\cal{F}$ and $A_i \\cap A_j = \\varnothing$ for $i\\neq j$, then $\\mathbb{P}\\left(\\bigcup_{k=0}^\\infty A_k \\right) = \\sum_{k=1}^\\infty \\mathbb{P}(A_k)$\n", - "- We need to prove $\\mathbb{P}\\left(\\bigcup_{k=0}^\\infty B_k \\right) \\leqslant \\sum_{k=1}^\\infty \\mathbb{P}(B_k)$\n", - "\n", - "What is lacking?" - ] - }, - { - "cell_type": "markdown", - "id": "40428e99", - "metadata": { - "slideshow": { - "slide_type": "subslide" - } - }, - "source": [ - "We need to correct our events $B_k$ to be disjoint. Let's introduce sets $C_k = B_k - \\bigcup_{i=1}^{k=1} B_k$. They are disjoint by construction.\n", - "\n", - "Since $C_k \\subset B_k$, we have\n", - "- $\\bigcup_{k=1}^\\infty B_k = \\bigcup_{k=1}^\\infty C_k$\n", - "- $\\mathbb{P}(C_k) \\leqslant \\mathbb{P}(B_k)$\n", - "\n", - "Therefore,\n", - "$$\n", - "\\mathbb{P}\\left(\\bigcup_{k=0}^\\infty B_k \\right) = \\mathbb{P}\\left(\\bigcup_{k=0}^\\infty C_k \\right)= \\sum_{k=1}^\\infty \\mathbb{P}(C_k) \\leqslant \\sum_{k=1}^\\infty \\mathbb{P}(B_k)\n", - "$$" - ] - } - ], - "metadata": { - "celltoolbar": "Slideshow", - "kernelspec": { - "display_name": "Python 3 (ipykernel)", - "language": "python", - "name": "python3" - }, - "language_info": { - "codemirror_mode": { - "name": "ipython", - "version": 3 - }, - "file_extension": ".py", - "mimetype": "text/x-python", - "name": "python", - "nbconvert_exporter": "python", - "pygments_lexer": "ipython3", - "version": "3.8.12" - } - }, - "nbformat": 4, - "nbformat_minor": 5 -} diff --git a/Seminar_materials/Seminar02/Seminar 2 (Definition of probability).pdf b/Seminar_materials/Seminar02/Seminar 2 (Definition of probability).pdf deleted file mode 100644 index 9336153..0000000 Binary files a/Seminar_materials/Seminar02/Seminar 2 (Definition of probability).pdf and /dev/null differ diff --git a/Seminar_materials/Seminar03/Seminar 3 (Conditional probability).pdf b/Seminar_materials/Seminar03/Seminar 3 (Conditional probability).pdf deleted file mode 100644 index 073c38b..0000000 Binary files a/Seminar_materials/Seminar03/Seminar 3 (Conditional probability).pdf and /dev/null differ diff --git a/Seminar_materials/Seminar04-05/.ipynb_checkpoints/Seminar 4 (Random variables)-checkpoint.ipynb b/Seminar_materials/Seminar04-05/.ipynb_checkpoints/Seminar 4 (Random variables)-checkpoint.ipynb deleted file mode 100644 index 061a25d..0000000 --- a/Seminar_materials/Seminar04-05/.ipynb_checkpoints/Seminar 4 (Random variables)-checkpoint.ipynb +++ /dev/null @@ -1,757 +0,0 @@ -{ - "cells": [ - { - "cell_type": "markdown", - "id": "a8f7b639", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "# Seminar 4" - ] - }, - { - "cell_type": "markdown", - "id": "7bb7a2e9", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Recap of axiomatic definition of probability\n", - "\n", - "A probability space is the following tuple: $(\\Omega, \\cal{F}, \\mathbb{P})$." - ] - }, - { - "cell_type": "markdown", - "id": "concrete-petroleum", - "metadata": { - "slideshow": { - "slide_type": "fragment" - } - }, - "source": [ - "- **Sample space** $\\Omega$\n", - "- **Set of events** $\\cal{F}$\n", - "- **Probability measure** $\\mathbb{P}$" - ] - }, - { - "cell_type": "markdown", - "id": "collaborative-madison", - "metadata": { - "slideshow": { - "slide_type": "subslide" - } - }, - "source": [ - "Set of events is $\\cal{F} \\subset 2^\\Omega$ ($\\sigma$-algebra), such that\n", - "1. $\\Omega \\in \\cal{F}$\n", - "2. If $A \\in \\cal{F}$, then $\\overline{A} \\in \\cal{F}$ (closed under complement operation)\n", - "3. If $A_1, A_2, \\ldots \\in \\cal{F}$, then $\\bigcup_{k=1}^\\infty A_k \\in \\cal{F}$ (closed under countable union operation)\n", - "\n", - "The pair $(\\Omega, \\cal{F})$ is called a measurable space. Set $A$ is called measurable if $A \\in \\mathcal{F}$.\n", - "\n", - "Probability measure is $\\mathbb{P}: \\cal{F} \\to \\mathbb{R}_+$, such that\n", - "1. $\\mathbb{P}(\\Omega) = 1$\n", - "2. If $A_1, A_2, \\ldots \\in \\cal{F}$ and $A_i \\cap A_j = \\varnothing$ for $i\\neq j$, then $\\mathbb{P}\\left(\\bigcup_{k=0}^\\infty A_k \\right) = \\sum_{k=1}^\\infty \\mathbb{P}(A_k)$ ($\\sigma$-additivity)" - ] - }, - { - "cell_type": "markdown", - "id": "reduced-february", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Example 1\n", - "\n", - "Let\n", - "- $\\Omega = (0, 1]$\n", - "- $\\mathcal{F} = 2^{(0, 1]}$\n", - "- $\\mathbb{P}(A) = \\tfrac{k}{n}$, where $k$ is the number of points like $\\tfrac{i}{n}, i \\in \\{1, \\ldots, n\\}$ in $A$\n", - "\n", - "We can check that all the necessary conditions are satisfied:" - ] - }, - { - "cell_type": "markdown", - "id": "convinced-musical", - "metadata": { - "slideshow": { - "slide_type": "fragment" - } - }, - "source": [ - "1. $\\Omega = (0, 1] \\subset 2^{(0, 1]} = \\mathcal{F}$\n", - "2. If $A \\subset (0, 1] \\in \\mathcal{F}$, then $\\overline{A} \\subset (0, 1] \\in \\mathcal{F}$ as well\n", - "3. If $A_1, A_2, \\ldots \\subset (0, 1] \\in \\mathcal{F}$, then all their elements are in $(0, 1]$, and thus the union $\\bigcup_{k=1}^\\infty A_k \\subset (0, 1] \\in \\mathcal{F}$\n", - "4. $\\mathbb{P}(\\Omega) = \\mathbb{P}((0,1]) = \\tfrac{n}{n} = 1$\n", - "5. If $A_1$ and $A_2 \\in \\cal{F}$ and $A_1 \\cap A_2 = \\varnothing$, then $\\tfrac{k_1 + k_2}{n} = \\mathbb{P}\\left(A_1 \\cup A_2 \\right) = \\mathbb{P}(A_1) + \\mathbb{P}(A_2) = \\tfrac{k_1}{n} + \\tfrac{k_2}{n}$" - ] - }, - { - "cell_type": "markdown", - "id": "congressional-mercury", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Example 2\n", - "\n", - "Let\n", - "- $\\Omega = (0, 1]$\n", - "- $\\mathcal{F}$ is the set of all half-intervals $(a, b]$ in $(0, 1]$\n", - "- $\\mathbb{P}((a, b]) = b - a$ (length of half-interval)" - ] - }, - { - "cell_type": "markdown", - "id": "current-height", - "metadata": { - "slideshow": { - "slide_type": "fragment" - } - }, - "source": [ - "In this case, $\\mathcal{F}$ is not a $\\sigma$-algebra, because union of half-intervals is not necessarily a half-interval." - ] - }, - { - "cell_type": "markdown", - "id": "expected-dynamics", - "metadata": { - "slideshow": { - "slide_type": "subslide" - } - }, - "source": [ - "We need to add something else to $\\mathcal{F}$. We can add all finite unions of half-intervals, then $\\mathcal{F}$ will be an algebra, but still not $\\sigma$-algebra. We can **not** build such $\\mathcal{F}$ by hand, but it exists according to the following theorem.\n", - "\n", - "**Theorem 1:** Let $\\mathcal{A}$ be some set of subsets of set $A$, then exists a minimal $\\sigma$-algebra $\\sigma(\\mathcal{A})$, which contains $\\mathcal{A}$. It means that this $\\sigma(\\mathcal{A})$ will be a part of any larger $\\sigma$-algebra, that contains $\\mathcal{A}$.\n", - "\n", - "**Theorem 2 (Caratheodory Theorem):** Let probability measure $\\mathbb{P}$ be defined on algebra $\\mathcal{A}$ and $\\sigma$-additive on it. Then $\\mathbb{P}$ can be extended to $\\sigma(\\mathcal{A})$ uniquely.\n", - "\n", - "This gives us the following result: The measure $\\mathbb{P}$, defined on half-intervals in $(0, 1]$ as $\\mathbb{P}((a, b]) = b - a$, can be uniquely extended to a minimal $\\sigma$-algebra containing such half-intervals. Then it will be a probability measure." - ] - }, - { - "cell_type": "markdown", - "id": "alpine-culture", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Lebesgue measure\n", - "\n", - "What we just defined is called **probabilistic Lebesgue measure** $\\lambda((a, b]) = \\mathbb{P}((a, b]) = b - a$. Next, we can define non-probabilistic measure.\n", - "\n", - "A mapping $\\mu: \\mathcal{F} \\to [0, +\\infty)$ is called a **measure** if it is additive and $\\sigma$-additive (we simply ignore the $\\mathbb{P}(\\Omega) = 1$ property).\n", - "\n", - "So we defined $\\lambda$ on $(0, 1]$. We can naturally extend it to $(n, n+1]$: the measure of set $A \\subset 2^{(n, n+1]}$ will be equal to measure of set $B \\subset 2^{(0, 1]}$ obtained by shifting the set.\n", - "\n", - "Finally, for a general set $A \\subset \\mathbb{R}$, define\n", - "$$\n", - "\\lambda(A) = \\sum\\limits_{n \\in \\mathbb{Z}} \\lambda\\left( A \\cap (n, n+1] \\right)\n", - "$$\n", - "\n", - "This is the **Lebesgue measure on real line**, what we usually call length." - ] - }, - { - "cell_type": "markdown", - "id": "trying-reconstruction", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Borel $\\sigma$-algebra\n", - "\n", - "Minimal $\\sigma$-algebra $\\mathcal{B}(A)$ that contains all open subsets of $A$ is called **Borel $\\sigma$-algebra**.\n", - "\n", - "**Lemma:** Borel $\\sigma$-algebra of subsets of $(0, 1]$ coincides with minimal $\\sigma$-algebra, containing all half-intervals.\n", - "\n", - "Since we extended Lebesgue measure to the whole real line, we can find measure of every set in $\\mathcal{B}(\\mathbb{R})$." - ] - }, - { - "cell_type": "markdown", - "id": "decreased-database", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Measurable mapping\n", - "\n", - "Consider\n", - "- Two measurable spaces $(X, \\mathcal{F}_X)$ and $(Y, \\mathcal{F}_Y)$\n", - "- A mapping $T: X \\to Y$\n", - "- A measurable set $A \\in \\mathcal{F}_X$\n", - "\n", - "**Full pre-image** of $A$ under T is then\n", - "$$\n", - "T^{-1}(A) = \\{ x \\in X | T(x) \\in A \\}\n", - "$$\n", - "\n", - "Full pre-image of $A$ under $T$ can also be measurable: $T^{-1}(A) \\in \\mathcal{F}_X$, but not necessarily. If for any measurable set $A$ its full pre-image under $T$ is measurable, we say that $T$ is a **measurable mapping**." - ] - }, - { - "cell_type": "markdown", - "id": "present-example", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Random variables\n", - "\n", - "Consider probabiliy space $(\\Omega, \\mathcal{F}, \\mathbb{P})$. A **random variable** is a measurable function $X: \\Omega \\to \\mathbb{R}$ from $(\\Omega, \\mathcal{F})$ to $(\\mathbb{R}, \\mathcal{B}(\\mathbb{R}))$.\n", - "\n", - "It means that the pre-image of any set $A$ in $\\mathcal{B}(\\mathbb{R})$ belongs to $\\mathcal{F}$:\n", - "$$\n", - "\\forall A \\in \\mathcal{B}(\\mathbb{R}) \\Longrightarrow X^{-1}(A) \\in \\mathcal{F}\n", - "$$\n", - "\n", - "Think of event $B \\in \\mathcal{B}(\\mathbb{R})$ and its pre-image $A \\in \\mathcal{F}$. Naturally, the probability that random variable $X$ lies in set $B$ is the same as probability of event $A$:\n", - "$$\n", - "\\mathbb{P}(X \\in B) = \\mathbb{P}(X^{-1}(B)) = \\mathbb{P}(A)\n", - "$$" - ] - }, - { - "cell_type": "markdown", - "id": "appropriate-bridal", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Example 3\n", - "\n", - "Let\n", - "- $\\Omega = [0, 1]$\n", - "- $\\mathcal{F} = \\mathcal{B}([0, 1])$\n", - "- $X_1(\\omega) = \\omega$" - ] - }, - { - "cell_type": "markdown", - "id": "abandoned-concert", - "metadata": { - "slideshow": { - "slide_type": "fragment" - } - }, - "source": [ - "In order to check if $X_1$ will be a random variable, we need to verify that $X_1$ is a measurable function from $([0, 1], \\mathcal{B}([0, 1]))$ to $(\\mathbb{R}, \\mathcal{B}(\\mathbb{R}))$." - ] - }, - { - "cell_type": "markdown", - "id": "curious-throat", - "metadata": { - "slideshow": { - "slide_type": "fragment" - } - }, - "source": [ - "It will be such measurable function, because the pre-image of any set $[a, b] \\in \\mathcal{B}(\\mathbb{R})$ lies in $\\mathcal{B}([0, 1])$:\n", - "$$\n", - "X_1^{-1}([a, b]) = [a, b] \\cap [0, 1]\n", - "$$" - ] - }, - { - "cell_type": "markdown", - "id": "stretch-motion", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Example 4\n", - "\n", - "Let\n", - "- $\\Omega = [0, 1]$\n", - "- $\\mathcal{F} = \\{\\Omega, \\varnothing\\}$\n", - "- $X_1(\\omega) = \\omega$\n", - "\n", - "In order to check if $X_1$ will be a random variable, we need to verify that $X_1$ is a measurable function from $([0, 1], \\mathcal{F})$ to $(\\mathbb{R}, \\mathcal{B}(\\mathbb{R}))$." - ] - }, - { - "cell_type": "markdown", - "id": "practical-camcorder", - "metadata": { - "slideshow": { - "slide_type": "fragment" - } - }, - "source": [ - "It will **not** be such measurable function, because the pre-image of e.g. $[0, 1/2] \\in \\mathcal{B}(\\mathbb{R})$ does not lie in $\\mathcal{F}$:\n", - "$$\n", - "X_1^{-1}([0, 1/2]) = [0, 1/2] \\not\\in \\{[0, 1], \\varnothing\\}\n", - "$$" - ] - }, - { - "cell_type": "markdown", - "id": "stretch-vehicle", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Independence of random variables\n", - "\n", - "A **minimal $\\sigma$-algebra generated by random variable $X$** is the minimal $\\sigma$-algebra containing pre-images of all borel sets:\n", - "$$\n", - "\\sigma(X) = \\sigma\\{X^{-1}(B), B \\in \\mathcal{B}(\\mathbb{R})\\} = \\{X^{-1}(B), B \\in \\mathcal{B}(\\mathbb{R})\\}\n", - "$$\n", - "\n", - "Random variables $X$ and $Y$ are called independent if their minimal generated $\\sigma$-algebras are independent. This means that any events $A \\in \\sigma(X)$ and $B \\in \\sigma(Y)$ should be independent:\n", - "$$\n", - "\\mathbb{P}(A \\cap B) = \\mathbb{P}(A) \\mathbb{P}(B)\n", - "$$\n", - "\n", - "Let's say $A$ is pre-image of some borel event $B_1$ and $B$ is the pre-image of a different borel event $B_2$. Then, $\\mathbb{P}(A) = \\mathbb{P}(X^{-1}(B_1)) = \\mathbb{P}(X \\in B_1)$ and similarly for $B_2$, so finally,\n", - "$$\n", - "\\mathbb{P}(X \\in B_1, X \\in B_2) = \\mathbb{P}(X \\in B_1) \\mathbb{P}(X \\in B_2)\n", - "$$" - ] - }, - { - "cell_type": "markdown", - "id": "built-newton", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Distribution of a random variable\n", - "\n", - "Consider probability sapce $(\\Omega, \\mathcal{F}, \\mathbb{P})$ and random variable $X: \\Omega \\to \\mathbb{R}$. We will call the image $\\mu$ of measure $\\mathbb{P}$ through the mapping $X$ **the distribution** (or distribution law) of $X$:\n", - "$$\n", - "\\mu(A) = \\mathbb{P}(X^{-1}(A))\n", - "$$\n", - "\n", - "We will write $X \\sim \\mu$." - ] - }, - { - "cell_type": "markdown", - "id": "elementary-hospital", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Three types of distributions\n", - "\n", - "**Lebesgue theorem:** Let $\\nu$ be Lenesgue measure on $\\mathbb{R}$ and $\\mu$ be any probabilistic measure, then $\\mu = \\mu_d + \\mu_s + \\mu_{ac}$, where\n", - "- $\\mu_d$ is **discrete measure**, i.e. it is concentrated on a countable set of points.\n", - "- $\\mu_s$ is **singular measure**, i.e. exists measurable set $S$ such that $\\nu(S) = 0$ and $\\mu_s(\\overline{S}) = 0$ and $\\forall x \\in \\mathbb{R} \\mu_s(\\{x\\}) = 0$.\n", - "- $\\mu_{ac}$ is **absolutely continuous measure**, i.e. from $\\nu(A) = 0$ follows $\\mu_{ac}(A) = 0$ for any measurable set $A$.\n", - " By **Radon-Nikodim theorem**, it is equivalent to the existence of a non-negative measurable function $f: \\mathbb{R} \\to \\mathbb{R}$ called **probability density function**, such that $\\mu_{ac}(A) = \\int_A f(x) dx$.\n", - " \n", - "Because the distributions are defined through the measure, any probability distribution may be viewed as a mixture of three base types: discrete, singular and continuous.\n", - "\n", - "Normally though, the distributions fall into just one category. Also, you never encounter singular distributions in practice." - ] - }, - { - "cell_type": "markdown", - "id": "fabulous-hamilton", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Example 5\n", - "\n", - "Consider event $A \\in \\mathcal{F}$ and a random variable $X = \\mathbb{I}\\text{nd}_A$, an indicator:\n", - "$$\n", - "\\mathbb{I}\\text{nd}_A(x) = \\begin{cases}\n", - "1, x \\in A, \\\\\n", - "0, \\text{else}\n", - "\\end{cases}\n", - "$$\n", - "\n", - "$$\n", - "\\mathbb{P}(X = 1) = \\mathbb{P}(A) = p\n", - "$$\n", - "\n", - "$$\n", - "\\mathbb{P}(X = 0) = 1 - \\mathbb{P}(A) = 1 - p\n", - "$$\n", - "\n", - "We say that $X$ follows **Bernoulli distribution** with parameter $p$ and write $X \\sim Be(p)$.\n", - "\n", - "We will call $\\mathbb{P}_X(\\omega)$ a **probability mass function** (PMF)." - ] - }, - { - "cell_type": "markdown", - "id": "ignored-connection", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Bernoulli trial scheme\n", - "\n", - "Previously we have worked with independent events that were happening in one probability space. But sometimes we want to have multiple trials, where for every trial the probability space is known, but we are interested in the probability space covering all the trials at once. We can achieve it via direct product of probability spaces.\n", - "\n", - "If all probability spaces are the same and equal to:\n", - "- $\\Omega = \\{0, 1\\}$\n", - "- $\\mathcal{F} = \\{\\varnothing, 0, 1, \\Omega\\}$\n", - "- $\\mathbb{P}(1) = p$ and $\\mathbb{P}(0) = 1 - p$\n", - "\n", - "Then we call such experiment a **Bernoulli trial scheme**, and the probability space of it is:\n", - "- $\\Omega = \\{(i_1, \\ldots, i_n), i_j \\in \\{0, 1\\}\\}$\n", - "- $\\mathcal{F} = 2^\\Omega$\n", - "- $\\mathbb{P}(i_1, \\ldots, i_n) = p^{\\text{num} j \\text{ such that } i_j = 1} (1 - p)^{\\text{num} j \\text{ such that } i_j = 0}$" - ] - }, - { - "cell_type": "markdown", - "id": "statutory-league", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Example 6\n", - "\n", - "Consider $X_1, \\ldots, X_n \\sim Be(p)$ independent random variables. Then $Y = \\sum_{k=1}^n X_k$ follows **Binomial distribution** with parameters $n$ and $p$, $Y \\sim Bi(n, p)$. $\\mathbb{P}(Y = k) = ?$" - ] - }, - { - "cell_type": "markdown", - "id": "adaptive-clinton", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Solution 6\n", - "\n", - "$$\n", - "\\mathbb{P}(Y = k) = \\begin{pmatrix}n\\\\k\\end{pmatrix} p^k (1-p)^{n-k}\n", - "$$" - ] - }, - { - "cell_type": "markdown", - "id": "joint-donor", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Example 7\n", - "\n", - "Consider $X$ and $Y$ independent $\\mathbb{Z}$-valued random variables. $\\mathbb{P}(X + Y = k) = ?$" - ] - }, - { - "cell_type": "markdown", - "id": "alive-chancellor", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Solution 7\n", - "\n", - "$$\n", - "\\mathbb{P}(X + Y = k) = \\sum_{m} \\mathbb{P}(X = m) \\mathbb{P}(Y = k - m)\n", - "$$" - ] - }, - { - "cell_type": "markdown", - "id": "intense-college", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Example 8\n", - "\n", - "Let $X \\sim Bi(n, p)$ and $Y \\sim Bi(m, p)$ be independent. What is the distribution of $Z = X + Y$?" - ] - }, - { - "cell_type": "markdown", - "id": "temporal-member", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Solution 8\n", - "\n", - "$$\n", - "\\mathbb{P}(X + Y = k) = \\sum_j \\begin{pmatrix}n\\\\j\\end{pmatrix} p^j (1-p)^{n-j} \\begin{pmatrix}m\\\\k-j\\end{pmatrix} p^{k-j} (1-p)^{m-k+j} = p^{k} (1-p)^{n+m-k} \\sum_j \\begin{pmatrix}n\\\\j\\end{pmatrix} \\begin{pmatrix}m\\\\k-j\\end{pmatrix} = \\begin{pmatrix}n+m\\\\k\\end{pmatrix} p^{k} (1-p)^{n+m-k}\n", - "$$\n", - "\n", - "$$\n", - "Z \\sim Bi(n+m, p)\n", - "$$" - ] - }, - { - "cell_type": "markdown", - "id": "increasing-delaware", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Cumulative distribution function\n", - "\n", - "Note that distribution of a discrete distribution of a random variable $X$ is uniquely defined by its PMF $\\mathbb{P}(X = x_i)$. In general, we define the distribution using cumulative distribution function (CDF):\n", - "$$\n", - "F_X(x) = \\mathbb{P}(X < x)\n", - "$$\n", - "\n", - "It has the following properties:\n", - "- $F_X$ is non-decreasing\n", - "- $\\lim\\limits_{x\\to-\\infty}F_X(x) = 0$\n", - "- $\\lim\\limits_{x\\to+\\infty}F_X(x) = 1$\n", - "- $F_X$ if left continuous\n", - "\n", - "Interesting enough, the converse is also true. Any function that conforms to the properties above defines some probability distribution on $\\mathbb{R}$ and this relation is unique." - ] - }, - { - "cell_type": "markdown", - "id": "printable-haiti", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Probability density function\n", - "\n", - "- If $X$ has a discrete distribution, then $F_X$ has a countable number of jumps $p_i = \\mathbb{P}(X = x_i)$ and at $x = x_i$ it is continuous\n", - "- If $X$ has absolutely continuous distribution, then $F_X$ is differentiable a.e. and can be recovered from its derivative:\n", - " $$\n", - " F_X(x) = \\int\\limits_{-\\infty}^x f_X(t) dt\n", - " $$\n", - " \n", - " where $f_X(t)$ is the probability density function and $f_X(t) = F'_X(x)$ a.e." - ] - }, - { - "cell_type": "markdown", - "id": "vocational-chinese", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Example 9\n", - "\n", - "We say that random variable $X$ is distributed uniformly on $[a, b]$ and write $X \\sim U([a, b])$ if\n", - "$$\n", - "f_X(x) = \\begin{cases}\n", - "\\frac{1}{b-a}, a \\leqslant x \\leqslant b, \\\\\n", - "0, \\text{else}\n", - "\\end{cases}\n", - "$$" - ] - }, - { - "cell_type": "markdown", - "id": "adjusted-acrylic", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Example 10\n", - "\n", - "Consider $X$ and $Y$ independent random variables with PDFs $f_X$ and $f_Y$ respectively. Then, their sum $Z = X + Y$ has absolutely continuous distribution with density\n", - "$$\n", - "f_Z(z) = \\int f_X(x) f_Y(z-x) dx\n", - "$$" - ] - }, - { - "cell_type": "markdown", - "id": "local-minnesota", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Example 11\n", - "\n", - "Let $X, Y \\sim U([0, 1])$ and $Z = X + Y$. Find $f_Z(z)$." - ] - }, - { - "cell_type": "markdown", - "id": "wireless-commerce", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Solution 11\n", - "\n", - "$$\n", - "f_Z(z) = \\int\\limits_{0}^1 f_X(x) f_Y(z-x) dx = \\int\\limits_{0}^1 f_Y(z-x) dx = \\begin{cases}\n", - "z, & 0 \\leqslant z \\leqslant 1, \\\\\n", - "2 - z, & 1 \\leqslant z \\leqslant 2, \\\\\n", - "0, & \\text{else}\n", - "\\end{cases}\n", - "$$" - ] - }, - { - "cell_type": "markdown", - "id": "auburn-speaking", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Functions of random variables\n", - "\n", - "Random variables transform like functions, i.e. if $Y = \\varphi(X)$, then $Y(\\omega) = \\varphi(X(\\omega))$." - ] - }, - { - "cell_type": "markdown", - "id": "identical-amazon", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Example 12\n", - "\n", - "Let $X$ be a random variable with CDF $F_X$ and PDF $f_X$. Find CDF and PDF of $Y = a X + b$." - ] - }, - { - "cell_type": "markdown", - "id": "detailed-bible", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Solution 13\n", - "\n", - "If $a > 0$:\n", - "$$\n", - "F_Y(y) = \\mathbb{P}(Y < y) = \\mathbb{P}(a X + b < y) = \\mathbb{P}\\left(X < \\frac{y - b}{a}\\right)\n", - "$$\n", - "\n", - "$$\n", - "f_Y(y) = F'_Y(y) = \\frac{1}{|a|}f_X\\left(X < \\frac{y - b}{a}\\right)\n", - "$$\n", - "\n", - "In general for a smooth $\\varphi$, the density will be:\n", - "$$\n", - "f_Y(y) = \\sum\\limits_{\\varphi(x) = y} \\frac{f_X(x)}{|\\varphi'(x)|}\n", - "$$" - ] - }, - { - "cell_type": "markdown", - "id": "alpine-motor", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Example 13\n", - "\n", - "Let $X$ be a **normally distributed** random variable with parameters $m$ and $\\sigma^2$:\n", - "$$\n", - "f_X(x) = \\frac{1}{\\sqrt{2 \\pi \\sigma^2}} \\exp \\left( - \\frac{(x - m)^2}{2\\sigma^2} \\right)\n", - "$$\n", - "\n", - "Find PDF of $Y = X^2$." - ] - }, - { - "cell_type": "markdown", - "id": "narrow-seven", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Solution 13" - ] - } - ], - "metadata": { - "celltoolbar": "Slideshow", - "kernelspec": { - "display_name": "Python 3 (ipykernel)", - "language": "python", - "name": "python3" - }, - "language_info": { - "codemirror_mode": { - "name": "ipython", - "version": 3 - }, - "file_extension": ".py", - "mimetype": "text/x-python", - "name": "python", - "nbconvert_exporter": "python", - "pygments_lexer": "ipython3", - "version": "3.9.7" - } - }, - "nbformat": 4, - "nbformat_minor": 5 -} diff --git a/Seminar_materials/Seminar04-05/Seminar 4-5 (Random variables).ipynb b/Seminar_materials/Seminar04-05/Seminar 4-5 (Random variables).ipynb deleted file mode 100644 index 061a25d..0000000 --- a/Seminar_materials/Seminar04-05/Seminar 4-5 (Random variables).ipynb +++ /dev/null @@ -1,757 +0,0 @@ -{ - "cells": [ - { - "cell_type": "markdown", - "id": "a8f7b639", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "# Seminar 4" - ] - }, - { - "cell_type": "markdown", - "id": "7bb7a2e9", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Recap of axiomatic definition of probability\n", - "\n", - "A probability space is the following tuple: $(\\Omega, \\cal{F}, \\mathbb{P})$." - ] - }, - { - "cell_type": "markdown", - "id": "concrete-petroleum", - "metadata": { - "slideshow": { - "slide_type": "fragment" - } - }, - "source": [ - "- **Sample space** $\\Omega$\n", - "- **Set of events** $\\cal{F}$\n", - "- **Probability measure** $\\mathbb{P}$" - ] - }, - { - "cell_type": "markdown", - "id": "collaborative-madison", - "metadata": { - "slideshow": { - "slide_type": "subslide" - } - }, - "source": [ - "Set of events is $\\cal{F} \\subset 2^\\Omega$ ($\\sigma$-algebra), such that\n", - "1. $\\Omega \\in \\cal{F}$\n", - "2. If $A \\in \\cal{F}$, then $\\overline{A} \\in \\cal{F}$ (closed under complement operation)\n", - "3. If $A_1, A_2, \\ldots \\in \\cal{F}$, then $\\bigcup_{k=1}^\\infty A_k \\in \\cal{F}$ (closed under countable union operation)\n", - "\n", - "The pair $(\\Omega, \\cal{F})$ is called a measurable space. Set $A$ is called measurable if $A \\in \\mathcal{F}$.\n", - "\n", - "Probability measure is $\\mathbb{P}: \\cal{F} \\to \\mathbb{R}_+$, such that\n", - "1. $\\mathbb{P}(\\Omega) = 1$\n", - "2. If $A_1, A_2, \\ldots \\in \\cal{F}$ and $A_i \\cap A_j = \\varnothing$ for $i\\neq j$, then $\\mathbb{P}\\left(\\bigcup_{k=0}^\\infty A_k \\right) = \\sum_{k=1}^\\infty \\mathbb{P}(A_k)$ ($\\sigma$-additivity)" - ] - }, - { - "cell_type": "markdown", - "id": "reduced-february", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Example 1\n", - "\n", - "Let\n", - "- $\\Omega = (0, 1]$\n", - "- $\\mathcal{F} = 2^{(0, 1]}$\n", - "- $\\mathbb{P}(A) = \\tfrac{k}{n}$, where $k$ is the number of points like $\\tfrac{i}{n}, i \\in \\{1, \\ldots, n\\}$ in $A$\n", - "\n", - "We can check that all the necessary conditions are satisfied:" - ] - }, - { - "cell_type": "markdown", - "id": "convinced-musical", - "metadata": { - "slideshow": { - "slide_type": "fragment" - } - }, - "source": [ - "1. $\\Omega = (0, 1] \\subset 2^{(0, 1]} = \\mathcal{F}$\n", - "2. If $A \\subset (0, 1] \\in \\mathcal{F}$, then $\\overline{A} \\subset (0, 1] \\in \\mathcal{F}$ as well\n", - "3. If $A_1, A_2, \\ldots \\subset (0, 1] \\in \\mathcal{F}$, then all their elements are in $(0, 1]$, and thus the union $\\bigcup_{k=1}^\\infty A_k \\subset (0, 1] \\in \\mathcal{F}$\n", - "4. $\\mathbb{P}(\\Omega) = \\mathbb{P}((0,1]) = \\tfrac{n}{n} = 1$\n", - "5. If $A_1$ and $A_2 \\in \\cal{F}$ and $A_1 \\cap A_2 = \\varnothing$, then $\\tfrac{k_1 + k_2}{n} = \\mathbb{P}\\left(A_1 \\cup A_2 \\right) = \\mathbb{P}(A_1) + \\mathbb{P}(A_2) = \\tfrac{k_1}{n} + \\tfrac{k_2}{n}$" - ] - }, - { - "cell_type": "markdown", - "id": "congressional-mercury", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Example 2\n", - "\n", - "Let\n", - "- $\\Omega = (0, 1]$\n", - "- $\\mathcal{F}$ is the set of all half-intervals $(a, b]$ in $(0, 1]$\n", - "- $\\mathbb{P}((a, b]) = b - a$ (length of half-interval)" - ] - }, - { - "cell_type": "markdown", - "id": "current-height", - "metadata": { - "slideshow": { - "slide_type": "fragment" - } - }, - "source": [ - "In this case, $\\mathcal{F}$ is not a $\\sigma$-algebra, because union of half-intervals is not necessarily a half-interval." - ] - }, - { - "cell_type": "markdown", - "id": "expected-dynamics", - "metadata": { - "slideshow": { - "slide_type": "subslide" - } - }, - "source": [ - "We need to add something else to $\\mathcal{F}$. We can add all finite unions of half-intervals, then $\\mathcal{F}$ will be an algebra, but still not $\\sigma$-algebra. We can **not** build such $\\mathcal{F}$ by hand, but it exists according to the following theorem.\n", - "\n", - "**Theorem 1:** Let $\\mathcal{A}$ be some set of subsets of set $A$, then exists a minimal $\\sigma$-algebra $\\sigma(\\mathcal{A})$, which contains $\\mathcal{A}$. It means that this $\\sigma(\\mathcal{A})$ will be a part of any larger $\\sigma$-algebra, that contains $\\mathcal{A}$.\n", - "\n", - "**Theorem 2 (Caratheodory Theorem):** Let probability measure $\\mathbb{P}$ be defined on algebra $\\mathcal{A}$ and $\\sigma$-additive on it. Then $\\mathbb{P}$ can be extended to $\\sigma(\\mathcal{A})$ uniquely.\n", - "\n", - "This gives us the following result: The measure $\\mathbb{P}$, defined on half-intervals in $(0, 1]$ as $\\mathbb{P}((a, b]) = b - a$, can be uniquely extended to a minimal $\\sigma$-algebra containing such half-intervals. Then it will be a probability measure." - ] - }, - { - "cell_type": "markdown", - "id": "alpine-culture", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Lebesgue measure\n", - "\n", - "What we just defined is called **probabilistic Lebesgue measure** $\\lambda((a, b]) = \\mathbb{P}((a, b]) = b - a$. Next, we can define non-probabilistic measure.\n", - "\n", - "A mapping $\\mu: \\mathcal{F} \\to [0, +\\infty)$ is called a **measure** if it is additive and $\\sigma$-additive (we simply ignore the $\\mathbb{P}(\\Omega) = 1$ property).\n", - "\n", - "So we defined $\\lambda$ on $(0, 1]$. We can naturally extend it to $(n, n+1]$: the measure of set $A \\subset 2^{(n, n+1]}$ will be equal to measure of set $B \\subset 2^{(0, 1]}$ obtained by shifting the set.\n", - "\n", - "Finally, for a general set $A \\subset \\mathbb{R}$, define\n", - "$$\n", - "\\lambda(A) = \\sum\\limits_{n \\in \\mathbb{Z}} \\lambda\\left( A \\cap (n, n+1] \\right)\n", - "$$\n", - "\n", - "This is the **Lebesgue measure on real line**, what we usually call length." - ] - }, - { - "cell_type": "markdown", - "id": "trying-reconstruction", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Borel $\\sigma$-algebra\n", - "\n", - "Minimal $\\sigma$-algebra $\\mathcal{B}(A)$ that contains all open subsets of $A$ is called **Borel $\\sigma$-algebra**.\n", - "\n", - "**Lemma:** Borel $\\sigma$-algebra of subsets of $(0, 1]$ coincides with minimal $\\sigma$-algebra, containing all half-intervals.\n", - "\n", - "Since we extended Lebesgue measure to the whole real line, we can find measure of every set in $\\mathcal{B}(\\mathbb{R})$." - ] - }, - { - "cell_type": "markdown", - "id": "decreased-database", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Measurable mapping\n", - "\n", - "Consider\n", - "- Two measurable spaces $(X, \\mathcal{F}_X)$ and $(Y, \\mathcal{F}_Y)$\n", - "- A mapping $T: X \\to Y$\n", - "- A measurable set $A \\in \\mathcal{F}_X$\n", - "\n", - "**Full pre-image** of $A$ under T is then\n", - "$$\n", - "T^{-1}(A) = \\{ x \\in X | T(x) \\in A \\}\n", - "$$\n", - "\n", - "Full pre-image of $A$ under $T$ can also be measurable: $T^{-1}(A) \\in \\mathcal{F}_X$, but not necessarily. If for any measurable set $A$ its full pre-image under $T$ is measurable, we say that $T$ is a **measurable mapping**." - ] - }, - { - "cell_type": "markdown", - "id": "present-example", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Random variables\n", - "\n", - "Consider probabiliy space $(\\Omega, \\mathcal{F}, \\mathbb{P})$. A **random variable** is a measurable function $X: \\Omega \\to \\mathbb{R}$ from $(\\Omega, \\mathcal{F})$ to $(\\mathbb{R}, \\mathcal{B}(\\mathbb{R}))$.\n", - "\n", - "It means that the pre-image of any set $A$ in $\\mathcal{B}(\\mathbb{R})$ belongs to $\\mathcal{F}$:\n", - "$$\n", - "\\forall A \\in \\mathcal{B}(\\mathbb{R}) \\Longrightarrow X^{-1}(A) \\in \\mathcal{F}\n", - "$$\n", - "\n", - "Think of event $B \\in \\mathcal{B}(\\mathbb{R})$ and its pre-image $A \\in \\mathcal{F}$. Naturally, the probability that random variable $X$ lies in set $B$ is the same as probability of event $A$:\n", - "$$\n", - "\\mathbb{P}(X \\in B) = \\mathbb{P}(X^{-1}(B)) = \\mathbb{P}(A)\n", - "$$" - ] - }, - { - "cell_type": "markdown", - "id": "appropriate-bridal", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Example 3\n", - "\n", - "Let\n", - "- $\\Omega = [0, 1]$\n", - "- $\\mathcal{F} = \\mathcal{B}([0, 1])$\n", - "- $X_1(\\omega) = \\omega$" - ] - }, - { - "cell_type": "markdown", - "id": "abandoned-concert", - "metadata": { - "slideshow": { - "slide_type": "fragment" - } - }, - "source": [ - "In order to check if $X_1$ will be a random variable, we need to verify that $X_1$ is a measurable function from $([0, 1], \\mathcal{B}([0, 1]))$ to $(\\mathbb{R}, \\mathcal{B}(\\mathbb{R}))$." - ] - }, - { - "cell_type": "markdown", - "id": "curious-throat", - "metadata": { - "slideshow": { - "slide_type": "fragment" - } - }, - "source": [ - "It will be such measurable function, because the pre-image of any set $[a, b] \\in \\mathcal{B}(\\mathbb{R})$ lies in $\\mathcal{B}([0, 1])$:\n", - "$$\n", - "X_1^{-1}([a, b]) = [a, b] \\cap [0, 1]\n", - "$$" - ] - }, - { - "cell_type": "markdown", - "id": "stretch-motion", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Example 4\n", - "\n", - "Let\n", - "- $\\Omega = [0, 1]$\n", - "- $\\mathcal{F} = \\{\\Omega, \\varnothing\\}$\n", - "- $X_1(\\omega) = \\omega$\n", - "\n", - "In order to check if $X_1$ will be a random variable, we need to verify that $X_1$ is a measurable function from $([0, 1], \\mathcal{F})$ to $(\\mathbb{R}, \\mathcal{B}(\\mathbb{R}))$." - ] - }, - { - "cell_type": "markdown", - "id": "practical-camcorder", - "metadata": { - "slideshow": { - "slide_type": "fragment" - } - }, - "source": [ - "It will **not** be such measurable function, because the pre-image of e.g. $[0, 1/2] \\in \\mathcal{B}(\\mathbb{R})$ does not lie in $\\mathcal{F}$:\n", - "$$\n", - "X_1^{-1}([0, 1/2]) = [0, 1/2] \\not\\in \\{[0, 1], \\varnothing\\}\n", - "$$" - ] - }, - { - "cell_type": "markdown", - "id": "stretch-vehicle", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Independence of random variables\n", - "\n", - "A **minimal $\\sigma$-algebra generated by random variable $X$** is the minimal $\\sigma$-algebra containing pre-images of all borel sets:\n", - "$$\n", - "\\sigma(X) = \\sigma\\{X^{-1}(B), B \\in \\mathcal{B}(\\mathbb{R})\\} = \\{X^{-1}(B), B \\in \\mathcal{B}(\\mathbb{R})\\}\n", - "$$\n", - "\n", - "Random variables $X$ and $Y$ are called independent if their minimal generated $\\sigma$-algebras are independent. This means that any events $A \\in \\sigma(X)$ and $B \\in \\sigma(Y)$ should be independent:\n", - "$$\n", - "\\mathbb{P}(A \\cap B) = \\mathbb{P}(A) \\mathbb{P}(B)\n", - "$$\n", - "\n", - "Let's say $A$ is pre-image of some borel event $B_1$ and $B$ is the pre-image of a different borel event $B_2$. Then, $\\mathbb{P}(A) = \\mathbb{P}(X^{-1}(B_1)) = \\mathbb{P}(X \\in B_1)$ and similarly for $B_2$, so finally,\n", - "$$\n", - "\\mathbb{P}(X \\in B_1, X \\in B_2) = \\mathbb{P}(X \\in B_1) \\mathbb{P}(X \\in B_2)\n", - "$$" - ] - }, - { - "cell_type": "markdown", - "id": "built-newton", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Distribution of a random variable\n", - "\n", - "Consider probability sapce $(\\Omega, \\mathcal{F}, \\mathbb{P})$ and random variable $X: \\Omega \\to \\mathbb{R}$. We will call the image $\\mu$ of measure $\\mathbb{P}$ through the mapping $X$ **the distribution** (or distribution law) of $X$:\n", - "$$\n", - "\\mu(A) = \\mathbb{P}(X^{-1}(A))\n", - "$$\n", - "\n", - "We will write $X \\sim \\mu$." - ] - }, - { - "cell_type": "markdown", - "id": "elementary-hospital", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Three types of distributions\n", - "\n", - "**Lebesgue theorem:** Let $\\nu$ be Lenesgue measure on $\\mathbb{R}$ and $\\mu$ be any probabilistic measure, then $\\mu = \\mu_d + \\mu_s + \\mu_{ac}$, where\n", - "- $\\mu_d$ is **discrete measure**, i.e. it is concentrated on a countable set of points.\n", - "- $\\mu_s$ is **singular measure**, i.e. exists measurable set $S$ such that $\\nu(S) = 0$ and $\\mu_s(\\overline{S}) = 0$ and $\\forall x \\in \\mathbb{R} \\mu_s(\\{x\\}) = 0$.\n", - "- $\\mu_{ac}$ is **absolutely continuous measure**, i.e. from $\\nu(A) = 0$ follows $\\mu_{ac}(A) = 0$ for any measurable set $A$.\n", - " By **Radon-Nikodim theorem**, it is equivalent to the existence of a non-negative measurable function $f: \\mathbb{R} \\to \\mathbb{R}$ called **probability density function**, such that $\\mu_{ac}(A) = \\int_A f(x) dx$.\n", - " \n", - "Because the distributions are defined through the measure, any probability distribution may be viewed as a mixture of three base types: discrete, singular and continuous.\n", - "\n", - "Normally though, the distributions fall into just one category. Also, you never encounter singular distributions in practice." - ] - }, - { - "cell_type": "markdown", - "id": "fabulous-hamilton", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Example 5\n", - "\n", - "Consider event $A \\in \\mathcal{F}$ and a random variable $X = \\mathbb{I}\\text{nd}_A$, an indicator:\n", - "$$\n", - "\\mathbb{I}\\text{nd}_A(x) = \\begin{cases}\n", - "1, x \\in A, \\\\\n", - "0, \\text{else}\n", - "\\end{cases}\n", - "$$\n", - "\n", - "$$\n", - "\\mathbb{P}(X = 1) = \\mathbb{P}(A) = p\n", - "$$\n", - "\n", - "$$\n", - "\\mathbb{P}(X = 0) = 1 - \\mathbb{P}(A) = 1 - p\n", - "$$\n", - "\n", - "We say that $X$ follows **Bernoulli distribution** with parameter $p$ and write $X \\sim Be(p)$.\n", - "\n", - "We will call $\\mathbb{P}_X(\\omega)$ a **probability mass function** (PMF)." - ] - }, - { - "cell_type": "markdown", - "id": "ignored-connection", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Bernoulli trial scheme\n", - "\n", - "Previously we have worked with independent events that were happening in one probability space. But sometimes we want to have multiple trials, where for every trial the probability space is known, but we are interested in the probability space covering all the trials at once. We can achieve it via direct product of probability spaces.\n", - "\n", - "If all probability spaces are the same and equal to:\n", - "- $\\Omega = \\{0, 1\\}$\n", - "- $\\mathcal{F} = \\{\\varnothing, 0, 1, \\Omega\\}$\n", - "- $\\mathbb{P}(1) = p$ and $\\mathbb{P}(0) = 1 - p$\n", - "\n", - "Then we call such experiment a **Bernoulli trial scheme**, and the probability space of it is:\n", - "- $\\Omega = \\{(i_1, \\ldots, i_n), i_j \\in \\{0, 1\\}\\}$\n", - "- $\\mathcal{F} = 2^\\Omega$\n", - "- $\\mathbb{P}(i_1, \\ldots, i_n) = p^{\\text{num} j \\text{ such that } i_j = 1} (1 - p)^{\\text{num} j \\text{ such that } i_j = 0}$" - ] - }, - { - "cell_type": "markdown", - "id": "statutory-league", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Example 6\n", - "\n", - "Consider $X_1, \\ldots, X_n \\sim Be(p)$ independent random variables. Then $Y = \\sum_{k=1}^n X_k$ follows **Binomial distribution** with parameters $n$ and $p$, $Y \\sim Bi(n, p)$. $\\mathbb{P}(Y = k) = ?$" - ] - }, - { - "cell_type": "markdown", - "id": "adaptive-clinton", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Solution 6\n", - "\n", - "$$\n", - "\\mathbb{P}(Y = k) = \\begin{pmatrix}n\\\\k\\end{pmatrix} p^k (1-p)^{n-k}\n", - "$$" - ] - }, - { - "cell_type": "markdown", - "id": "joint-donor", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Example 7\n", - "\n", - "Consider $X$ and $Y$ independent $\\mathbb{Z}$-valued random variables. $\\mathbb{P}(X + Y = k) = ?$" - ] - }, - { - "cell_type": "markdown", - "id": "alive-chancellor", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Solution 7\n", - "\n", - "$$\n", - "\\mathbb{P}(X + Y = k) = \\sum_{m} \\mathbb{P}(X = m) \\mathbb{P}(Y = k - m)\n", - "$$" - ] - }, - { - "cell_type": "markdown", - "id": "intense-college", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Example 8\n", - "\n", - "Let $X \\sim Bi(n, p)$ and $Y \\sim Bi(m, p)$ be independent. What is the distribution of $Z = X + Y$?" - ] - }, - { - "cell_type": "markdown", - "id": "temporal-member", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Solution 8\n", - "\n", - "$$\n", - "\\mathbb{P}(X + Y = k) = \\sum_j \\begin{pmatrix}n\\\\j\\end{pmatrix} p^j (1-p)^{n-j} \\begin{pmatrix}m\\\\k-j\\end{pmatrix} p^{k-j} (1-p)^{m-k+j} = p^{k} (1-p)^{n+m-k} \\sum_j \\begin{pmatrix}n\\\\j\\end{pmatrix} \\begin{pmatrix}m\\\\k-j\\end{pmatrix} = \\begin{pmatrix}n+m\\\\k\\end{pmatrix} p^{k} (1-p)^{n+m-k}\n", - "$$\n", - "\n", - "$$\n", - "Z \\sim Bi(n+m, p)\n", - "$$" - ] - }, - { - "cell_type": "markdown", - "id": "increasing-delaware", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Cumulative distribution function\n", - "\n", - "Note that distribution of a discrete distribution of a random variable $X$ is uniquely defined by its PMF $\\mathbb{P}(X = x_i)$. In general, we define the distribution using cumulative distribution function (CDF):\n", - "$$\n", - "F_X(x) = \\mathbb{P}(X < x)\n", - "$$\n", - "\n", - "It has the following properties:\n", - "- $F_X$ is non-decreasing\n", - "- $\\lim\\limits_{x\\to-\\infty}F_X(x) = 0$\n", - "- $\\lim\\limits_{x\\to+\\infty}F_X(x) = 1$\n", - "- $F_X$ if left continuous\n", - "\n", - "Interesting enough, the converse is also true. Any function that conforms to the properties above defines some probability distribution on $\\mathbb{R}$ and this relation is unique." - ] - }, - { - "cell_type": "markdown", - "id": "printable-haiti", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Probability density function\n", - "\n", - "- If $X$ has a discrete distribution, then $F_X$ has a countable number of jumps $p_i = \\mathbb{P}(X = x_i)$ and at $x = x_i$ it is continuous\n", - "- If $X$ has absolutely continuous distribution, then $F_X$ is differentiable a.e. and can be recovered from its derivative:\n", - " $$\n", - " F_X(x) = \\int\\limits_{-\\infty}^x f_X(t) dt\n", - " $$\n", - " \n", - " where $f_X(t)$ is the probability density function and $f_X(t) = F'_X(x)$ a.e." - ] - }, - { - "cell_type": "markdown", - "id": "vocational-chinese", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Example 9\n", - "\n", - "We say that random variable $X$ is distributed uniformly on $[a, b]$ and write $X \\sim U([a, b])$ if\n", - "$$\n", - "f_X(x) = \\begin{cases}\n", - "\\frac{1}{b-a}, a \\leqslant x \\leqslant b, \\\\\n", - "0, \\text{else}\n", - "\\end{cases}\n", - "$$" - ] - }, - { - "cell_type": "markdown", - "id": "adjusted-acrylic", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Example 10\n", - "\n", - "Consider $X$ and $Y$ independent random variables with PDFs $f_X$ and $f_Y$ respectively. Then, their sum $Z = X + Y$ has absolutely continuous distribution with density\n", - "$$\n", - "f_Z(z) = \\int f_X(x) f_Y(z-x) dx\n", - "$$" - ] - }, - { - "cell_type": "markdown", - "id": "local-minnesota", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Example 11\n", - "\n", - "Let $X, Y \\sim U([0, 1])$ and $Z = X + Y$. Find $f_Z(z)$." - ] - }, - { - "cell_type": "markdown", - "id": "wireless-commerce", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Solution 11\n", - "\n", - "$$\n", - "f_Z(z) = \\int\\limits_{0}^1 f_X(x) f_Y(z-x) dx = \\int\\limits_{0}^1 f_Y(z-x) dx = \\begin{cases}\n", - "z, & 0 \\leqslant z \\leqslant 1, \\\\\n", - "2 - z, & 1 \\leqslant z \\leqslant 2, \\\\\n", - "0, & \\text{else}\n", - "\\end{cases}\n", - "$$" - ] - }, - { - "cell_type": "markdown", - "id": "auburn-speaking", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Functions of random variables\n", - "\n", - "Random variables transform like functions, i.e. if $Y = \\varphi(X)$, then $Y(\\omega) = \\varphi(X(\\omega))$." - ] - }, - { - "cell_type": "markdown", - "id": "identical-amazon", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Example 12\n", - "\n", - "Let $X$ be a random variable with CDF $F_X$ and PDF $f_X$. Find CDF and PDF of $Y = a X + b$." - ] - }, - { - "cell_type": "markdown", - "id": "detailed-bible", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Solution 13\n", - "\n", - "If $a > 0$:\n", - "$$\n", - "F_Y(y) = \\mathbb{P}(Y < y) = \\mathbb{P}(a X + b < y) = \\mathbb{P}\\left(X < \\frac{y - b}{a}\\right)\n", - "$$\n", - "\n", - "$$\n", - "f_Y(y) = F'_Y(y) = \\frac{1}{|a|}f_X\\left(X < \\frac{y - b}{a}\\right)\n", - "$$\n", - "\n", - "In general for a smooth $\\varphi$, the density will be:\n", - "$$\n", - "f_Y(y) = \\sum\\limits_{\\varphi(x) = y} \\frac{f_X(x)}{|\\varphi'(x)|}\n", - "$$" - ] - }, - { - "cell_type": "markdown", - "id": "alpine-motor", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Example 13\n", - "\n", - "Let $X$ be a **normally distributed** random variable with parameters $m$ and $\\sigma^2$:\n", - "$$\n", - "f_X(x) = \\frac{1}{\\sqrt{2 \\pi \\sigma^2}} \\exp \\left( - \\frac{(x - m)^2}{2\\sigma^2} \\right)\n", - "$$\n", - "\n", - "Find PDF of $Y = X^2$." - ] - }, - { - "cell_type": "markdown", - "id": "narrow-seven", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Solution 13" - ] - } - ], - "metadata": { - "celltoolbar": "Slideshow", - "kernelspec": { - "display_name": "Python 3 (ipykernel)", - "language": "python", - "name": "python3" - }, - "language_info": { - "codemirror_mode": { - "name": "ipython", - "version": 3 - }, - "file_extension": ".py", - "mimetype": "text/x-python", - "name": "python", - "nbconvert_exporter": "python", - "pygments_lexer": "ipython3", - "version": "3.9.7" - } - }, - "nbformat": 4, - "nbformat_minor": 5 -} diff --git a/Seminar_materials/Seminar04-05/Seminar 4-5 (Random variables).pdf b/Seminar_materials/Seminar04-05/Seminar 4-5 (Random variables).pdf deleted file mode 100644 index 3875e57..0000000 Binary files a/Seminar_materials/Seminar04-05/Seminar 4-5 (Random variables).pdf and /dev/null differ diff --git a/Seminar_materials/Seminar06/Seminar 6 (Expectation and Variance).ipynb b/Seminar_materials/Seminar06/Seminar 6 (Expectation and Variance).ipynb deleted file mode 100644 index 547c0fb..0000000 --- a/Seminar_materials/Seminar06/Seminar 6 (Expectation and Variance).ipynb +++ /dev/null @@ -1,622 +0,0 @@ -{ - "cells": [ - { - "cell_type": "markdown", - "id": "a8f7b639", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "# Seminar 5" - ] - }, - { - "cell_type": "markdown", - "id": "present-example", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Recap of random variables\n", - "\n", - "Consider probabiliy space $(\\Omega, \\mathcal{F}, \\mathbb{P})$. A **random variable** is a measurable function $X: \\Omega \\to \\mathbb{R}$ from $(\\Omega, \\mathcal{F})$ to $(\\mathbb{R}, \\mathcal{B}(\\mathbb{R}))$." - ] - }, - { - "cell_type": "markdown", - "id": "athletic-durham", - "metadata": { - "slideshow": { - "slide_type": "fragment" - } - }, - "source": [ - "Measurable function $X: \\Omega \\to \\mathbb{R}$ from $(\\Omega, \\mathcal{F})$ to $(\\mathbb{R}, \\mathcal{B}(\\mathbb{R}))$. It means that the pre-image of any set $A$ in $\\mathcal{B}(\\mathbb{R})$ belongs to $\\mathcal{F}$:\n", - "$$\n", - "\\forall A \\in \\mathcal{B}(\\mathbb{R}) \\Longrightarrow X^{-1}(A) \\in \\mathcal{F}\n", - "$$" - ] - }, - { - "cell_type": "markdown", - "id": "congressional-malaysia", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Recap of distributions\n", - "\n", - "We will call the image $\\mu$ of measure $\\mathbb{P}$ through the mapping $X$ **the distribution** of r.v. $X$ and write $X \\sim \\mu$:\n", - "$$\n", - "\\mu(A) = \\mathbb{P}(X^{-1}(A))\n", - "$$" - ] - }, - { - "cell_type": "markdown", - "id": "successful-associate", - "metadata": { - "slideshow": { - "slide_type": "fragment" - } - }, - "source": [ - "Any probabilistic measure (hence any probability distribution) can be decomposed into sum of three types of measures:\n", - "- Discrete\n", - "- Singular\n", - "- Absolutely continuous\n", - "\n", - "Normally, the distributions fall into just one category and you never encounter singular distributions." - ] - }, - { - "cell_type": "markdown", - "id": "behind-startup", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Functions describing distributions\n", - "\n", - "- For any distribution we have **cumulative distribution function** (CDF) $F_X(x) = \\mathbb{P}(X < x)$\n", - "- For discrete distributions we have **probability mass function** (PMF) $\\mathbb{P}_X(x) = \\mathbb{P}(X = x)$\n", - "- For continuous distributions we have **probability density function** (PDF) $f_X(x) = F'_X(x)$" - ] - }, - { - "cell_type": "markdown", - "id": "auburn-speaking", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Functions of random variables\n", - "\n", - "Random variables transform like functions, i.e. if $Y = \\varphi(X)$, then $Y(\\omega) = \\varphi(X(\\omega))$.\n", - "\n", - "For a smooth $\\varphi$, the density will be:\n", - "$$\n", - "f_Y(y) = \\sum\\limits_{\\varphi(x) = y} \\frac{f_X(x)}{|\\varphi'(x)|}\n", - "$$" - ] - }, - { - "cell_type": "markdown", - "id": "laden-divorce", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Mathematical expectation\n", - "\n", - "Mathematical expectation generalizes the concept of mean. Consider probability space $(\\Omega, \\mathcal{F}, \\mathbb{P})$ and random variable $X: \\Omega \\to \\mathbb{R}$. Then expected value of $X$ is\n", - "$$\n", - "\\mathbb{E}\\left[X\\right] = \\int_\\Omega X(\\omega) d\\mathbb{P}(\\omega) = \\int_{\\mathbb{R}} x d\\mu(x)\n", - "$$\n", - "\n", - "- If $X$ is discrete, then\n", - " $$\n", - " \\mathbb{E}\\left[X\\right] = \\sum_k x_k \\mathbb{P}(X = x_k)\n", - " $$\n", - "- If $X$ is continuous, then\n", - " $$\n", - " \\mathbb{E}\\left[X\\right] = \\int_{-\\infty}^{+\\infty} x f_X(x) dx\n", - " $$\n", - " \n", - "It may be the case that $\\mathbb{E}\\left[X\\right] = \\pm \\infty$ or even does not exist." - ] - }, - { - "cell_type": "markdown", - "id": "rational-operator", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Example 1\n", - "\n", - "We roll a die and r.v. $X$ is the score of a roll. What is $\\mathbb{E}\\left[X\\right]$?" - ] - }, - { - "cell_type": "markdown", - "id": "returning-webcam", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Solution 1\n", - "\n", - "$$\n", - "\\mathbb{E}\\left[X\\right] = \\sum_{k=1}^6 k \\cdot \\mathbb{P}(X = k) = \\frac16 \\sum_{k=1}^6 k = \\frac72\n", - "$$" - ] - }, - { - "cell_type": "markdown", - "id": "basic-feature", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Example 2\n", - "\n", - "We flip a non-symmetric coin and $X$ is the r.v. for heads, $X \\sim Be(p)$. What is $\\mathbb{E}\\left[X\\right]$?" - ] - }, - { - "cell_type": "markdown", - "id": "running-lemon", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Solution 2\n", - "\n", - "$$\n", - "\\mathbb{E}\\left[X\\right] = 0 \\cdot \\mathbb{P}(X = 0) + 1 \\cdot \\mathbb{P}(X = 1) = p\n", - "$$" - ] - }, - { - "cell_type": "markdown", - "id": "interesting-topic", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Example 3\n", - "\n", - "Consider discrete r.v. $X$ with distribution $\\mathbb{P}(X = 2^n) = 2^{-n}$. What is $\\mathbb{E}\\left[X\\right]$?" - ] - }, - { - "cell_type": "markdown", - "id": "working-syria", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Solution 3\n", - "\n", - "$$\n", - "\\mathbb{E}\\left[X\\right] = \\sum_{n} 2^n 2^{-n} = \\infty\n", - "$$" - ] - }, - { - "cell_type": "markdown", - "id": "dressed-bookmark", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Example 4\n", - "\n", - "Consider discrete r.v. $X$ with distribution $\\mathbb{P}(X = 2^n) = \\mathbb{P}(X = - 2^n) = 2^{-n-1}$. What is $\\mathbb{E}\\left[X\\right]$?" - ] - }, - { - "cell_type": "markdown", - "id": "prospective-partnership", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Solution 4\n", - "\n", - "Expectation of r.v. $X$ exists if and only if $\\mathbb{E}\\left[|X|\\right] < \\infty$" - ] - }, - { - "cell_type": "markdown", - "id": "incredible-revolution", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Example 5\n", - "\n", - "Consider $X$ with **Poisson distribution** $X \\sim Pois(\\lambda)$:\n", - "$$\n", - "\\mathbb{P}(X = k) = \\frac{\\lambda^k}{k!} e^{-\\lambda}\n", - "$$\n", - "\n", - "What is $\\mathbb{E}\\left[X\\right]$?" - ] - }, - { - "cell_type": "markdown", - "id": "criminal-lighter", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Solution 5\n", - "\n", - "$$\n", - "\\begin{aligned}\n", - "\\mathbb{E}\\left[X\\right] & = \\sum_{k=0}^\\infty k \\frac{\\lambda^k}{k!} e^{-\\lambda} = e^{-\\lambda} \\sum_{k=0}^\\infty k \\frac{\\lambda^k}{k!} = e^{-\\lambda} \\sum_{k=0}^\\infty \\frac{\\lambda^k}{(k - 1)!} = \\\\\n", - "& = e^{-\\lambda} \\sum_{k=0}^\\infty \\frac{\\lambda^k}{(k - 1)!} = e^{-\\lambda} \\sum_{k=1}^\\infty \\frac{\\lambda^{k+1}}{k!} = \\lambda e^{-\\lambda} e^\\lambda = \\lambda\n", - "\\end{aligned}\n", - "$$" - ] - }, - { - "cell_type": "markdown", - "id": "missing-jurisdiction", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Properties of expectation\n", - "\n", - "Consider r.v.s $X$ and $Y$ with finite expectations. Then,\n", - "1. For any constants $a$ and $b$ it holds $\\mathbb{E}\\left[aX + b\\right] = a \\mathbb{E}\\left[X\\right] + b$\n", - "2. $\\mathbb{E}\\left[X + Y\\right] = \\mathbb{E}\\left[X\\right] + \\mathbb{E}\\left[Y\\right]$\n", - "3. If $X \\leqslant Y$ a.s., then $\\mathbb{E}\\left[X\\right] \\leqslant \\mathbb{E}\\left[Y\\right]$\n", - "4. If $X \\perp Y$, then $\\mathbb{E}\\left[XY\\right] = \\mathbb{E}\\left[X\\right] \\mathbb{E}\\left[Y\\right]$" - ] - }, - { - "cell_type": "markdown", - "id": "blond-gibraltar", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Example 6\n", - "\n", - "Consider $X$ with binomial distribution $X \\sim Bi(n, p)$. What is $\\mathbb{E}\\left[X\\right]$?" - ] - }, - { - "cell_type": "markdown", - "id": "worst-mention", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Solution 6\n", - "\n", - "- We know that $X = \\sum_{k=1}^n X_k$, where $X_k \\sim Be(p)$\n", - "- We know that $\\mathbb{E}\\left[X_k\\right] = p$\n", - "- Then, $\\mathbb{E}\\left[X\\right] = \\sum_{k=1}^n \\mathbb{E}\\left[X_k\\right] = np$" - ] - }, - { - "cell_type": "markdown", - "id": "occupational-campbell", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Expectation of a function of a random variable\n", - "\n", - "Consider $Y = \\varphi(X)$, then its expectation is\n", - "$$\n", - "\\mathbb{E}\\left[Y\\right] = \\int_\\Omega \\varphi(X(\\omega)) d \\mathbb{P}_X(\\omega) = \\int_{-\\infty}^\\infty \\varphi(x) d\\mu(x)\n", - "$$\n", - "\n", - "If additionally the following integral exists\n", - "$$\n", - "\\int_{-\\infty}^\\infty |\\varphi(x)| d F_X(x) < \\infty\n", - "$$\n", - "\n", - "Then,\n", - "$$\n", - "\\mathbb{E}\\left[Y\\right] = \\int_{-\\infty}^\\infty \\varphi(x) f_X(x) d x\n", - "$$" - ] - }, - { - "cell_type": "markdown", - "id": "caroline-scoop", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Variance\n", - "\n", - "We call **variance** the following quantity of a r.v. $X$ with finite expectation:\n", - "$$\n", - "\\mathbb{V}\\text{ar}(X) = \\mathbb{E}\\left[\\left(X - \\mathbb{E}[X]\\right)^2\\right]\n", - "$$" - ] - }, - { - "cell_type": "markdown", - "id": "governing-ethiopia", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Example 7\n", - "\n", - "We flip a non-symmetric coin and $X$ is the r.v. for heads, $X \\sim Be(p)$. What is $\\mathbb{V}\\text{ar}\\left(X\\right)$?" - ] - }, - { - "cell_type": "markdown", - "id": "oriented-nirvana", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Solution 7\n", - "\n", - "1. We know the formula\n", - " $$\n", - " \\mathbb{V}\\text{ar}\\left(X\\right) = \\mathbb{E}\\left[\\left(X - \\mathbb{E}\\left[X\\right]\\right)^2\\right]\n", - " $$\n", - "2. We know $\\mathbb{E}\\left[X\\right]$\n", - " $$\n", - " \\mathbb{V}\\text{ar}\\left(X\\right) = \\mathbb{E}\\left[\\left(X - p\\right)^2\\right] = \\mathbb{E}\\left[X^2 - 2 p X + p^2\\right]\n", - " $$\n", - "3. We know that expectation is linear\n", - " $$\n", - " \\mathbb{V}\\text{ar}\\left(X\\right) = \\mathbb{E}\\left[X^2\\right] - 2 p \\mathbb{E}\\left[X\\right] + p^2 = \\mathbb{E}\\left[X^2\\right] - p^2\n", - " $$\n", - "4. For $Y = X^2$ we can compute\n", - " $$\n", - " \\mathbb{E}\\left[Y\\right] = 0 \\cdot \\mathbb{P}(Y = 0) + 1 \\cdot \\mathbb{P}(Y = 1) = \\mathbb{P}(Y = 1) = \\mathbb{P}(X^2 = 1) = \\mathbb{P}(X = 1) = p\n", - " $$\n", - "5. Finally,\n", - " $$\n", - " \\mathbb{V}\\text{ar}\\left(X\\right) = p - p^2 = p (1 - p)\n", - " $$" - ] - }, - { - "cell_type": "markdown", - "id": "israeli-instruction", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Properties of variance\n", - "\n", - "1. $\\mathbb{V}\\text{ar}\\left(X\\right) \\geqslant 0$ and $\\mathbb{V}\\text{ar}\\left(X\\right) = 0$ if and only if $X = const$ a.s.\n", - "2. If holds\n", - " $$\n", - " \\mathbb{V}\\text{ar}\\left(X\\right) = \\mathbb{E}\\left[X^2\\right] - \\left(\\mathbb{E}\\left[X\\right]\\right)^2\n", - " $$\n", - "3. It holds\n", - " $$\n", - " \\mathbb{V}\\text{ar}\\left(aX + b\\right) = b^2 \\mathbb{V}\\text{ar}\\left(X\\right)\n", - " $$\n", - "4. If $X \\perp Y$, it holds\n", - "$$\n", - "\\mathbb{V}\\text{ar}\\left(X + Y\\right) = \\mathbb{V}\\text{ar}\\left(X\\right) + \\mathbb{V}\\text{ar}\\left(Y\\right)\n", - "$$" - ] - }, - { - "cell_type": "markdown", - "id": "casual-happiness", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Example 8\n", - "\n", - "Consider $X$ with binomial distribution $X \\sim Bi(n, p)$. What is $\\mathbb{V}\\text{ar}\\left(X\\right)$?" - ] - }, - { - "cell_type": "markdown", - "id": "caroline-execution", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Solution 8\n", - "\n", - "- We know that $X = \\sum_{k=1}^n X_k$, where $X_k \\sim Be(p)$\n", - "- We know that $\\mathbb{V}\\text{ar}\\left(X_k\\right) = p(1-p)$\n", - "- Then, $\\mathbb{V}\\text{ar}\\left(X\\right) = \\mathbb{V}\\text{ar}\\left(\\sum_{k=1}^n X_k\\right) = \\sum_{k=1}^n \\mathbb{V}\\text{ar}\\left(X_k\\right) = np(1-p)$" - ] - }, - { - "cell_type": "markdown", - "id": "demanding-immigration", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Moments of distribution\n", - "\n", - "$\\mathbb{E}\\left[X^k\\right]$ is called $k$-th moment of r.v. $X$.\n", - "\n", - "We say that $k$-th moment is finite if $\\mathbb{E}\\left[X^k\\right] < \\infty$.\n", - "\n", - "If $k$-th moment is finite, then all moments $m < k$ are finite as well." - ] - }, - { - "cell_type": "markdown", - "id": "unlimited-narrative", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Jensen's inequality\n", - "\n", - "Consider r.v. $X$ with $\\mathbb{E}[X] < \\infty$ and convex function $g(\\cdot)$, then\n", - "$$\n", - "\\mathbb{E}\\left[g(X)\\right] \\geqslant g\\left(\\mathbb{E}\\left[X\\right]\\right)\n", - "$$" - ] - }, - { - "cell_type": "markdown", - "id": "entitled-athletics", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Example 9\n", - "\n", - "Prove Jensen's inequality for special case of $g(x) = x^2$" - ] - }, - { - "cell_type": "markdown", - "id": "finite-attraction", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Cauchy-Schwarz inequality\n", - "\n", - "Consider r.v. $X$ with $\\mathbb{E}\\left[X^2\\right] < \\infty$, then\n", - "$$\n", - "|\\mathbb{E}\\left[XY\\right]| \\leqslant \\sqrt{\\mathbb{E}\\left[X^2\\right]\\mathbb{E}\\left[Y^2\\right]}\n", - "$$" - ] - }, - { - "cell_type": "markdown", - "id": "active-diamond", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Covariance\n", - "\n", - "**Covariance** of two random variables $X$ and $Y$ is defined as\n", - "$$\n", - "\\operatorname{cov}\\left(X, Y\\right) = \\mathbb{E}\\left[\\left(X - \\mathbb{E}\\left[X\\right]\\right)\\left(Y - \\mathbb{E}\\left[Y\\right]\\right)\\right] = \\mathbb{E}\\left[XY\\right] - \\mathbb{E}\\left[X\\right]\\mathbb{E}\\left[Y\\right]\n", - "$$\n", - "\n", - "From Cauchy-Schwarz inequality,\n", - "$$\n", - "\\operatorname{cov}\\left(X, Y\\right) \\leqslant \\sqrt{\\mathbb{V}\\text{ar}\\left(X\\right)\\mathbb{V}\\text{ar}\\left(Y\\right)}\n", - "$$" - ] - }, - { - "cell_type": "markdown", - "id": "prostate-promise", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Correlation\n", - "\n", - "If $X \\perp Y$, $\\operatorname{cov}\\left(X, Y\\right) = 0$. The converse is not true. Regardless, covariance is often used to measure the dependency between random variables. It is not handy to use, so instead a **correlation coefficient** is proposed:\n", - "$$\n", - "r_{XY} = \\frac{\\operatorname{cov}\\left(X, Y\\right)}{\\sqrt{\\mathbb{V}\\text{ar}\\left(X\\right)\\mathbb{V}\\text{ar}\\left(Y\\right)}}\n", - "$$\n", - "\n", - "Note that $-1 \\leqslant r_{XY} \\leqslant 1$." - ] - } - ], - "metadata": { - "celltoolbar": "Slideshow", - "kernelspec": { - "display_name": "Python 3", - "language": "python", - "name": "python3" - }, - "language_info": { - "codemirror_mode": { - "name": "ipython", - "version": 3 - }, - "file_extension": ".py", - "mimetype": "text/x-python", - "name": "python", - "nbconvert_exporter": "python", - "pygments_lexer": "ipython3", - "version": "3.8.2" - } - }, - "nbformat": 4, - "nbformat_minor": 5 -} diff --git a/Seminar_materials/Seminar06/Seminar 6 (Expectation and Variance).pdf b/Seminar_materials/Seminar06/Seminar 6 (Expectation and Variance).pdf deleted file mode 100644 index 8cb9e27..0000000 Binary files a/Seminar_materials/Seminar06/Seminar 6 (Expectation and Variance).pdf and /dev/null differ diff --git a/Seminar_materials/Seminar09/Seminar 9 (Random vector).ipynb b/Seminar_materials/Seminar09/Seminar 9 (Random vector).ipynb deleted file mode 100644 index 935448d..0000000 --- a/Seminar_materials/Seminar09/Seminar 9 (Random vector).ipynb +++ /dev/null @@ -1,503 +0,0 @@ -{ - "cells": [ - { - "cell_type": "markdown", - "id": "69835e6e", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "# Moment-generating function" - ] - }, - { - "cell_type": "markdown", - "id": "9a915803", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Moment-generating function: definition\n", - "\n", - "**Moment-generating function** of r.v. $X$ is\n", - "$$\n", - "M_X(t) = \\mathbb{E}\\left[e^{tX}\\right]\n", - "$$\n", - "\n", - "It does not always exist. If it exists and is finite:\n", - "- It uniquely defines distribution of $X$\n", - "- $M_X(t) > 0, \\forall t$ and $M_X(0) = 1$\n", - "- $M_{aX+b}(t) = e^{bt} M_X(at)$\n", - "- For all $k$ exists a finite moment of $X$ and is defined as $\\mathbb{E}[X^k] = M^{(k)}_X(0)$ meaning $k$-th derivative\n", - "\n", - "The purpose of MGF is to replace computation of expectation with differentiation." - ] - }, - { - "cell_type": "markdown", - "id": "d769cc68", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Example 1: Bernoulli MGF\n", - "\n", - "Consider $X \\sim Be(p)$. What is $M_X(t)$? Find expectation and variance using MGF." - ] - }, - { - "cell_type": "markdown", - "id": "4533f0b3", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Solution 1\n", - "\n", - "MGF:\n", - "$$\n", - "M_X(t) = \\mathbb{E}\\left[e^{tX}\\right] = e^{t \\cdot 0} \\cdot \\mathbb{P}(X = 0) + e^{t \\cdot 1} \\cdot \\mathbb{P}(X = 1) = q + pe^t\n", - "$$\n", - "\n", - "First and second derivatives are $pe^t$, so\n", - "$$\n", - "\\mathbb{E}X = M'_X(0) = pe^0 = p = M''_X(0) = \\mathbb{E}\\left[X^2\\right]\n", - "$$\n", - "$$\n", - "\\mathbb{V}\\text{ar}(X) = M''_X(0) - \\left(M'_X(0)\\right)^2 = p - p^2 = p(1-p)\n", - "$$" - ] - }, - { - "cell_type": "markdown", - "id": "2be679cc", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Example 2: Poisson MGF\n", - "\n", - "Consider $X \\sim Pois(\\lambda)$. What is $M_X(t)$? Find expectation and variance using MGF." - ] - }, - { - "cell_type": "markdown", - "id": "d5f9da4b", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Solution 2\n", - "\n", - "MGF:\n", - "$$\n", - "M_X(t) = \\mathbb{E}\\left[e^{tX}\\right] = \\sum\\limits_{k=-\\infty}^\\infty e^{tk} \\frac{\\lambda^k}{k!} e^{-\\lambda} = e^{-\\lambda} \\sum\\limits_{k=-\\infty}^\\infty \\frac{1}{k!} \\left( \\lambda e^{t}\\right)^k = \\exp \\left( \\lambda \\left( e^t - 1 \\right) \\right)\n", - "$$\n", - "\n", - "First derivative:\n", - "$$\n", - "M'_X(t) = \\lambda e^t \\exp \\left( \\lambda \\left( e^t - 1 \\right) \\right)\n", - "$$\n", - "\n", - "Expectation $M'_X(0) = \\lambda$. Second derivative:\n", - "$$\n", - "M''_X(0) = \\lambda e^t \\exp \\left( \\lambda \\left( e^t - 1 \\right) \\right) + \\lambda e^t \\exp \\lambda e^t \\left( \\lambda \\left( e^t - 1 \\right) \\right)\n", - "$$\n", - "\n", - "Second moment $M''_X(0) = \\lambda + \\lambda^2$. Variance $\\mathbb{V}\\text{ar}(X) = \\lambda + \\lambda^2 - \\lambda^2 = \\lambda$." - ] - }, - { - "cell_type": "markdown", - "id": "bc530156", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Example 3: Gaussian MGF\n", - "\n", - "Consider $X \\sim \\mathcal{N}(\\mu, \\sigma^2)$. What is $M_X(t)$? Find expectation and variance using MGF." - ] - }, - { - "cell_type": "markdown", - "id": "78ce7a2c", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Solution 3\n", - "\n", - "First let's find for $Y \\sim \\mathcal{N}(0, 1)$, then apply properties.\n", - "$$\n", - "\\begin{aligned}\n", - "M_Y(t) & = \\mathbb{E}\\left[e^{tY}\\right] = \\frac{1}{\\sqrt{2\\pi}} \\int\\limits_{-\\infty}^\\infty e^{tx} e^{-x^2 / 2} dx = \\frac{1}{\\sqrt{2\\pi}} \\int\\limits_{-\\infty}^\\infty \\exp\\left( -\\frac{x^2 - 2tx}{2}\\right) dx = \\\\\n", - "& = \\frac{1}{\\sqrt{2\\pi}} \\int\\limits_{-\\infty}^\\infty \\exp\\left( -\\frac{(x - t)^2 - t^2}{2}\\right) dx = \\\\\n", - "& = \\exp\\left( \\frac{t^2}{2} \\right) \\frac{1}{\\sqrt{2\\pi}} \\int\\limits_{-\\infty}^\\infty \\exp\\left( -\\frac{(x - t)^2}{2}\\right) dx = \\exp\\left( \\frac{t^2}{2} \\right)\n", - "\\end{aligned}\n", - "$$" - ] - }, - { - "cell_type": "markdown", - "id": "09571b25", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Solution 3 (continued)\n", - "\n", - "From properties, $M_X(t) = e^{\\mu t} M_Y(\\sigma t) = \\exp \\left( \\mu t + \\frac{t^2 \\sigma^2}{2} \\right)$. First derivative:\n", - "$$\n", - "M'_X(t) = \\left( \\mu + t \\sigma^2 \\right) \\exp \\left( \\mu t + \\frac{t^2 \\sigma^2}{2} \\right)\n", - "$$\n", - "\n", - "Second derivative:\n", - "$$\n", - "M''_X(t) = \\sigma^2 \\exp \\left( \\mu t + \\frac{t^2 \\sigma^2}{2} \\right) + \\left( \\mu + t \\sigma^2 \\right)^2 \\exp \\left( \\mu t + \\frac{t^2 \\sigma^2}{2} \\right)\n", - "$$\n", - "\n", - "Expectation: $M'_X(0) = \\mu$, variance $M''_X(0) - \\left( M'_X(0) \\right)^2 = \\sigma^2$." - ] - }, - { - "cell_type": "markdown", - "id": "fdf80980", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "# Random vector" - ] - }, - { - "cell_type": "markdown", - "id": "5464a7fa", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Random vector: definition\n", - "Consider probability space $(\\Omega, \\mathcal{F}, \\mathbb{P})$. Then, a **random vector** is a borel function\n", - "$$\n", - "\\mathbf{X}: \\Omega \\to \\mathbb{R}^n,\n", - "$$\n", - "where $\\mathbf{X} = (X_1, \\ldots, X_n)^\\top$. Every component $X_i$ of the vector is a random variable. The converse is also true: for any r.v.s $X_1, \\ldots, X_n$ a vector $(X_1, \\ldots, X_n)^\\top$ is a random vector." - ] - }, - { - "cell_type": "markdown", - "id": "8febda80", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Random vector: distribution\n", - "\n", - "The distribution of a random vector $\\mathbf{X} = (X_1, \\ldots, X_n)^\\top$ can be described via **multivariate (joint) cumulative distribution function**:\n", - "$$\n", - "F_{\\mathbf{X}}(\\mathbf{x}) = \\mathbb{P}(X_1 < x_1, X_2 < x_2, \\ldots, X_n < x_n)\n", - "$$\n", - "\n", - "Properties of multivariate CDF:\n", - "- $\\lim_{x_i \\to -\\infty} F_{\\mathbf{X}}(\\mathbf{x}) = 0$ but $\\lim_{x_1, \\ldots, x_n \\to \\infty} F_{\\mathbf{X}}(\\mathbf{x}) = 1$\n", - "- $\\lim_{x_i \\to \\infty} F_{\\mathbf{X}}(\\mathbf{x}) = $ the function $F$ of everything except $x_i$\n", - "- $F_{\\mathbf{X}}(\\mathbf{x})$ is non-decreasing and left-continuous in every component\n", - "- Supermodulatiry: $F_{\\mathbf{X}}(x_1, \\ldots, x_i, \\ldots, x_n) - F_{\\mathbf{X}}(x_1, \\ldots, x_i - \\varepsilon, \\ldots, x_n) \\geqslant 0$" - ] - }, - { - "cell_type": "markdown", - "id": "3bfe0fe2", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Random vector: distribution\n", - "\n", - "If $X$ has continuous distribution, then exists **multivariate (joint) probability density function**, i.e. non-negative function $f_{\\mathbf{X}}(\\cdot)$ such that\n", - "$$\n", - " \\mathbb{P}(\\mathbf{X} \\in B) = \\int_B f_{\\mathbf{X}}(x_1, \\ldots, x_n) dx_1 \\ldots dx_n\n", - "$$\n", - "\n", - "PDF can also be found from CDF:\n", - "$$\n", - " f_{\\mathbf{X}}(\\mathbf{x}) = \\frac{\\partial^n F_{\\mathbf{X}}(\\mathbf{x})}{\\partial x_1 \\ldots \\partial x_n}\n", - "$$" - ] - }, - { - "cell_type": "markdown", - "id": "5bb2e786", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Random vector: independence\n", - "\n", - "If all r.v.s $X_i$ are independent, then\n", - "$$\n", - "\\begin{cases}\n", - "F_{\\mathbf{X}}(\\mathbf{x}) & = \\prod\\limits_{i=1}^n F_{X_i}(x_i), \\\\\n", - "f_{\\mathbf{X}}(\\mathbf{x}) & = \\prod\\limits_{i=1}^n f_{X_i}(x_i)\n", - "\\end{cases}\n", - "$$" - ] - }, - { - "cell_type": "markdown", - "id": "49f0c419", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Random vector: moments\n", - "\n", - "**Mathematical expectation** of a random vector is a vector of mathematical expectations of its components:\n", - "$$\n", - "\\mathbb{E}\\left[\\mathbf{X}\\right] = (\\mathbb{E}X_1, \\ldots, \\mathbb{E}X_n)^\\top\n", - "$$\n", - "\n", - "Second moments of a random vector are described with **covariance matrix** $\\mathbb{V}\\text{ar}(\\mathbf{X}) = \\Sigma$, where\n", - "$$\n", - "\\Sigma_{ij} = \\operatorname{cov}(X_i, X_j)\n", - "$$" - ] - }, - { - "cell_type": "markdown", - "id": "fc4059d8", - "metadata": { - "slideshow": { - "slide_type": "fragment" - } - }, - "source": [ - "$$\n", - "\\Sigma_{ij} = \\operatorname{cov}(X_i, X_j) = \\mathbb{E} \\left[ (X_i - \\mathbb{E} X_i) (X_j - \\mathbb{E} X_j) \\right]\n", - "$$\n", - "\n", - "In particular, the diagonal elements are variances: $\\Sigma_{ii} = \\mathbb{V}\\text{ar}(X_i)$." - ] - }, - { - "cell_type": "markdown", - "id": "4c74f30b", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Random vector: covariance matrix\n", - "\n", - "Matrix notation for covariance matrix is $\\mathbb{V}\\text{ar}(\\mathbf{X}) = \\mathbb{E}\\left[(\\mathbf{X} - \\mathbb{E}[\\mathbf{X}]) (\\mathbf{X} - \\mathbb{E}[\\mathbf{X}])^\\top\\right]$.\n", - "\n", - "Properties of convariance matrix:\n", - "- Symmetry $\\Sigma^\\top = \\Sigma$\n", - "- Non-negative semi-definite: $a^\\top \\Sigma a \\geqslant 0, \\forall a$" - ] - }, - { - "cell_type": "markdown", - "id": "b6cd49a9", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Random vector: marginal and conditional distributions\n", - "\n", - "**Marginal distribution** is the distribution of a subset of a random vector. For example, consider r.v. $\\mathbf{X} \\in \\mathbb{R}^n$ and let's view it as two vectors, $\\mathbf{Y} \\in \\mathbb{R}^k$ and $\\mathbf{Z} \\in \\mathbb{R}^{n-k}$, stacked: $\\mathbf{X} = (\\mathbf{Y}^\\top, \\mathbf{Z}^\\top)^\\top$. The marginal distribution of $\\mathbf{Z}$ then will be:\n", - "$$\n", - "f_{\\mathbf{Z}}(\\mathbf{z}) = \\int_{\\mathbb{R}^k} f_{\\mathbf{X}}(\\mathbf{y}, \\mathbf{z}) d \\mathbf{y}\n", - "$$\n", - "\n", - "In words, we take distribution of $\\mathbf{X}$ and **integrate out** everything not realted to $\\mathbf{Z}$.\n", - "\n", - "We may also define **conditional distribution**:\n", - "$$\n", - "f_{\\mathbf{Y}|\\mathbf{Z}=\\mathbf{z}}(\\mathbf{y}) = \\frac{f_{\\mathbf{X}}(\\mathbf{y}, \\mathbf{z})}{f_{\\mathbf{Z}}(\\mathbf{z})}\n", - "$$" - ] - }, - { - "cell_type": "markdown", - "id": "cc6d7afe", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Example 4: joint, marginal and conditional distributions for discrete case\n", - "\n", - "Let $X$ be the indicator of the sampled individual being a current smoker, and let $Y$ be the indicator of his developing lung cancer at some point in his life. Suppose the joint PMF is as follows:\n", - "\n", - "||$Y=1$|$Y=0$|\n", - "|--|--|--|\n", - "|$X=1$|$\\frac{5}{100}$|$\\frac{20}{100}$|\n", - "|$X=0$|$\\frac{3}{100}$|$\\frac{72}{100}$|\n", - "\n", - "Find the marginal and consitional distributions." - ] - }, - { - "cell_type": "markdown", - "id": "aaf1ccef", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Solution 4\n", - "\n", - "||$Y=1$|$Y=0$|\n", - "|--|--|--|\n", - "|$X=1$|$\\frac{5}{100}$|$\\frac{20}{100}$|\n", - "|$X=0$|$\\frac{3}{100}$|$\\frac{72}{100}$|\n", - "\n", - "$$\n", - "\\mathbb{P}(X = x) = \\sum_y \\mathbb{P}(X = x, Y = y)\n", - "$$" - ] - }, - { - "cell_type": "markdown", - "id": "6eb91023", - "metadata": { - "slideshow": { - "slide_type": "fragment" - } - }, - "source": [ - "||$Y=1$|$Y=0$|$\\text{Sum}$|\n", - "|--|--|--|--|\n", - "|$X=1$|$\\frac{5}{100}$|$\\frac{20}{100}$|$\\frac{25}{100}$|\n", - "|$X=0$|$\\frac{3}{100}$|$\\frac{72}{100}$|$\\frac{75}{100}$|\n", - "|$\\text{Sum}$|$\\frac{8}{100}$|$\\frac{92}{100}$|$\\frac{100}{100}$|" - ] - }, - { - "cell_type": "markdown", - "id": "1bf7694e", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Solution 4\n", - "\n", - "||$Y=1$|$Y=0$|\n", - "|--|--|--|\n", - "|$X=1$|$\\frac{5}{100}$|$\\frac{20}{100}$|\n", - "|$X=0$|$\\frac{3}{100}$|$\\frac{72}{100}$|\n", - "\n", - "$$\n", - "\\mathbb{P}(Y = y | X = x) = \\frac{\\mathbb{P}(X = x, Y = y)}{\\mathbb{P}(X = x)}\n", - "$$\n", - "\n", - "Example: if the person is a smoker ($X = 1$), then $\\mathbb{P}(Y = 1 | X = 1) = \\frac{\\mathbb{P}(X = 1, Y = 1)}{\\mathbb{P}(X = 1)} = \\frac{5/100}{25/100} = 0.2$." - ] - }, - { - "cell_type": "markdown", - "id": "e044a510", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Example 5 (unit disc)\n", - "\n", - "Consider a random point on unit disc with random coordinates $(X, Y)$. What is the joint, marginal and conditional PDF for the coordinates?" - ] - }, - { - "cell_type": "markdown", - "id": "5f87daba", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Solution 5\n", - "\n", - "The joint is\n", - "$$\n", - "f_{X, Y}(x, y) = \\begin{cases}\n", - "\\frac1\\pi, \\text{ if } x^2 + y^2 \\leqslant 1, \\\\\n", - "0, \\text{ else}\n", - "\\end{cases}\n", - "$$\n", - "\n", - "The marginal for $X$ is:\n", - "$$\n", - "f_X(x) = \\int\\limits_{-\\infty}^\\infty f_{X, Y}(x, y) dy = \\int\\limits_{-\\sqrt{1 - x^2}}^{\\sqrt{1-x^2}} \\frac{1}{\\pi} dy = \\frac{2}{\\pi} \\sqrt{1 - x^2}\n", - "$$\n", - "\n", - "The conditional for $Y$ is:\n", - "$$\n", - "f_{Y|X=x}(x) = \\frac{f_{X, Y}(x, y)}{f_{X}(x)} = \\frac{\\frac{1}{\\pi}}{\\frac{2}{\\pi}\\sqrt{1 - x^2}} = \\frac{1}{2\\sqrt{1 - x^2}}\n", - "$$" - ] - } - ], - "metadata": { - "celltoolbar": "Слайд-шоу", - "kernelspec": { - "display_name": "Python 3 (ipykernel)", - "language": "python", - "name": "python3" - }, - "language_info": { - "codemirror_mode": { - "name": "ipython", - "version": 3 - }, - "file_extension": ".py", - "mimetype": "text/x-python", - "name": "python", - "nbconvert_exporter": "python", - "pygments_lexer": "ipython3", - "version": "3.9.7" - } - }, - "nbformat": 4, - "nbformat_minor": 5 -} diff --git a/Seminar_materials/Seminar09/Seminar 9 (Random vector).pdf b/Seminar_materials/Seminar09/Seminar 9 (Random vector).pdf deleted file mode 100644 index aac0a04..0000000 Binary files a/Seminar_materials/Seminar09/Seminar 9 (Random vector).pdf and /dev/null differ diff --git a/Seminar_materials/Seminar10/Seminar 10 (Transformations).ipynb b/Seminar_materials/Seminar10/Seminar 10 (Transformations).ipynb deleted file mode 100644 index 75620ee..0000000 --- a/Seminar_materials/Seminar10/Seminar 10 (Transformations).ipynb +++ /dev/null @@ -1,606 +0,0 @@ -{ - "cells": [ - { - "cell_type": "markdown", - "id": "6f9a4edb", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "# Transformations" - ] - }, - { - "cell_type": "markdown", - "id": "b64f0689", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Examples of transformations\n", - "\n", - "- Linear transformations of random variables and vectors $Y = aX + b$\n", - "- Non-linear invertible transformations of random variables $Y = g(X)$\n", - "- Sums $Y = X_1 + X_2$" - ] - }, - { - "cell_type": "markdown", - "id": "c0aba036", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Transformations previously\n", - "\n", - "We have a technique for computing the expectation of transformed random variable. What is its name?" - ] - }, - { - "cell_type": "markdown", - "id": "b02823f4", - "metadata": { - "slideshow": { - "slide_type": "fragment" - } - }, - "source": [ - "LOTUS:\n", - "$$\n", - "\\mathbb{E}[g(X)] = \\sum_{x} g(x) \\mathbb{P}(X = x)\n", - "$$\n", - "It works with continuous r.v.s:\n", - "$$\n", - "\\mathbb{E}[g(X)] = \\int g(x) f_X(x) dx\n", - "$$\n", - "And it works in multiple dimensions:\n", - "$$\n", - "\\mathbb{E}[g(X, Y)] = \\sum_{x} \\sum_{y} g(x, y) \\mathbb{P}(X = x, Y = y)\n", - "$$" - ] - }, - { - "cell_type": "markdown", - "id": "5b8ec38b", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Transformations now\n", - "\n", - "Now we want not just the expected value, but the whole distribution (CDF, PMF, PDF). The approach depends on whether the distribution is discrete or continuous.\n", - "\n", - "## Discrete case\n", - "\n", - "The formula:\n", - "$$\n", - "\\mathbb{P}(g(X) = y) = \\sum\\limits_{x \\text{ such that } g(x) = y} \\mathbb{P}(X = x)\n", - "$$\n", - "\n", - "If $g(\\cdot)$ is one-to-one, it simplifies to:\n", - "$$\n", - "\\mathbb{P}(g(X) = y) = \\mathbb{P}(X = g^{-1}(y))\n", - "$$" - ] - }, - { - "cell_type": "markdown", - "id": "300d0052", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Example 1\n", - "\n", - "Let $X \\sim Bin(n, p)$. Find the PDF of $Y = \\exp(X)$." - ] - }, - { - "cell_type": "markdown", - "id": "37d0ef6a", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Solution 1\n", - "\n", - "So $g(x) = \\exp(x)$, it's one-to-one and the inverse is $g^{-1}(x) = \\log x$.\n", - "$$\n", - "\\mathbb{P}(Y = y) = \\mathbb{P}(X = g^{-1}(y)) = \\mathbb{P}(X = \\log y)\n", - "$$" - ] - }, - { - "cell_type": "markdown", - "id": "f1a5036d", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Continuous case\n", - "\n", - "In the continuous case, when additionally $g(\\cdot)$ is one-to-one, continuous and strictly increasing, we have the following relation for CDF:\n", - "$$\n", - "F_{g(X)}(y) = \\mathbb{P}(g(X) \\leqslant y) = \\mathbb{P}(X \\leqslant g^{-1}(y)) = F_X(g^{-1}(y))\n", - "$$\n", - "\n", - "To get the PDF, we need to differentiate this relation at every point with the chain rule:\n", - "$$\n", - "f_{g(X)}(g(x)) d \\left( g(x) \\right) = f_X(x) dx\n", - "$$\n", - "\n", - "To account for the case when $g(\\cdot)$ is strictly decreasing, we add the modulus\n", - "$$\n", - "f_{g(X)}(g(x)) = f_X(x) \\left| \\frac{d g(x)}{dx} \\right|^{-1}\n", - "$$\n", - "\n", - "This is called 1D change of variables formula. Similarly to sicrete case, we can extend it to non one-to-one $g(\\cdot)$ using the sum over $x$ such that $g(x) = y$." - ] - }, - { - "cell_type": "markdown", - "id": "1e348fcf", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Example 2\n", - "\n", - "Let $X \\sim Exp(1)$. Find the PDF of $Y = \\exp(- X)$." - ] - }, - { - "cell_type": "markdown", - "id": "37ae1360", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Solution 2\n", - "\n", - "So $g(x) = \\exp(- x)$, it is one-to-one, and $g^{-1}(y) = - \\log y$. Let's find\n", - "$$\n", - "\\frac{dg(x)}{dx} = - \\exp(- x)\n", - "$$\n", - "\n", - "So,\n", - "$$\n", - "\\begin{aligned}\n", - "f_Y(y) & = f_X(x) \\left| \\frac{d g(x)}{dx} \\right|^{-1} = f_X(- \\log y) \\exp (x) = \\\\\n", - "& = f_X(- \\log y) \\exp (- \\log y) = \\frac1y f_X(- \\log y)\n", - "\\end{aligned}\n", - "$$" - ] - }, - { - "cell_type": "markdown", - "id": "5b8d9aaa", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Example 3\n", - "\n", - "Let $X \\sim \\mathcal{N}(0, 1)$. Find the PDF of $Y = X^2$. This distribution is called chi-square distribution." - ] - }, - { - "cell_type": "markdown", - "id": "22365b9b", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Solution 3\n", - "\n", - "So, $g(x) = x^2$. It is not one-to-one, so we need the sum:\n", - "$$\n", - "\\begin{aligned}\n", - "f_Y(y) & = \\sum_{x = \\{-\\sqrt{y}, \\sqrt{y}\\}} f_X(x) \\left| \\frac{d g(x)}{dx} \\right|^{-1} = \\sum_{x = \\{-\\sqrt{y}, \\sqrt{y}\\}} f_X(x) 2 x = \\\\& = f_X(\\sqrt{y}) \\frac{1}{2 \\sqrt{y}} + f_X(- \\sqrt{y}) \\frac{1}{2 \\sqrt{y}} = \\\\\n", - "& = 2 f_X(\\sqrt{y}) \\frac{1}{2 \\sqrt{y}}\n", - "\\end{aligned}\n", - "$$" - ] - }, - { - "cell_type": "markdown", - "id": "2efaa28e", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Example 4\n", - "\n", - "Let $X$ be a random variable with PDF $f_X$. Find PDF of $Y = a X + b$." - ] - }, - { - "cell_type": "markdown", - "id": "b2a1da9b", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Solution 4\n", - "\n", - "So, $g(x) = ax+b$, it is one-to-one, and the inverse is $g^{-1}(y) = \\frac1a (y - b)$. We need to calculate\n", - "$$\n", - "\\frac{dg(x)}{dx} = \\frac{d(ax + b)}{dx} = a\n", - "$$\n", - "\n", - "$$\n", - "f_Y(y) = f_X(x) \\left| \\frac{d g(x)}{dx} \\right|^{-1} = f_X\\left(\\frac{y - b}{a}\\right) \\frac{1}{|a|}\n", - "$$" - ] - }, - { - "cell_type": "markdown", - "id": "1d29d8da", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Multivariate transformations\n", - "\n", - "Consider $n$-dimensional random vector $X \\in \\mathbb{R}^n$ with continuous distribution with PDF $f_X$. Let $g: A_0 \\to B_0$ be invertive one-to-one function from open subset $A_0$ containing support of $X$ to an open subset $B_0$ containing the range of $g(\\cdot)$. Denote $Y = g(X)$ and $y = g(x)$. Suppose that all partial derivatives $\\frac{\\partial y_i}{\\partial x_j}$ exist and are continuous. Then, we can form the Jacobian matrix:\n", - "$$\n", - "\\frac{\\partial y}{\\partial x} = \\left(\\begin{array}{cccc}\n", - "\\frac{\\partial y_1}{\\partial x_1} & \\frac{\\partial y_1}{\\partial x_2} & \\ldots & \\frac{\\partial y_1}{\\partial x_n} \\\\\n", - "\\vdots & & \\ddots & \\vdots \\\\\n", - "\\frac{\\partial y_n}{\\partial x_1} & \\frac{\\partial y_n}{\\partial x_2} & \\ldots & \\frac{\\partial y_n}{\\partial x_n}\n", - "\\end{array}\\right)\n", - "$$" - ] - }, - { - "cell_type": "markdown", - "id": "febb4184", - "metadata": { - "slideshow": { - "slide_type": "subslide" - } - }, - "source": [ - "Assume that this matrix is non-degenerate. Then, the PDF of $Y$ is:\n", - "$$\n", - "f_Y(y) = f_X(g^{-1}(y)) \\left|\\det \\frac{\\partial y}{\\partial x}\\right|^{-1}\n", - "$$\n", - "\n", - "If the function is not one-to-one, we add sum." - ] - }, - { - "cell_type": "markdown", - "id": "aa1ba0b8", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Example 5\n", - "\n", - "Consider $n$-dimensional random vector $X \\in \\mathbb{R}^n$ with continuous distribution, a non-degenerate matrix $A \\in \\mathbb{R}^{m \\times n}$ and a vector $b \\in \\mathbb{R}^m$, then define random vector $Y = AX + b$. What is the expected value, the covariance matrix and the PDF of $Y$?" - ] - }, - { - "cell_type": "markdown", - "id": "efca0cf2", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Solution 5\n", - "\n", - "Its expected value will be $\\mathbb{E}[Y] = A \\mathbb{E} [X] + b$ and its covariance matrix will be:\n", - "$$\n", - "\\Sigma_Y = \\mathbb{E}\\left[(AX - A \\mathbb{E}[X]) (AX - A \\mathbb{E}[X])^\\top\\right] = A \\mathbb{E}\\left[(X - \\mathbb{E}[X]) (X - \\mathbb{E}[X])^\\top\\right] A^\\top = A \\Sigma_X A^\\top\n", - "$$\n", - "\n", - "So,\n", - "- $g(x) = Ax + b$\n", - "- $\\frac{\\partial y}{\\partial x} = A$\n", - "- $g^{-1}(y) = A^{-1} (y - b)$\n", - "\n", - "Therefore,\n", - "$$\n", - "f_Y(y) = f_X(g^{-1}(y)) \\left|\\det \\frac{\\partial y}{\\partial x}\\right|^{-1} = f_X(A^{-1}(y - b)) \\frac{1}{|\\det A|} \n", - "$$" - ] - }, - { - "cell_type": "markdown", - "id": "8f5e55a6", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Example 6\n", - "\n", - "Consider $n$ independent standard normal r.v.s $X_1, \\ldots, X_n \\sim \\mathcal{N}(0, 1)$ and vector $X = (X_1, \\ldots, X_n)^\\top$. We know that\n", - "- $\\mathbb{E}[X] = 0$\n", - "- $\\Sigma_X = I$\n", - "- $f_X(x) = (2 \\pi)^{- n/2} \\exp(- \\frac12 x^\\top x)$\n", - "\n", - "Consider a non-degenerate matrix $A \\in \\mathbb{R}^{n\\times n}$ and $Y = AX + m$. Find its expectation, covariance matrix and distribution." - ] - }, - { - "cell_type": "markdown", - "id": "24b90512", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Solution 6\n", - "\n", - "From the previous example, $\\mathbb{E}[Y] = m$ and $\\Sigma_Y = AA^\\top$\n", - "\n", - "$$\n", - "\\begin{aligned}\n", - "f_X(x) & = f_X(A^{-1}(y - m)) \\frac{1}{|\\det A|} = \\\\\n", - "& = \\frac{1}{(2 \\pi)^{n/2} |\\det A|} \\exp \\left( - \\frac{(y - m)^\\top A^{-\\top} A^{-1} (y - m)}{2} \\right) = \\\\\n", - "& = \\frac{1}{(2 \\pi)^{n/2} \\sqrt{\\det \\Sigma_Y}} \\exp \\left( - \\frac{(y - m)^\\top \\Sigma_Y^{-1} (y - m)}{2} \\right)\n", - "\\end{aligned}\n", - "$$\n", - "\n", - "What we just did, is we obtained a new random normal vector with controllable parameters from a standard normal random vector. This is a very useful property of a random normal vectors, but also demonstrates the power of linear transforms." - ] - }, - { - "cell_type": "markdown", - "id": "5c3cb2d1", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Example 7\n", - "\n", - "Let random vector $(X, Y)$ have PDF $f_{X, Y}(x, y)$. Find the density of $Z = X + Y$." - ] - }, - { - "cell_type": "markdown", - "id": "a99217cb", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Solution 7\n", - "\n", - "In order to find this density, consider transform\n", - "$$\n", - "\\begin{pmatrix}Z\\\\Y\\end{pmatrix} = A \\begin{pmatrix}X\\\\Y\\end{pmatrix}\n", - "$$\n", - "\n", - "So, matrix $A$ is...?" - ] - }, - { - "cell_type": "markdown", - "id": "800ad1a5", - "metadata": { - "slideshow": { - "slide_type": "fragment" - } - }, - "source": [ - "$$\n", - "A = \\begin{pmatrix}1&1\\\\0&1\\end{pmatrix}\n", - "$$" - ] - }, - { - "cell_type": "markdown", - "id": "3d8a4781", - "metadata": { - "slideshow": { - "slide_type": "fragment" - } - }, - "source": [ - "Then,\n", - "$$\n", - "f_{Z,Y}(z,y) = f_{X,Y}(A^{-1}(x \\; y)^\\top, y) \\left|\\det A\\right|^{-1} = f_{X,Y}(z-y,y)\n", - "$$" - ] - }, - { - "cell_type": "markdown", - "id": "20094347", - "metadata": { - "slideshow": { - "slide_type": "subslide" - } - }, - "source": [ - "Finally,\n", - "$$\n", - "f_Z(z) = \\int f_{Z,Y}(z,y) dy = \\int f_{X,Y}(z-y,y) dy\n", - "$$\n", - "\n", - "If $X \\perp Y$,\n", - "$$\n", - "f_Z(z) = \\int f_{X}(z-y) f_Y(y) dy\n", - "$$\n", - "\n", - "This is the convolution rule that we studied on Seminar 4." - ] - }, - { - "cell_type": "markdown", - "id": "af673e63", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Convolution rule\n", - "\n", - "Consider independent r.v.s $X$ and $Y$. Then, their sum $Z = X + Y$ is distributed:\n", - "- If they are discrete,\n", - " $$\n", - " \\mathbb{P}(Z = n) = \\sum_{k=-\\infty}^\\infty \\mathbb{P}(X = k) \\mathbb{P}(Y = n-k)\n", - " $$\n", - "- If they are continuous,\n", - " $$\n", - " f_Z(z) = \\int_{-\\infty}^\\infty f_X(z-y) f_Y(y) dy\n", - " $$" - ] - }, - { - "cell_type": "markdown", - "id": "1ba91eff", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Example 8\n", - "\n", - "Let $X \\sim Bi(n, p)$ and $Y \\sim Bi(m, p)$ be independent. What is the distribution of $Z = X + Y$?" - ] - }, - { - "cell_type": "markdown", - "id": "ef82a7ba", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Solution 8\n", - "\n", - "$$\n", - "\\begin{aligned}\n", - "\\mathbb{P}(X + Y = k) & = \\sum_j \\begin{pmatrix}n\\\\j\\end{pmatrix} p^j (1-p)^{n-j} \\begin{pmatrix}m\\\\k-j\\end{pmatrix} p^{k-j} (1-p)^{m-k+j} = \\\\\n", - "& = p^{k} (1-p)^{n+m-k} \\sum_j \\begin{pmatrix}n\\\\j\\end{pmatrix} \\begin{pmatrix}m\\\\k-j\\end{pmatrix} = \\\\\n", - "& =\\begin{pmatrix}n+m\\\\k\\end{pmatrix} p^{k} (1-p)^{n+m-k}\n", - "\\end{aligned}\n", - "$$\n", - "\n", - "$$\n", - "Z \\sim Bi(n+m, p)\n", - "$$" - ] - }, - { - "cell_type": "markdown", - "id": "7997e355", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Example 9\n", - "\n", - "Let $X, Y \\sim Exp(1)$ be independent. Find the PDF of $Z = \\frac{X}{X+Y}$." - ] - }, - { - "cell_type": "markdown", - "id": "d85532bf", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Solution 9\n", - "\n", - "This one will be a non-linear transform. Consider transform $(X, Y) \\to (Z, U)$, where $U = X + Y$. The inverse transform is $X = UZ, Y = U - UZ$. Jacobian of the inverse transform is\n", - "$$\n", - "\\frac{\\partial (x, y)}{\\partial (z, u)} = \\begin{pmatrix}u&z\\\\-u&1-z\\end{pmatrix}\n", - "$$\n", - "\n", - "The joint density is\n", - "$$\n", - " f_{Z, U}(z, u) = f_{X, Y}(g^{-1}(z, u)) \\left|\\det \\frac{\\partial (x, y)}{\\partial (z, u)}\\right| = f_{X, Y}(uz, u-uz) u = ue^u, u > 0, 0 < z \\leqslant 1\n", - "$$\n", - "\n", - "The joint density does not depend on $z$, it means that marginal density of $Z$ is Uniform with support $[0, 1]$." - ] - }, - { - "cell_type": "markdown", - "id": "05ea168e", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "# Homework problems\n", - "\n", - "1. Compute the moment-generating function of $Geom(p)$. Use it to find expectation and variance.\n", - "2. Let $X$ and $Y$ be i.i.d. $Geom(p)$, and $N = X + Y$. Find the joint PMF of $X, Y, N$. Find the joint PMF of $X$ and $N$. Find the conditional PMF of $X$ given $N = n$\n", - "3. Let $U \\sim U[0, \\tfrac{\\pi}{2}]$. Find the PDF of $\\sin(U)$.\n", - "4. Let $X$ and $Y$ be i.i.d. $Exp(\\lambda)$, and $T = \\log(X/Y)$. Find the CDF and PDF of $T$." - ] - } - ], - "metadata": { - "celltoolbar": "Слайд-шоу", - "kernelspec": { - "display_name": "Python 3 (ipykernel)", - "language": "python", - "name": "python3" - }, - "language_info": { - "codemirror_mode": { - "name": "ipython", - "version": 3 - }, - "file_extension": ".py", - "mimetype": "text/x-python", - "name": "python", - "nbconvert_exporter": "python", - "pygments_lexer": "ipython3", - "version": "3.9.7" - } - }, - "nbformat": 4, - "nbformat_minor": 5 -} diff --git a/Seminar_materials/Seminar10/Seminar 10 (Transformations).pdf b/Seminar_materials/Seminar10/Seminar 10 (Transformations).pdf deleted file mode 100644 index 36836f8..0000000 Binary files a/Seminar_materials/Seminar10/Seminar 10 (Transformations).pdf and /dev/null differ diff --git a/Seminar_materials/Seminar11/Seminar 11 (Beta and Gamma).ipynb b/Seminar_materials/Seminar11/Seminar 11 (Beta and Gamma).ipynb deleted file mode 100644 index 96f3402..0000000 --- a/Seminar_materials/Seminar11/Seminar 11 (Beta and Gamma).ipynb +++ /dev/null @@ -1,582 +0,0 @@ -{ - "cells": [ - { - "cell_type": "markdown", - "id": "6f9a4edb", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "# Homework problems" - ] - }, - { - "cell_type": "markdown", - "id": "25de506c", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Problem 1\n", - "\n", - "Compute the moment-generating function of $Geom(p)$. Use it to find expectation and variance." - ] - }, - { - "cell_type": "markdown", - "id": "3e1e710d", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Solution 1\n", - "\n", - "MGF:\n", - "$$\n", - "M_X(t) = \\mathbb{E}\\left[e^{tX}\\right] = \\sum\\limits_{k=0}^\\infty e^{tk} q^k p = p \\sum\\limits_{k=0}^\\infty\\left( e^{t} q \\right)^k = \\frac{p}{1 - e^t q}\n", - "$$\n", - "\n", - "First derivative:\n", - "$$\n", - "M'_X(t) = \\frac{p q e^t}{(1 - e^t q)^2}\n", - "$$" - ] - }, - { - "cell_type": "markdown", - "id": "03ef75ae", - "metadata": { - "slideshow": { - "slide_type": "subslide" - } - }, - "source": [ - "Expectation $M'_X(0) = \\frac{q}{p}$. Second derivative:\n", - "$$\n", - "M''_X(0) = p q e^t \\frac{1 + q e^t}{(1 - e^t q)^3}\n", - "$$\n", - "\n", - "Second moment $M''_X(0) = \\frac{q(1+q)}{p^2}$. Variance $\\mathbb{V}\\text{ar}(X) = \\frac{q^2+q}{p^2} - \\frac{q^2}{p^2} = \\frac{q}{p^2}$." - ] - }, - { - "cell_type": "markdown", - "id": "436606ff", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Problem 2\n", - "\n", - "Let $X$ and $Y$ be i.i.d. $Geom(p)$, and $N = X + Y$. Find the joint PMF of $X, Y, N$. Find the joint PMF of $X$ and $N$. Find the conditional PMF of $X$ given $N = n$." - ] - }, - { - "cell_type": "markdown", - "id": "8bd33e84", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Solution 2\n", - "\n", - "$$\n", - "\\begin{aligned}\n", - "\\mathbb{P}(X = x, Y = y, N = n) & = \\mathbb{P}(X = x, Y = y) \\mathbb{I}\\text{nd}(n = x + y) = \\\\\n", - "& = \\mathbb{P}(X = x)\\mathbb{P}(Y = y) \\mathbb{I}\\text{nd}(n = x + y) = \\\\\n", - "& = q^{x+y} p^2 \\mathbb{I}\\text{nd}(n = x + y) = \\\\\n", - "& = q^{n} p^2 \\mathbb{I}\\text{nd}(n = x + y)\n", - "\\end{aligned}\n", - "$$" - ] - }, - { - "cell_type": "markdown", - "id": "56574f7f", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Problem 3\n", - "\n", - "Let $U \\sim U[0, \\tfrac{\\pi}{2}]$. Find the PDF of $\\sin(U)$." - ] - }, - { - "cell_type": "markdown", - "id": "7f7899d5", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Solution 3\n", - "\n", - "So $g(x) = \\sin(- x)$, it is not one-to-one in general, but one-to-one on $[0, \\frac{\\pi}{2}]$, and $g^{-1}(y) = - \\arcsin y$. Let's find\n", - "$$\n", - "\\frac{dg(x)}{dx} = \\cos x\n", - "$$" - ] - }, - { - "cell_type": "markdown", - "id": "44ee3149", - "metadata": { - "slideshow": { - "slide_type": "subslide" - } - }, - "source": [ - "So,\n", - "$$\n", - "\\begin{aligned}\n", - "f_Y(y) & = f_X(x) \\left| \\frac{d g(x)}{dx} \\right|^{-1} = f_X(\\arcsin y) \\frac{1}{\\cos \\arcsin y} = \\\\\n", - "& = f_X(\\arcsin y) \\frac{1}{\\sqrt{1 - x^2}}\n", - "\\end{aligned}\n", - "$$" - ] - }, - { - "cell_type": "markdown", - "id": "3620fde4", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Problem 4\n", - "\n", - "Let $X$ and $Y$ be i.i.d. $Exp(\\lambda)$, and $T = \\log(X/Y)$. Find the CDF and PDF of $T$." - ] - }, - { - "cell_type": "markdown", - "id": "3594ab4b", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Solution 4\n", - "\n", - "Consider transform $(X, Y) \\to (Z, Y)$, where $Z = X/Y$. The inverse transform is $X = YZ, Y = Y$. Jacobian of the inverse transform is\n", - "$$\n", - "\\frac{\\partial (x, y)}{\\partial (z, y)} = \\begin{pmatrix}y&z\\\\0&1\\end{pmatrix}\n", - "$$\n", - "\n", - "The joint density is\n", - "$$\n", - "f_{Z, Y}(z, y) = f_{X, Y}(g^{-1}(z, y)) \\left|\\det \\frac{\\partial (x, y)}{\\partial (z, y)}\\right| = f_{X, Y}(yz, y) y = y \\lambda e^{- \\lambda yz} \\lambda e^{- \\lambda y} = y \\lambda^2 e^{- \\lambda y (1 + z)}, y > 0\n", - "$$" - ] - }, - { - "cell_type": "markdown", - "id": "a14e0b95", - "metadata": { - "slideshow": { - "slide_type": "subslide" - } - }, - "source": [ - "Let's find the marginal distribution of $Z$:\n", - "$$\n", - "f_Z(z) = \\frac{\\lambda^2}{\\lambda^2 (1 + z)^2} \\int_0^\\infty y (1 + z) e^{- \\lambda y (1 + z)} d ( y ( 1 + z ) ) = \\frac{1}{(1 + z)^2} \\int_0^\\infty u e^{-u} du = \\frac{1}{(1 + z)^2}\n", - "$$\n", - "\n", - "Let's find the log-transform $T$. So $g(x) = \\log(x)$, it is one-to-one, and $g^{-1}(y) = \\exp y$. Let's find\n", - "$$\n", - "\\frac{dg(x)}{dx} = \\frac{1}{x}\n", - "$$" - ] - }, - { - "cell_type": "markdown", - "id": "d23b2188", - "metadata": { - "slideshow": { - "slide_type": "subslide" - } - }, - "source": [ - "So,\n", - "$$\n", - "\\begin{aligned}\n", - "f_T(t) & = f_Z(z) \\left| \\frac{d g(z)}{dz} \\right|^{-1} = f_Z(\\exp t) \\exp t = \\\\\n", - "& = \\frac{\\exp t}{(1 + \\exp t)^2}\n", - "\\end{aligned}\n", - "$$" - ] - }, - { - "cell_type": "markdown", - "id": "35cd90a0", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "# Beta and Gamma" - ] - }, - { - "cell_type": "markdown", - "id": "9790546c", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Beta distribution\n", - "\n", - "The Beta distribution is a continuous distribution on the interval $(0, 1)$. It is a generalization of the $U[0,1]$ distribution, allowing the PDF to be non-constant. Let $X \\sim Beta(a, b)$ with $a > 0, b > 0$, then\n", - "$$\n", - "f_X(x) = \\frac{1}{\\beta(a, b)} x^{a-1} (1 - x)^{b-1}, 0 < x < 1\n", - "$$\n", - "\n", - "$\\beta(a, b)$ is a normalizing constant. We will discuss what it is exactly later." - ] - }, - { - "cell_type": "markdown", - "id": "ffe599e2", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Beta distribution\n", - "\n", - "Why need this extension? For example for Bayesian inference.\n", - "\n", - "Bayesian inference is used to update the parameters $\\theta$ of distribution $X$ after observing some data $X$ and also equipped with preliminary beliefs on the values of these parameters expressed **as a distibution**. We will cover it in statistics course in great detail. Bayesian inference relies on bayes theorem:\n", - "$$\n", - "\\mathbb{P}(\\theta|X) = \\frac{\\mathbb{P}(X|\\theta)\\mathbb{P}(\\theta)}{\\int \\mathbb{P}(X|\\theta) \\mathbb{P}(\\theta) d\\theta}\n", - "$$\n", - "\n", - "Here, $\\mathbb{P}(X|\\theta)$ is the likelihood of the data (probability to observe it with current parameters), $\\mathbb{P}(\\theta)$ is the prior distribution on parameters (our preliminary beliefs), and $\\mathbb{P}(\\theta|X)$ is the posterior distribution on parameters (updated beliefs).\n", - "\n", - "In order to compute the integral in the denominator, you need your prior and likelihood to be **conjugate**. If your data is counts, the likelihood will be Binomial. If you do not have any beliefs, you may assume Uniform prior. However, Uniform distribution is not conjugate with Binomial, so the integral can not be computed. Beta distribution is in fact conjugate with Binomial." - ] - }, - { - "cell_type": "markdown", - "id": "39815f27", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Beta distribution\n", - "\n", - "Here, $\\mathbb{P}(X|\\theta)$ is the likelihood of the data (probability to observe it with current parameters), $\\mathbb{P}(\\theta)$ is the prior distribution on parameters (our preliminary beliefs), and $\\mathbb{P}(\\theta|X)$ is the posterior distribution on parameters (updated beliefs).\n", - "\n", - "In order to compute the integral in the denominator, you need your prior and likelihood to be **conjugate**. If your data is counts, the likelihood will be Binomial. If you do not have any beliefs, you may assume Uniform prior. Beta distribution is in fact conjugate with Binomial." - ] - }, - { - "cell_type": "code", - "execution_count": 9, - "id": "d901b44d", - "metadata": { - "slideshow": { - "slide_type": "skip" - } - }, - "outputs": [], - "source": [ - "import numpy as np\n", - "import scipy.stats as sts\n", - "import matplotlib.pyplot as plt\n", - "%matplotlib inline" - ] - }, - { - "cell_type": "code", - "execution_count": 19, - "id": "4c0faa72", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "outputs": [ - { - "data": { - "image/png": "\n", - "text/plain": [ - "
" - ] - }, - "metadata": { - "needs_background": "light" - }, - "output_type": "display_data" - } - ], - "source": [ - "xx = np.linspace(0,1,100)\n", - "fig, axes = plt.subplots(2,2,figsize=(5,5), sharex=True, sharey=True)\n", - "for (a, b), ax in zip([(0.5, 0.5), (1, 1), (2, 8), (5, 5)], axes.flatten()):\n", - " ax.plot(xx, sts.beta(a, b).pdf(xx))\n", - " ax.set_title(f\"Beta({a}, {b})\")\n", - "fig.tight_layout()" - ] - }, - { - "cell_type": "markdown", - "id": "44756d94", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Beta distribution\n", - "\n", - "By definition,\n", - "$$\n", - "\\beta(a, b) = \\int_0^1 x^{a-1} (1 - x)^{b-1} dx\n", - "$$\n", - "\n", - "We can compute it directly, but there is a useful relation (read story proof in the textbook) called Bayes billiards:\n", - "$$\n", - "\\int_0^1 \\begin{pmatrix}n\\\\k\\end{pmatrix} x^{a-1} (1 - x)^{b-1} dx = \\frac{1}{n+1}\n", - "$$\n", - "\n", - "We can compute $\\beta(a, b)$ using this relation." - ] - }, - { - "cell_type": "markdown", - "id": "86084e0f", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Gamma distribution\n", - "\n", - "The Gamma distribution is a continuous distribution on the positive real line. It is a generalization of the Exponential distribution. In order to write down the PDF, we will need the **gamma function**:\n", - "$$\n", - "\\Gamma(a) = \\int_0^\\infty x^{a-1} e^{-x} dx, a > 0\n", - "$$\n", - "\n", - "Properties:\n", - "1. $\\Gamma(a+1) = a\\Gamma(a)$\n", - "2. $\\Gamma(n) = (n-1)!, n \\in \\mathbb{N}_{++}$" - ] - }, - { - "cell_type": "markdown", - "id": "d9a6e00f", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Gamma distribution\n", - "\n", - "Let $X \\sim Gamma(a, 1)$ with $a > 0$, then\n", - "$$\n", - "f_X(x) = \\frac{1}{\\Gamma(a)} x^{a-1} e^{-x}, x > 0\n", - "$$\n", - "\n", - "The distribution of $Y = \\frac{1}{\\lambda} X \\sim Gamma(a, \\lambda)$. By the transform formula,\n", - "$$\n", - "f_Y(y) = \\frac{1}{\\Gamma(a)} \\lambda^a y^{a-1} e^{-\\lambda y}, y > 0\n", - "$$\n", - "\n", - "Note that for $a = 1$ we have $Gamma(1, \\lambda) \\equiv Exp(\\lambda)$." - ] - }, - { - "cell_type": "code", - "execution_count": 18, - "id": "fcdcec6d", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "outputs": [ - { - "data": { - "image/png": "\n", - "text/plain": [ - "
" - ] - }, - "metadata": { - "needs_background": "light" - }, - "output_type": "display_data" - } - ], - "source": [ - "xx = np.linspace(0,20,100)\n", - "fig, axes = plt.subplots(2,2,figsize=(5,5), sharex=True, sharey=True)\n", - "for (a, b), ax in zip([(3, 1), (3, 0.5), (10, 1), (5, 0.5)], axes.flatten()):\n", - " ax.plot(xx, sts.gamma(a).pdf(xx * b) * b)\n", - " ax.set_title(f\"Gamma({a}, {b})\")\n", - "fig.tight_layout()" - ] - }, - { - "cell_type": "markdown", - "id": "a1924a25", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Gamma distribution\n", - "\n", - "Let's find the mean, variance, and other moments of the Gamma distribution. Let's start with $X \\sim Gamma(a, 1)$. We will do it without taking a single integral.\n", - "\n", - "$$\n", - "\\begin{aligned}\n", - "\\mathbb{E}[X] & = \\int_0^\\infty x \\frac{1}{\\Gamma(a)} x^{a-1} e^{-x} dx = \\frac{1}{\\Gamma(a)} \\int_0^\\infty x x^{a+1-1} e^{-x} dx = \\\\\n", - "& = \\frac{\\Gamma(a+1)}{\\Gamma(a)} = \\frac{a\\Gamma(a)}{\\Gamma(a)} = a\n", - "\\end{aligned}\n", - "$$" - ] - }, - { - "cell_type": "markdown", - "id": "762889b8", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Gamma distribution\n", - "\n", - "LOTUS gives us the second moment:\n", - "$$\n", - "\\begin{aligned}\n", - "\\mathbb{E}[X^2] & = \\int_0^\\infty x^2 \\frac{1}{\\Gamma(a)} x^{a-1} e^{-x} dx = \\frac{1}{\\Gamma(a)} \\int_0^\\infty x x^{a+2-1} e^{-x} dx = \\\\\n", - "& = \\frac{\\Gamma(a+2)}{\\Gamma(a)} = \\frac{a(a+1)\\Gamma(a)}{\\Gamma(a)} = a (a+1)\n", - "\\end{aligned}\n", - "$$\n", - "\n", - "Then, the variance is $\\mathbb{V}\\text{ar}(X) = \\mathbb{E}\\left[X^2\\right] - \\left(\\mathbb{E}[X] \\right)^2 = a$. " - ] - }, - { - "cell_type": "markdown", - "id": "edd60693", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Gamma distribution\n", - "\n", - "We can now transform to $Y = X/\\lambda \\sim Gamma(a, \\lambda)$, to obtain:\n", - "- $\\mathbb{E}[Y] = \\frac{a}{\\lambda}$\n", - "- $\\mathbb{V}\\text{ar}(Y) = \\frac{a}{\\lambda^2}$\n", - "\n", - "The Gamma distribution is conjugate with Poisson distribution." - ] - }, - { - "cell_type": "markdown", - "id": "6797e982", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Beta-Gamma connection\n", - "\n", - "$$\n", - "\\beta(a, b) = \\frac{\\Gamma(a) \\Gamma(b)}{\\Gamma(a+b)}\n", - "$$" - ] - }, - { - "cell_type": "markdown", - "id": "449c4970", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "# Homework problems" - ] - }, - { - "cell_type": "markdown", - "id": "443277f6", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Homework problems\n", - "\n", - "1. Find the MGF function of $\\Gamma(n, \\lambda)$.\n", - "2. While running errands, you need to go to the bank, then to the post office. Let $X \\sim Gamma(a, \\lambda)$ be your waiting time in line at the bank, and let $Y \\sim Gamma(b, \\lambda)$ be your waiting time in line at the post office (with the same $\\lambda$ for both). Assume $X$ and $Y$ are independent. What is the joint distribution of $T = X + Y$ (your total wait at the bank and post office) and $W = X/(X+Y)$ (the fraction of your waiting time spent at the bank)? In case of trouble, refer to the textbook, this problem is solved there.\n", - "3. Use the result of previous problem to find the expectation and variance of Beta distribution. In case of trouble, refer to the textbook, this problem is solved there." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "b41d73fb", - "metadata": {}, - "outputs": [], - "source": [] - } - ], - "metadata": { - "celltoolbar": "Слайд-шоу", - "kernelspec": { - "display_name": "Python 3 (ipykernel)", - "language": "python", - "name": "python3" - }, - "language_info": { - "codemirror_mode": { - "name": "ipython", - "version": 3 - }, - "file_extension": ".py", - "mimetype": "text/x-python", - "name": "python", - "nbconvert_exporter": "python", - "pygments_lexer": "ipython3", - "version": "3.8.9" - } - }, - "nbformat": 4, - "nbformat_minor": 5 -} diff --git a/Seminar_materials/Seminar11/Seminar 11 (Beta and Gamma).pdf b/Seminar_materials/Seminar11/Seminar 11 (Beta and Gamma).pdf deleted file mode 100644 index 0833ad8..0000000 Binary files a/Seminar_materials/Seminar11/Seminar 11 (Beta and Gamma).pdf and /dev/null differ diff --git a/Seminar_materials/Seminar12/Seminar 12 (Conditional Expectation).ipynb b/Seminar_materials/Seminar12/Seminar 12 (Conditional Expectation).ipynb deleted file mode 100644 index 00bd598..0000000 --- a/Seminar_materials/Seminar12/Seminar 12 (Conditional Expectation).ipynb +++ /dev/null @@ -1,563 +0,0 @@ -{ - "cells": [ - { - "cell_type": "markdown", - "id": "19d9a552", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "# Homework problems" - ] - }, - { - "cell_type": "markdown", - "id": "32b13142", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Problem 1\n", - "\n", - "Find the MGF function of $X \\sim \\Gamma(n, \\lambda)$." - ] - }, - { - "cell_type": "markdown", - "id": "96b637bd", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Soltuion 1\n", - "\n", - "MGF of $X$:\n", - "$$\n", - "\\begin{aligned}\n", - "M_X(t) & = \\mathbb{E}\\left[e^{tX}\\right] = \\int\\limits_{-\\infty}^\\infty e^{tx} \\frac{1}{\\Gamma(n)} \\lambda^n x^{n-1} e^{-\\lambda x} dx = \\\\\n", - "& = \\frac{1}{\\Gamma(n)} \\frac{\\lambda^n}{(\\lambda - t)^n} \\int\\limits_{-\\infty}^\\infty \\left((\\lambda - t)x\\right)^{n-1} e^{-(\\lambda-t)x} d \\left( (\\lambda - t) x \\right) = \\\\\n", - "& = \\frac{1}{\\Gamma(n)} \\frac{\\lambda^n}{(\\lambda - t)^n} \\int\\limits_{-\\infty}^\\infty u^{n-1} e^{-u} d \\left( u \\right) = \\\\\n", - "& = \\frac{\\lambda^n}{(\\lambda - t)^n}\n", - "\\end{aligned}\n", - "$$" - ] - }, - { - "cell_type": "markdown", - "id": "964f5b16", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Problem 2\n", - "\n", - "While running errands, you need to go to the bank, then to the post office. Let $X \\sim Gamma(a, \\lambda)$ be your waiting time in line at the bank, and let $Y \\sim Gamma(b, \\lambda)$ be your waiting time in line at the post office (with the same $\\lambda$ for both). Assume $X$ and $Y$ are independent. What is the joint distribution of $T = X + Y$ (your total wait at the bank and post office) and $W = X/(X+Y)$ (the fraction of your waiting time spent at the bank)?" - ] - }, - { - "cell_type": "markdown", - "id": "233ade75", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Solution 2\n", - "\n", - "Consider transform $(X, Y) \\to (T, W)$, where $T = X + Y$ and $W = \\frac{X}{X+Y}$. The inverse transform is $X = TW, Y = T(1 - W)$. The Jacobian of the **direct** transform is:\n", - "$$\n", - "\\left| \\det \\frac{\\partial (x, y)}{\\partial (t, w)} \\right| = \\left| \\det \\begin{pmatrix}w&t\\\\1-w&-t\\end{pmatrix} \\right| = t\n", - "$$\n", - "\n", - "Then,\n", - "$$\n", - "\\begin{aligned}\n", - "f_{T,W}(t, w) & = f_{X,Y}(tw, t(1-w)) \\left| \\det \\frac{\\partial (x, y)}{\\partial (t, w)} \\right| = \\\\\n", - "& = f_X(tw)f_Y\\left(t(1-w)\\right) t = \\\\\n", - "& = t \\frac{1}{\\Gamma(a)} \\lambda^a (tw)^{a-1} e^{-a tw} \\frac{1}{\\Gamma(b)} \\lambda^b (t(1-w))^{b-1} e^{-b t(1-w)} = \\\\\n", - "& = \\frac{1}{\\Gamma(a)\\Gamma(b)} w^{a-1} (1-w)^{b-1} (\\lambda t)^{a+b} e^{-\\lambda t} \\frac1t = \\\\\n", - "& = \\left( \\frac{\\Gamma(a+b)}{\\Gamma(a)\\Gamma(b)} w^{a-1} (1-w)^{b-1} \\right) \\left( \\frac{1}{\\Gamma(a+b)} \\lambda^{a+b} t^{a+b-1} e^{-\\lambda t} \\right)\n", - "\\end{aligned}\n", - "$$" - ] - }, - { - "cell_type": "markdown", - "id": "8906e87d", - "metadata": { - "slideshow": { - "slide_type": "subslide" - } - }, - "source": [ - "$$\n", - "f_{T,W}(t, w) = \\left( \\frac{\\Gamma(a+b)}{\\Gamma(a)\\Gamma(b)} w^{a-1} (1-w)^{b-1} \\right) \\left( \\frac{1}{\\Gamma(a+b)} \\lambda^{a+b} t^{a+b-1} e^{-\\lambda t} \\right)\n", - "$$\n", - "\n", - "First, it tells us that $f_{T,W}(t,w) = f_T(t)f_W(w)$, so they are independent (total time is independent of the fraction at the bank). Second, we have expression for $f_T(t)$:\n", - "$$\n", - "f_T(t) = \\frac{1}{\\Gamma(a+b)} \\lambda^{a+b} t^{a+b-1} e^{-\\lambda t}\n", - "$$\n", - "\n", - "In which we recognize $T \\sim Gamma(a+b, \\lambda)$. We also have expression for $f_W(w)$:\n", - "$$\n", - "f_W(w) = \\frac{\\Gamma(a+b)}{\\Gamma(a)\\Gamma(b)} w^{a-1} (1-w)^{b-1}\n", - "$$\n", - "\n", - "In which we recognize $W \\sim Beta(a, b)$. This gives us expression for beta function:\n", - "$$\n", - "\\beta(a, b) = \\frac{\\Gamma(a)\\Gamma(b)}{\\Gamma(a + b)}\n", - "$$" - ] - }, - { - "cell_type": "markdown", - "id": "c639e62e", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Problem 3\n", - "\n", - "Use the result of previous problem to find the expectation and variance of Beta distribution." - ] - }, - { - "cell_type": "markdown", - "id": "d54229c3", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Solution 3\n", - "\n", - "Result of previous problem: for $X \\sim Gamma(a, \\lambda)$ and $Y \\sim Gamma(b, \\lambda)$, we have $T = X + Y \\sim Gamma(a+b, \\lambda)$ and $W = \\frac{X}{X+Y} \\sim Beta(a, b)$ independent.\n", - "\n", - "Now,\n", - "$$\n", - "\\mathbb{E}\\left[TW\\right] = \\mathbb{E}\\left[T\\right]\\mathbb{E}\\left[W\\right]\n", - "$$\n", - "$$\n", - "\\mathbb{E}\\left[(X+Y) \\frac{X}{X+Y}\\right] = \\mathbb{E}\\left[X+Y\\right]\\mathbb{E}\\left[W\\right]\n", - "$$\n", - "$$\n", - "\\mathbb{E}\\left[W\\right] = \\frac{\\mathbb{E}\\left[X\\right]}{\\mathbb{E}\\left[X+Y\\right]} = \\frac{a/\\lambda}{a/\\lambda + b/\\lambda} = \\frac{a}{a+b}\n", - "$$" - ] - }, - { - "cell_type": "markdown", - "id": "24b1ce1d", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "# Conditional expectation" - ] - }, - { - "cell_type": "markdown", - "id": "f979236d", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Conditional expectation: definition\n", - "\n", - "We know conditional probability. Consider r.v.s $X$ (discrete) and $Z$ (discrete with alphabet $E$), then for event $X = x$\n", - "$$\n", - "\\mathbb{P}(X = x | Z = z) = \\frac{\\mathbb{P}(X = x, Z = z)}{\\mathbb{P}(Z = z)}\n", - "$$\n", - "\n", - "A function $x \\to \\mathbb{P}(X = x | Z = z)$ is the conditional law of $X$ given $Z = z$. \n", - "\n", - "But what if we had not one event $Z = z$, but many events $A_i = \\{Z = z_i\\}$? We can still do it, by plugging a set of events $\\mathcal{A} = \\{A_1, \\ldots, A_n\\}$ into function $h(\\cdot)$.\n", - "\n", - "What if want to account for all events with their probabilities? We can still use our function $h(\\cdot)$ by plugging the random variable $Z$ into it! $h(Z) = \\mathbb{E}[X|Z]$ then will be a random variable." - ] - }, - { - "cell_type": "markdown", - "id": "558fcd04", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Properties\n", - "- $\\mathbb{E}[\\alpha X_1 + \\beta X_2|Z] = \\alpha\\mathbb{E}[X_1|Z] + \\beta \\mathbb{E}[X_2|Z]$\n", - "- If $X \\geqslant 0$ a.s., then $\\mathbb{E}[X|Z] \\geqslant 0$\n", - "- $\\mathbb{E}\\left[\\mathbb{E}[X|Z]\\right] = \\mathbb{E}[X]$\n", - "- For any function $h(\\cdot)$ we have $\\mathbb{E}[Xh(Z)|Z] = h(Z) \\mathbb{E}[X|Z]$ a.s.\n", - "- If $X$ and $Z$ are independent, then $\\mathbb{E}[X|Z] = \\mathbb{E}[X]$" - ] - }, - { - "cell_type": "markdown", - "id": "83aba7e9", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Example 1\n", - "\n", - "A stick of length 1 is broken at point $X$ chosen uniformly at random. Given that $X = x$, we choose another\n", - "breakpoint $Y$ uniformly at the interval $[0,x]$. Find $\\mathbb{E}[Y|X]$ and its mean." - ] - }, - { - "cell_type": "markdown", - "id": "edc23fd3", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Solution 1\n", - "\n", - "We have $X \\sim U[0, 1]$ and $Y|X=x \\sim U[0, x]$.\n", - "\n", - "The expected value of $U[0, x]$ distribution is $\\mathbb{E}[Y|X=x] = \\tfrac{x}{2}$.\n", - "\n", - "To get $\\mathbb{E}[Y|X]$ we take this expected value and plug in our random variable, so\n", - "$$\n", - "\\mathbb{E}[Y|X] = \\frac{X}{2}\n", - "$$\n", - "\n", - "To find the expectation $\\mathbb{E}\\left[\\mathbb{E}[Y|X]\\right]$, we use the property:\n", - "$$\n", - "\\mathbb{E}\\left[\\mathbb{E}[Y|X]\\right] = \\mathbb{E}[Y] = \\mathbb{E}\\left[\\frac{X}{2}\\right] = \\frac12 \\mathbb{E}[X] = \\frac14\n", - "$$" - ] - }, - { - "cell_type": "markdown", - "id": "0a46d7d9", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Example 2\n", - "\n", - "Let $Z \\sim \\mathcal{N}(0, 1)$ and $Y = Z^2$. Find $\\mathbb{E}[Y|Z]$ and $\\mathbb{E}[Z|Y]$." - ] - }, - { - "cell_type": "markdown", - "id": "1e6c7fc5", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Solution 2\n", - "\n", - "We have $Y = h(Z) = Z^2$. Therefore\n", - "$$\n", - "\\mathbb{E}[Y|Z] = \\mathbb{E}[h(Z)|Z] = h(Z) = Z^2\n", - "$$\n", - "\n", - "To get the converse $\\mathbb{E}[Z|Y]$, let's first start with\n", - "$$\n", - "\\mathbb{E}[Z|Y = y] = \\sqrt{y} \\mathbb{P}(Z = \\sqrt{y} | Y = y) + (- \\sqrt{y}) \\mathbb{P}(Z = - \\sqrt{y} | Y = y) = \\sqrt{y} p - \\sqrt{y} p = 0\n", - "$$\n", - "\n", - "By plugging in r.v. $Y$ we still get 0, so $\\mathbb{E}[Z|Y] = 0$." - ] - }, - { - "cell_type": "markdown", - "id": "ed7ebc0e", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Example 3\n", - "\n", - "Let $X_1, \\ldots, X_n$ be i.i.d. r.v.s and $S_n = X_1 + \\ldots + X_n$. Find $\\mathbb{E}[X_1|S_n]$." - ] - }, - { - "cell_type": "markdown", - "id": "b58c9bfa", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Solution 3\n", - "\n", - "By symmetry,\n", - "$$\n", - "\\mathbb{E}[X_1|S_n] = \\mathbb{E}[X_2|S_n] = \\ldots = \\mathbb{E}[X_n|S_n]\n", - "$$\n", - "\n", - "By linearity,\n", - "$$\n", - "\\mathbb{E}[X_1|S_n] + \\mathbb{E}[X_2|S_n] + \\ldots + \\mathbb{E}[X_n|S_n] = \\mathbb{E}[S_n|S_n] = S_n\n", - "$$\n", - "\n", - "So,\n", - "$$\n", - "\\mathbb{E}[X_1|S_n] = \\frac{1}{n} S_n = \\overline{X}\n", - "$$" - ] - }, - { - "cell_type": "markdown", - "id": "a7368f5f", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Example 4\n", - "\n", - "Regression uses the following to make a prediction: $\\hat{Y} = \\mathbb{E}[Y|X]$. Linear regression assumes $\\mathbb{E}[Y|X] = a + bX$.\n", - "\n", - "1. Show that an equivalent way to express this is to write $Y = a + bX + \\varepsilon$ with $\\mathbb{E}[\\varepsilon|X] = 0$.\n", - "2. Solve for the constants $a$ and $b$" - ] - }, - { - "cell_type": "markdown", - "id": "9ab5b998", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Solution 4.1\n", - "\n", - "1. Let $Y = a + bX + \\varepsilon$, then\n", - "$$\n", - "\\mathbb{E}[Y|X] = \\mathbb{E}[a + bX + \\varepsilon|X] = a + bX + \\mathbb{E}[\\varepsilon|X] = a + bX\n", - "$$\n", - "2. Let $\\mathbb{E}[Y|X] = a + bX$ and define $\\varepsilon = Y - (a + bX)$, then $Y = a + bX + \\varepsilon$ and\n", - "$$\n", - "\\mathbb{E}[\\varepsilon|X] = \\mathbb{E}[Y - (a + bX)|X] = \\mathbb{E}[Y|X] - \\mathbb{E}[a + bX|X] = a + bX - (a + bX) = 0\n", - "$$" - ] - }, - { - "cell_type": "markdown", - "id": "3de32786", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Solution 4.2\n", - "\n", - "1. Expectation of $Y$\n", - " $$\n", - " \\mathbb{E}[Y] = \\mathbb{E}\\left[\\mathbb{E}[Y|X]\\right] = \\mathbb{E}[a + bX] = a + b\\mathbb{E}[X]\n", - " $$\n", - "\n", - "2. $\\varepsilon$ has zero mean\n", - " $$\n", - " \\mathbb{E}[\\varepsilon] = \\mathbb{E}\\left[\\mathbb{E}[\\varepsilon|X]\\right] = \\mathbb{E}[0] = 0\n", - " $$\n", - "\n", - "3. $X$ and $\\varepsilon$ uncorrelated\n", - " $$\n", - " \\mathbb{E}[\\varepsilon X] = \\mathbb{E}\\left[\\mathbb{E}[\\varepsilon X|X]\\right] = \\mathbb{E} X \\left[\\mathbb{E}[\\varepsilon|X]\\right] = \\mathbb{E}[X \\cdot 0] = 0\n", - " $$\n", - "\n", - "Now, take $Y = a + bX + \\varepsilon$ and take covariance with $X$ at both sides: \n", - "$$\n", - "\\operatorname{cov}(X,Y) = \\operatorname{cov}(X,a) + \\operatorname{cov}(X,bX) + \\operatorname{cov}(X,\\varepsilon) = b \\mathbb{V}\\text{ar}(X)\n", - "$$\n", - "\n", - "Therefore,\n", - "$$\n", - "b = \\frac{\\operatorname{cov}(X,Y)}{\\mathbb{V}\\text{ar}(X)}\n", - "$$\n", - "$$\n", - "a = \\mathbb{E}[Y] - \\frac{\\operatorname{cov}(X,Y)}{\\mathbb{V}\\text{ar}(X)} \\mathbb{E}[X]\n", - "$$" - ] - }, - { - "cell_type": "markdown", - "id": "d8f112d4", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Problem 5\n", - "\n", - "One of two identical-looking coins is picked from a hat randomly, where one coin has probability $p_1$ of success and the other has probability $p_2$ of success. Let $X$ be the number of successes after flipping the chosen coin $n$ times. Find the mean of $X$." - ] - }, - { - "cell_type": "markdown", - "id": "36c321c1", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Solution 5\n", - "\n", - "Denote $Y \\sim Be(1/2)$ the r.v. that we chose coin 1, $X_1 \\sim Bi(n, p_1)$ and $X_2 \\sim Bi(n, p_2)$ the number of successes for trials with different coins. Then\n", - "$$\n", - "X = Y X_1 + (1 - Y) X_2\n", - "$$\n", - "\n", - "Use the tower rule:\n", - "$$\n", - "\\mathbb{E}[X] = \\mathbb{E}\\left[\\mathbb{E}[X|Y]\\right]\n", - "$$\n", - "\n", - "Now we need to find\n", - "$$\n", - "\\mathbb{E}[X|Y = y] = y np_1 + (1 - y) np_2\n", - "$$\n", - "\n", - "Then,\n", - "$$\n", - "\\mathbb{E}[X|Y] = Y np_1 + (1 - Y) np_2\n", - "$$\n", - "\n", - "Finally,\n", - "$$\n", - "\\mathbb{E}[X] = \\mathbb{E}\\left[Y np_1 + (1 - Y) np_2\\right] = \\frac{n}{2} (p_1 + p_2)\n", - "$$" - ] - }, - { - "cell_type": "markdown", - "id": "812beebf", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Problem 6\n", - "\n", - "Let $N \\sim Pois(\\lambda_1)$ be the number of movies released in 2022. Suppose that for every movie independently the number of tickets sold in Dolgoprudnyy is $X_i \\sim Pois(\\lambda_2)$. Find the mean of the total number of movie tickets sold in Dolgoprudnyy in 2022." - ] - }, - { - "cell_type": "markdown", - "id": "e6fd2258", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Solution 6\n", - "\n", - "Denote $S_N = X_1 + \\ldots + X_N$. We need to find $\\mathbb{E}[S_N]$. Use tower rule:\n", - "$$\n", - "\\mathbb{E}[S_N] = \\mathbb{E}\\left[\\mathbb{E}[S_N|N]\\right]\n", - "$$\n", - "\n", - "$$\n", - "\\mathbb{E}[S_N|N = n] = \\mathbb{E}[X_1 + \\ldots + X_N|N = n] = \\mathbb{E}[X_1 + \\ldots + X_n] = n \\mathbb{E}[X_1] = n \\lambda_2\n", - "$$\n", - "\n", - "$$\n", - "\\mathbb{E}[S_N|N] = N \\lambda_2\n", - "$$\n", - "\n", - "$$\n", - "\\mathbb{E}[S_N] = \\mathbb{E}\\left[N \\lambda_2\\right] = \\lambda_1 \\lambda_2\n", - "$$" - ] - }, - { - "cell_type": "markdown", - "id": "dbfabb16", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "# Homework problems" - ] - }, - { - "cell_type": "markdown", - "id": "498aed6c", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Homework problems\n", - "\n", - "1. Emails arrive one at a time in an inbox. Let $T_n$ be the time at which the $n$-th email arrives (measured on a continuous scale from some starting point in time). Suppose that the waiting times between emails are i.i.d. $Exp(\\lambda)$, i.e., $T_1, T_2 - T_1, T_3 − T_2, \\ldots \\sim Exp(\\lambda)$. Each email is non-spam with probability $p$, and spam with probability $q = 1 − p$ (independently of the other emails and of the waiting times). Let $X$ be the time at which the first non-spam email arrives (so $X$ is a continuous r.v.\n", - " - Find the mean of $X$.\n", - " - Find the MGF of $X$. What famous distribution does this imply that X has?\n", - " \n", - " Hint for both parts: Let $N$ be the number of emails until the first non-spam (including that one), and write $X$ as a sum of $N$ terms, then condition on $N$.\n", - " \n", - "2. Customers arrive at a store according to a Poisson process of rate $\\lambda$ customers per hour. Each makes a purchase with probability $p$, independently. Given that a customer makes a purchase, the amount spent has mean $\\mu$ (in dollars) and variance $\\sigma^2$.\n", - "\n", - " - Find the mean of how much a random customer spends (note that the customer may not make a purchase).\n", - " - Find the mean of the revenue the store obtains in an 8-hour time interval, using previous subproblem." - ] - } - ], - "metadata": { - "celltoolbar": "Slideshow", - "kernelspec": { - "display_name": "Python 3 (ipykernel)", - "language": "python", - "name": "python3" - }, - "language_info": { - "codemirror_mode": { - "name": "ipython", - "version": 3 - }, - "file_extension": ".py", - "mimetype": "text/x-python", - "name": "python", - "nbconvert_exporter": "python", - "pygments_lexer": "ipython3", - "version": "3.8.9" - } - }, - "nbformat": 4, - "nbformat_minor": 5 -} diff --git a/Seminar_materials/Seminar12/Seminar 12 (Conditional Expectation).pdf b/Seminar_materials/Seminar12/Seminar 12 (Conditional Expectation).pdf deleted file mode 100644 index 95eef69..0000000 Binary files a/Seminar_materials/Seminar12/Seminar 12 (Conditional Expectation).pdf and /dev/null differ diff --git a/Seminar_materials/Seminar13/Seminar 13 (Inequalities).html b/Seminar_materials/Seminar13/Seminar 13 (Inequalities).html deleted file mode 100644 index 5487395..0000000 --- a/Seminar_materials/Seminar13/Seminar 13 (Inequalities).html +++ /dev/null @@ -1,14831 +0,0 @@ - - - - - -Seminar 13 (Inequalities) - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - diff --git a/Seminar_materials/Seminar13/Seminar 13 (Inequalities).ipynb b/Seminar_materials/Seminar13/Seminar 13 (Inequalities).ipynb deleted file mode 100644 index ccc969d..0000000 --- a/Seminar_materials/Seminar13/Seminar 13 (Inequalities).ipynb +++ /dev/null @@ -1,353 +0,0 @@ -{ - "cells": [ - { - "cell_type": "markdown", - "id": "dbfabb16", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "# Homework problems" - ] - }, - { - "cell_type": "markdown", - "id": "498aed6c", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Problem 1\n", - "\n", - "Emails arrive one at a time in an inbox. Let $T_n$ be the time at which the $n$-th email arrives (measured on a continuous scale from some starting point in time). Suppose that the waiting times between emails are i.i.d. $Exp(\\lambda)$, i.e., $T_1, T_2 - T_1, T_3 − T_2, \\ldots \\sim Exp(\\lambda)$. Each email is non-spam with probability $p$, and spam with probability $q = 1 − p$ (independently of the other emails and of the waiting times). Let $X$ be the time at which the first non-spam email arrives (so $X$ is a continuous r.v.)\n", - "- Find the mean of $X$.\n", - "- Find the MGF of $X$. What famous distribution does this imply that X has?\n", - "\n", - "Hint for both parts: Let $N$ be the number of emails until the first non-spam (including that one), and write $X$ as a sum of $N$ terms, then condition on $N$." - ] - }, - { - "cell_type": "markdown", - "id": "2480b4b4", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Solution 1.1\n", - "\n", - "Let $N$ be the number of emails until the first non-spam, including it. Then, $N-1 \\sim Geom(p)$.\n", - "\n", - "Let's introduce random variables $X_k = T_k - T_{k-1}$. Then, $X = X_1 + X_2 + \\ldots + X_N$. Next,\n", - "$$\n", - "\\mathbb{E}[X] = \\mathbb{E}\\left[\\mathbb{E}[X|N]\\right]\n", - "$$\n", - "\n", - "$$\n", - "\\mathbb{E}[X|N=n] = \\mathbb{E}[\\sum_{k=1}^n X_k|N=n] = \\sum_{k=1}^n\\mathbb{E}[X_k] = \\sum_{k=1}^n \\frac{1}{\\lambda}\n", - "$$\n", - "\n", - "$$\n", - "\\mathbb{E}[X] = \\mathbb{E}\\left[\\mathbb{E}[X|N]\\right] = \\mathbb{E}[N \\frac{1}{\\lambda}] = \\frac{1}{p \\lambda}\n", - "$$" - ] - }, - { - "cell_type": "markdown", - "id": "a7cc5a29", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Solution 1.2\n", - "\n", - "Let $N$ be the number of emails until the first non-spam, including it. Then, $N-1 \\sim Geom(p)$.\n", - "\n", - "Let's introduce random variables $X_k = T_k - T_{k-1}$. Then, $X = X_1 + X_2 + \\ldots + X_N$. Next,\n", - "$$\n", - "M_X(t) = \\mathbb{E}\\left[e^{tX}\\right] = \\mathbb{E}\\left[\\mathbb{E}\\left[e^{tX}|N\\right]\\right]\n", - "$$\n", - "\n", - "$$\n", - "\\mathbb{E}\\left[e^{tX}|N=n\\right] = \\mathbb{E}\\left[e^{t(X_1+X_2+\\ldots+X_n)}|N\\right] = \\prod_{k=1}^n\\mathbb{E}\\left[e^{tX_1}\\right] = M_1(t)^n = \\left( \\frac{\\lambda}{\\lambda - t} \\right)^n\n", - "$$\n", - "\n", - "$$\n", - "\\begin{aligned}\n", - "M_X(t) & = \\mathbb{E}\\left[\\mathbb{E}\\left[e^{tX}|N\\right]\\right] = \\mathbb{E}\\left[M_(t)^N\\right] = \\\\\n", - "& = \\sum_{n=1}^\\infty pq^{n-1} M_1(t)^n = \\frac{p}{q} \\sum_{n=1}^\\infty \\left( q M_1(t)\\right)^n = \\frac{p}{q} \\frac{q M_1(t)}{1 - q M_1(t)} = \\frac{p\\lambda}{p\\lambda - t}\n", - "\\end{aligned}\n", - "$$" - ] - }, - { - "cell_type": "markdown", - "id": "7c959848", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Problem 2\n", - "\n", - "Customers arrive at a store according to a Poisson process of rate $\\lambda$ customers per hour. Each makes a purchase with probability $p$, independently. Given that a customer makes a purchase, the amount spent has mean $\\mu$ (in dollars) and variance $\\sigma^2$.\n", - "\n", - "- Find the mean of how much a random customer spends (note that the customer may not make a purchase).\n", - "- Find the mean of the revenue the store obtains in an 8-hour time interval, using previous subproblem." - ] - }, - { - "cell_type": "markdown", - "id": "6d59b4b9", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Solution 2.1\n", - "\n", - "Denote $P \\sim Be(p)$ the successful purchase, $S|P$ the price of purchase and $Z = PS$ the total spednings of a customer. Then,\n", - "$$\n", - "\\mathbb{E}[Z] = \\mathbb{E}\\left[\\mathbb{E}[Z|P]\\right]\n", - "$$\n", - "\n", - "$$\n", - "\\mathbb{E}[Z|P=p] = p S\n", - "$$\n", - "\n", - "$$\n", - "\\mathbb{E}[Z] = \\mathbb{E}\\left[\\mathbb{E}[Z|P]\\right] = \\mathbb{E}[PS] = \\mathbb{E}[P]\\mathbb[S] = p\\mu\n", - "$$" - ] - }, - { - "cell_type": "markdown", - "id": "66696e3f", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Solution 2.2\n", - "\n", - "Denote $Y \\sim Pois(8 \\lambda)$ the number of customers that arrive to the store in 8-hour time interval and $R = YZ$ the revenue of the store. Then,\n", - "$$\n", - "\\mathbb{E}[R] = \\mathbb{E}\\left[\\mathbb{E}[R|Y]\\right]\n", - "$$\n", - "\n", - "$$\n", - "\\mathbb{E}[R|Y = y] = yp\\mu\n", - "$$\n", - "\n", - "$$\n", - "\\mathbb{E}[R] = \\mathbb{E}\\left[\\mathbb{E}[R|Y]\\right] = \\mathbb{E}[Y p \\mu] = 8 \\lambda p \\mu\n", - "$$" - ] - }, - { - "cell_type": "markdown", - "id": "5cfc1589", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "# Inequalities" - ] - }, - { - "cell_type": "markdown", - "id": "4a6eb617", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Inequalities\n", - "\n", - "|Name|Conditions|Formula|Uses|\n", - "|:---:|:---:|:---:|:---:|\n", - "|Cauchy-Schwarz|\\begin{eqnarray}\\mathbb{E}[|X|]<\\infty, \\mathbb{E}[|Y|]<\\infty\\end{eqnarray}|\\begin{eqnarray}|\\mathbb{E}[XY]|\\leqslant\\sqrt{\\mathbb{E}[X]\\mathbb{E}[Y]}\\end{eqnarray}|Covariance|\n", - "|Jensen|\\begin{eqnarray}\\mathbb{E}[|X|]<\\infty, x>0, g - \\text{convex}\\end{eqnarray}|\\begin{eqnarray}g(\\mathbb{E}[X])\\leqslant\\mathbb{E}[g(X)]\\end{eqnarray}|Proofs|\n", - "|Markov|\\begin{eqnarray}\\mathbb{E}[|X|^p]<\\infty, p>0, x>0\\end{eqnarray}|\\begin{eqnarray}\\mathbb{P}(|X| \\geqslant x) \\leqslant \\frac{\\mathbb{E}[|X|^p]}{x^p}\\end{eqnarray}|Tails|\n", - "|Chebyshev|\\begin{eqnarray}\\mathbb{V}\\text{ar}(X)<\\infty, x>0\\end{eqnarray}|\\begin{eqnarray}\\mathbb{P}(|X - \\mathbb{E}[X]| \\geqslant x) \\leqslant \\frac{\\mathbb{V}\\text{ar}(X)}{x^2}\\end{eqnarray}|Tails|\n", - "|Chernoff|\\begin{eqnarray}\\mathbb{E}[e^{tX}]<\\infty,t>0,x>0\\end{eqnarray}|\\begin{eqnarray}\\mathbb{P}(X \\geqslant x) \\leqslant \\frac{\\mathbb{E}[e^{tX}]}{e^{tx}}\\end{eqnarray}|Tails|" - ] - }, - { - "cell_type": "markdown", - "id": "554886b3", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Example 1\n", - "\n", - "Let $Z \\sim \\mathcal{N}(0, 1)$. Find $\\mathbb{P}(|Z| > 3)$." - ] - }, - { - "cell_type": "markdown", - "id": "98dae5da", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Solution 1\n", - "\n", - "Three-sigma rule:\n", - "$$\n", - "\\mathbb{P}(|Z| > 3) \\approx 0.003\n", - "$$" - ] - }, - { - "cell_type": "markdown", - "id": "043ffae9", - "metadata": { - "slideshow": { - "slide_type": "fragment" - } - }, - "source": [ - "Markov:\n", - "$$\n", - "\\mathbb{P}(|Z| > 3) \\leqslant \\frac{\\mathbb{E}[Z]}{3} = \\frac{1}{3}\\sqrt{\\frac{2}{\\pi}} \\approx 0.27\n", - "$$" - ] - }, - { - "cell_type": "markdown", - "id": "04d48ef8", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "Chebyshev:\n", - "$$\n", - "\\mathbb{P}(|Z| > 3) \\leqslant \\frac{\\mathbb{V}\\text{ar}(X)}{3^2} = \\frac19 \\approx 0.11\n", - "$$" - ] - }, - { - "cell_type": "markdown", - "id": "fad57c3f", - "metadata": { - "slideshow": { - "slide_type": "fragment" - } - }, - "source": [ - "Chernoff:\n", - "$$\n", - "\\mathbb{P}(|Z| > 3) = 2 \\mathbb{P}(Z > 3) \\leqslant 2 e^{-3t} \\mathbb{E}[e^{tZ}] = 2 e^{-3t} e^{t^2/2} \\to 0.022\n", - "$$" - ] - }, - { - "cell_type": "markdown", - "id": "e80c99a2", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Problem 2\n", - "\n", - "For i.i.d. r.v.s $X_1,\\ldots,X_n$ with mean $\\mu$ and variance $\\sigma^2$, give a value of $n$ that will ensure that there is at least a 99\\% chance that the sample mean will be within 2 standard deviations of the true mean $\\mu$." - ] - }, - { - "cell_type": "markdown", - "id": "9a539ef2", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Solution 2\n", - "\n", - "$$\n", - "\\mathbb{P}(|\\overline{X_n} - \\mu| > 2\\sigma) \\leqslant 0.01\n", - "$$\n", - "\n", - "$$\n", - "\\mathbb{P}(|\\overline{X_n} - \\mu| > 2\\sigma) \\leqslant \\frac{\\mathbb{V}\\text{ar}(\\overline{X_n})}{(2\\sigma)^2} = \\frac{\\sigma^2/n}{4\\sigma^2} = \\frac{1}{4n}\n", - "$$" - ] - }, - { - "cell_type": "markdown", - "id": "8b5f954d", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Problem 3\n", - "\n", - "In a national survey, a random sample of people are chosen and asked whether they support a certain policy. Assume that everyone in the population is equally likely to be surveyed at each step, and that the sampling is with replacement (sampling without replacement is typically more realistic, but with replacement will be a good approximation if the sample size is small compared to the population size). Let $n$ be the sample size, and let $\\hat{p} and $p$ be the proportion of people who support the policy in the sample and in the entire population, respectively. Show that for every $c > 0$,\n", - "$$\n", - "\\mathbb{P}(|\\hat{p} - p| > c\\sigma) \\leqslant \\frac{1}{4nc^2}\n", - "$$" - ] - }, - { - "cell_type": "markdown", - "id": "8fa5b773", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Solution 3\n", - "\n", - "Write Bernoulli, use Chebyshev" - ] - } - ], - "metadata": { - "celltoolbar": "Slideshow", - "kernelspec": { - "display_name": "Python 3 (ipykernel)", - "language": "python", - "name": "python3" - }, - "language_info": { - "codemirror_mode": { - "name": "ipython", - "version": 3 - }, - "file_extension": ".py", - "mimetype": "text/x-python", - "name": "python", - "nbconvert_exporter": "python", - "pygments_lexer": "ipython3", - "version": "3.8.12" - } - }, - "nbformat": 4, - "nbformat_minor": 5 -} diff --git a/Seminar_materials/Seminar13/Seminar 13 (Inequalities).pdf b/Seminar_materials/Seminar13/Seminar 13 (Inequalities).pdf deleted file mode 100644 index 6eb62af..0000000 Binary files a/Seminar_materials/Seminar13/Seminar 13 (Inequalities).pdf and /dev/null differ diff --git a/Seminar_materials/Seminar14/Seminar 14 (Limit Theorems).ipynb b/Seminar_materials/Seminar14/Seminar 14 (Limit Theorems).ipynb deleted file mode 100644 index eb1d8d0..0000000 --- a/Seminar_materials/Seminar14/Seminar 14 (Limit Theorems).ipynb +++ /dev/null @@ -1,505 +0,0 @@ -{ - "cells": [ - { - "cell_type": "markdown", - "id": "2255baed", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "# Inequalities recap" - ] - }, - { - "cell_type": "markdown", - "id": "b19842ed", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Inequalities recap\n", - "\n", - "|Name|Conditions|Formula|Uses|\n", - "|:---:|:---:|:---:|:---:|\n", - "|Cauchy-Schwarz|\\begin{eqnarray}\\mathbb{E}[|X|]<\\infty, \\mathbb{E}[|Y|]<\\infty\\end{eqnarray}|\\begin{eqnarray}|\\mathbb{E}[XY]|\\leqslant\\sqrt{\\mathbb{E}[X]\\mathbb{E}[Y]}\\end{eqnarray}|Covariance|\n", - "|Jensen|\\begin{eqnarray}\\mathbb{E}[|X|]<\\infty, x>0, g - \\text{convex}\\end{eqnarray}|\\begin{eqnarray}g(\\mathbb{E}[X])\\leqslant\\mathbb{E}[g(X)]\\end{eqnarray}|Proofs|\n", - "|Markov|\\begin{eqnarray}\\mathbb{E}[|X|^p]<\\infty, p>0, x>0\\end{eqnarray}|\\begin{eqnarray}\\mathbb{P}(|X| \\geqslant x) \\leqslant \\frac{\\mathbb{E}[|X|^p]}{x^p}\\end{eqnarray}|Tails|\n", - "|Chebyshev|\\begin{eqnarray}\\mathbb{V}\\text{ar}(X)<\\infty, x>0\\end{eqnarray}|\\begin{eqnarray}\\mathbb{P}(|X - \\mathbb{E}[X]| \\geqslant x) \\leqslant \\frac{\\mathbb{V}\\text{ar}(X)}{x^2}\\end{eqnarray}|Tails|\n", - "|Chernoff|\\begin{eqnarray}\\mathbb{E}[e^{tX}]<\\infty,t>0,x>0\\end{eqnarray}|\\begin{eqnarray}\\mathbb{P}(X \\geqslant x) \\leqslant \\frac{\\mathbb{E}[e^{tX}]}{e^{tx}}\\end{eqnarray}|Tails|" - ] - }, - { - "cell_type": "markdown", - "id": "5e171c83", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "# Homework problems" - ] - }, - { - "cell_type": "markdown", - "id": "8b5f954d", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Problem 1\n", - "\n", - "In a national survey, a random sample of people are chosen and asked whether they support a certain policy. Assume that everyone in the population is equally likely to be surveyed at each step, and that the sampling is with replacement (sampling without replacement is typically more realistic, but with replacement will be a good approximation if the sample size is small compared to the population size). Let $n$ be the sample size, and let $\\hat{p}$ and $p$ be the proportion of people who support the policy in the sample and in the entire population, respectively. Show that for every $c > 0$,\n", - "$$\n", - "\\mathbb{P}(|\\hat{p} - p| > c\\sigma) \\leqslant \\frac{1}{4nc^2}\n", - "$$" - ] - }, - { - "cell_type": "markdown", - "id": "8fa5b773", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Solution 1\n", - "\n", - "Let $X \\sim Bin(n, p)$ describe the number of people, who support the policy. Then,\n", - "$$\n", - "\\hat{p} = \\frac{X}{n}\n", - "$$\n", - "\n", - "Then,\n", - "$$\n", - "\\mathbb{E}[\\hat{p}] = \\frac{1}{n}\\mathbb{E}[X] = \\frac{1}{n}np = p\n", - "$$\n", - "\n", - "$$\n", - "\\mathbb{V}\\text{ar}(\\hat{p}) = \\frac{1}{n^2}\\mathbb{V}\\text{ar}(X) = \\frac{p(1-p)}{n}\n", - "$$\n", - "\n", - "Then we use Chebyshev inequality\n", - "$$\n", - "\\mathbb{P}(|\\hat{p} - \\mathbb{E}[\\hat{p}]| > c) \\leqslant \\frac{\\mathbb{V}\\text{ar}(\\hat{p})}{c^2}\n", - "$$\n", - "\n", - "$$\n", - "\\mathbb{P}(|\\hat{p} - p| > c) \\leqslant \\frac{p(1-p)}{nc^2} \\leqslant \\frac{1}{4nc^2}\n", - "$$" - ] - }, - { - "cell_type": "markdown", - "id": "fe77da7f", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "# Convergence of random variables" - ] - }, - { - "cell_type": "markdown", - "id": "57401c7b", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Convergence of random variables\n", - "\n", - "Consider infinite series of r.v.s $X_1, X_2, \\ldots$\n", - "\n", - "1. $X_n$ converges to $X$ **in probability** ($X_n \\xrightarrow{P} X$) if $\\forall \\varepsilon > 0$\n", - " $$\n", - " \\mathbb{P}(|X_n - X| > \\varepsilon) \\to 0\n", - " $$\n", - "2. $X_n$ converges to $X$ **almost surely** ($X_n \\xrightarrow{\\text{a.s.}} X$) if\n", - " $$\n", - " \\mathbb{P}(X_n \\to X) = 1\n", - " $$\n", - "3. $X_n$ converges to $X$ **mean-square** ($X_n \\xrightarrow{\\text{m.s.}} X$) if\n", - " $$\n", - " \\mathbb{E}\\left[(X_n - X)^2\\right] \\to 0\n", - " $$\n", - "4. $X_n$ converges to $X$ **in distribution** ($X_n \\xrightarrow{d} X$) if for any bounded continuous function $\\varphi$\n", - " $$\n", - " \\mathbb{E}[\\varphi(X_n)] \\to \\mathbb{E}[\\varphi(X)]\n", - " $$" - ] - }, - { - "cell_type": "markdown", - "id": "958619c9", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Convergence in probability, detailed\n", - "\n", - "- $X_n \\xrightarrow{d} X$ is equivalent to $F_{X_n}(x) \\to F_X(x)$\n", - "- If r.v. takes only integer values, then $X_n \\xrightarrow{d} X$ is equivalent to $\\mathbb{P}(X_n = k) \\to \\mathbb{P}(X = k)$\n", - "- $X_n \\xrightarrow{d} X$ is equivalent to convergence of characteristic functions, PGFs, MGFs" - ] - }, - { - "cell_type": "markdown", - "id": "b8b18466", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Example 1\n", - "\n", - "Let $X_n \\sim Bin(n, p_n)$ and $np_n \\to \\lambda, n \\to \\infty$. Show that $X_n \\xrightarrow{d} X \\sim Pois(\\lambda)$." - ] - }, - { - "cell_type": "markdown", - "id": "af91ee58", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Solution 1\n", - "\n", - "We need to show $\\mathbb{P}(X_n = k) \\to \\mathbb{P}(X = k)$.\n", - "\n", - "$$\n", - "\\begin{aligned}\n", - "\\mathbb{P}(X_n = k) & = \\begin{pmatrix}n\\\\k\\end{pmatrix} p^k (1 - p)^{n-k} = \\frac{n(n-1)\\ldots (n-k+1)}{k!} \\left( \\frac{\\lambda}{n} \\right)^k \\left( 1 - \\frac{\\lambda}{n} \\right)^n \\left( 1- \\frac{\\lambda}{n} \\right)^{-k} = \\\\\n", - "& = \\left[ \\frac{\\lambda^k}{k!} \\underbrace{\\left(1- \\frac{\\lambda}{n} \\right)^n}_{\\to e^{-\\lambda}} \\right] \\underbrace{\\left( 1 - \\frac{\\lambda}{n} \\right)^{-k}}_{\\to 1} \\underbrace{\\frac{n(n-1)\\ldots (n-k+1)}{n^k}}_{\\to 1} \\to \\frac{\\lambda^k}{k!} e^{-\\lambda} = \\\\\n", - "& = \\mathbb{P}(X = k)\n", - "\\end{aligned}\n", - "$$" - ] - }, - { - "cell_type": "markdown", - "id": "ca3e36dc", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Relations between types of convergence\n", - "\n", - "1. $X_n \\xrightarrow{P} X \\Rightarrow X_n \\xrightarrow{d} X$\n", - "2. $X_n \\xrightarrow{\\text{a.s.}} X \\Rightarrow X_n \\xrightarrow{P} X$\n", - "3. $X_n \\xrightarrow{\\text{m.s.}} X \\Rightarrow X_n \\xrightarrow{P} X$\n", - "4. If $\\{X_n\\}$ is monotonic a.s. ($X_{n+1} \\geqslant X_n$ a.s.) $X_n \\xrightarrow{P} X \\Rightarrow X_n \\xrightarrow{\\text{m.s.}} X$\n", - "5. If $X_n$ is uniformly bounded ($|X_n| < a$ a.s.) $X_n \\xrightarrow{P} X \\Rightarrow X_n \\xrightarrow{\\text{m.s.}} X$\n", - "5. $X_n \\xrightarrow{d} C \\Rightarrow X_n \\xrightarrow{P} C$\n", - "\n", - "" - ] - }, - { - "cell_type": "markdown", - "id": "3899d533", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "# Limit theorems" - ] - }, - { - "cell_type": "markdown", - "id": "05ed877f", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Limit theorems\n", - "\n", - "We will discuss the two most famous theorems in probability:\n", - "- the law of large numbers (LLN)\n", - "- the central limit theorem (CLT)\n", - "\n", - "Both tell us what happens to the **sample mean** $\\overline{X_n} = \\frac1n \\sum_{i=1}^n X_i$ as we obtain more and more data $n \\to \\infty$.\n", - "\n", - "Limit theorems let us make approximations which are likely to work well when we have a large number of data points." - ] - }, - { - "cell_type": "markdown", - "id": "c837a36f", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Law of large numbers\n", - "\n", - "Let $X_1, X_2, \\ldots, X_n$ be i.i.d. r.v.s with $\\mathbb{E}[X_i] = m$ and $\\mathbb{V}\\text{ar}(X) = \\sigma^2$. Then,\n", - "$$\n", - "\\mathbb{E}\\left[ \\left( \\frac1n \\sum_{i=1}^n X_i - m \\right)^2 \\right] \\to 0\n", - "$$\n", - "\n", - "So, LLN says that\n", - "$$\n", - "\\frac1n \\sum_{i=1}^n X_i \\xrightarrow{\\text{m.s.}} m\n", - "$$\n", - "\n", - "We know that m.s. convergence implies $P$-convergence, so\n", - "$$\n", - "\\frac1n \\sum_{i=1}^n X_i \\xrightarrow{P} m\n", - "$$" - ] - }, - { - "cell_type": "markdown", - "id": "f88bd73b", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Law of large numbers\n", - "\n", - "1. (Chebyshev form) The i.i.d. part can be relaxed to $\\operatorname{cov}(X_i, X_j) \\leqslant 0$\n", - "2. (Khinchin form) The existence of variance can be relaxed, but then we only have convergence in probability\n", - "3. Both parts can not be relaxed at the same time\n", - "4. Existence of expectation is essential" - ] - }, - { - "cell_type": "markdown", - "id": "5b3078a3", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Strong law of large numbers\n", - "\n", - "Let $X_1, X_2, \\ldots, X_n$ be i.i.d. r.v.s with $\\mathbb{E}[X_i] = m$. Then,\n", - "$$\n", - "\\frac1n \\sum_{i=1}^n X_i \\xrightarrow{\\text{a.s.}} m\n", - "$$" - ] - }, - { - "cell_type": "markdown", - "id": "239593c2", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Example 2\n", - "\n", - "Consider you have $K_1$ money, that you want to multiply. You have two possibilities to do that:\n", - "- to put the money into the bank that guarantees $b \\cdot 100\\%$ yearly interest\n", - "- to invest into stocks that will earn you $X \\cdot 100\\%$ yearly interest, where $X$ is an r.v. with expected value $m_X > b > 0$\n", - "\n", - "Denote the fraction of money put into the bank as $u$ and the fraction of money invested into stocks $v$, such that $0 \\leqslant u + v \\leqslant 1$. This way, in a year, you will have\n", - "$$\n", - "K_2 = K_1 (1 + bu + Xv)\n", - "$$\n", - "\n", - "Obviously, the strategy maximizing the expected income is to set $u = 0, v = 1$. Show that \n", - "- $\\mathbb{E}[K_t] \\to \\infty$\n", - "- If $\\mathbb{E}[\\log (1 + X)] < 0$, then $\\mathbb{E}[K_t] \\xrightarrow{\\text{a.s.}} 0$" - ] - }, - { - "cell_type": "markdown", - "id": "125d6736", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Solution 2.1\n", - "\n", - "$$\n", - "\\mathbb{E}[K_{t+1}] = \\mathbb{E}[K_t (1 + X_t)] = \\mathbb{E}[K_1 \\prod_{i=1}^t (1 + X_i)] = K_1 (1 + m_x)^t \\to \\infty\n", - "$$" - ] - }, - { - "cell_type": "markdown", - "id": "d6e959c1", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Solution 2.2\n", - "\n", - "Consider change of variables\n", - "$$\n", - "K_{t+1} = K_1 e^{tY_t}\n", - "$$\n", - "where\n", - "$$\n", - "Y_t = \\frac1t \\sum_{i=1}^t \\log(1+ X_i)\n", - "$$\n", - "\n", - "Consider SLLN for $\\log(1+ X_i)$, then\n", - "$$\n", - "\\frac1t \\sum_{i=1}^t \\log(1+ X_i) \\xrightarrow{\\text{a.s.}} \\mathbb{E}[\\log(1+ X)] = - a < 0\n", - "$$\n", - "\n", - "Then,\n", - "$$\n", - "K_{t+1} = K_1 e^{-at} \\xrightarrow{\\text{a.s.}} 0\n", - "$$" - ] - }, - { - "cell_type": "markdown", - "id": "f79080c9", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Central limit theorem\n", - "\n", - "Let $X_1, X_2, \\ldots, X_n$ be i.i.d. r.v.s, $\\mathbb{E}[X] = \\mu$, $\\mathbb{V}\\text{ar}(X) = \\sigma^2$. Then,\n", - "$$\n", - "Z_n = \\frac{\\sum_{i=1}^n X_i - \\mu}{\\sqrt{n \\sigma^2}} \\xrightarrow{d} \\mathcal{N}(0, 1)\n", - "$$\n", - "\n", - "If additionally $X_i$ are continuous, then\n", - "$$\n", - "f_{Z_n}(x) \\to \\frac{1}{\\sqrt{2\\pi}} e^{-x^2/2}\n", - "$$\n", - "\n", - "It works for random vector as well. Let $\\mathbf{X_1}, \\mathbf{X_2}, \\ldots, \\mathbf{X_n}$ be i.i.d. with $\\mathbb{E}[\\mathbf{X}] = \\mathbf{m}$ and covariance matrix $\\Sigma$. Then,\n", - "$$\n", - "\\frac{\\sum_{i=1}^n \\mathbf{X_i} - \\mathbf{m}}{\\sqrt{n}} \\sim \\mathcal{N}(0, \\Sigma)\n", - "$$" - ] - }, - { - "cell_type": "markdown", - "id": "79a0f686", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Example 3\n", - "\n", - "В выборах участвовали два кандидата T и C. T набрал NT = 520000 го- лосов, а C — NC = 480000. Оказалось, что аппарат считающий бюллетени, был настроен неправильно, и случайно менял каждый голос на противоположный с вероятностью p = 0.45. Найти вероятность того, что число голосов за T оказалось не меньше N T , если изначально за T и C было отдано поровну голосов: N0T = N0C = 500000.\n", - "\n", - "Two parties $L$ and $D$ participate in election. $L$ gets $N_L = 520000$ votes, $D$ gets $N_D = 480000$ votes. Turns out that the voting machine was broken and randomly changed a vote with probability $p = 0.45$. Find probability that votes for $L$ were at least $N_L$ if the actual votes were $N_{0L} = N_{0D} = 500000$." - ] - }, - { - "cell_type": "markdown", - "id": "1a4cb1e7", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Solution 3\n", - "\n", - "Consider r.v. $X_i \\sim Be(p)$ - indicator of error $D \\to L$, and $Y_i \\sim Be(p)$ - indicator of error $L \\to D$. Then, the votes for $L$ are\n", - "$$\n", - "Z = \\sum_{i=1}^{N_{0L}} X_i + \\sum_{j=1}^{N_{OD}} (1 - Y_j)\n", - "$$\n", - "\n", - "By CLT,\n", - "$$\n", - "\\frac{Z - \\mathbb{E}[Z]}{\\sqrt{N_{OD}\\mathbb{V}\\text{ar}(X_i)}} \\sim \\mathcal{N}(0, 1)\n", - "$$\n", - "\n", - "$$\n", - "\\mathbb{E}[Z] = N_{0L} p + N_{OD} (1-p) = N_{OD}\n", - "$$\n", - "\n", - "$$\n", - "\\mathbb{V}\\text{ar}(X_i) = p (1 - p)\n", - "$$\n", - "\n", - "Then,\n", - "$$\n", - "\\mathbb{P}(Z \\geqslant N_L) = \\mathbb{P}\\left(\\frac{Z - N_{OD}}{\\sqrt{2 N_{OD} p (1 - p)}} \\geqslant \\frac{N_L - N_{OD}}{\\sqrt{2 N_{OD} p (1 - p)}}\\right) \\approx \\mathbb{P}(V \\geqslant a)\n", - "$$" - ] - }, - { - "cell_type": "markdown", - "id": "8be46233", - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "source": [ - "## Solution 3\n", - "\n", - "$$\n", - "\\mathbb{P}(Z \\geqslant N_L) = \\mathbb{P}\\left(\\frac{Z - N_{OD}}{2 N_{OD} p (1 - p)} \\geqslant \\frac{N_L - N_{OD}}{2 N_{OD} p (1 - p)}\\right) \\approx \\mathbb{P}(V \\geqslant a)\n", - "$$\n", - "where $V \\sim \\mathcal{N}(0, 1)$ and\n", - "$$\n", - "a = \\frac{N_L - N_{OD}}{2 N_{OD} p (1 - p)} \\approx 40\n", - "$$\n", - "\n", - "$$\n", - "\\mathbb{P}(Z \\geqslant N_L) \\approx 10^{-350}\n", - "$$" - ] - } - ], - "metadata": { - "celltoolbar": "Slideshow", - "kernelspec": { - "display_name": "Python 3 (ipykernel)", - "language": "python", - "name": "python3" - }, - "language_info": { - "codemirror_mode": { - "name": "ipython", - "version": 3 - }, - "file_extension": ".py", - "mimetype": "text/x-python", - "name": "python", - "nbconvert_exporter": "python", - "pygments_lexer": "ipython3", - "version": "3.9.13" - } - }, - "nbformat": 4, - "nbformat_minor": 5 -} diff --git a/Seminar_materials/Seminar14/Seminar 14 (Limit Theorems).pdf b/Seminar_materials/Seminar14/Seminar 14 (Limit Theorems).pdf deleted file mode 100644 index 209ce5b..0000000 Binary files a/Seminar_materials/Seminar14/Seminar 14 (Limit Theorems).pdf and /dev/null differ diff --git a/Seminar_materials/Seminar14/convergence_relations.png b/Seminar_materials/Seminar14/convergence_relations.png deleted file mode 100644 index 9afdb50..0000000 Binary files a/Seminar_materials/Seminar14/convergence_relations.png and /dev/null differ diff --git a/Seminar_materials/seminar_01/Seminar 1 (Introduction).html b/Seminar_materials/seminar_01/Seminar 1 (Introduction).html new file mode 100644 index 0000000..9718aa9 --- /dev/null +++ b/Seminar_materials/seminar_01/Seminar 1 (Introduction).html @@ -0,0 +1,7982 @@ + + + + + +Seminar 1 (Introduction) + + + + + + + + + + + + +
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ + diff --git a/Seminar_materials/seminar_01/Seminar 1 (Introduction).ipynb b/Seminar_materials/seminar_01/Seminar 1 (Introduction).ipynb new file mode 100755 index 0000000..e745412 --- /dev/null +++ b/Seminar_materials/seminar_01/Seminar 1 (Introduction).ipynb @@ -0,0 +1,625 @@ +{ + "cells": [ + { + "cell_type": "code", + "execution_count": 1, + "id": "c622c935-0b83-4b18-a608-25e27ab27c7d", + "metadata": { + "editable": true, + "slideshow": { + "slide_type": "skip" + }, + "tags": [] + }, + "outputs": [], + "source": [ + "import numpy as np\n", + "import pandas as pd\n", + "\n", + "import matplotlib.pyplot as plt\n", + "\n", + "%matplotlib inline" + ] + }, + { + "cell_type": "code", + "execution_count": 2, + "id": "2a8aaab5-a8da-466d-9fce-9df3e0879985", + "metadata": { + "editable": true, + "slideshow": { + "slide_type": "skip" + }, + "tags": [] + }, + "outputs": [], + "source": [ + "from scipy.special import factorial" + ] + }, + { + "cell_type": "markdown", + "id": "ed2cd65f", + "metadata": { + "editable": true, + "slideshow": { + "slide_type": "slide" + }, + "tags": [] + }, + "source": [ + "# Seminar 1" + ] + }, + { + "cell_type": "markdown", + "id": "3fba3091-9c11-4435-bb22-a755fd5f982d", + "metadata": { + "editable": true, + "slideshow": { + "slide_type": "slide" + }, + "tags": [] + }, + "source": [ + "## Seminar outlook\n", + "\n", + "- I introduce myself\n", + "- You introduce yourselves\n", + "- Administrative announcements\n", + "- Counting rules and naive definition of probability" + ] + }, + { + "cell_type": "markdown", + "id": "632282a7", + "metadata": { + "editable": true, + "slideshow": { + "slide_type": "slide" + }, + "tags": [] + }, + "source": [ + "## Let's get to know each other\n", + "\n", + "- My name is Nikolai Stulov\n", + "- I myself got my BSc from MIPT\n", + "- Got my MSc from HSE and Skoltech\n", + "- 6 years of experience as Data Scientist and ML researcher\n", + "- 4th time teaching Probability at MSAI\n", + "\n", + "I will be happy to hear short introductions from you! (but it's OK if you don't want to)" + ] + }, + { + "cell_type": "markdown", + "id": "34821b99-a06f-4c27-869c-8ab26fe622a9", + "metadata": { + "editable": true, + "slideshow": { + "slide_type": "slide" + }, + "tags": [] + }, + "source": [ + "## Course outlook\n", + "\n", + "- Lectures will be distributed several days in advance (perhaps Monday), it's best to watch the lecture before the webinar\n", + "- The webinars are mostly delivered in Jupyter notebooks, notebooks will be distributed via GitHub\n", + "- You can get a maximum of 10 points for the course. Graded activities: homework, project, exam.\n", + "- Homework will be weekly with 2 weeks time to solve. Each homework includes regular and bonus problems. Homeworks are distributed in Telegram, collected with Google Forms.\n", + "- Fully solved regular problems in all homeworks will give you 7 points for the course. Bonus problems will give you additional 2 points for the course. If AI assistants are used, it must be explicitly stated and prompt must be provided.\n", + "- A project is a take-home literature review + coding assignment. Project will give you additional 2 points for the course, but is not compulsory for anyone.\n", + "- Exam is not compulsory if you get more than 3 points with homeworks. If you get 3 or less, you must take exam. Exam can give you a maximum of 3 points." + ] + }, + { + "cell_type": "markdown", + "id": "5756efe3", + "metadata": { + "editable": true, + "slideshow": { + "slide_type": "slide" + }, + "tags": [] + }, + "source": [ + "## Counting rules" + ] + }, + { + "cell_type": "markdown", + "id": "19acb677", + "metadata": { + "editable": true, + "slideshow": { + "slide_type": "slide" + }, + "tags": [] + }, + "source": [ + "### Problem 1 (entrance exam)\n", + "\n", + "Russian car plate consists of three letters and three digits. Any digits are permitted, but the only permitted letters are the ones that have English-lookalikes. How many car plates are possible in one region?" + ] + }, + { + "cell_type": "markdown", + "id": "6320db58", + "metadata": { + "editable": true, + "slideshow": { + "slide_type": "slide" + }, + "tags": [] + }, + "source": [ + "### Solution 1\n", + "\n", + "How many letters are there?" + ] + }, + { + "cell_type": "markdown", + "id": "74e8cc12-3ed6-4b9c-a089-8dcadd36de71", + "metadata": { + "editable": true, + "slideshow": { + "slide_type": "fragment" + }, + "tags": [] + }, + "source": [ + "A, B, C, E, H, K, M, O, P, T, X, Y - total 12 letters. We choose 3 digits from them. Do we sample with or without replacement?" + ] + }, + { + "cell_type": "markdown", + "id": "7dc90a61-e216-4dd5-8d8d-cc13625cc2c2", + "metadata": { + "editable": true, + "slideshow": { + "slide_type": "fragment" + }, + "tags": [] + }, + "source": [ + "Using sampling with replacement, because there are no restrictions on repetitions of letters:\n", + "- We choose three of ten digits: $10^3$\n", + "- We choose three of twelve letters: $12^3$" + ] + }, + { + "cell_type": "markdown", + "id": "2ac7b04d-7978-4789-a32c-1cea83bf2de8", + "metadata": { + "editable": true, + "slideshow": { + "slide_type": "fragment" + }, + "tags": [] + }, + "source": [ + "Since the choice of the digits and the letters is independent, the total number of plates is therefore $10^3 \\cdot 12^3 = 1728000$." + ] + }, + { + "cell_type": "markdown", + "id": "8913f59a", + "metadata": { + "editable": true, + "slideshow": { + "slide_type": "slide" + }, + "tags": [] + }, + "source": [ + "### Problem 2 (entrance exam)\n", + "How many 7-digit phone numbers are possible, assuming that the first digit can’t be a 0 or a 1?" + ] + }, + { + "cell_type": "markdown", + "id": "993204d7-65d3-41c5-b4a9-1a6080bfa6d8", + "metadata": { + "editable": true, + "slideshow": { + "slide_type": "slide" + }, + "tags": [] + }, + "source": [ + "### Solution 2\n", + "\n", + "We independently choose each digit. Do we sample with or without replacement?" + ] + }, + { + "cell_type": "markdown", + "id": "2ec2c737-871d-47eb-aacf-de76a7b657d2", + "metadata": { + "editable": true, + "slideshow": { + "slide_type": "fragment" + }, + "tags": [] + }, + "source": [ + "Using sampling with replacement, because there are no restrictions on repetitions of numbers:\n", + "- We choose the first digit from reduced set of 8 digits: $8$\n", + "- We choose the rest 6 digits: $10^6$" + ] + }, + { + "cell_type": "markdown", + "id": "4ac289e8-0676-4b2c-a179-b978d4a46c2b", + "metadata": { + "editable": true, + "slideshow": { + "slide_type": "fragment" + }, + "tags": [] + }, + "source": [ + "The total number of phone numbers is therefore $8 \\cdot 10^6$." + ] + }, + { + "cell_type": "markdown", + "id": "a5d89ade-8a67-4e29-8c3f-1ec458012277", + "metadata": {}, + "source": [ + "$$\n", + "n \\cdot (n - 1) \\cdot \\ldots \\cdot (n - k + 1)\n", + "$$" + ] + }, + { + "cell_type": "markdown", + "id": "124fbdd2-9a29-4287-8dcc-059a5e4d48d7", + "metadata": {}, + "source": [ + "$$\n", + "n = k \\Rightarrow n \\cdot (n - 1) \\cdot \\ldots \\cdot (n - n + 1) = n!\n", + "$$" + ] + }, + { + "cell_type": "markdown", + "id": "182aa1e5", + "metadata": { + "editable": true, + "slideshow": { + "slide_type": "slide" + }, + "tags": [] + }, + "source": [ + "### Problem 3\n", + "How many paths are there from the point (0,0) to the point (110,111) in the plane such that each step either consists of going one unit up or one unit to the right?" + ] + }, + { + "cell_type": "markdown", + "id": "dd2bccb8-3184-4e4f-9e5a-4a3a9133b386", + "metadata": { + "editable": true, + "slideshow": { + "slide_type": "slide" + }, + "tags": [] + }, + "source": [ + "### Solution 3\n", + "\n", + "We will encode a path as a sequence of letters $U$ (for up step) and $R$ (for right step), like $URURURU\\ldots UURUR$. How many $R$s and $U$s will be in the complete sequence?" + ] + }, + { + "cell_type": "markdown", + "id": "02d1f71d-41de-4dd3-8c31-0d6f19ca4618", + "metadata": { + "editable": true, + "slideshow": { + "slide_type": "fragment" + }, + "tags": [] + }, + "source": [ + "The sequence must consist of 110 $R$s and 111 $U$s, because we need to get from 0 to 110 horizontally by only moving right and from 0 to 111 vertically by only moving up." + ] + }, + { + "cell_type": "markdown", + "id": "c5bbb80d", + "metadata": { + "editable": true, + "slideshow": { + "slide_type": "fragment" + }, + "tags": [] + }, + "source": [ + "Let's use the factorial rule: the number of shuffles of this $UR$ sequence is $(110+111)! = 221!$. Is it correct?" + ] + }, + { + "cell_type": "markdown", + "id": "09d469d4", + "metadata": { + "editable": true, + "slideshow": { + "slide_type": "fragment" + }, + "tags": [] + }, + "source": [ + "It is not correct, because we do not care about individual permutations of $R$s and $U$s, but we counted these permutations as different. We need to **adjust for overcounting**.\n", + "\n", + "We need to get rid of permutations that we counted multiple times. In order to do that, we divide by the number of such permutations, and this gives the correct answer:\n", + "\n", + "$$\\frac{221!}{110!111!}$$" + ] + }, + { + "cell_type": "code", + "execution_count": 3, + "id": "e41ddb89-18e4-425b-844e-23387545ab73", + "metadata": { + "editable": true, + "slideshow": { + "slide_type": "fragment" + }, + "tags": [] + }, + "outputs": [ + { + "name": "stderr", + "output_type": "stream", + "text": [ + "/var/folders/33/j0cl7y453td68qb96j7bqcj4cf41kc/T/ipykernel_87094/3000228604.py:1: RuntimeWarning: overflow encountered in double_scalars\n", + " factorial(221) / (factorial(110) * factorial(111))\n", + "/var/folders/33/j0cl7y453td68qb96j7bqcj4cf41kc/T/ipykernel_87094/3000228604.py:1: RuntimeWarning: invalid value encountered in double_scalars\n", + " factorial(221) / (factorial(110) * factorial(111))\n" + ] + }, + { + "data": { + "text/plain": [ + "nan" + ] + }, + "execution_count": 3, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "factorial(221) / (factorial(110) * factorial(111))" + ] + }, + { + "cell_type": "markdown", + "id": "08b20e9f-8dee-4851-82e6-64c9b096aaca", + "metadata": { + "editable": true, + "slideshow": { + "slide_type": "subslide" + }, + "tags": [] + }, + "source": [ + "Why didn't we overcount previously?" + ] + }, + { + "cell_type": "markdown", + "id": "8cbac97c-655d-4734-8d12-ce15c34c96ee", + "metadata": { + "editable": true, + "slideshow": { + "slide_type": "fragment" + }, + "tags": [] + }, + "source": [ + "Because we didn't use the number of shuffles formula $n!$, that assumes that object are distinguishable." + ] + }, + { + "cell_type": "markdown", + "id": "bfae1a78", + "metadata": { + "editable": true, + "slideshow": { + "slide_type": "slide" + }, + "tags": [] + }, + "source": [ + "## Naive definition" + ] + }, + { + "cell_type": "markdown", + "id": "16b79717-51aa-43e6-801f-5e0ac1386c57", + "metadata": { + "editable": true, + "slideshow": { + "slide_type": "slide" + }, + "tags": [] + }, + "source": [ + "### Problem 4 (entrance exam)\n", + "\n", + "A child is playing with cubes with letters A, A, C, E, H, I, K, M, M, S, T, T. What is the probability that a random ordering of the cubes in one line will form the word MATHEMATICS?" + ] + }, + { + "cell_type": "markdown", + "id": "b814bb58-6a28-4c5d-9460-93899980d443", + "metadata": { + "editable": true, + "slideshow": { + "slide_type": "slide" + }, + "tags": [] + }, + "source": [ + "### Solution 4\n", + "\n", + "A, A, C, E, H, I, K, M, M, S, T, T - total 12 letters.\n", + "\n", + "Let's count the number of favorable cases: 2 ways to get an M, 2 ways to get an A, 2 ways to get a T, one way to get H and E, one left way to get an M, etc all ones. Multiplying, we get $2 \\cdot 2 \\cdot 2 \\cdot 1 \\cdot 1 \\cdot 1 \\cdot ... = 2^3$.\n", + "\n", + "Let's count the total number of cases: it is $12!$. Does this mean our answer is\n", + "$$\n", + "\\frac{2^3}{12!}\n", + "$$" + ] + }, + { + "cell_type": "markdown", + "id": "db2fcd4c-2cd5-447c-9ccb-2ccab725c23e", + "metadata": { + "editable": true, + "slideshow": { + "slide_type": "fragment" + }, + "tags": [] + }, + "source": [ + "No, it does not. We overcounted the total number of cases, because we counted twice the cases that differe in the positions of the same letters. We must get rid of the overcounting:\n", + "$$\n", + "\\frac{2^3}{\\frac{12!}{2!2!2!}} = \\frac{2^6}{12!}\n", + "$$" + ] + }, + { + "cell_type": "code", + "execution_count": 4, + "id": "e069ccd9-03f9-44d4-b561-5e4d04cfdf58", + "metadata": { + "editable": true, + "slideshow": { + "slide_type": "fragment" + }, + "tags": [] + }, + "outputs": [ + { + "data": { + "text/plain": [ + "1.3361124472235584e-07" + ] + }, + "execution_count": 4, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "2 ** 6 / factorial(12)" + ] + }, + { + "cell_type": "markdown", + "id": "fc86e8db", + "metadata": { + "editable": true, + "slideshow": { + "slide_type": "slide" + }, + "tags": [] + }, + "source": [ + "### Problem 5 (entrance exam)\n", + "A city with 6 districts has 6 robberies in a particular week. Assume the robberies are located randomly, with all possibilities for which robbery occurred where equally likely. What is the probability that some district had more than 1 robbery?" + ] + }, + { + "cell_type": "markdown", + "id": "c0d028d6-e602-4346-8e41-f185bdf75d62", + "metadata": {}, + "source": [ + "$$\n", + "\\frac{neg}{all} = \\frac{all - fav}{all} = 1 - \\frac{fav}{all}\n", + "$$" + ] + }, + { + "cell_type": "markdown", + "id": "cf105e4e", + "metadata": { + "editable": true, + "slideshow": { + "slide_type": "slide" + }, + "tags": [] + }, + "source": [ + "### Solution 5\n", + "\n", + "We will compute the probability of the complement.\n", + "\n", + "- All cases: There are $6^6$ possible configurations for which robbery occurred where.\n", + "- Favorable cases: There are $6!$ configurations where each district had exactly 1 of the 6.\n", + "\n", + "So the probability of the complement of the desired event is $6!/6^6$.\n", + "\n", + "Finally, the probability of some district having more than 1 robbery is $1 - 6!/6^6$." + ] + }, + { + "cell_type": "code", + "execution_count": 5, + "id": "5dc1948a", + "metadata": { + "editable": true, + "slideshow": { + "slide_type": "fragment" + }, + "tags": [] + }, + "outputs": [ + { + "data": { + "text/plain": [ + "0.9845679012345679" + ] + }, + "execution_count": 5, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "1 - factorial(6) / (6 ** 6)" + ] + } + ], + "metadata": { + "celltoolbar": "Slideshow", + "kernelspec": { + "display_name": "Python 3 (ipykernel)", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.12.0" + } + }, + "nbformat": 4, + "nbformat_minor": 5 +} diff --git a/Seminar_materials/seminar_01/Seminar 1 (Introduction).pdf b/Seminar_materials/seminar_01/Seminar 1 (Introduction).pdf new file mode 100644 index 0000000..2321531 Binary files /dev/null and b/Seminar_materials/seminar_01/Seminar 1 (Introduction).pdf differ diff --git a/Seminar_materials/seminar_01/Seminar 1 (Introduction).slides.html b/Seminar_materials/seminar_01/Seminar 1 (Introduction).slides.html new file mode 100644 index 0000000..44b6ec6 --- /dev/null +++ b/Seminar_materials/seminar_01/Seminar 1 (Introduction).slides.html @@ -0,0 +1,7959 @@ + + + + + + + +Seminar 1 (Introduction) slides + + + + + + + + + + + + + + + + + +
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ + +
+
+
+
+
+
+
+
+
+
+
+
+ +
+
+
+
+
+ + + diff --git a/Seminar_materials/seminar_02/Seminar 2 (Definition of probability).html b/Seminar_materials/seminar_02/Seminar 2 (Definition of probability).html new file mode 100644 index 0000000..fb00cba --- /dev/null +++ b/Seminar_materials/seminar_02/Seminar 2 (Definition of probability).html @@ -0,0 +1,8124 @@ + + + + + +Seminar 2 (Definition of probability) + + + + + + + + + + + + +
+ + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ + diff --git a/Seminar_materials/seminar_02/Seminar 2 (Definition of probability).ipynb b/Seminar_materials/seminar_02/Seminar 2 (Definition of probability).ipynb new file mode 100755 index 0000000..ffeb7f5 --- /dev/null +++ b/Seminar_materials/seminar_02/Seminar 2 (Definition of probability).ipynb @@ -0,0 +1,564 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "id": "1cac9fc4", + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "source": [ + "# Seminar 2" + ] + }, + { + "cell_type": "code", + "execution_count": 1, + "id": "605f0b9f", + "metadata": { + "editable": true, + "slideshow": { + "slide_type": "skip" + }, + "tags": [] + }, + "outputs": [ + { + "name": "stderr", + "output_type": "stream", + "text": [ + "/var/folders/33/j0cl7y453td68qb96j7bqcj4cf41kc/T/ipykernel_40555/3212562443.py:8: DeprecationWarning: `set_matplotlib_formats` is deprecated since IPython 7.23, directly use `matplotlib_inline.backend_inline.set_matplotlib_formats()`\n", + " dp.set_matplotlib_formats(\"retina\")\n" + ] + } + ], + "source": [ + "import numpy as np\n", + "import pandas as pd\n", + "\n", + "import IPython.display as dp\n", + "import matplotlib.pyplot as plt\n", + "import seaborn as sns\n", + "\n", + "dp.set_matplotlib_formats(\"retina\")\n", + "sns.set(style=\"whitegrid\", font_scale=1.5)\n", + "sns.despine()\n", + "\n", + "%matplotlib inline" + ] + }, + { + "cell_type": "markdown", + "id": "41f5d8ec-69f9-44ef-8e70-8b1b1d0766b3", + "metadata": { + "editable": true, + "slideshow": { + "slide_type": "slide" + }, + "tags": [] + }, + "source": [ + "## Last time\n", + "\n", + "- Product rule for sampling with replacement when order matters\n", + "- Factorial rule for sampling without replacement when order matters\n", + "- Naive definition of probability\n", + "- Finding the probability of complement" + ] + }, + { + "cell_type": "markdown", + "id": "3b6cbe26-1a99-4df7-b756-20e0691b352c", + "metadata": { + "editable": true, + "slideshow": { + "slide_type": "slide" + }, + "tags": [] + }, + "source": [ + "## This time\n", + "\n", + "- Binomial coefficient for sampling without replacement when order doesn't matter\n", + "- \"Stars and bars\" for sampling with replacement where the order doesn't matter\n", + "- Axiomatic definition of probability" + ] + }, + { + "cell_type": "markdown", + "id": "f525ffce-cfaa-4907-830a-64a1c1003bca", + "metadata": { + "editable": true, + "slideshow": { + "slide_type": "slide" + }, + "tags": [] + }, + "source": [ + "## Binomial coefficient\n", + "\n", + "A binomial coefficient counts the number of subsets of a certain size for a set, such as the number of ways to choose a committee of size $k$ from a set of $n$ people. Sets and subsets are by definition unordered, e.g., $\\{3, 1, 4\\} = \\{4, 1, 3\\}$, so we are counting the number of ways to choose $k$ objects out of $n$, without replacement and without distinguishing between the different orders in which they could be chosen.\n", + "\n", + "For any nonnegative integers $k$ and $n$, the binomial coefficient $\\begin{pmatrix}n\\\\k\\end{pmatrix}$, read as \"$n$ choose $k$\", is the number of subsets of size $k$ for a set of size $n$. For $ k \\leqslant n$,\n", + "\n", + "$$\n", + "\\begin{pmatrix}n\\\\k\\end{pmatrix}=\\frac{n!}{k!(n-k)!}\n", + "$$" + ] + }, + { + "cell_type": "markdown", + "id": "3d2bbfbb-4ef7-4b69-a252-e2dc50da8d25", + "metadata": { + "editable": true, + "slideshow": { + "slide_type": "slide" + }, + "tags": [] + }, + "source": [ + "### Problem 1\n", + "\n", + "How many paths are there from the point (0,0) to the point (110,111) in the plane such that each step either consists of going one unit up or one unit to the right?" + ] + }, + { + "cell_type": "markdown", + "id": "f1477908-73e1-4159-96da-c25df7a1d14f", + "metadata": { + "editable": true, + "slideshow": { + "slide_type": "slide" + }, + "tags": [] + }, + "source": [ + "### Solution 1\n", + "\n", + "Reminder: We will encode a path as a sequence of letters $U$ (for up step) and $R$ (for right step), like $URURURU\\ldots UURUR$. The sequence must consist of 110 $R$s and 111 $U$s.\n", + "\n", + "Note that to fully describe the sequence we actually only need to specify where the $R$s are located. This falls under binomial coefficient definition. So there are $\\begin{pmatrix}110+111\\\\110\\end{pmatrix}$ possible paths." + ] + }, + { + "cell_type": "markdown", + "id": "83c515e1-5aa2-4481-9a96-01584d1d8a6b", + "metadata": { + "editable": true, + "slideshow": { + "slide_type": "slide" + }, + "tags": [] + }, + "source": [ + "### Problem 2\n", + "\n", + "How many ways are there to split a dozen people into 3 teams, where each team has 4 people?" + ] + }, + { + "cell_type": "markdown", + "id": "5906936b-1bd5-42d0-b649-883bc1e524d2", + "metadata": { + "editable": true, + "slideshow": { + "slide_type": "slide" + }, + "tags": [] + }, + "source": [ + "### Solution 2\n", + "\n", + "Let's randomly pick the first team, then randomly pick the second and claim the remaining people the third team.\n", + "\n", + "This gives us $\\begin{pmatrix}12\\\\4\\end{pmatrix}\\cdot\\begin{pmatrix}8\\\\4\\end{pmatrix}$ possibilities. Is it correct?" + ] + }, + { + "cell_type": "markdown", + "id": "5bd8b72a-1a0b-42c1-beb6-b22a31483786", + "metadata": { + "editable": true, + "slideshow": { + "slide_type": "fragment" + }, + "tags": [] + }, + "source": [ + "It is not correct, because we overcounted due the fact that we do not actually care which team is the first, second or third. So we need to divide the expression by $3!$. The final answer is:\n", + "\n", + "$$\n", + "\\frac{1}{3!} \\cdot \\begin{pmatrix}12\\\\4\\end{pmatrix}\\cdot\\begin{pmatrix}8\\\\4\\end{pmatrix} = \\frac{1}{3!} \\cdot \\frac{12!}{4!8!} \\cdot \\frac{8!}{4!4!} = \\frac{12!}{4! 4! 4! 3!}\n", + "$$" + ] + }, + { + "cell_type": "markdown", + "id": "15802ff5-f369-46a9-9f9c-bde643c64509", + "metadata": { + "editable": true, + "slideshow": { + "slide_type": "fragment" + }, + "tags": [] + }, + "source": [ + "If we cared which team is which, we would obtain $\\frac{12!}{4! 4! 4!}$, which is called a **multinomial coefficient**. The only difference is that we choose more than one subset from one total." + ] + }, + { + "cell_type": "markdown", + "id": "d81bceef-797d-4df7-9921-73ef99d818ca", + "metadata": { + "editable": true, + "slideshow": { + "slide_type": "slide" + }, + "tags": [] + }, + "source": [ + "### Problem 3\n", + "\n", + "How many ways are there to put 10 balls into 3 boxes?" + ] + }, + { + "cell_type": "markdown", + "id": "cc8d44d4-6de0-44bf-8a6c-85457b6f53b3", + "metadata": { + "editable": true, + "slideshow": { + "slide_type": "slide" + }, + "tags": [] + }, + "source": [ + "### Solution 3\n", + "\n", + "Let's visualize the balls as o and borders of boxes as |:\n", + "$$\n", + "|oooo|oooo|oo|\n", + "$$\n", + "\n", + "How many balls do we have? How many borders do we have? How many total objects do we have?" + ] + }, + { + "cell_type": "markdown", + "id": "46451a2b-d3b0-4352-82b0-78a0b38e7888", + "metadata": { + "editable": true, + "slideshow": { + "slide_type": "fragment" + }, + "tags": [] + }, + "source": [ + "We have:\n", + "- 10 balls\n", + "- 3 + 1 borders. Note however, that 2 borders are fixed to be leftmost and rightmost. So the free borders are actually 3 - 1.\n", + "- 10 + 3 - 1 objects total\n", + "\n", + "Then, we need to choose the positions of the borders:\n", + "$$\n", + "\\begin{pmatrix}10 + 3 - 1\\\\3 - 1\\end{pmatrix} = \\begin{pmatrix}n + k - 1\\\\k - 1\\end{pmatrix}\n", + "$$" + ] + }, + { + "cell_type": "markdown", + "id": "3b8f1f70", + "metadata": { + "editable": true, + "slideshow": { + "slide_type": "slide" + }, + "tags": [] + }, + "source": [ + "## Recap of counting\n", + "\n", + "Sampling $k$ objects from $n$ choices:\n", + "\n", + "|With replacement|Order matters|Formula|\n", + "|:-:|:-:|:-:|\n", + "|Yes|Yes|\\begin{eqnarray}n^k\\end{eqnarray}|\n", + "|Yes|No|\\begin{eqnarray}\\begin{pmatrix}n+k-1\\\\k-1\\end{pmatrix}\\end{eqnarray}|\n", + "|No|Yes|\\begin{eqnarray}n!\\end{eqnarray}|\n", + "|No|No|\\begin{eqnarray}\\begin{pmatrix}n\\\\k\\end{pmatrix}\\end{eqnarray}|" + ] + }, + { + "cell_type": "markdown", + "id": "63428a7f", + "metadata": { + "editable": true, + "slideshow": { + "slide_type": "slide" + }, + "tags": [] + }, + "source": [ + "### Problem 4\n", + "\n", + "There are 100 passengers lined up to board an airplane with 100 seats (with each seat assigned to one of the passengers). The first passenger in line crazily decides to sit in a randomly chosen seat (with all seats equally likely). Each subsequent passenger takes their assigned seat if available, and otherwise sits in a random available seat. What is the probability that the last passenger in line gets to sit in their assigned seat?" + ] + }, + { + "cell_type": "markdown", + "id": "3af73656-548a-470b-90c0-31cadf930ad8", + "metadata": { + "editable": true, + "slideshow": { + "slide_type": "slide" + }, + "tags": [] + }, + "source": [ + "### Solution 4\n", + "\n", + "Denote $i$-th passenger true seat as $i$, regardless of its position in the plane.\n", + "\n", + "Notice that there is always only one seat occupied incorrectly. Then consider a passenger $j$. He could sit in any other available seat, but he decided to sit into seat $1$. If he sits into seat $1$, it means that his place $j$ is taken.\n", + "\n", + "|pax\\seat|1|2|3|4|5|\n", + "|:-:|:-:|:-:|:-:|:-:|:-:|\n", + "||||1|||\n", + "|||2|1|||\n", + "|||2|1|3||\n", + "|||2|1|3|4|\n", + "||5|2|1|3|4|" + ] + }, + { + "cell_type": "markdown", + "id": "60e030d1-652a-484b-9064-b59e21c2d7db", + "metadata": { + "editable": true, + "slideshow": { + "slide_type": "fragment" + }, + "tags": [] + }, + "source": [ + "|pax\\seat|1|2|3|4|5|\n", + "|:-:|:-:|:-:|:-:|:-:|:-:|\n", + "||||1|||\n", + "|||2|1|||\n", + "|||2|1|3||\n", + "||4|2|1|3||\n", + "||4|2|1|3|5|" + ] + }, + { + "cell_type": "markdown", + "id": "0897805d-5aea-49f1-b0dd-30b9e32fe163", + "metadata": {}, + "source": [ + "|pax\\seat|1|2|3|4|5|\n", + "|:-:|:-:|:-:|:-:|:-:|:-:|\n", + "||||1|||\n", + "|||2|1|||\n", + "|||2|1||3|\n", + "|||2|1|4|3|\n", + "||5|2|1|4|3|\n", + "\n", + "|pax\\seat|1|2|3|4|5|\n", + "|:-:|:-:|:-:|:-:|:-:|:-:|\n", + "||||1|||\n", + "|||2|1|||\n", + "||3|2|1|||\n", + "||3|2|1|4||\n", + "||3|2|1|4|5|" + ] + }, + { + "cell_type": "markdown", + "id": "e4fb44a0-9e40-4a05-a02d-51c94b20eb2f", + "metadata": { + "editable": true, + "slideshow": { + "slide_type": "subslide" + }, + "tags": [] + }, + "source": [ + "By sitting into seat $1$, he removes the source of permutation, because now all seats are occupied correctly. After that, all the passengers that enter the plane will be able to sit in their true seats. It is important that it always happens and can happen with any passenger $j$.\n", + "\n", + "Generally, the last $100$-th passenger may observe two cases:\n", + "- The premutation was removed, then he has the option to sit into his true $100$-th seat\n", + "- The permutation was not removed, then he is the one to remove the permutation and take seat $1$.\n", + "\n", + "We can now reduce the problem to just two seats: $1$-st and $100$-th. One of the passengers seating on these seats is the last $100$-th passenger, the other is any other passenger $j$." + ] + }, + { + "cell_type": "markdown", + "id": "0113aef1", + "metadata": { + "editable": true, + "slideshow": { + "slide_type": "subslide" + }, + "tags": [] + }, + "source": [ + "Since $j<100$, it means that both $1$-st and $100$-th seats were empty when he or she boarded the plane! And the probabilities to sit in any of them is equal.\n", + "\n", + "If $j$ sat in $1$ then the last passenger ended up sitting in $100$ and the resulting configuration of passengers sitting in the 100 seats is the same as if $j$ had sat in $100$ except for the fact that the passengers in $1$ and $100$ are swapped. Therefore these two configurations occur with the same probability and exactly one of them has the last passenger in his or her seat $100$.\n", + "\n", + "This implies that all the final configurations of passengers can be paired such that the two configurations in any pair occur with the same probability and exactly one has the last passenger in his or her seat.\n", + "\n", + "This implies that the probability that the last passenger is in her seat is $0.5$." + ] + }, + { + "cell_type": "markdown", + "id": "ef85e62a", + "metadata": { + "editable": true, + "slideshow": { + "slide_type": "slide" + }, + "tags": [] + }, + "source": [ + "## Non-naive definition" + ] + }, + { + "cell_type": "markdown", + "id": "74356ca3", + "metadata": { + "editable": true, + "slideshow": { + "slide_type": "slide" + }, + "tags": [] + }, + "source": [ + "### Definition\n", + "\n", + "A probability space consists of a sample space $S$ and a probability function $P$ which takes an event $A \\subseteq S$ as input and returns $P(A)$, a real number between $0$ and $1$, as output. The function $P$ must satisfy the following axioms:\n", + "- $P(\\varnothing) = 0, P(S) = 1$\n", + "- If $A_1, A_2, \\ldots$ are disjoint ($A_i \\cap A_j = \\varnothing, i \\neq j$) events, then\n", + " $$\n", + " P\\left(\\bigcup\\limits_{j=1}^\\infty A_j\\right) = \\sum\\limits_{j=1}^\\infty P(A_j)\n", + " $$" + ] + }, + { + "cell_type": "markdown", + "id": "7c482edb", + "metadata": { + "editable": true, + "slideshow": { + "slide_type": "slide" + }, + "tags": [] + }, + "source": [ + "### Properties\n", + "\n", + "1. $P(A^c) = 1 − P(A)$\n", + "2. If $A \\subseteq B$, then $P(A) \\leqslant P(B)$\n", + "3. $P (A \\cup B) = P (A) + P (B) − P (A \\cap B)$" + ] + }, + { + "cell_type": "markdown", + "id": "764dbfee", + "metadata": { + "editable": true, + "slideshow": { + "slide_type": "slide" + }, + "tags": [] + }, + "source": [ + "### Problem 5\n", + "\n", + "Consider set $S$ of all (how many?) subsets of set $M = \\{1, \\ldots, N\\}$. We take two sets randomly and independently two sets $A, B \\in S$. Find the probability that $A \\cap B = \\varnothing$." + ] + }, + { + "cell_type": "markdown", + "id": "eb8e0e1b", + "metadata": { + "editable": true, + "slideshow": { + "slide_type": "slide" + }, + "tags": [] + }, + "source": [ + "## Solution 5\n", + "\n", + "Take any element $e \\in M$ from the original set.\n", + "\n", + "- $\\mathbb{P}(e \\in A) = p_1 = \\tfrac12$, by construction of $A$\n", + "- $\\mathbb{P}(e \\in B) = p_2 = \\tfrac12$, by construction of $B$\n", + "- $\\mathbb{P}(e \\in A \\cap B) = p_{12} = p_1 \\cdot p_2 = \\tfrac14$\n", + "- $\\mathbb{P}(e \\notin A \\cap B) = 1 - \\mathbb{P}(e \\in A \\cap B) = 1 - p_{12} = \\tfrac34$\n", + "\n", + "Repeat for every $e \\in M$ to obtain:\n", + "$$\n", + "\\mathbb{P}(A \\cap B = \\varnothing) = \\left( \\frac34 \\right)^N\n", + "$$" + ] + }, + { + "cell_type": "markdown", + "id": "acd1a62f", + "metadata": { + "editable": true, + "slideshow": { + "slide_type": "slide" + }, + "tags": [] + }, + "source": [ + "## Inclusion-exclusion formula\n", + "\n", + "$$P (A \\cup B) = P (A) + P (B) − P (A \\cap B)$$" + ] + }, + { + "cell_type": "markdown", + "id": "7624cc8b", + "metadata": { + "editable": true, + "slideshow": { + "slide_type": "fragment" + }, + "tags": [] + }, + "source": [ + "$$P\\left(\\bigcup\\limits_{i=1}^n A_i\\right) = \\sum_i P(A_i) − \\sum_{i < j} P(A_i \\cap A_j) + \\sum_{i < j < k}P(A_i \\cap A_j \\cap A_k)−\\ldots+(−1)^{n+1} P(A_1 \\cap\\ldots \\cap A_n)$$" + ] + } + ], + "metadata": { + "celltoolbar": "Slideshow", + "kernelspec": { + "display_name": "Python 3 (ipykernel)", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.12.0" + } + }, + "nbformat": 4, + "nbformat_minor": 5 +} diff --git a/Seminar_materials/seminar_02/Seminar 2 (Definition of probability).pdf b/Seminar_materials/seminar_02/Seminar 2 (Definition of probability).pdf new file mode 100644 index 0000000..622473d Binary files /dev/null and b/Seminar_materials/seminar_02/Seminar 2 (Definition of probability).pdf differ diff --git a/Seminar_materials/seminar_02/Seminar 2 (Definition of probability).slides.html b/Seminar_materials/seminar_02/Seminar 2 (Definition of probability).slides.html new file mode 100644 index 0000000..9eef529 --- /dev/null +++ b/Seminar_materials/seminar_02/Seminar 2 (Definition of probability).slides.html @@ -0,0 +1,8095 @@ + + + + + + + +Seminar 2 (Definition of probability) slides + + + + + + + + + + + + + + + + + +
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ +
+
+
+
+
+
+
+
+
+
+
+
+
+ + + diff --git a/Seminar_materials/seminar_03/Seminar 3 (Conditional probability).html b/Seminar_materials/seminar_03/Seminar 3 (Conditional probability).html new file mode 100644 index 0000000..dcf7ff3 --- /dev/null +++ b/Seminar_materials/seminar_03/Seminar 3 (Conditional probability).html @@ -0,0 +1,7992 @@ + + + + + +Seminar 3 (Conditional probability) + + + + + + + + + + + + +
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ + diff --git a/Seminar_materials/Seminar03/Seminar 3 (Conditional probability).ipynb b/Seminar_materials/seminar_03/Seminar 3 (Conditional probability).ipynb old mode 100644 new mode 100755 similarity index 77% rename from Seminar_materials/Seminar03/Seminar 3 (Conditional probability).ipynb rename to Seminar_materials/seminar_03/Seminar 3 (Conditional probability).ipynb index f151b9c..42497ac --- a/Seminar_materials/Seminar03/Seminar 3 (Conditional probability).ipynb +++ b/Seminar_materials/seminar_03/Seminar 3 (Conditional probability).ipynb @@ -4,9 +4,11 @@ "cell_type": "markdown", "id": "a8f7b639", "metadata": { + "editable": true, "slideshow": { "slide_type": "slide" - } + }, + "tags": [] }, "source": [ "# Seminar 3" @@ -16,9 +18,11 @@ "cell_type": "markdown", "id": "adopted-electricity", "metadata": { + "editable": true, "slideshow": { "slide_type": "slide" - } + }, + "tags": [] }, "source": [ "## Conditional probability in classic probability\n", @@ -38,9 +42,11 @@ "cell_type": "markdown", "id": "deadly-wildlife", "metadata": { + "editable": true, "slideshow": { "slide_type": "slide" - } + }, + "tags": [] }, "source": [ "## Problem 1\n", @@ -54,9 +60,11 @@ "cell_type": "markdown", "id": "incorrect-margin", "metadata": { + "editable": true, "slideshow": { "slide_type": "fragment" - } + }, + "tags": [] }, "source": [ "- Sum of results is more than 6 is event $B$ (we condition on it). \n", @@ -67,9 +75,11 @@ "cell_type": "markdown", "id": "operating-rochester", "metadata": { + "editable": true, "slideshow": { "slide_type": "fragment" - } + }, + "tags": [] }, "source": [ "$$\n", @@ -81,84 +91,89 @@ "cell_type": "markdown", "id": "7bb7a2e9", "metadata": { + "editable": true, "slideshow": { "slide_type": "slide" - } + }, + "tags": [] }, "source": [ - "## Recap of axiomatic definition of probability\n", + "## Recap of non-naive definition\n", "\n", - "A probability space is the following tuple: $(\\Omega, \\cal{F}, \\mathbb{P})$." + "A probability space consists of ..." ] }, { "cell_type": "markdown", "id": "needed-speaker", "metadata": { + "editable": true, "slideshow": { "slide_type": "fragment" - } + }, + "tags": [] }, "source": [ "- **Sample space** $\\Omega$\n", - "- **Set of events** $\\cal{F}$\n", - "- **Probability measure** $\\mathbb{P}$" + "- **Probability function** $\\mathbb{P}$" ] }, { "cell_type": "markdown", "id": "innovative-german", "metadata": { + "editable": true, "slideshow": { - "slide_type": "subslide" - } + "slide_type": "slide" + }, + "tags": [] }, "source": [ - "Set of events is $\\cal{F} \\subset 2^\\Omega$, such that\n", - "1. $\\Omega \\in \\cal{F}$\n", - "2. If $A \\in \\cal{F}$, then $\\overline{A} \\in \\cal{F}$ (closed under complement operation)\n", - "3. If $A_1, A_2, \\ldots \\in \\cal{F}$, then $\\bigcup_{k=1}^\\infty A_k \\in \\cal{F}$ (closed under countable union operation)\n", + "Probability function $P$ is such that it takes an event $A \\subseteq S$ as input and returns $P(A)$, a real number between $0$ and $1$, as output.\n", "\n", - "Probability measure is $\\mathbb{P}: \\cal{F} \\to \\mathbb{R}_+$, such that\n", - "1. $\\mathbb{P}(\\Omega) = 1$\n", - "2. If $A_1, A_2, \\ldots \\in \\cal{F}$ and $A_i \\cap A_j = \\varnothing$ for $i\\neq j$, then $\\mathbb{P}\\left(\\bigcup_{k=0}^\\infty A_k \\right) = \\sum_{k=1}^\\infty \\mathbb{P}(A_k)$ ($\\sigma$-additivity)" + "The function $P$ must satisfy the following axioms:\n", + "- $P(\\varnothing) = 0, P(S) = 1$\n", + "- If $A_1, A_2, \\ldots$ are disjoint ($A_i \\cap A_j = \\varnothing, i \\neq j$) events, then\n", + " $$\n", + " P\\left(\\bigcup\\limits_{j=1}^\\infty A_j\\right) = \\sum\\limits_{j=1}^\\infty P(A_j)\n", + " $$" ] }, { "cell_type": "markdown", "id": "directed-taiwan", "metadata": { + "editable": true, "slideshow": { "slide_type": "slide" - } + }, + "tags": [] }, "source": [ - "## Conditional probability in axiomatic probability\n", + "## Conditional probability\n", "\n", - "In axiomatic probability, we will use the same formula as before, and say it's a definition.\n", + "In non-naive probability, we will use the same formula as before, and say it's a definition.\n", "\n", - "We are working with probability space $(\\Omega, \\cal{F}, \\mathbb{P})$, and we are given event $B \\in \\cal{F}$ such that $\\mathbb{P}(B) > 0$. Then the probability of any event $A \\in \\cal{F}$ conditioned on event $B$ is by definition:\n", + "We are working with probability space $(\\Omega, \\mathbb{P})$, and we are given event $B$ such that $\\mathbb{P}(B) > 0$. Then the probability of any event $A$ conditioned on event $B$ is by definition:\n", "$$\n", "\\mathbb{P}(A|B) = \\frac{\\mathbb{P}(A \\cap B)}{\\mathbb{P}(B)}\n", "$$\n", "\n", - "Let's say our sample space is now $B$. We can prove that:\n", - "- $\\cal{F}_B = \\{A \\cap B : A \\in \\cal{F}\\}$ is a $\\sigma$-algebra for $B$\n", - "- $\\mathbb{P}_B = \\mathbb{P}(A|B)$ is a porbability measure\n", - "\n", - "This means that $(B, \\cal{F}_B, \\mathbb{P}_B)$ is a porbability space and it is called **conditional probability space** given $B$." + "Let's say our sample space is now $B$. We can prove that $\\mathbb{P}_B = \\mathbb{P}(A|B)$ is a proper probability function, i.e. it satisfies the axioms. This means that conditional probability is a probability, $(B, \\mathbb{P}_B)$ is a probability space and it is called **conditional probability space** given $B$." ] }, { "cell_type": "markdown", "id": "preceding-dictionary", "metadata": { + "editable": true, "slideshow": { - "slide_type": "subslide" - } + "slide_type": "slide" + }, + "tags": [] }, "source": [ - "## Conditional probability in axiomatic probability\n", + "## Conditional probability\n", "\n", "- If $A \\subset B$, then\n", " $$\n", @@ -170,13 +185,58 @@ " $$" ] }, + { + "cell_type": "markdown", + "id": "a4e02c43-e18a-45c9-8ac8-5d4efa811df0", + "metadata": { + "editable": true, + "slideshow": { + "slide_type": "slide" + }, + "tags": [] + }, + "source": [ + "## Problem\n", + "\n", + "We throw die 10 times, and it is known that at least one result was 6. What is the probability that there was more than one result 6?" + ] + }, + { + "cell_type": "markdown", + "id": "3b10d007-43fe-41ca-8529-1b83d3115c6d", + "metadata": { + "editable": true, + "slideshow": { + "slide_type": "slide" + }, + "tags": [] + }, + "source": [ + "## Solution\n", + "\n", + "Denote $A$ the event that there was at least one 6, $B$ the event that there was more than one 6.\n", + "$$\n", + "\\mathbb{P}(B|A) = \\frac{\\mathbb{P}(B \\cap A)}{\\mathbb{P}(A)} = \\frac{\\mathbb{P}(B)}{\\mathbb{P}(A)} = \\frac{1 - \\mathbb{P}(\\overline{B})}{1 - \\mathbb{P}(\\overline{A})}\n", + "$$\n", + "\n", + "$$\n", + "\\mathbb{P}(\\overline{B}) = (1 - \\frac16)^{10} + \\begin{pmatrix}10\\\\9\\end{pmatrix} (1 - \\frac16)^{9}\n", + "$$\n", + "\n", + "$$\n", + "\\mathbb{P}(\\overline{A}) = (1 - \\frac16)^{10}\n", + "$$" + ] + }, { "cell_type": "markdown", "id": "radio-marsh", "metadata": { + "editable": true, "slideshow": { "slide_type": "slide" - } + }, + "tags": [] }, "source": [ "## Probability of intersection of events\n", @@ -196,9 +256,11 @@ "cell_type": "markdown", "id": "fluid-albert", "metadata": { + "editable": true, "slideshow": { "slide_type": "slide" - } + }, + "tags": [] }, "source": [ "## Law of total probability\n", @@ -217,9 +279,11 @@ "cell_type": "markdown", "id": "successful-biodiversity", "metadata": { + "editable": true, "slideshow": { "slide_type": "slide" - } + }, + "tags": [] }, "source": [ "## Problem 2\n", @@ -231,9 +295,11 @@ "cell_type": "markdown", "id": "reported-pastor", "metadata": { + "editable": true, "slideshow": { "slide_type": "slide" - } + }, + "tags": [] }, "source": [ "## Solution\n", @@ -250,9 +316,11 @@ "cell_type": "markdown", "id": "exceptional-jungle", "metadata": { + "editable": true, "slideshow": { "slide_type": "fragment" - } + }, + "tags": [] }, "source": [ "Then, $\\mathbb{P}(B_1) = 0.99$ and $\\mathbb{P}(B_2) = 0.01$. Next, let's compute the probabilities $\\mathbb{P}(A|B_1)$ and $\\mathbb{P}(A|B_2)$." @@ -262,9 +330,11 @@ "cell_type": "markdown", "id": "built-budapest", "metadata": { + "editable": true, "slideshow": { "slide_type": "fragment" - } + }, + "tags": [] }, "source": [ "For a fair coin $\\mathbb{P}(A|B_1) = 0.5^7$, and for double-headed coin $\\mathbb{P}(A|B_2) = 1$.\n", @@ -279,9 +349,11 @@ "cell_type": "markdown", "id": "capital-celebration", "metadata": { + "editable": true, "slideshow": { "slide_type": "slide" - } + }, + "tags": [] }, "source": [ "## Bayes rule\n", @@ -302,9 +374,11 @@ "cell_type": "markdown", "id": "biological-tanzania", "metadata": { + "editable": true, "slideshow": { "slide_type": "slide" - } + }, + "tags": [] }, "source": [ "## Problem 3\n", @@ -316,9 +390,11 @@ "cell_type": "markdown", "id": "treated-honor", "metadata": { + "editable": true, "slideshow": { "slide_type": "slide" - } + }, + "tags": [] }, "source": [ "## Solution\n", @@ -333,9 +409,11 @@ "cell_type": "markdown", "id": "opposed-deficit", "metadata": { + "editable": true, "slideshow": { "slide_type": "slide" - } + }, + "tags": [] }, "source": [ "## Problem 4\n", @@ -347,9 +425,11 @@ "cell_type": "markdown", "id": "romantic-county", "metadata": { + "editable": true, "slideshow": { "slide_type": "slide" - } + }, + "tags": [] }, "source": [ "## Solution\n", @@ -375,9 +455,11 @@ "cell_type": "markdown", "id": "leading-waste", "metadata": { + "editable": true, "slideshow": { "slide_type": "slide" - } + }, + "tags": [] }, "source": [ "## Problem 5\n", @@ -389,36 +471,37 @@ "cell_type": "markdown", "id": "south-relay", "metadata": { + "editable": true, "slideshow": { "slide_type": "slide" - } + }, + "tags": [] }, "source": [ "## Solution\n", "\n", - "Let's find the probability of finding a car if the contestant does not switch his choice.\n", - "\n", - "Let's label\n", - "- the door that the contestant originally picked as door 1\n", - "- the door that Monty opens as door 2\n", - "- the probabilities of a car being behind a certain door $k$ as $C_k$\n", - "- the probability of Monty opening a certain door $k$ as $M_k$\n", + "Denote $A$ the event that the prize is behing the chosen door, $B$ the event that the choice was changed. Let's find the probability of winning when the choice is changed:\n", + "$$\n", + "\\mathbb{P}(\\text{win} | B) = \\mathbb{P}(\\text{win} | B, A) \\cdot \\mathbb{P}(A) + \\mathbb{P}(\\text{win} | B, \\overline{A}) \\cdot \\mathbb{P}(\\overline{A}) = 0 \\cdot \\frac13 + 1 \\cdot \\frac23 = \\frac23\n", + "$$\n", "\n", - "Then,\n", + "Let's find the probability of winning when the choice is not changed:\n", "$$\n", - "\\mathbb{P}(C_1|M_2) = \\frac{\\mathbb{P}(M_2|C_1)\\mathbb{P}(C_1)}{\\mathbb{P}(M_2)} = \\frac{\\tfrac12 \\times \\tfrac13}{\\frac12} = \\frac13\n", + "\\mathbb{P}(\\text{win} | \\overline{B}) = \\mathbb{P}(\\text{win} | \\overline{B}, A) \\cdot \\mathbb{P}(A) + \\mathbb{P}(\\text{win} | \\overline{B}, \\overline{A}) \\cdot \\mathbb{P}(\\overline{A}) = 1 \\cdot \\frac13 + 0 \\cdot \\frac23 = \\frac13\n", "$$\n", "\n", - "This means that the contestant should switch his choice." + "The contestant should thus switch." ] }, { "cell_type": "markdown", "id": "composed-footwear", "metadata": { + "editable": true, "slideshow": { "slide_type": "slide" - } + }, + "tags": [] }, "source": [ "## Independence of events\n", @@ -443,9 +526,11 @@ "cell_type": "markdown", "id": "forty-aviation", "metadata": { + "editable": true, "slideshow": { "slide_type": "slide" - } + }, + "tags": [] }, "source": [ "## Properties of independence\n", @@ -463,9 +548,11 @@ "cell_type": "markdown", "id": "authentic-nirvana", "metadata": { + "editable": true, "slideshow": { "slide_type": "slide" - } + }, + "tags": [] }, "source": [ "## Pairwise and mutual independence\n", @@ -487,9 +574,11 @@ "cell_type": "markdown", "id": "equipped-conservation", "metadata": { + "editable": true, "slideshow": { "slide_type": "slide" - } + }, + "tags": [] }, "source": [ "## Problem 6\n", @@ -506,9 +595,11 @@ "cell_type": "markdown", "id": "departmental-lunch", "metadata": { + "editable": true, "slideshow": { "slide_type": "slide" - } + }, + "tags": [] }, "source": [ "## Solution\n", @@ -523,9 +614,11 @@ "cell_type": "markdown", "id": "compressed-store", "metadata": { + "editable": true, "slideshow": { "slide_type": "slide" - } + }, + "tags": [] }, "source": [ "## Problem 7\n", @@ -537,9 +630,11 @@ "cell_type": "markdown", "id": "imposed-tourism", "metadata": { + "editable": true, "slideshow": { "slide_type": "slide" - } + }, + "tags": [] }, "source": [ "## Solution\n", @@ -571,7 +666,7 @@ "metadata": { "celltoolbar": "Slideshow", "kernelspec": { - "display_name": "Python 3", + "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, @@ -585,7 +680,7 @@ "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", - "version": "3.8.2" + "version": "3.12.0" } }, "nbformat": 4, diff --git a/Seminar_materials/seminar_03/Seminar 3 (Conditional probability).pdf b/Seminar_materials/seminar_03/Seminar 3 (Conditional probability).pdf new file mode 100644 index 0000000..70d96f1 Binary files /dev/null and b/Seminar_materials/seminar_03/Seminar 3 (Conditional probability).pdf differ diff --git a/Seminar_materials/seminar_03/Seminar 3 (Conditional probability).slides.html b/Seminar_materials/seminar_03/Seminar 3 (Conditional probability).slides.html new file mode 100644 index 0000000..edbc5a3 --- /dev/null +++ b/Seminar_materials/seminar_03/Seminar 3 (Conditional probability).slides.html @@ -0,0 +1,8002 @@ + + + + + + + +Seminar 3 (Conditional probability) slides + + + + + + + + + + + + + + + + + +
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ + + diff --git a/Seminar_materials/seminar_04/Seminar 4 (Random variables).html b/Seminar_materials/seminar_04/Seminar 4 (Random variables).html new file mode 100644 index 0000000..d13d96a --- /dev/null +++ b/Seminar_materials/seminar_04/Seminar 4 (Random variables).html @@ -0,0 +1,7881 @@ + + + + + +Seminar 4 (Random variables) + + + + + + + + + + + + +
+ + + + + + + + + + + + + + + + + + + +
+ + diff --git a/Seminar_materials/seminar_04/Seminar 4 (Random variables).ipynb b/Seminar_materials/seminar_04/Seminar 4 (Random variables).ipynb new file mode 100755 index 0000000..d4e4b5f --- /dev/null +++ b/Seminar_materials/seminar_04/Seminar 4 (Random variables).ipynb @@ -0,0 +1,538 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "id": "a8f7b639", + "metadata": { + "editable": true, + "slideshow": { + "slide_type": "slide" + }, + "tags": [] + }, + "source": [ + "# Seminar 4" + ] + }, + { + "cell_type": "code", + "execution_count": 1, + "id": "bb285517-7029-4885-b5bc-bba2b530f86d", + "metadata": { + "editable": true, + "slideshow": { + "slide_type": "skip" + }, + "tags": [] + }, + "outputs": [ + { + "name": "stderr", + "output_type": "stream", + "text": [ + "/var/folders/33/j0cl7y453td68qb96j7bqcj4cf41kc/T/ipykernel_31083/3109700056.py:10: DeprecationWarning: `set_matplotlib_formats` is deprecated since IPython 7.23, directly use `matplotlib_inline.backend_inline.set_matplotlib_formats()`\n", + " dp.set_matplotlib_formats(\"retina\")\n" + ] + }, + { + "data": { + "text/plain": [ + "
" + ] + }, + "metadata": {}, + "output_type": "display_data" + } + ], + "source": [ + "import numpy as np\n", + "import pandas as pd\n", + "\n", + "import scipy.stats as sts\n", + "\n", + "import IPython.display as dp\n", + "import matplotlib.pyplot as plt\n", + "import seaborn as sns\n", + "\n", + "dp.set_matplotlib_formats(\"retina\")\n", + "sns.set(style=\"whitegrid\", font_scale=1.5)\n", + "sns.despine()\n", + "\n", + "%matplotlib inline" + ] + }, + { + "cell_type": "markdown", + "id": "present-example", + "metadata": { + "editable": true, + "slideshow": { + "slide_type": "slide" + }, + "tags": [] + }, + "source": [ + "## Random variables\n", + "\n", + "A **random variable** is a function from sample space to the real numbers $X: S \\to \\mathbb{R}$.\n", + "\n", + "It means that for every outcome $\\omega \\in S$ there is a real number $X(\\omega)$.\n", + "\n", + "The function needs to be measureable, but this topic is slightly beyond the scope of our course. Normal functions that you encounter in maths are all measureable." + ] + }, + { + "cell_type": "markdown", + "id": "built-newton", + "metadata": { + "editable": true, + "slideshow": { + "slide_type": "slide" + }, + "tags": [] + }, + "source": [ + "## Distribution of a random variable\n", + "\n", + "Consider random variable $X: S \\to \\mathbb{R}$. The values it can attain are therefore real numbers. We introduce **the distribution** (or distribution law) $\\mathcal{L}$ of random variable $X$. Distribution takes values of the random variable and outputs their probabilities. It is not the same as probability function $P$, because probability function works with events from sample space and distribution works with values of random variables. We will write $X \\sim \\mathcal{L}$." + ] + }, + { + "cell_type": "markdown", + "id": "elementary-hospital", + "metadata": { + "editable": true, + "slideshow": { + "slide_type": "slide" + }, + "tags": [] + }, + "source": [ + "## Two types of distributions\n", + "\n", + "A probability distribution can be **discrete** or **continuous**.\n", + "\n", + "Discrete random variables can only take countably many values (like integers), continuous random variables can take uncountably many values (like reals).\n", + "\n", + "There is also a third type of distributions, which you never encounter in practice; it's possible for a distribution to be a mix of several types, which you also do not normally encounter." + ] + }, + { + "cell_type": "markdown", + "id": "fabulous-hamilton", + "metadata": { + "editable": true, + "slideshow": { + "slide_type": "slide" + }, + "tags": [] + }, + "source": [ + "## Example 1\n", + "\n", + "Consider event $A$ and a random variable $X = \\mathbb{I}\\text{nd}_A$, an indicator:\n", + "$$\n", + "\\mathbb{I}\\text{nd}_A(x) = \\begin{cases}\n", + "1, x \\in A, \\\\\n", + "0, \\text{else}\n", + "\\end{cases}\n", + "$$\n", + "\n", + "$$\n", + "\\mathbb{P}(X = 1) = \\mathbb{P}(A) = p\n", + "$$\n", + "\n", + "$$\n", + "\\mathbb{P}(X = 0) = 1 - \\mathbb{P}(A) = 1 - p\n", + "$$\n", + "\n", + "We say that $X$ follows **Bernoulli distribution** with parameter $p$ and write $X \\sim Be(p)$.\n", + "\n", + "We will call $p_X(x) = \\mathbb{P}_X(X=x)$ a **probability mass function** (PMF)." + ] + }, + { + "cell_type": "code", + "execution_count": 2, + "id": "f21c428d-30d6-42d7-bc1c-9f2ab0fc9f28", + "metadata": { + "editable": true, + "slideshow": { + "slide_type": "slide" + }, + "tags": [] + }, + "outputs": [ + { + "data": { + "image/png": "", + "text/plain": [ + "
" + ] + }, + "metadata": { + "image/png": { + "height": 451, + "width": 561 + } + }, + "output_type": "display_data" + } + ], + "source": [ + "fig, ax = plt.subplots()\n", + "ax.stem([0, 1], [0.3, 0.7])\n", + "ax.set_title(\"PMF of Be(0.7)\");" + ] + }, + { + "cell_type": "markdown", + "id": "ignored-connection", + "metadata": { + "editable": true, + "slideshow": { + "slide_type": "skip" + }, + "tags": [] + }, + "source": [ + "## Bernoulli trial scheme\n", + "\n", + "Previously we have worked with independent events that were happening in one probability space. But sometimes we want to have multiple trials, where for every trial the probability space is known, but we are interested in the probability space covering all the trials at once. We can achieve it via direct product of probability spaces.\n", + "\n", + "If all probability spaces are the same and equal to:\n", + "- $S = \\{0, 1\\}$\n", + "- $\\mathbb{P}(1) = p$ and $\\mathbb{P}(0) = 1 - p$\n", + "\n", + "Then we call such experiment a **Bernoulli trial scheme**, and the probability space of it is:\n", + "- $S = \\{(i_1, \\ldots, i_n), i_j \\in \\{0, 1\\}\\}$\n", + "- $\\mathbb{P}(i_1, \\ldots, i_n) = p^{\\text{num} j \\text{ such that } i_j = 1} (1 - p)^{\\text{num} j \\text{ such that } i_j = 0}$" + ] + }, + { + "cell_type": "markdown", + "id": "31b15678-39ce-4730-8fa5-7a412f836aab", + "metadata": { + "editable": true, + "slideshow": { + "slide_type": "slide" + }, + "tags": [] + }, + "source": [ + "## Bernoulli trial scheme\n", + "\n", + "" + ] + }, + { + "cell_type": "markdown", + "id": "statutory-league", + "metadata": { + "editable": true, + "slideshow": { + "slide_type": "slide" + }, + "tags": [] + }, + "source": [ + "## Example 2\n", + "\n", + "Consider $X_1, \\ldots, X_n \\sim Be(p)$ independent random variables. Then $Y = \\sum_{k=1}^n X_k$ follows **Binomial distribution** with parameters $n$ and $p$, $Y \\sim Bi(n, p)$. $\\mathbb{P}(Y = k) = ?$" + ] + }, + { + "cell_type": "markdown", + "id": "adaptive-clinton", + "metadata": { + "editable": true, + "slideshow": { + "slide_type": "slide" + }, + "tags": [] + }, + "source": [ + "## Solution 2\n", + "\n", + "If $Y \\sim Bi(n, p)$, then\n", + "$$\n", + "\\mathbb{P}(Y = k) = \\begin{pmatrix}n\\\\k\\end{pmatrix} p^k (1-p)^{n-k}\n", + "$$" + ] + }, + { + "cell_type": "code", + "execution_count": 3, + "id": "2e76a96f-4543-418d-bca8-2c70c15a2ca2", + "metadata": { + "editable": true, + "slideshow": { + "slide_type": "slide" + }, + "tags": [] + }, + "outputs": [ + { + "data": { + "image/png": "", + "text/plain": [ + "
" + ] + }, + "metadata": { + "image/png": { + "height": 466, + "width": 1317 + } + }, + "output_type": "display_data" + } + ], + "source": [ + "fig, ax = plt.subplots(1,3,figsize=(16,5))\n", + "ax[0].stem(np.arange(-1, 7), sts.binom.pmf(np.arange(-1, 7), 5, 0.3))\n", + "ax[0].set_title(\"PMF of Bi(5, 0.3)\")\n", + "ax[1].stem(np.arange(-1, 7), sts.binom.pmf(np.arange(-1, 7), 5, 0.5))\n", + "ax[1].set_title(\"PMF of Bi(5, 0.5)\")\n", + "ax[2].stem(np.arange(-1, 7), sts.binom.pmf(np.arange(-1, 7), 5, 0.7))\n", + "ax[2].set_title(\"PMF of Bi(5, 0.7)\");" + ] + }, + { + "cell_type": "markdown", + "id": "394172b1", + "metadata": { + "editable": true, + "slideshow": { + "slide_type": "slide" + }, + "tags": [] + }, + "source": [ + "## Example 3\n", + "\n", + "We say $X$ follows discrete uniform distribution $DU([1, n])$ and we write $X \\sim DU([1, n])$ if\n", + "$$\n", + "P(X = k) = \\frac1n\n", + "$$" + ] + }, + { + "cell_type": "markdown", + "id": "9a845a49", + "metadata": { + "editable": true, + "slideshow": { + "slide_type": "slide" + }, + "tags": [] + }, + "source": [ + "## Example 4\n", + "\n", + "Consider an urn with $w$ white balls and $b$ black balls. We draw $n$ balls out of the urn at random without replacement. Let $X$ be the number of white balls in the sample. What is the distribution of $X$? What is its PMF?" + ] + }, + { + "cell_type": "markdown", + "id": "dad5a285", + "metadata": { + "editable": true, + "slideshow": { + "slide_type": "slide" + }, + "tags": [] + }, + "source": [ + "## Solution 4\n", + "\n", + "If $X \\sim HGeom(w, b, n)$, then\n", + "$$\n", + "\\mathbb{P}(X = k) = \\frac{\\begin{pmatrix}w\\\\k\\end{pmatrix}\\begin{pmatrix}b\\\\n-k\\end{pmatrix}}{\\begin{pmatrix}w+b\\\\n\\end{pmatrix}}\n", + "$$" + ] + }, + { + "cell_type": "code", + "execution_count": 4, + "id": "b3cb010c-9b8d-476f-9494-433e5cd06b77", + "metadata": { + "editable": true, + "slideshow": { + "slide_type": "slide" + }, + "tags": [] + }, + "outputs": [ + { + "data": { + "image/png": "", + "text/plain": [ + "
" + ] + }, + "metadata": { + "image/png": { + "height": 466, + "width": 1305 + } + }, + "output_type": "display_data" + } + ], + "source": [ + "fig, ax = plt.subplots(1,3,figsize=(16,5))\n", + "ax[0].stem(np.arange(-1, 7), sts.hypergeom.pmf(np.arange(-1, 7), 10, 3, 5))\n", + "ax[0].set_title(\"PMF of HGeom(3, 7, 5)\")\n", + "ax[1].stem(np.arange(-1, 7), sts.hypergeom.pmf(np.arange(-1, 7), 10, 5, 5))\n", + "ax[1].set_title(\"PMF of HGeom(5, 5, 5)\")\n", + "ax[2].stem(np.arange(-1, 7), sts.hypergeom.pmf(np.arange(-1, 7), 10, 7, 5))\n", + "ax[2].set_title(\"PMF of HGeom(7, 3, 5)\");" + ] + }, + { + "cell_type": "markdown", + "id": "70216d6f", + "metadata": { + "editable": true, + "slideshow": { + "slide_type": "slide" + }, + "tags": [] + }, + "source": [ + "## Example 5\n", + "\n", + "What is the difference between hypergeometric and binomial distributions?" + ] + }, + { + "cell_type": "markdown", + "id": "f0414680", + "metadata": { + "editable": true, + "slideshow": { + "slide_type": "fragment" + }, + "tags": [] + }, + "source": [ + "Reminder:\n", + "- Binomial story: Consider and urn with $w$ white balls and $b$ black balls. We draw $n$ balls from the urn with replacement. Let $X$ be the number of white balls in the sample.\n", + "- Hypergeometric story: Consider an urn with $w$ white balls and $b$ black balls. We draw $n$ balls out of the urn at random without replacement. Let $X$ be the number of white balls in the sample." + ] + }, + { + "cell_type": "markdown", + "id": "d9680151", + "metadata": { + "editable": true, + "slideshow": { + "slide_type": "fragment" + }, + "tags": [] + }, + "source": [ + "Bernoulli trials in Binomial story are independent. The Bernoulli trials in the Hypergeometric story are dependent, since the sampling is done without replacement." + ] + }, + { + "cell_type": "markdown", + "id": "joint-donor", + "metadata": { + "editable": true, + "slideshow": { + "slide_type": "slide" + }, + "tags": [] + }, + "source": [ + "## Example 5\n", + "\n", + "Consider $X$ and $Y$ independent $\\mathbb{Z}$-valued random variables. $\\mathbb{P}(X + Y = k) = ?$" + ] + }, + { + "cell_type": "markdown", + "id": "alive-chancellor", + "metadata": { + "editable": true, + "slideshow": { + "slide_type": "slide" + }, + "tags": [] + }, + "source": [ + "## Solution 5\n", + "\n", + "$$\n", + "\\mathbb{P}(X + Y = k) = \\sum_{m} \\mathbb{P}(X = m) \\mathbb{P}(Y = k - m)\n", + "$$" + ] + }, + { + "cell_type": "markdown", + "id": "intense-college", + "metadata": { + "editable": true, + "slideshow": { + "slide_type": "slide" + }, + "tags": [] + }, + "source": [ + "## Example 6\n", + "\n", + "Let $X \\sim Bi(n, p)$ and $Y \\sim Bi(m, p)$ be independent. What is the distribution of $Z = X + Y$?" + ] + }, + { + "cell_type": "markdown", + "id": "temporal-member", + "metadata": { + "editable": true, + "slideshow": { + "slide_type": "slide" + }, + "tags": [] + }, + "source": [ + "## Solution 6\n", + "\n", + "$$\n", + "\\begin{aligned}\n", + "\\mathbb{P}(X + Y = k) & = \\sum_j \\begin{pmatrix}n\\\\j\\end{pmatrix} p^j (1-p)^{n-j} \\begin{pmatrix}m\\\\k-j\\end{pmatrix} p^{k-j} (1-p)^{m-k+j} = \\\\\n", + "& = p^{k} (1-p)^{n+m-k} \\sum_j \\begin{pmatrix}n\\\\j\\end{pmatrix} \\begin{pmatrix}m\\\\k-j\\end{pmatrix} = \\\\\n", + "& =\\begin{pmatrix}n+m\\\\k\\end{pmatrix} p^{k} (1-p)^{n+m-k}\n", + "\\end{aligned}\n", + "$$\n", + "\n", + "$$\n", + "Z \\sim Bi(n+m, p)\n", + "$$" + ] + } + ], + "metadata": { + "celltoolbar": "Slideshow", + "kernelspec": { + "display_name": "Python 3 (ipykernel)", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.12.0" + } + }, + "nbformat": 4, + "nbformat_minor": 5 +} diff --git a/Seminar_materials/seminar_04/Seminar 4 (Random variables).pdf b/Seminar_materials/seminar_04/Seminar 4 (Random variables).pdf new file mode 100644 index 0000000..fff0d65 Binary files /dev/null and b/Seminar_materials/seminar_04/Seminar 4 (Random variables).pdf differ diff --git a/Seminar_materials/seminar_04/Seminar 4 (Random variables).slides.html b/Seminar_materials/seminar_04/Seminar 4 (Random variables).slides.html new file mode 100644 index 0000000..c8dd498 --- /dev/null +++ b/Seminar_materials/seminar_04/Seminar 4 (Random variables).slides.html @@ -0,0 +1,7823 @@ + + + + + + + +Seminar 4 (Random variables) slides + + + + + + + + + + + + + + + + + +
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ + + diff --git a/home_assignments/MSAI_Prob_HW1.pdf b/home_assignments/MSAI_Prob_HW1.pdf deleted file mode 100644 index 867d6a5..0000000 Binary files a/home_assignments/MSAI_Prob_HW1.pdf and /dev/null differ diff --git a/home_assignments/MSAI_Prob_HW2.pdf b/home_assignments/MSAI_Prob_HW2.pdf deleted file mode 100644 index 002aab9..0000000 Binary files a/home_assignments/MSAI_Prob_HW2.pdf and /dev/null differ diff --git a/practice_problems/MSAI_Prob_Seminar_10_practice_problems.pdf b/practice_problems/MSAI_Prob_Seminar_10_practice_problems.pdf deleted file mode 100644 index 4413105..0000000 Binary files a/practice_problems/MSAI_Prob_Seminar_10_practice_problems.pdf and /dev/null differ diff --git a/practice_problems/MSAI_Prob_Seminar_11_practice_problems.pdf b/practice_problems/MSAI_Prob_Seminar_11_practice_problems.pdf deleted file mode 100644 index 2bfe82a..0000000 Binary files a/practice_problems/MSAI_Prob_Seminar_11_practice_problems.pdf and /dev/null differ