Measure theory and probability

Posted on April 2, 2019
Tags: probability

Probability playground

1 Intro

\[ X = \{a,b\} \qquad \mathcal{A} \subseteq P(X) \] \[ X = \mathbb{N} = \{1,2,3,4,5..\} \qquad P(X) = \{\emptyset,\{1,2\},\{1,3\},..\}\tag{example}\] \[\mathcal{A}= \{\emptyset , \{1\},\{1,3,5\},\{1,3,5,7\}..\}\]

A trivial smallest σ-algebra is \(\mathcal{A} = \{\emptyset , X\}\)
A trivial largest σ-algebra is \(\mathcal{A} = P(X)\)

\(\mathcal{A}\) is a σ-algebra aka A :: σ-algebra(X)

For a given set we may have multiple perspectives of what can be measured aka we can have multiple σ-algebras \(\mathcal{A}\) on \(X\).

1.1 Borel-Sigma algebra and closure

\[X=\{1,2,3\} \qquad \mathcal{M}=\{\{2\}\}\] \[ \mathcal{M} \cup \{\emptyset, X\} \tag{add empty and base set}\] \[ X/\{2\} = \{1,2\} \quad \sigma (\mathcal{M}) = \mathcal{M}\cup \{\emptyset, X\} \cup \{\{1,2\}\} \tag{add complement of measurable sets}\] Given a subset of \(P(X), \mathcal{M}\) which is not a σ-algebra, what is the minimal measurable sets we need to add to turn it into a σ-algebra? \(\sigma (\mathcal{M})\) is the result and is called the Borel-Sigma algebra.

  • Let \((X,\mathcal{M})\) be a topological space or a metric space (more importantly \(\mathcal{T}\) has open sets)
    • then \(B(X) := \sigma (\mathcal{M})\) is the Borel-Sigma algebra generated by open sets
    • \(\sigma (\mathcal{M})\) is the σ-algebra closure of \(\mathcal{M}\)

2 What is a measure

3 Terms

Stats

Statistical Test is all about asking do these 2 samples come from the same population?

3.1 Notation

  • X~N(0,1) means random variable \(X\) has the normal distribution of mean 0 and variance of 1
    • \(X\) is a random variable means we choose a subset aka sample \(X\) from a population.
      • This subset can take on many different RANDOM VARying combinations of values aka “random variable”.
    • Wiki: Random Variable is any function that maps from the Sample Space to a Real number.
      • Sample Space is just the possible sample subsets of the population.

Population Mean

  • \(\mu = E(Y_i)\)

Sample Mean

  • \(\bar{x}\~N(\mu,\frac{\sigma^2}{n})\)
  • How does a sample mean have a distribution?
    • The sample mean is a RANDOM VARIABLE, not a constant, since it’s value will differ depending on the subset of population sampled. This variability allows thie sample mean to have a distribution.
      • The meaning of a normally distributed sample means is
        “the sampled mean has some probability of falling within some interval which follows a normal distribution”

Parameters

  • \(\mu\) mu is the mean
  • σ sigma is std (z-score = sigma)
  • \(\sigma^2\) sigma squared is variance

3.2 Z-test T-test ANOVA

  • z-test is closest to normal dist.
  • t-test is similar to z-test but takes into account degrees of freedom.
  • ANOVA-analysis of variance is basically t-test but with more than 2 populations

Tails

  • 2-tail test for Alt Hypothesis inequality
  • 1-tail test for Alt Hypothesis gt or lt

Multiple Regression vs Multivariate regression

  • Multiple regression means more than one independent variable
    • Age, Weight, Height as predictors for one independent variable GPA
  • Multivariate means more than one dependent variable

independent random variable = Subset
Note these Subsets can come from the same or different populations.

4 P-value

4.1 Example p-value of fair coin

p-value for 2 heads: Probability of Event + Probability of Equally rare Event + Probability of More Rarer Events = Prob(HH) + Prob(TT) + 0 = 0.5

Notice even though Probability of 2 heads is only Prob(HH)=0.25, the p-value is 0.5

p-value is almost like inverse Shannon entropy. High p-value means not surprising something is “fair” or Equal(Null hypothesis).

5 z-score

z-score is used when you can normalize your dataset to a 0 mean and 1 std

     ____
   /      \ 
 /+|      |+\
/++|      |++\
  -1  0   1     z-score  

\[P(X \lt -1)+P (X \gt 1) = \text{zscore of 1}\]

p-value = 0.3173 aka 31.73% probability or area under the z-distribution curve

Notice how the inverse 1-0.3173 = 0.6827 is around 68% which aligns with the 68-95-99.7 rule

     ____
   /      \ 
 /+|        \
/++|         \
  -1  0   1     z-score

\[P(X \lt -1) = \text{zscore of -1 one-tailed}\]

p-value = 0.1586 aka 15.86% probability or area under the z-distribution curve

     ____
   /++++++\ 
 /++++++++| \
/+++++++++|  \
  -1  0   1     z-score

\[P(X \gt 1) = \text{zscore of 1 one-tailed}\]

p-value = 0.8413 which is 84.13% probability or 84.13% under the z-distribution curve

6 CLT

7 Distributions