fundamental statistics learning note(20)

Bayes’ rule revisited

How is $p(H_Otext{ is true})$ related to $p(text{reject } H_O|H_Otext{ is true})?$
Ans: According to Bayes’ rule $$p(text{reject } H_O|H_Otext{ is true}) = frac{p(text{reject } H_Ocap H_Otext{ is true})}{p(H_Otext{ is true})} = frac{p(H_Otext{ is true}|text{reject } H_O)p(text{reject } H_O)}{p(H_Otext{ is true})}$$

Limitations of classical (frequentist) statistics
For illstration suppose we have the hypothesis is $H_O: theta leq 1$. Then $p(H_Otext{ is true})=p(thetaleq 1)$. But it doesn’t make sense as $theta$ is not an random variable, it is an unknown constants. It is why we don’t directly use classical $p(H_Otext{ is true})$ and use p-value instead.

Bayesian statistics

The Bayesian approach to statistical inference is to treat all unknown as random variable. This includes both the usual random variables $X_1,X_2,dots,X_n$ representing the data and the parameter(s) $theta$.
The probability like $p(thetaleq 1)$ make sense and $p(H_Otext{ is true})$ can be calculated in Bayesian statistical inference, which makes hypotehis testing more interpretable.

Batesian inference

All unknows $X_1,X_2,dots,X_n$ (data) and $theta$ (parameter) are random vairable, so they have a joint distribution.
$p(theta, X_1,X_2,dots,X_n)$ is joint distribution of data and parameter.
$$p(theta, X_1,X_2,dots,X_n) = underbrace{f(X_1,X_2,dots,X_n|theta)}_{text{likelihood function }}underbrace{pi(theta)}_{underbrace{text{prior}}_{text{before data}}}=underbrace{pi(theta|X_1,X_2,dots,X_n)}_{underbrace{text{posteiror}}_{text{after data}}}underbrace{m(X_1,X_2,dots,X_n)}_{underbrace{text{marginal distribution of data}}_{text{does not depend on }thetatext{, ignore for estimating }theta}}$$
Here $pi$ is a classical statistical inference notation, $f$ and $m$ are just assumptive probability function.
$$Rightarrow pi(theta|X_1,X_2,dots,X_n) = frac{f(X_1,X_2,dots,X_n|theta)pi(theta)}{m(X_1,X_2,dots,X_n)} propto f(X_1,X_2,dots,X_n|theta)pi(theta)$$