6.1 Introduction
- Statistical Inference, or “learning” as it is called in computer science, is the process of using data to infer distribution that generated the data.
6.2 Parametric and Nonparametric Models
A statistical model $\Im$ is a set of distributions (or densities or regression functions)
Parametric model : is a set of $\Im$ that can be parameterized by a finite number of parameters
If we assume that the data come from a Normal distribution, then It would be two-prarmeter model.
$$ \Im = {f(x; \mu \sigma) = \frac{1}{\sigma \sqrt{2 \pi}}} exp {-\frac{1}{2 \sigma^2}(x-\mu)^2 } $$
- In general, a parametric model takes the form
$$ \Im = {f(x; \theta) : \theta \in \Theta }$$
Non-Parametric model : is a set $\Im$ that cannot be parameterized by a finited number of parameters.
Frequentists and Bayesians : The two dominant approaches to statistical inference are called frequentists inference and Bayyesian Inference.
6.3 Fundamental Concepts in inference
Many inferential problems can be identified as being one of three types : estimation, confidence sets, or hypothesis testing.
Point Estimation : refers to providing a single “best guess” of some quantity of interest
Confidence Sets : A $1-\alpha$ confidence interval for a parameter $\theta$ is an interval $C_n = (a,b)$ where $a = a(X_1, … , X_n)$ and $b = b(X_1, … , X_n)$
Hypothesis Testing : In hypothesis testing, we start with some default theory , called null hypothesis, and we ask if the data provide sufficient evidence to reject the theory.