Skip to article frontmatterSkip to article content

Assessing predictive error

Test error, train error

A dataset was gathered and split into two groups of samples: D1\mathscr{D}_{1} with n1n_{1} points and D2\mathscr{D}_{2} with n2n_{2} points. A regression function estimate f^A\hat{f}_A was fit using D1\mathscr{D}_1. Another regression function estimate f^B\hat{f}_B was also fit using D1\mathscr{D}_1. The following quantities were computed:

1n1(x,y)D1(f^A(x)y)2=101n2(x,y)D2(f^A(x)y)2=401n1(x,y)D1(f^B(x)y)2=201n2(x,y)D2(f^B(x)y)2=30\begin{aligned} \frac{1}{n_{1}}\sum_{(x,y)\in\mathscr{D}_{1}}\left(\hat{f}_A(x)-y\right)^{2} &= 10\\ \frac{1}{n_{2}}\sum_{(x,y)\in\mathscr{D}_{2}}\left(\hat{f}_A(x)-y\right)^{2} &= 40\\ \frac{1}{n_{1}}\sum_{(x,y)\in\mathscr{D}_{1}}\left(\hat{f}_B(x)-y\right)^{2} &= 20\\ \frac{1}{n_{2}}\sum_{(x,y)\in\mathscr{D}_{2}}\left(\hat{f}_B(x)-y\right)^{2} &= 30 \end{aligned}

Given this information, which estimate would generally be preferred, f^A\hat f_A or f^B\hat f_B?

Test error, train error, II

A dataset was gathered and then split into a train set D1\mathscr{D}_{1} with n1n_{1} points and a test set D2\mathscr{D}_{2} with n2n_{2} points. A regression function estimate f^\hat{f} was trained using the training dataset. Then two forms of error were computed, as follows.

ϵ1=1n1(x,y)D1(f^(x)y)2ϵ2=1n2(x,y)D2(f^(x)y)2\begin{aligned} \epsilon_{1} & =\frac{1}{n_{1}}\sum_{(x,y)\in\mathscr{D}_{1}}\left(\hat{f}(x)-y\right)^{2}\\ \epsilon_{2} & =\frac{1}{n_{2}}\sum_{(x,y)\in\mathscr{D}_{2}}\left(\hat{f}(x)-y\right)^{2} \end{aligned}

True or false: in typical cases, we expect ϵ1\epsilon_{1} to be greater than ϵ2\epsilon_{2}.

Estimator bias

Imagine you are given an estimator and a dataset. Using test/train splits and/or the bootstrap, we can usually get an accurate assessment of the estimator’s bias on this dataset. True or false?

Mean squared error

Consider a prediction problem with one categorical predictor X{1,2,M}X \in \{1,2,\ldots M\} and one continuous response YRY\in \mathbb{R}. You obtain an estimate f^\hat f for the regression function E[YX=x]\mathbb{E}[Y|X=x]. True or false:

E[(Yf^(X))2]=1Mx=1ME[(Yf^(X))2X=x].\mathbb{E}[(Y-\hat f(X))^2] = \frac{1}{M}\sum_{x=1}^M \mathbb{E}[(Y-\hat f(X))^2|X=x].

Mean squared error and heteroskedasticity

Consider a prediction problem with one continuous response YRY\in \mathbb{R}. You obtain an estimate f^\hat f for the regression function E[YX=x]\mathbb{E}[Y|X=x]. Let x1,x2x_1,x_2 denote two possible values for the input. True or false: if there is no heteroskedasticity in the true relationship between XX and YY, then we know that

E[(Yf^(X))2X=x1]=E[(Yf^(X))2X=x2]\mathbb{E}[(Y-\hat f(X))^2|X=x_1] = \mathbb{E}[(Y-\hat f(X))^2|X=x_2]

Quality of an estimate for the conditional distribution

Let p^(yx)\hat p(y|x) denote an estimate of the conditional distribution of a binary response YY given an input XX. Which of the following are traditional tools for measuring the quality of this estimate?

  1. Log likelihood

  2. Misclassification rate (of the hard classifier y^(x)=argminyp^(yx)\hat y(x) = \arg \min_y \hat p(y|x))

  3. Squared error

  4. Mean absolute error

  5. AUROC

Low train error, high test error

If you observe low training error and high validation error for your model, you might want to adjust your estimator in ways that increase its bias. True or false?