6 `validate()`

The glance() function from broom had a vague label for the F statistic (simply “statistic”) and lacked any kind of pseudo R-squared for logistic regressions.

Furthermore, while the same function is friendly for data frames, its wide form is cumbersome for quickly ascertaining model validity. Thus, validate() produces similar output as a column vector, adding McFadden’s pseudo R-squared and the apparent error rate–defined as the ratio of the number of incorrect predictions to correct ones (i.e. number incorrect / number correct)–for logistic regressions. Those who wish to have the values in the format of broom can always transpose the vector. Alternatively, converting the output to a dataframe is simple by setting dataframe = TRUE in the function.

Output definitions are in the help file associated with this function.

6.1 Case 1: OLS

model.lm <- lm(data = mtcars, formula = mpg ~ wt + gear)

validate(model.lm)

##                   model.lm
## n                32.000000
## rsq               0.753842
## adj.rsq           0.736866
## F.stat           44.405361
## df.num            3.000000
## df.den           29.000000
## p.value           0.000000
## residual.median  -0.293202
## residual.mean     0.000000
## residual.sd       2.990226
## residual.se       0.528602
## rmse              2.943133
## mad               1.943778
## mae               2.353567
## medianpe         -0.016107
## mpe              -0.015267
## sdpe              0.161915
## sepe              0.028623
## AIC             167.898446
## BIC             173.761389
## loglik          -79.949223

model.lm <- lm(data = mtcars, formula = mpg ~ wt + gear)

validate(model.lm, TRUE) # data frame

##          statistic   model.lm
## 1                n  32.000000
## 2              rsq   0.753842
## 3          adj.rsq   0.736866
## 4           F.stat  44.405361
## 5           df.num   3.000000
## 6           df.den  29.000000
## 7          p.value   0.000000
## 8  residual.median  -0.293202
## 9    residual.mean   0.000000
## 10     residual.sd   2.990226
## 11     residual.se   0.528602
## 12            rmse   2.943133
## 13             mad   1.943778
## 14             mae   2.353567
## 15        medianpe  -0.016107
## 16             mpe  -0.015267
## 17            sdpe   0.161915
## 18            sepe   0.028623
## 19             AIC 167.898446
## 20             BIC 173.761389
## 21          loglik -79.949223

6.2 Case 2: GLM (logit)

model.glm <- glm(formula = am ~ mpg + wt, mtcars, 
                 family  = binomial(link = 'logit'))

validate(model.glm)

##                   model.glm
## n                 32.000000
## pseudo.rsq.mcfad   0.602490
## aer                0.062500
## null.deviance     43.229733
## residual.deviance 17.184255
## df.null           31.000000
## df.residual       29.000000
## residual.median   -0.046842
## residual.mean     -0.044152
## residual.sd        0.743181
## residual.se        0.131377
## rmse               0.732808
## mad                0.384793
## mae                0.508942
## medianpe           1.024399
## mpe                0.881613
## sdpe               0.482775
## sepe               0.085343
## AIC               23.184255
## BIC               27.581463
## loglik            -8.592128

# Note the inapplicability of the percent error (pe) statistics.

6.3 Case 3: NLS

model.nls <- nls(Ozone ~ theta0 + Temp^theta1, airquality)

validate(model.nls)

##                         model.nls
## n                      116.000000
## iterations               4.000000
## convergence_tolerance    0.000001
## sigma                   23.624178
## df.sigma               114.000000
## residual.median         -0.684547
## residual.mean           -0.000002
## residual.sd             23.521240
## residual.se              2.183892
## rmse                    23.419636
## mad                     15.047691
## mae                     17.120045
## medianpe                -0.011579
## mpe                     -0.287614
## sdpe                     1.161944
## sepe                     0.107884
## AIC                   1066.823097
## BIC                   1075.083868
## loglik                -530.411549

6 validate()

6.1 Case 1: OLS

6.2 Case 2: GLM (logit)

6.3 Case 3: NLS

6 `validate()`