We have the results from our survey.
Our sample has 30 records but with basic data cleansing we will be working with 28.
ggplot(results)+ggopts+
aes(x = Age, y=..count..)+
geom_histogram(bins = 10, fill="white", colour="white")
We could take some measure of central tendency to predict the age of attendees.
averages<-training[,.(Mean=floor(mean(Age))
,Median=floor(median(Age))
,Mode=Mode(Age)
)]
knitr::kable(averages)
Mean | Median | Mode |
---|---|---|
38 | 37 | 45 |
holdout.lse<-melt(holdout, measure.vars = colnames(averages))
holdout.lse[,Error:=(Age-value)^2]
knitr::kable(holdout.lse[,.(LSE=sum(Error)), variable][order(LSE)])
variable | LSE |
---|---|
Median | 279 |
Mean | 348 |
Mode | 1223 |
training[,expLMres:=expLM$fitted]
ggplot(training, aes(x=Experience, y=Age))+
geom_point()+
geom_line(aes(y=expLMres),colour="blue")+
theme_minimal()
holdout.lse<-melt(holdout, measure.vars = c("expLMres",colnames(averages)))
holdout.lse[,Error:=(Age-value)^2]
knitr::kable(holdout.lse[,.(LSE=sum(Error)), variable][order(LSE)])
variable | LSE |
---|---|
expLMres | 225.4772 |
Median | 279.0000 |
Mean | 348.0000 |
Mode | 1223.0000 |
fieldLM<-lm(Age~Experience + Field, training)
summary(fieldLM)
##
## Call:
## lm(formula = Age ~ Experience + Field, data = training)
##
## Residuals:
## Min 1Q Median 3Q Max
## -7.685 -4.402 -1.991 3.977 14.315
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 26.6427 4.1755 6.381 9.11e-06 ***
## Experience 0.6209 0.2580 2.407 0.0285 *
## FieldBI 4.7124 5.6953 0.827 0.4202
## FieldDBA 5.4458 5.3848 1.011 0.3269
## FieldOther 3.6241 5.5618 0.652 0.5239
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 6.968 on 16 degrees of freedom
## Multiple R-squared: 0.4548, Adjusted R-squared: 0.3185
## F-statistic: 3.337 on 4 and 16 DF, p-value: 0.03621
training[,fieldLMres:=fieldLM$fitted]
ggplot(training, aes(x=Experience, y=Age, group=Field, colour=Field))+
geom_point()+
geom_line(aes(y=fieldLMres, group=Field),colour="blue")+
facet_wrap(~Field)+
theme_minimal()
holdout.lse<-melt(holdout, measure.vars = c("fieldLMres","expLMres",colnames(averages)))
holdout.lse[,Error:=(Age-value)^2]
knitr::kable(holdout.lse[,.(LSE=sum(Error)), variable][order(LSE)])
variable | LSE |
---|---|
fieldLMres | 134.0936 |
expLMres | 225.4772 |
Median | 279.0000 |
Mean | 348.0000 |
Mode | 1223.0000 |