optiRum – gini like a wizard
optiRum, the R package I built and maintain for Optimum on CRAN has gained some extra functions recently. Some of it uses currently experimental data.table functionality so I’m eagerly awaiting the release to CRAN to deliver optiRum.
In the interim, I thought I’d give some brief overviews of existing functionality contained in the package.
I do a lot of regression models and one of the common tools for assessing a regression’s ability to accurately model an event is to produce a Gini chart and a Gini coefficient. The higher the Gini coefficient, the more your model is able to discriminate probability accurately.
I simplify the process of producing gini charts (giniChart
) and coefficients (giniCoef
) so that I get a chart in one simple step.
Under the hood this uses the AUC package to get the coefficient, scales to format it and ggplot2 to produce the chart. Using ggplot leads to a better looking chart that can also be tweaked to suit your needs since a ggplot object is returned by the function.
Two gini’s but only one lamp
There are two typical ways of calculating the gini coefficient.
The first utilises a Lorenz curve which plots the cumulative percentage of goods vs cumulative percentage of bads.
The second is to do a Receiver Operating Characteristic (ROC) curve and calculate the Area Under Curve (AUC) for this.
Notionally there is little practical difference in the result (so long as you use one or the other consistently), however, I’ve utilised the ROC method as it emphasises the ability to correctly predict good behaviour. In my industry, it is much more costly to write a bad loan, than it is to not write a good loan.
Get the package
- Install from CRAN:
install.packages(optiRum)
- Install the dev version from the github repository using devtools:
devtools::install_github("stephlocke/optiRum")