Function Reference

ROCKS.BCDiag
ROCKS.accuracyplot
ROCKS.bcdiag
ROCKS.biasplot
ROCKS.concordance
ROCKS.cumliftable
ROCKS.cumliftcurve
ROCKS.ksplot
ROCKS.kstest
ROCKS.liftable
ROCKS.liftcurve
ROCKS.ranks
ROCKS.rocplot

ROCKS.BCDiag — Type

BCDiag

A structure of diagnostic properties of a Binary Classifier, facilitates summary plots and tables.

source

ROCKS.accuracyplot — Method

accuracyplot(x::BCDiag; util=[1, 0, 0, 1])

Using util values for [TP, FN, FP, TN], produce accuracy plot and its [max, argmax, argdep].
Default util values of [1, 0, 0, 1] gives the standard accuracy value of (TP+TN)/N.

source

ROCKS.bcdiag — Method

bcdiag(target, pred; groups = 100, rev = true, tie = 1e-6)

Perform diagnostics of a binary classifier.
target is a 2 level categorical variable, pred is probability of class 1.
groups is the number of bins to use for plotting/printing.
rev = true orders pred from high to low.
tie is the tolerance of pred where values are considered tied.
Returns a BCDiag struct which can be used for plotting or printing:

biasplot is calibration plot of target response rate vs. pred response rate
ksplot produces ksplot of cumulative distributions
rocplot plots the Receiver Operating Characteristics curve
accuracyplot plots the accuracy curve with adjustable utility
liftcurve is the lift curve
cumliftsurve is the cumulative lift surve
liftable is the lift table as a DataFrame
cumliftable is the cumulative lift table as a DataFrame

source

ROCKS.biasplot — Method

biasplot(x::BCDiag)

return bias calibration plot of x - actual response vs. predicted response

source

ROCKS.concordance — Function

concordance(class, var, tie)

Computes concordant, tied and discordant pairs.
class can be either a BitVector or a 2 level categorical target variable in which case true is defined by the last value in sorted sequence.
var is a Vector of predictor, same length as class,
tie (optional) can be a number (default is 1e-6) that defines a tied region, or it can be a function that when called with a scalar value will return a tuple of lower bound and upper bound of a tied region, useful when you want to do percentage tied region for instance.

Pair-wise comparison between class 1 with class 0 values are made as follows:

class 1 value > class 0 value is Concordant
class 1 value ≈ class 0 value (within tie) is Tied
class 1 value < class 0 value is Discordant

Returns:

concordant, number of concordant comparisons
tied, number of tied comparisons
discordant, number of discordant comparisons
auroc, or C, is (Concordant + 0.5Tied) / Total comparisons; same as numeric integration of ROC curve
gini, 2C-1, also known as Somer's D, is (Concordant - Discordant) / Total comparisons

Concordance calculation is the same as numeric integration of the ROC curve, but it allows for fuzzy tied regions which can be useful.

Note:

Goodman-Kruskal Gamma is (Concordant - Discordant) / (Concordant + Discordant)
Kendall's Tau is (Concordant - Discordant) / (0.5 x Total count x (Total count - 1))

source

ROCKS.cumliftable — Method

cumliftable(x::BCDiag)

return cumulative lift table of x as a DataFrame

source

ROCKS.cumliftcurve — Method

cumliftcurve(x::BCDiag)

return cumulative lift curve plot of x - cumulative actual and predicted vs. depth

source

ROCKS.ksplot — Method

ksplot(x::BCDiag)

return KS plot of x - CDF1 (True Positive) and CDF0 (False Positive) versus depth

source

ROCKS.kstest — Method

kstest(class, var; rev = true)

Calculate empirical 2 sample Kolmogorov-Smirnov statistic and its location.
class is a 2 level categorical variable, var is the distribution to analyze.

Returns:

n, total number of observations
n1, number of observations of class 1
n0, number of observations of class 0
baserate, incidence rate of class 1
ks, the maximum separation between the two cumulative distributions
ksarg, the value of var at which maximum separation is achieved
ksdep, depth of ksarg in the sorted values of var

rev = true counts depth from high value towards low value

source

ROCKS.liftable — Method

liftable(x::BCDiag)

return lift table of x as a DataFrame

source

ROCKS.liftcurve — Method

liftcurve(x::BCDiag)

return lift curve plot of x - actual and predicted versus depth

source

ROCKS.ranks — Method

ranks(x; groups = 10, rank = tiedrank, rev = false)

Return a variable which bins x into groups number of bins.
The rank keyword allows different ranking method;
use rev = true to reverse sort so that small bin number is large value of x.
Missing values are assigned to group missing.

Default values of rank = tiedrank and rev = false results in similar grouping as SAS PROC RANK groups=n tied=mean.

source

ROCKS.rocplot — Method

rocplot(x::BCDiag)

return ROC plot of x - CDF1 (True Positive) vs. CDF0 (False Positive)

source