Applying Complementary Credit Scores to Calculate Aggregate Ranking
Researchers have been improving credit scoring models for decades, as an increase in the predictive ability of scoring even by a small amount can allow financial institutions to avoid significant losses. Many researchers believe that ensembles of classifiers or aggregated scorings are the most effective. However, ensembles outperform base classifiers by thousandths of a percent on unbalanced samples.
This article proposes an aggregated scoring model. In contrast to previous models, its base classifiers are focused on identifying different types of borrowers. We illustrate the effectiveness of such scoring aggregation on real unbalanced data.
As the effectiveness indicator we use the performance measure of the area under the ROC curve. The DeLong, DeLong and Clarke-Pearson test is used to measure the statistical difference between two or more areas. In addition, we apply a logistic model of defaults (logistic regression) to the data of company financial statements. This model is usually used to identify default borrowers. To obtain a scoring aimed at non-default borrowers, we employ a modified Kemeny median, which was initially developed to rank companies with credit ratings. Both scores are aggregated by logistic regression.
Our data Russian banks that existed or defaulted between July 1, 2010, and July 1, 2015. This sample of banks is highly unbalanced, with a concentration of defaults of about 5%. The aggregation was carried out for banks with several ratings.
We show that aggregated classifiers based on different types of information significantly improve the discriminatory power of scoring even on an unbalanced sample. Moreover, the absolute value of this improvement surpasses all the values previously obtained from unbalanced samples.
The aggregated scoring and the approach to its construction can be applied by financial institutions to credit risk assessment and as an auxiliary tool in the decision-making process thanks to the relatively high interpretability of the scores.