Default Prediction Model for Emerging Capital Market Service Companies

The author tested the hypothesis that default prediction based on financial data may be inapplicable to Russian service sector organizations by analyzing the differences in the accuracy of models based solely on financial data for service providers from Russia and developed European countries. Logistic regression, Random Forest and K-nearest neighbors machine learning methods were used as modeling tools on a sample of 404 Russian firms and 304 firms from developed European countries. The results suggest that the prediction error is significantly higher in the case of Russian firms than in the case of firms from the control group (European service firms). Thus, the use of financial ratios for default prediction for service firms in Russia seems insufficient. These findings can be used by organizations that provide credit scoring, and by any other market participants interested in the financial stability assessment of their counterparties


Introduction
The conventional approach to default prediction implies using financial ratios as determinants of defaults.Since the late 1960s numerous researchers have demonstrated that financial ratios are good default predictors, starting with the famous paper of Edward Altman [1] and ending with some recent papers of both foreign [2] and Russian researchers [3; 4].
The set of financial ratios used as default predictors has also expanded.The researchers have added non-trivial predictors, such as the growth rate of income [18] or the standard deviation of stock returns [19].Some researchers also prove that non-financial predictors can improve prediction accuracy [20][21][22][23][24][25][26].However, there are still very few papers that deal with non-financial predictors of default in general and related to Russian firms in particular.One possible explanation of this fact could be the high predictive power of conventional default prediction models (based on financial ratios).
At the same time, using only financial ratios for default prediction seems to be inefficient in case of developing economies, and namely in case of the Russian service sector.It seems that the financial reporting of service firms in Russia does not always reflect the real condition of the business.First of all, some operations may be undisclosed, or there may be certain falsifications.Secondly, one firm may comprise several legal entities, and the managers are free to distribute revenues, expenditures, debt and capital between legal entities at will.These factors may make financial reporting biased and, hence, irrelevant to default prediction.Thus, the prediction accuracy may turn out to be low.
In this paper I compared the prediction accuracy of Logit Regression, K-Nearest Neighbors and Random Forest classification algorithms, trained on a set of Russian service firms, as well as on service firms from developed European markets.The algorithms were trained on the financial ratios of defaulted service firms, reported for the year preceding the year of default, and the financial ratios of non-defaulted firms.The firms from developed European markets were used as the control group.It was expected that the accuracy of prediction will be lower for Russian service firms, because of the likely bias in financial reporting, caused by shadow operations and business disaggregation, than for developed European markets' firms, which seem not to have the mentioned features.Hence, the purpose of this study is to estimate the potential default prediction accuracy for Russian service firms if only financial data is used as predictors and to compare it with that for developed European markets' firms.After performing such an analysis, it would be possible to judge whether financial ratios can be used for predicting the default of Russian service firms.
In the next section I provide a review of literature related to default prediction, and subsequently explain why the service sector was selected for this analysis.In the Theoretical framework section I provide a more detailed explanation of why financial ratios do not seem to be reliable default predictors for Russian service firms, and in the Research methods section I describe the data and the algorithms used.Finally, I present and discuss the results of modelling in the "Results" section.

Literature review
Default prediction for firms has existed for over 50 years, starting from the first credit risk model developed by W. Beaver [27].In an attempt to increase prediction accuracy, it has been evolving in two major domains: methods and explanatory variables.
Firstly, following the development of statistical techniques and econometrics, the researchers started to use more advanced modelling techniques, starting with Edward Altman [1], who implemented Multiple Discriminant Analysis approach, proceeding with James Ohlson [5], who was probably the first to use Logistic Regression to create a default probability assessment model.Logistic Regression (Logit) and a similar algorithm -Probit Regression -were commonly used by the 20th century researchers and are still used nowadays [4; 8; 10; 17], mostly because of their simplicity, given that these are linear algorithms.However, the currently used Machine Learning algorithms seem to be the leading framework in default prediction studies.
There are many different Machine Learning algorithms that are used for default prediction purpose, however, based on the analyzed literature, the most popular are Artificial Neural Networks [14; 28] and Support Vector Machine [18; 25].
One of the contributions of this paper is the implementation of the Random Forest Algorithm as the underlying default prediction technique.This algorithm seems to be underused in default prediction studies, despite its high performance demonstrated by previous researchers [29; 30].
A separate area of research within default prediction is credit rating modelling [31; 32].The models are based on financial data for corporations and macroeconomic data and is applicable mostly to public firms, because of the significant influence of market capitalization on the credit rating.
The second development vector for default prediction is expanding the set of explanatory variables -going beyond the use of only financial data.This development vector is relatively new, a "novel trend in this field" [21].According to Altman [23], there was no research in this field for small and medium enterprises at all before 2010.
There are no restrictions on the use of any data available for analyzed firms to predict defaults, and the researchers are starting to utilize these available data.The examples of such variables are indicators related to text published in news or disclosures of a firm (e.g.sentiment level or the use of certain words) [25; 26], as well as legal claim-related [21], corporate governance [20], CSR measures [22], and audit report (e.g.sentiment level, number of auditor's comments, etc.) indicators [24].
Based on the analyzed literature use of non-financial data does not seem to replace conventional approach (based on financial data), especially since there are few papers related to Russian firms.This fact can potentially be explained by the high accuracy of default prediction based on financial data.However, as it is demonstrated further in this paper, the approach based on financial data may show poor performance in regard to Russian firms, and in this case the use of non-financial data may prove to be a good solution.

Defaults in the Russian service sector
I chose the Russian service sector for this analysis because the need for accurate default prediction is especially relevant in this sector.First of all, in 2015-2020 the overall number of bankruptcies has decreased, while the share of service sector bankruptcies in the overall number of cases increased (see Figures 1 and 2).Hereinafter, the year 2020 is not taken into account because of the bankruptcy moratorium in Russia due to COVID-19 pandemic.Secondly, the share of debts paid out to the creditors during default procedures in the service sector is among the lowest across industries.In 2019 this ratio was only 3,4% (less than than the average of 4,7%) (Figure 3).It means that in the case of default the expected amount of debt repayment per 100 RUB borrowed is only 3,4 RUB.Also, the firms in the service sector tend to have debts more often than those in any other industry.According to the research conducted by Centre for Strategic Development1 , 55% of service firms have debts, while the market average is 36% (Figure 4).This may be an indicator of the higher credit risk of service sector industries compared with other industries.
The increasing number of defaults and the low rate of debt repayment in case of a default are driven by the specificities of the service sector.The sector consists of mostly B2C businesses, which means a high competition level, and therefore low margins.The average profitability of service sector is lower compared to other industries, like production, agriculture, or mining, or even negative (see Figure 5).This statement is less relevant for the medical services sector, but very relevant for such huge markets as HoRe-Ca services and personal services (which include everyday services, i.e., repairs, hairdressing, etc.).
The last but not the least argument to focus on a specific sector of economy, like the service sector, is the gap in the research related to credit risk modelling, which is expressed in the lack of industry focus in such studies as described in [31].This study aims to contribute to filling the gap for the service sector.

Theoretical framework
The statement that the financial reporting of Russian service firms does not reflect the real condition of the firms is based on two main reasons:

Business disaggregation (artificial separation) make the financial ratios biased
If a firm is divided into several legal entities, it means that it is necessary to obtain the consolidated financial reports in order to judge the condition of the entire business.On the one hand, it is not always possible to get the reports for a group of legal entities, on the other hand, some parts of a group can be presented as sole proprietors or legal entities that use the simplified taxation system and are not required to provide comprehensive reports.That is why one usually has to use data for one legal entity to analyze a firm, and it seems that this data may be biased.
The problem of business disaggregation is highly relevant for the Russian market.Small legal entities have an opportunity to reduce their tax burden using the simplified taxation system.That is why the owners often split their business into several small entities, hence, reducing the tax burden [33].The relevance of the business disaggregation problem is confirmed by the active prevention measures undertaken by the government.Since 2017, Federal Tax Service and the Investigative Committee of Russia have been actively pursuing a relevant crime detection policy, which includes continuous development and updates to disaggregation criteria [34].
The business disaggregation problem is relevant for every economic sector in Russia, including the service sector.According to a survey conducted by TaxCoach 2 , 24% of legal claims on business disaggregation in 2020 were related to service firms (Figure 6).Source: TaxCoach.

Shadow operations lead to bias in the financial ratios
In the Soviet period, there were no legal private firms in Russia that could provide services for the population.At the same time, government-backed entities did not provide certain everyday services.Thus, the needed services were provided by individuals, including repairs, transport, tutoring, etc.It was an illegal, but the sole way to get the needed services.The prolonged involvement in the shadow economy affected the concept of business culture in the minds of Russian citizens [35].Moreover, the effect is still apparent.
According to the survey by The Forum for Research on Eastern Europe and Emerging Economies (FREE Network) 3 , the volume of the shadow economy in Russia is estimated to be almost 45% of GDP.The two major types of shadow operations are the underreporting of profits and "envelope wages" (according to Tatiana Golikova 4 , Deputy Prime Minister of the Russian Federation, about 15 millions of Russian citizens receive wages off the books).According to Russian Longitudinal Monitoring Survey (HSE) 2020 5 , 16% of Russian citizens confess being paid off the books, and 51% of them receive their entire salary unofficially.
If a firm is involved in some type of shadow operations, the official financial reporting for a legal entity will be biased: the revenues may be underreported, the costs may be exaggerated etc.
Additional indirect evidence of biased financial reporting by Russian firms is offered by the weak auditing and accounting standards.According to The World Bank Global Competitiveness Index data 6 , Russian Federation is in the 100th position out of 137 countries by the strength of auditing and accounting standards (4 out 7 points earned for the question "In your country, how strong are financial auditing and reporting standards?(1 = extremely weak; 7 = extremely strong)).
Thus, these factors lead us to believe that the available financial ratios of Russian services firms may be biased, hence, use of only financial information is not sufficient to assess the credit risk in case of Russian service firms.

Data description
It is necessary to specify the industries I consider to be parts of the service sector, because there is no single definition of it.According to Great Russian Encyclopedia 7 , the service sector includes cultural, educational and domestic services.Russian Federal State Statistics Service 8 identifies postal, telecommunication, housing and utilities, medical and care, tourism, educational, cultural and legal services to be part of public service sector.In this study, I worked with firms from the following industries, which are definitely elements of the service sector: • Tourism, Accommodation and Passenger Transportation Services; • Dining & Catering; • Education; • Medical & Social Services; • Culture, Sport & Entertainment Services; • Other services (personal services, veterinary services, repair services).
I prepared two datasets.The first dataset contains information for Russian service firms, which faced financial failure from 2017 to 2020.The year when the creditor sent out the notice of intent to file an application for default, was used to identify the year of the financial failure.The data was collected from the SPARK-Interfax database, 9 and the dataset consists of 202 failed firms.Each of these firms is paired with a "healthy" one -a firm that has not defaulted.The matching criteria is the value of total firm assets.This matching criteria is commonly used by the researchers [8].
The dependent variable is a dummy variable: 1 stands for defaulted firms, 0 for "healthy" ones.The independent variables are the financial ratios of the firms (calculated for the year preceding the financial failure for defaulted firms).
The most popular financial ratios used by the researchers to create default prediction models, are the following: • Turnover ratios; • Profitability ratios; • Liquidity ratios; • Assets, equity or debt structure ratios, debt coverage ratios [36].
It turned to be impossible to include debt coverage ratios, because the value of interest payments is not available for the majority of the Russian firms in the dataset.The final list of independent variables used is provided in Table 1.Source: Prepared by the author.
The second dataset is the control group.It contains the same information, but for service firms from the developed European Union economies (152 defaulted and 152 "healthy" firms).The date of the start of insolvency proceedings was used to identify the year of the financial failure.The data was collected from the Amadeus database 10 .
I chose firms from the developed European Union countries as a control group, because the problems of shadow operations and business disaggregation are far less relevant for them.While the shadow market volume in the Emerging & Developing European countries is estimated to be around 27%, the same ratio for the European Union is two times lower (only about 14%) 11 .The countries with the lowest shadow economy ratios are: Austria, Luxembourg, Great Britain, Netherlands, France, Ireland, Island, Germany, Denmark, Sweden, Slovakia, Finland, Spain, Norway 12 .
Firms from these countries are used to form the control dataset.
As for busines disaggregation, it seems that there are no statistics for European Union, but one still can state that this problem is less relevant for the European market.Given that business disaggregation is a tool for reducing the tax burden, the attitude of the business community to tax rates can be a proxy for the level of disaggregation.According to the World Bank data 13 , 22.6% of Russian firms consider tax rates the biggest obstacle for their business.The same indicator for Austria is only 20.6%, Denmark -6.4%, Luxembourg -5.7%, Netherlands -7.4%, Ireland -13.6%, Sweden -13.4%, Slovakia -17.7%, Finland -9.5%.There is no data for other European countries on the list, but presumably they are less concerned with the business disaggregation problem, being at a higher "development level." GDP per capita is used as a proxy for the countries' "development level." GDP per capita in the rest of the countries with no data for attitude to taxes is much higher than in Russia, 14   The variables' descriptive statistics for the two datasets are provided in Table 2.One may notice that Russian financial reporting data has some specificities, e.g.extremely low profitability ratios or extremely high collection and credit periods for defaults.These specificities may be also an indicator of biased financial reporting.A decision was made not to treat the firms with extreme values as outliers, because these extreme values are taken from real financial reporting (the reporting for these firms was checked manually).

Machine Learning algorithms
I used three Machine Learning algorithms to train the data: Logistic Regression, K-Nearest-Neighbors (KNN) and Random Forest.Logistic Regression is a linear classification algorithm that is often used for the purpose of default prediction [5][6][7][8].One of the advantages of Logistic Regression is the ability to interpret the contribution of every independent variable to the prediction.KNN was chosen as probably the most simple machine learning algorithm that is frequently used in studies related to default prediction [37].The Random Forest classifier was chosen as one of the most powerful algorithms used for default prediction and scoring, as shown in the previous studies [29; 30].Logistic Regression is an algorithm that is similar to ordinary linear regression.The difference is that the predicted dependent variable can vary only from 0 to 1, while in the case of ordinary linear regression it can assume any values.For making predictions we use the logistic function (logistic curve): ( ) in the case of this study is the estimated probability of default and B 0 -B n are the linear coefficients for the independent variables (financial ratios).To transform the regression into a classification algorithm, I set the cutoff probability value (50% in this case).The observations are classified into the default group if the estimated probability is higher than 50%.
Logistic Regression is fitted using the maximum likelihood method.The optimal coefficients are chosen in order to maximize the likelihood function: ( .
which is the product of probabilities, estimated for defaults, and multiplied by the same for non-defaults [38].
The L1 type of regularization is used to limit the number of variables.It means that the sum of absolute values of coefficients is added to the minimized function The K-Nearest Neighbors classifier is one of the simplest classification algorithms.The classification is based on the classes of several (k) most similar firms from the training set.The observation is classified on the basis of a majority vote.The classification procedure consists of three steps: • Choosing the number of "neighbors".
The number of "neighbors" should not be very small (may lead to low accuracy) or very high (most of the observations in the test set will be classified as one class, which has more representatives in the training set).I used the square root of the number of observations as k, following the ap- proach recognized by researchers [39].
• Assessing distances between training and test data and identifying the "neighbors".
I use Euclidian distance to choose the nearest "neighbors", calculating it as the following: 2

Value of variablei for the observationinthetest set
Value of variablei for the observationinthetrain set • Classifying the test observation on a majority vote basis, in other words, assigning a class based on the most popular class among the "neighbors" 15 .
Due to the fact that Euclidian distance is used, data needs to be normalized before modelling.
The Random Forest classifier is an ensemble Machine Learning algorithm -an ensemble of Classification and Regression Trees (CART).An illustration of a simple CART is shown in Figure 7.
While the tree is trained, the training data is split into 2 subsamples on every node.The split is made based on a particular variable's value.The Gini index is used to choose the variables (Variable 1, Variable 2 on Figure 7) and the threshold for splitting (T 1 , T 2 on Figure 7) -the core idea is to minimize this index.The Gini Index reflects the inverse accuracy of splitting: L and R refer to subsample 1 and subsample 2 (left and right), i refers to the class (1 -defaulted, 0 -"healthy) [40].
"Forest" stands for a combination of simple decision trees, "Random" -for the fact that each tree is trained on a randomly chosen subsample from the training sample and the "splitting" variables are chosen randomly.The subsamples are formed using bootstrap.The idea underlying this method is that repeated samples are taken from the initial training sample.For every tree, the variables (Variable 1 and Variable 2 on Figure 7) are chosen from a random list of k variables, taken from the whole list of determinants.Thanks to this, the trees are not similar to each other 16 .
It is necessary to limit the number of trees and internal nodes in every tree.It was decided to train 100 trees for each training set and set the maximum number of split layers at 2.

Data preparation and modelling
The datasets contained some missing values.To get rid of them I imputed the data with mean values of the corre-sponding variables.Table 3 shows the fractions of missing values for every variable in two datasets.There are some differences, but it seems that the quality of the collected data is similar for Russian and European firms.The main hypothesis is that the mean accuracy for Russian service firms is going to be lower than for European service firms.This hypothesis was tested using the Mann-Witney test.

Results
The results demonstrate that prediction accuracy is much lower for Russian firms.The results for the three classification algorithms are provided in Figure 8.    KNN algorithm accuracy is lower in both cases: 54.8% for Russian firms and 71.7% for European firms.Classification accuracy can be considered insufficient for European firms, but it is still significantly higher than the mean accuracy for Russian firms.Figure 10 shows the distribution of KNN algorithm accuracy, calculated on randomly formed test sets for Russian and European service firms.Accuracy distribution is normal in the case of Russian firms, but not in the case of European firms (Shapiro-Wilk test p-values are 0.389 and 0.008 respectively), that is why the Mann-Witney test was used for estimating the significance of the difference in mean accuracies (Figure 10).It can be also useful to consider Type I and II errors along with overall accuracy.Table 4 provides the means of Type I and Type II errors for Russian and European datasets according to the algorithm used.The outcomes obtained through overall accuracy analysis are consistent here: both Type I and II errors are bigger in case of Russian service firms, compared with European service firms.

Conclusions
Given such results, we can state that default prediction based on financial data is less effective in the case of Russian service firms than in the case of service firms from developed European markets.The accuracy for Russian firms is 55-73%, depending on the algorithm, compared to 72-81% accuracy for the firms from developed European markets.
The results for the European dataset in terms of overall accuracy are consistent with the results of previous research [23], while the results for Russian dataset are far behind.
Thus, in case of Russian firms one should expect a higher probability of error while predicting default based on financial indicators.In other words, the results suggest that the financial ratios are worse indicators of future financial failures for Russian firms than for firms from developed markets.
The financial reporting of Russian legal entities does not reflect the real condition of firms due to two possible reasons discussed in this paper: business disaggregation and undisclosed operations.Thus, it may be beneficial to use non-financial factors, which can act as proxies for financial ratios, to improve the accuracy of classification, which can be a starting point for further research related to default prediction in Russia.
Moreover, I believe that the findings of this paper can be generalized in a sense that the conventional approach to default prediction may be inapplicable not only to Russian service firms, but for firms in other developing economies, which are facing the problem of biased financial reporting.

Figure 1 .
Figure 1.Bankruptcies in the Russian service sector, 2015-2020 (number of cases)

Figure
Figure 3. Share of debt paid out in case of default in TOP-10 industries by number of default cases, 2019 (% of total debt)

Figure 7 .
Figure 7.An example of a simple decision tree (CART) Prepared by the author.I divided each of the samples (Russian and European firms) into training and test sets.Subsequently, I trained the classification algorithms on the training sets, then applying the trained algorithms to test sets and calculated prediction accuracy.To make sure that the result is not an outlier that occurred because of specific train-test dataset split, I made 100 random train-test splits for every dataset and then trained the algorithms on every training set and calculated the accuracy on every corresponding test set.

Figure 8 .
Figure 8. Classification results for Logit, KNN and Random Forest algorithms

Table 1 .
List of independent variables

Table 2 .
Descriptive statistics of variables for the two datasets

Table 3 .
Fractions of missing values in the datasets (%) Firstly, I applied Logistic Regression to the datasets.The mean accuracy of classification is 64.4% for Russian service firms and 80.7% for the firms from the European dataset.Figure9shows the distribution of Logit algorithm accuracy calculated on the randomly formed test sets for Russian and European service firms.The distribution is visually close to normal in case of both Russian data and European data, but the Shapiro-Wilk normality test result suggests that the accuracies for European firms are not distributed normally (the p-values for Russian and European sets are 0.386 and 0.04 respectively).For instance, the Mann-Witney non-parametric test was used instead of the conventional Student test to test whether the mean accuracies differ.The Mann-Witney test p-value is close to zero (1.35*10 −33 ), which means that there is a very low probability of getting such a test statistic if the mean accuracy is the same for Russian and European firms.
The p-value of the Mann-Witney test is close to zero (4.40*10 −28 ), which means that there is a very low probability of getting such a value if the mean accuracy is the same for Russian and European firms.
The Random Forest algorithm turned to be the most accurate classifier for both Russian and European firms (Figure11).The mean accuracy of classification is 72.7% and 80.6% for Russian and European service firms, respectively.Figure11shows the distribution of Random Forest algorithm accuracy, calculated on randomly formed test sets for Russian and European service firms.The Shapiro-Wilk test results suggest that accuracy distribution is not normal for Russian firms (p-values are 0.019 for Russian firms and 0.18 for European firms), hence I used the Mann-Witney test to assess the significance of the difference in mean accuracies.The p-value of the test is close to zero (6.14*10 −23 ), which means that there is a very low probability of getting such a value if the mean accuracy is the same for Russian and European firms.

Table 4 .
Sensitivity, specificity, and Type I & II errors of classification (%)Source: Prepared by the author.