Approaches to Building Default Probability Models for Financial Instruments of Project Financing at Long Time Horizons

Project financing is one of the priority tools for stimulating the country’s economic growth around the world, which allows the implementation of large-scale and capital-intensive projects, providing favorable credit conditions with insufficient creditworthiness of the project beneficiaries. As a rule, project financing instruments are long-term (10–30 years, depending on the type of transaction), so this asset class is interesting for the implementation of the task of building long-term models for assessing credit risk associated with the introduction in 2018 of the new international financial reporting standard IFRS 9 “Financial Instruments”. The new standard requires financial institutions to calculate their expected credit loss (ECL) at the time of granting loans and other banking products exposed to credit risk, taking into account different time horizons, which significantly changes the traditional approaches to assessing credit risk by commercial banks. As part of this work, a model was built to assess the long-term probability of default for the portfolio of assets of a Russian commercial bank belonging to the project finance segment in accordance with the requirements of the International Financial Reporting standard IFRS 9 “Financial Instruments”. At present, the topic of this work is extremely relevant and may be of interest both for commercial banks that are faced with the problem of improving credit risk assessment models


Introduction
Project financing (PF) is a method of long-term borrowing for large projects by means of "financial engineering" based on loan on cash flow created only by the project without recourse to the borrower. The fundamental feature of project financing is that for implementation of a certain project a special project enterprise is established (SPV, SPE) which attracts resources (not only funds) for project implementation, implements the project and squares accounts with creditors and project investors using the funds (cash flows) generated by the project itself [1]. For decades project financing was a preferable way of financing of large-scale infrastructure projects all over the world. A series of studies emphasized its importance, especially for the countries with emerging market economies and accentuated the correlation between investment in infrastructure and economic growth.
Long-termness is a distinctive feature of project financing transactions -implementation of some projects takes 30 years. Large-scale capital-intensive projects usually require significant initial investments and generate the revenue sufficient only to cover expenses in the long-term. Thus, some authors point out in their papers that on average loans aimed at financing projects have a longer maturity than other syndicated credits [5][6][7]. This distinctive feature makes this assets class an interesting research object as a part of the task of constructing long-term models of assessing credit risk according to requirements of IFRS 9.
In the recent 15-20 years investors' interest to transactions of project financing has been growing across the globe. First of all, it is due to financing of infrastructure facilities' construction under public-private partnership (PPP): from 1999 through 2019 the volume and number of performed transactions increased more than twice ( Figure 1). Projects are increasingly financed by issue of bonds backed by cash-flows from implementation of infrastructure projects based on PPP. According to international practices default of this tool occurs less frequently even in economic turbulence periods [8].
As on September 2020 over 3,000 PPP projects were implemented in Russia. Their total value exceeds 4.5 trillion roubles and the share of private investments amounts to 3.1 trillion roubles (69%). The total value of PPP projects amounts to 44% of expenses for infrastructure which have been planned for implementation of national projects in 2019 [8]. Use of external data on defaults. This method implies a PD assessment based on migration of ratings information on which is provided by external rating agencies (S&P, Moody's, Fitch Ratings, ACRA).
In case the Bank has no statistics to build a migration matrix using internal data the migration matrix built on external data is used. Depending on the purpose of modelling statistics of one or several rating agencies may be applied.
In case of inversion in the data of external matrices the matrix is adjusted (by experts or applying mathematical methods of reducing the function to a monotone function) PD assessment on the basis of migration matrices. The migration matrix is a square matrix which elements contain probabilities of change (transition probabilities) of the rating category of a corresponding Borrower. 11 1, where ij p -probability of transition to the j rating category in a certain period of time provided it belongs to the i rating category.
The Bank uses the rating scale of internal credit ratings to build the migration matrix (Appendix 1).
The Bank does not set upper and lower bounds of the values of the probability of default. According to IFRS 9 assessment of the probability of default is unbiassed. Subsequently, the conservatism concept enshrined in the model of probability of default assessment as per IRB Basel II [10] cannot be used to calculate PD in accordance with IFRS 9 and as part of upgrade of IRB of PD models in order to meet requirements of IFRS 9 such material adjustments are excluded (adjustment "PG not less than 0.03%" established in accordance with 483-P is also excluded) [4]. This is with the exception of the RF rating adjustment (the borrower's rating is not higher than the RF rating): this adjustment is preserved.
Depending on availability of data when constructing a migration matrix enlarged or initial rating categories may be used (for example, combining of ratings 7-, 7, 7+ into one category).
Assessment of probabilities of transition is determined by cohort analysis: where ( ) ij N t -number of migrations from state I into state j in the t period; − -number of transactions in state I in the t -1 period.
The probability of default at a 1-year horizon.
A one-year migration matrix 0 M is constructed on the basis of statistics of observations for 12 calendar months. Shorter periods may be used in order to take into consideration the most relevant information.
An average one-year migration matrix is calculated by computing the arithmetic mean of one-year migration matrices obtained every quarter (month).
A one-year probability of default ( t PD ) for each rating category is defined as the likelihood of transition into the state of "10-default". In the migration matrix ( t PD ) is indicated in the last column of the one-year transition matrix.
If statistical frequency of defaults does not correspond to the probability of default in each rating grade in the Bank's master scale scaling is performed.
The adjustments performed are recorded in the Report on the Model Development.
When assessing the PD indicator on the basis of migration matrices the following main assumptions are contemplated: • further transitions to rating grades depend only on the current rating but not on previous ratings (property of Markov process); • probabilities of migration do not depend on a certain time point, i.e. the transition rates are unchanged in time (homogeneity) [9].
A formular to calculate the probability of default for the lifetime of a financial instrument: where T -lifetime of a financial instruments.
The column in the multiyear matrix which shows a probability of transition into default is the cumulative probability of default of a certain period (cPD). Use of the migration matrix allows to take into consideration complete information on migration of ratings when calculating the probability of default for the lifetime.
Profiles of cumulative PDs are made by evaluating parameters of cumulative DR distribution.
On the basis of the Weilbull distribution: Parameters of the Weilbull distribution k and λ are assessed using a linear regression of double logarithm of the survival function. The survivorship function is defined by the following formula: κ λ is a two-parameter Weilbull distribution function.
where k > 0 defines the shape of the distribution function. k < 1 is indicative of a decrease of default rate in time, k = 1 points at stability of default rate in time, k > 1 is indicative of increase of default rate in time; λ > 0 is a scale parameter which regulates survivorship time [11].
On the basis of the modified Weilbull distribution: • Modeling of cumulative PD is made by selecting such distribution parameters which describe most accurately the behaviour of cumulative default rates.
A two-parameter modified Weibull distribution function is as follows: where and 0 α β < are parameters of the modified Weibull distribution; ( ) cDR t, , α β is the cumulative default rate per year [11].
Сonstruction of the model for the project financing segment The target segment of this model is customers pertaining to the area of applying the Project Financing models in accordance with the bank's methodology.
When developing of the present model the default definition stated in the section Terms and Definitions was used. The fact of assigning to a borrower of the "10-default" rating was taken into consideration as default characteristic. In view of the above the approach to modeling of Lt PD on the basis of the migration matrices constructed of consolidated internal data on changing ratings for both segments was considered for the segments of Project Financing and Project Financing (Developers).
When combining rating groups the number of observations in a rating group amounted from 292 to 890 observations (the largest number of observations was in the "default" rating group and the smallest number of observations was in "good" rating groups). As long as it is important for the segments with the borrowers characterized by credit risk above average to have a number of observations in "bad" ratings sufficient for rating migration analysis we made the conclusion on applicability of the approach to modeling of Lt PD by means of constructing the rating migration matrix on the basis of internal data.
The approach to getting multiyear PD using rating migration matrices. Computation of an average one-year migration matrix.
The migration matrix is indicative of the likelihood that a borrower with a certain rating as at the beginning of the year will have the following: • the same rating (shown on the principal diagonal); • rating with a higher probability of default (in the migration matrix such values are indicated above the principal diagonal); • rating with a lower probability of default (in the migration matrix such values are indicated under the principal diagonal); • rating of default state (in the migration matrix such values are indicated in column D, default).
When computing a one-year rating migration matrix using the data for a period exceeding 1 year probabilities of migration of the final one-year migration matrix are obtained by averaging probabilities of migration of several matrices. Averaging is performed by calculating the arithmetic mean.
A one-year migration matrix with averaged probabilities of migration is applied as the basic matrix to compute multiyear matrices.
When empiric default rates deviate from model ones (PD of the master scale) PD of the basic migration matrix (the last column -an average one-year default rate, DR) is adjusted to harmonize with PD of the Bank's master scale.
PD adjustment of the basic migration matrix is also necessary in case of inversions (PD of "bad" ratings is lower than PD of "good" ratings).
Adjustment may be performed both by means of calculating the coefficient by which the actually obtained DR is to be multiplied / divided and by means of permutation of PD of a corresponding master scale into the last column of the basic migration matrix. If the basic migration matrix is constructed on the basis of rating groups the weighted average of the number of observations in the rating group of PD of the master scale is calculated for the last column.
After adjustment of the values of the last column the transition probabilities of the basic matrix are adjusted in such a way that the sum of probabilities of transition of each line was 100% (by proportional change of probabilities of rating transitions of each line).

Adjustment example
Sum of probabilities of transition (except for default) before adjustment = 100%, sum of probabilities of transition after adjustment (except for default) = 98.71%.
Other probabilities in the line are adjusted in the same way.

Calculation of cumulative PD estimates
Estimates of cumulative PD are obtained using migration matrices by means of raising a one-year migration matrix to the corresponding power. For example, in order to get a cumulative PD for N years it is necessary to raise the matrix to the N power in accordance with the Formula (1): where M is a migration matrix for N years;

Adjustment of Probabilities of Transition of Ratings of the Basic Migration Matrix
A probability of rating transition should decrease monotonically when moving from the principal diagonal to extreme columns of the migration matrix. It means that the probability of transition of ratings to neighboring rating groups is higher in comparison to the probability of rating transition "skipping" 2 or 3 ratings.
Probabilities of transition are adjusted using mathematical methods (for example, approximation of nonmonotonic data series by a monotone function).
In practice parameters of the function (exponential, logarithmic) used for adjustment may be selected applying Excel ("search for solution" package, trend adding).
After obtaining the cumulative probability of default for consolidated rating groups it is converted into the conditional probability of default to calculate conditional PD for each rating inside rating groups applying logarithmic interpolation.
On the basis of the conditional PD obtained at the previous stage final marginal PD is calculated (exclusive of forecasting information).
The above transitions of the probability of default profiles are performed by the following formulae.
The cumulative PD is determined as: The marginal PD is determined as: Due to a non-linear character of change of PD ratings when moving along a rating scale it is not recommended to apply the linear approach to interpolation. See below the approach to interpolation which takes into consideration the non-linear character of PD.
Interpolation consists of several main stages.
Each rating is assigned a numerical value (Table 3). Average-weighted ratings and PD corresponding to them expressed in terms of numerical values are calculated.
Weights are the number of observations in each corresponding rating (Table 4).
where j b is a numerical value of the j rating which is between the numerical values of the rating groups i and i 1 + ; i i 1 a ,a + are numerical values of rating groups i and i 1 + respectively; i,t i 1,t PD , PD + are conditional probabilities of default calculated applying migration matrices for the t period for rating groups i and i 1 + respectively (Table 5).
At the next stage adjustment is performed: PD of the first year is set equal to PD of the master scale. In order to take into consideration forecasting macroeconomic information it is necessary to adjust estimates of TTC PD ratings obtained for the model with consideration to the forecasted default rate.
PIT calibration is performed on the basis of Bayes' formula where the rating PD is scaled according to the forecasted default rate and CDT.
In order to convert one-year PD values Bayes' formula is applied.
where PD i New -PIT PD of the i rating which corresponds to a new forecasting default rate DRNew; PD i -conditional PD of rating grade i; DR New -forecasting default rate; CDT -average default rate calculated by the economic cycle.
The data source for developing the model is the corporate data warehouse (area of corporate data warehouse which stores the information of the data warehouse of CRM) with a set of presentations which contain data from various points of view (loan portfolio, agreement information etc.).
Data from the segments of Project Financing and Project Financing (Developers) was used for analysis.  Due to insufficient number of observations for individual ratings we made rating groups of 1 to 9 ratings in order to construct migration matrices.
When making rating groups we took into consideration the following: • rating groups comprise ratings close in terms of risk level; • the number of observations in a rating group should be sufficient to model probabilities of ratings transition (Table 7).

Analysis of data in terms of assigning to a stage of the project
As a part of model development we studied existence of dependence between the default rate and a project stage. When assessing borrowers' projects assigned to the segments of Project Financing and Project Financing (Developers) in the Bank the basic rating defining module comprises the factor of the Project Stage. Since the wording of factors changed in re-development of models we decided to combine the observations data in three groups: 1) А (initial financing); 2) В (work performance); 3) С (completion) ( Table 8).
Analysis showed that over 65% observations pertain to the C stage (completion) and 18% -to the B stage (work performance). Apart from that, 84.3% of projects were at the completion (C) stage out of 312 observations as at 2015-2017 (Table 9). On the basis of the results of studies we took the decision not to divide the initial sample into project stages and not to determine individual models for various stages for the following reasons: 1) The major part of observations pertain to the stage of project completion (65%).
2) The rating calculated on the basis of the one-year default probability model takes into consideration the fact of project affiliation to a certain stage.
3) A single sample will allow to develop a stable PD Lifetime model.

Basic Prerequisites
When constructing one-year migration matrix we adopted the following prerequisites: • default is an absorbing state, i.e. getting out of the default state is not considered; • in case of several ratings calculated on the basis of the same reports we used for calculation the rating with the last date of calculation; • within the period (one year) we eliminated migrations into the state of "no rating (no re-rating)", i.e. if as at the beginning of the considered period a customer was assigned a rating and at the end of the year there was no information on the calculated rating such rating was considered in the calculation as remaining in the same rating. The prerequisite was introduced to meet the modeling purposes -the event of "no re-rating" was not simulated, change of the rating while the borrower is in the Bank's portfolio is simulated.
• assigning of "10" rating to the borrower was considered as an event of default (provided it did not equal 10 as at the previous date).

Calculation of the basic migration matrix
One-year matrices were calculated as follows: • a one-year probability that a borrower with a certain rating as at the beginning of the year will in one year have the same or a different rating was calculated per quarters for a one-year interval (one-year matrices were calculated per quarters); • data was analyzed from 01.04.2009 to 01.07.2017 (30 matrices in total); • the matrix obtained by averaging of 30 matrices was taken as the basic one-year matrix; • for the PF portfolio the data was combined in rating groups 345 (3, 4+, 4, 4-, 5+, 5, 5-), 6 (6+, 6, 6-), 7 (7+, 7, 7-), 89 (8+; 8; 8-and 9). It was necessary to consolidate ratings into groups due to an insufficient number of observations in individual rating grades ( Table 10).

Adjustment of the Last Column of the Basic Migration Matrix
Due to an insufficient number of default observations we adjusted probabilities of default on the basis of weighted PD of corresponding ratings in the master scale of the Bank. PD are weighted by the number of observations in each rating (Table 11 and 12).

Reducing the Basic Migration Matrix to the Monotone Type
The basic matrix is reduced to the monotone type against the principal diagonal by using smoothing functions. Values of transition probability are adjusted line-by-line except for the values in the last column and the principal diagonal.
In order to eliminate zero probabilities of transition and non-monotonic values above the principal diagonal we used the decreasing function y = a•exp(-b•t). Its parameters were selected applying the Search for Solution package in Excel (Table 13).

Results of calculation of cumulative PD
The cumulative PDs were calculated by raising to power of the adjusted basic one-year migration matrix (Table 14 and Figure 2). Multiyear matrices were calculated for the period not exceeding 5 years.

Reducing to the Master Scale, Results' Interpolation
Cumulative probabilities of default were transformed into conditional probabilities of default in accordance with dependence [12] for further reducing of TTC PD for the first year to the Bank's master scale and calculation of the conditional PD for each rating inside rating groups by means of logarithmic interpolation.
The conditional PDs for rating categories 345, 6, 7, 89 were calculated by means of logarithmic interpolation.
The conditional PDs for ratings 1+, 1, 1-, 2+, 2, 2-, 3+, 3, 3-were fixed at the Bank's master scale level. Marginal PDs were calculated on the basis of the conditional PDs (Table 15). The final values of mPD (without regard to forecasting information) are presented in Table 16 and Figure 3.  Comparison of obtained mPD estimates for project financing (exclusive of forecasting information) to estimates for the Construction and Rental Business segments revealed that for "good" rating grades (3+, 3, 3-, 4+, 4, 4-, 4+, 4, 4-, 5+, 5, 5-) the obtained estimates are better (the probability of default is lower) than for the Construction and Rental Business segments. This may be due to the fact that the PF portfolio contains a third less observations in "good" rating grades (Table 17).
For "average" and "bad" rating grades the obtained estimates are a little worse (the probability of default is higher) than for the Construction and Rental Business segments. It should be noted that the PF portfolio contains 2.5 times as much observations in "bad" rating grades (8+, 8, 8-, 9) as in the Construction and Rental Business segments. It is important to take this feature into consideration when assessing the final mPD value in order to calculate ECL. Thus, the computed estimates do not lead to overvaluation of the ECL amount and show the specific character of the PF portfolio in the best way. Therefore, they should be used in ECL calculation in the Bank. Table 18 represents the final one-year conditional PDs which indicate the probability of default taking into consideration influence of macroeconomic information.  Table 19 represents the final one-year marginal PDs which indicate the probability of default taking into consideration influence of macroeconomic information and participate in ECL estimate.