Feeds:
Posts
Comments

Archive for the ‘Analytics’ Category

I wrote my layman’s introduction to scoring a while ago now and never delivered the promised more in-depth articles. This is the first in a line of articles correcting that oversight. The team at Scorto has very kindly provided me with a white paper on scorecard building, which I will break into sections and reproduce here. In the first of those articles, I’ll look into reject inference, a topic that has been asked about before.

One of the inherent problems with a scorecard is that while you can test easily test whether you made the right decision in accepting an application, it is less easy to know whether you made the right decision in rejecting an application. In the day-to-day running of a business this might not seem like much of a problem, but it is dangerous in two ways: · it can limit the highly profitable growth opportunities around the cut-off point by hiding any segmenting behaviour a characteristic might have; and · it can lead to a point where the data that is available for creating new scorecards represents only a portion of the population likely to apply. As this portion is disproportionately ‘good’ it can cause future scorecards to under-estimate the risk present in a population. Each application provides a lender with a great deal of characteristic data: age, income, bureau score, etc. That application data is expensive to acquire, but of limited value until it is connected with behavioural data. When an application is approved, that value-adding behavioural data follows as a matter of course and comes cheaply: did the customer of age x and with income of y and a bureau score of z go “bad” or not? Every application that is rejected gets no such data. Unless we go out of our way to get it; and that’s where reject inference comes into play.

The general population in a market will have an average probability of bad that is influenced by various national and economic characteristics, but generally stable. A smaller sub-population will make-up the total population of applicants for any given loan product –the average probability of bad in this total population will rise and fall more easily depending on marketing and product design. It is the risk of that total population of applicants that a scorecard should aim to understand. However, the data from existing customers is not a full reflection of that population. It has been filtered through the approval process it stripped of a lot of its bads. Very often, the key data problem when building a scorecard build is the lack of information on “bad” since that’s what we’re trying to model, the probability an application with a given set of characteristics will end up “bad”. The more conservative the scoring strategy in question, the more the data will become concentrated in the better score quadrants and the weaker it will become for future scorecard builds. Clearly we need a way to bring back that information. Just because the rejected applications were too risky to approve doesn’t mean they’re too risky to add value in this exercise. We do this by combining the application data of the rejected applicants with external data sources or proxies. The main difficulty related to this approach is the unavailability and/ or inconsistency of the data which may make it difficult to classify an outcome as “good” or “bad”. A number of methods can be used to infer the performance of rejected applicants.

Simple Augmentation
Not all rejected applications would have gone bad. We knew this at the time we rejected them, we just knew that too few would stay good to compensate for those that did go bad. So while a segment of applications with a 15% probability of bad might be deemed too risky, 85% of them would still be good accounts. Using that knowledge we can reconsider the rejected applications in the data exercise.

· A base scoring model is built using data from the borrowers whose behavior is known – the previously approved book.
· Using the developed model, the rejected applications are scored and an estimation is made of the percentage of “bad” borrowers and that performance is assigned at random but in proportion across the rejected applications.
· The cut-off point should be set in accordance with the rules of the current lending policy that define the permissible level of bad borrowers.
· Information on the rejected and approved requests is merged and the resulting set is used to build the final scoring model.

Accept/ Reject Augmentation
The basis of this method consists in the correction of the weights of the base scoring model by taking into consideration the likelihood of the request‘s approval.
· The first step is to build a model that evaluates the likelihood of a requests approval or rejection. · The weights of the characteristics are adjusted taking into consideration the likelihood of the request‘s approval or rejection, determined during the previous step. This is done so that the resulting scores are inversely proportional to the likelihood of the request‘s approval. So, for example, if the original approval rate was 50% in a certain cluster then each approved record is replicated to stand in for itself and the one that was rejected.
· This method is preferable to the Simple Augmentation method, but not without its own drawbacks. Two key problems can be created by augmentation: the impact of small and unusual groups can be exaggerated (such as low-side overrides for VIP clients) and then because you’ve only modeled on approved accounts the approval rates will be either 0% or 100% in each node.

Fuzzy Augmentation
The distinguishing feature of this method is that each rejected request is split and used twice, to reflect each of the likelihood of the good and bad outcomes. In other words, if a rejected application has a 15% probability of going bad it is split and 15% of the person is assumed to go bad and 85% assumed to stay good.
· Classification
Evaluation of a set of the rejected requests is performed using a base scoring model that was built based on requests with a known status;
– The likelihood of a default p(bad) and that of the “good” outcome p(good) are determined based on the set cut-off point, defining the required percentage of the “bad” requests (p(bad)+p(good)=1); – Two records that correspond to the likelihood of the “good” and “bad” outcomes are formed for each rejected request;
– Evaluation of the rejected requests is performed taking into consideration the likelihood of the two outcomes. Those accounts that fall under the likelihood of the “good” outcome are assigned with the weight p(good). The accounts that fall under the likelihood of the “bad” outcome are assigned with the weight p(bad).
· Clarification
– The data on the approved requests is merged with the data on the rejected requests and the rating of each request is adjusted taking into consideration the likelihood of the request‘s further approval. For example, the frequency of the “good” outcome for a rejected request is evaluated as the result of the “good” outcome multiplied by the weight coefficient.
– The final scoring model is built based on the combined data set.

Reject inference is no a single silver bullet. Used inexpertly it can lead to less accurate rather than more accurate results. Wherever possible, it is better to augment the exercise with a test-and-learn experiment to understand the true performance of small portions of key rejected segments. Then a new scorecard can be built based on the data from this new test segment alone and the true bad rates from that model can be compared and averaged to those from the reject inference model to get a more reliable bad rate for the rejected population.

Read Full Post »

We usually assume that in a given situation, the more conservative of two strategies will better protect the bank’s interest. So, in the sort of uncertain times that we are facing now, it is common to migrate towards more conservative approaches, but this isn’t always the best approach.
In fact, a more conservative approach can sometimes encourage the sort of behaviour that it aims to prevent. Provisions are a case in point.

Typically provisions are calculated based on a bank’s experience of risk over the last 6 months – as reflected in the net roll-rates. This period is long enough to smooth out any once-off anomalies and short enough to react quickly to changing conditions.
However, we were recently asked if it wouldn’t be more conservative to use the worst net roll-rates over the last 10 years. While this is technically more conservative (since the worst roll-rates in 120 months are almost certainly worse than the worse roll-rates in 6 months) it could actually help to create a higher risk portfolio. Yes, the bank would immediately be more secure, but over time two factors are likely to push risk in the wrong direction:

1)        The provision rate is an important source of feedback. It tells the originations team a lot about the risk that is coming into the portfolio from internal and external forces. The sooner the provisions react to new risks, the sooner the originations strategies can be adjusted. So, because a 10 year worst case scenario is an almost static measure and unaffected by changes in risk, new risk could be entering the portfolio without triggering any warnings. A slow and unintentional slide in credit quality will result.
2)        Admittedly, other metrics can alert a lender to increases in risk, but there is another incentive at work because provisions are the cost of carrying risk; by setting the cost of risk at a static and artificially high level you change the risk-reward dynamic in a portfolio.
A low risk customer segment should have a low cost of risk, allowing you to grow a portfolio by lending to low risk/ low margin customers. However, if all customers were to carry a high cost of risk regardless, only high margin customers would be profitable; and since high margin customers are usually also higher risk, there would be an incentive to grow the portfolio in the most risky segments.

In cases where the future is expected to be significantly worse than the recent past, it is better therefore to apply a flat provision overlay, a once-off increase in provisions that will increase coverage but still provide allow provisions to rise and fall with changing risk.

Read Full Post »

You will almost certainly have heard the phrase, ‘you can’t manage want you don’t measure’. This is true, but there is a corollary to that phrase which is often not considered, ‘you have to manage what you do measure’.

To manage a business you need to understand it, but more reports do not necessarily mean a deeper understanding. More reports do, however, mean more work, often exponentially more work. So while regular reporting is obviously important for the day-to-day functioning of a business, its extent should be carefully planned.
Since I started this article with one piece of trite wisdom, I’ll continue. I’m trying to write my first novel – man can not live on tales of credit risk strategy alone – and in a writing seminar I attended the instructor made reference to this piece of wisdom which he picked-up in an otherwise forgettable book on script writing, ‘if nothing has changed, nothing has happened’.
It is important to look at the regular reports generated in an organization with this philosophy in mind – do the embedded metrics enable the audience present to change the business? If the audience is not going to – or is not able to – change anything based on a metric then nothing is actually happening and if nothing is going happening, why are we spending money doing it?
Don’t get me wrong, I am an ardent believer in the value of data and data analytics, I just question the value in regular reporting. Those two subjects are definitely related, but they’re not just different, at times I believe they are fundamentally opposed.

An over-reliance on reporting can damage a business in four ways:

Restricting Innovation and Creativity
Raw data – stored in a well-organized and accessible database – encourages creative and insightful problem solving, it begs for innovative relationships to be found, provides opportunities for surprising connections to be made, and encourages ‘what if’ scenario planning.
Reports are tools for managing an operation. Reports come with ingrained expectations and encourage more constrained and retrospective analysis. They ask questions like ‘did what we expect to happen, actually happen’.
The more an organization relies on reports the more, I believe, it will tend to become operational in nature and backward focused in its analytics, asking and explaining what happened last month and how that was different to plan and to the month before. Yes it is import to know how many new accounts were opened and whether that was more or less than planned for in the annual budget, but no one ever changed the status quo by knowing how many accounts they had opened.
The easiest way to look good as the analytics department in an organization with a heavy focus on reports, is to get those reports to show stable numbers in-line with the annual plan, thus raising as few questions as possible; and the easiest way to do that is by implementing the same strategy year after year. To look good in an organization that understands the real value of data though, an analytics department has to add business value, has to delve into the data and has to come up with insightful stories about relationships that weren’t known last year, designing and implementing innovative strategies that are by their nature hard to plan accurately in an annual budgeting process, but which have the potential to change an industry.

Creating a False Sense of Control
Reports also create an often false sense of accuracy. A report, nicely formatted and with numbers showing month-on-month and year-to-date changes to the second decimal point, carries a sense of presence; if the numbers today look like the numbers did a year ago they feel like they must be right, but if the numbers today look like the numbers did a year ago there is also less of an incentive to test the underlying assumptions and the numbers can only ever be as accurate as those assumptions: how is profit estimated, how is long-term risk accounted for, how are marketing costs accounted for, how much growth is assumed, etc. and is this still valid?
Further, in a similar way to how too many credit policies can end up reducing the accountability of business leaders rather than increasing it, when too much importance is placed on reporting managers become accountable for knowing their numbers, rather than knowing their businesses. If you can say how much your numbers changed month-on-month but not why, then you’re focusing on the wrong things.

Raising Costs
Every report includes multiple individual metrics and goes to multiple stakeholders, each of those metrics has the potential to raise a question with each of those stakeholders. This is good if the question being raised influences the actions of the business, but the work involved in answering a question is not related to the value of answering it and so as more metrics of lesser importance are added to a business’ vocabulary, the odds of a question generating non-value-adding work increases exponentially.
Once it has been asked, it is hard to ignore a question pertaining to a report without looking like you don’t understand your business, but sometimes the opposite is true. If you really understand your business you’ll know which metrics are indicative of its overall state and which are not. While your own understanding of your business should encompass the multiple and detailed metrics impacting your business, you should only be reporting the most important of those to broader audiences.
And it is not just what you’re reporting, but to whom. Often a question asked out of interest by an uninvolved party can trigger a significant amount of work without providing any extra control or oversight. Better reports and better audiences should therefore replace old ones and metrics that are not value-adding in a context should not be displayed in that context; or the audience needs to change until the context is right.

Compounding Errors
The biggest problem, though, that I have with a report-based approach is the potential for compounding errors. When one report is compiled based off another report there is always the risk that an error in the first will be included in the second. This actually costs the organization in two ways: firstly the obvious risk of incorrectly informed decisions and secondly in the extra work needed to stay vigilant to this risk.
Numbers need to be checked and rechecked, formats need to be aligned or changed in synchronization, and reconciliations need to be carried out where constant differences exist – month-end data versus cycle end data, monthly average exchange rates versus month-end exchange rates, etc.
Time should never be spent getting the numbers to match; that changes nothing. Time should rather be spent creating a single source of data that can be accessed by multiple teams and which can be left in its raw state, any customization of the data happening in one team will therefore remain isolated from all other teams.

Reports are important and will remain so, but their role should be understood. A few key metrics should be reported widely and these should each add a significant and unique piece of information about an organization’s health, at one level down a similar report should break down the team’s performance, but beyond that time and resources should be invested in the creative analysis of raw data, encouraging the creation of analytics-driven business stories.
Getting this right will involve a culture change more than anything, a move away from trusting the person who knows their numbers to trusting the person who provides the most genuine insight.
I know of a loan origination operation that charges sales people a token fee for any declined application which they asked to be manually referred, forcing them to consider the merits of the case carefully before adding to the costs. A similar approach might be helpful here, charging audiences for access to monthly reports on a per metric basis – this could be an actual monetary fine which is added saved up for an end of year event or a virtual currency awarded on a quota basis.

Read Full Post »

First things first, I am by no means a scorecard technician. I do not know how to build a scorecard myself, though I have a fair idea of how they are built; if that makes sense. As the title suggests, this article takes a simplistic view of the subject. I will delve into the underlying mathematics at only the highest of levels and only where necessary to explain another point. This article treats scorecards as just another tool in the credit risk process, albeit an important one that enables most of the other strategies discussed on this blog. I have asked a colleague to write a more specialised article covering the technical aspects and will post that as soon as it is available.

 

Scorecards aim to replace subjective human judgement with objective and statistically valid measures; replacing inconsistent anecdote-based decisions with consistent evidence-based ones. What they do is essentially no different from what a credit assessor would do, they just do it in a more objective and repeatable way. Although this difference may seem small, it enables a large array of new and profitable strategies.

So what is a scorecard?

A scorecard is a means of assigning importance to pieces of data so that a final decision can be made regarding the underlying account’s suitableness for a particular strategy. They do this by separating the data into its individual characteristics and then assigning a score to each characteristic based on its value and the average risk represented by that value.

For example an application for a new loan might be separated into age, income, length of relationship with the bank, credit bureau score, etc. Then the each possible value of those characteristics will be assigned a score based on the degree to which they impact risk. In this example ages between 19 and 24 might be given a score of – 100, ages between 25 and 30 a score of -75 and so on until ages 50 and upwards are given a score of +10. In this scenario young applicants are ‘punished’ while older customers benefit marginally from their age. This implies that risk has been shown to be inversely related to age. The diagram below shows an extract of a possible scorecard:

The score for each of these characteristics is then added to reach a final score. The final score produced by the scorecard is attached to a risk measure; usually something like the probability of an account going 90 days into arrears within the next 12 months. Reviewing this score-to-risk relationship allows a risk manager to set the point at which they will decline applications (the cut-off) and to understand the relative risk of each customer segment on the book. The diagram below shows how this score-to-risk relationship can be used to set a cut-off.

How is a scorecard built?

Basically what the scorecard builder wants to do is identify which characteristics at one point in time are predictive of a given outcome before or at some future point in time. To do this historic data must be structured so that one period can represent the ‘present state’ and the subsequent periods can represent the ‘future state’. In other words, if two years of data is available for analysis (the current month can be called Month 0 and the last Month can be called Month -24) then the most distant six months (from Month -24 to Month -18) will be used to represent the ‘current state’ or, more correctly, the observation period while the subsequent months (Months -17 to 0) represent the known future of those first six months and are called ‘the outcome period’. The type of data used in each of these periods will vary to reflect these differences so that application data (applicant age, applicant income, applicant bureau score, loan size requested, etc.) is important in the observation period and performance data (current balance, current days in arrears, etc.) is important in the outcome period.

With this simple step completed the accounts in the observation period must be defined and sorted based on their performance during the outcome period. To start this process a ‘bad definition’ and ‘good definition’ must first be agreed upon. This is usually something like: ‘to be considered bad, an account must have gone past 90 days in delinquency at least once during the 18 month outcome period’ and ‘to be considered good an account must never have gone past 30 days in delinquency during the same period’. Accounts that meet neither definition are classified as ‘indeterminate’.

Thus separated, the unique characteristics of each group can be identified. The data that was available at the time of application for every ‘good’ and ‘bad’ account is statistically tested and those characteristics with largely similar values within one group but largely varying values across groups are valuable indicators of risk and should be considered for the scorecard. For example if younger customers were shown to have a higher tendency to go ‘bad’ than older customers, then age can be said to be predictive of risk. If on average 5% of all accounts go bad but a full 20% of customers aged between 19 and 25 go bad while only 2% of customers aged over 50 go bad then age can be said to be a strong predictor of risk. There are a number of statistical tools that will identify these key characteristics and the degree to which they influence risk more accurately than this but they won’t be covered here.

Once each characteristic that is predictive of risk has been identified along with its relative importance some cleaning-up of the model is needed to ensure that no characteristics are overly correlated. That is, that no two characteristics are in effect showing the same thing. If this is the case, only the best of the related characteristics will be kept while the other will be discarded to prevent, for want of a better term, double-counting. Many characteristics are correlated in some way, for example the older you are the more likely you are to be married, but this is fine so long as both characteristics add some new information in their own right as is usually the case with age and marital status – an older, married applicant is less risky than a younger, married applicant just as a married, older applicant is less risky than a single, older applicant. However, there are cases where the two characteristics move so closely together that the one does not add any new information and should therefore not be included.

So, once the final characteristics and their relative weightings have been selected the basic scorecard is effectively in place. The final step is to make the outputs of the scorecard useable in the context of the business. This usually involves summarising the scores into a few score bands and may also include the addition of a constant – or some other means of manipulating the scores – so that the new scores match with other existing or previous models.

 

How do scorecards benefit an organisation?

Scorecards benefit organisations in two major ways: by describing risk in very fine detail they allow lenders to move beyond simple yes/ no decisions and to implement a wide range of segmented strategies; and by formalising the lending decision they provide lenders with consistency and measurability.

One of the major weaknesses of a manual decisioning system is that it seldom does more than identify the applications which should be declined leaving those that remain to be accepted and thereafter treated as being the same. This makes it very difficult to implement risk-segmented strategies. A scorecard, however, prioritises all accounts in order of risk and then declines those deemed too risky. This means that all accepted accounts can still be segmented by risk and this can be used as a basis for risk-based pricing, risk-based limit setting, etc.

The second major benefit comes from the standardisation of decisions. In a manual system the credit policy may well be centrally conceived but the quality of its implementation will be dependent on the branch or staff member actually processing the application. By implementing a scorecard this is no longer the case and the roll-out of a scorecard is almost always accompanied by the reduction in bad rates.

Over-and-above these risk benefits, the roll-out of a scorecard is also almost always accompanied by an increase in acceptance rates. This is because manual reviewers tend to be more conservative than they need to be in cases that vary in some way from the standard. The nature of a single credit policy is such that to qualify for a loan a customer must exceed the minimum requirements for every policy rule. For example, to get a loan the customer must be above the minimum age (say 28), must have been with the bank for more than the minimum period (say 6 months) and must have no adverse remarks on the credit bureau. A client of 26 with a five year history with the bank and a clean credit report would be declined. With a scorecard in place though the relative importance of exceeding one criteria can be weighed against the relative importance of missing another and a more accurate decision can be made; almost always allowing more customers in.

 

Implementing scorecards

There are three levels of scorecard sophistication and, as with everything else in business, the best choice for any situation will likely involve a compromise between accuracy and cost.

The first option is to create an expert model. This is a manual approximation of a scorecard based on the experience of several experts. Ideally this exercise would be supported by some form of scenario planning tool where the results of various adjustments could be seen for a series of dummy applications – or genuine historic applications if these exist – until the results that meet the expectations of the ‘experts’. This method is better than manual decisioning since it leads to a system that looks at each customer in their entirety and because it enforces a standardised outcome. That said, since it is built upon relatively subjective judgements it should be replaced with a statistically built scorecard as soon as enough data is available to do so.

An alternative to the expert model is a generic scorecard. These are scorecards which have been built statistically but using a pool of similar though not customer-specific data. These scorecards are more accurate than expert models so as long as the data on which they were built reasonably resembles the situation in which they are to be employed. A bureau-level scorecard is probably the purest example of such a scorecard though generic scorecards exist for a range of different products and for each stage of the credit life-cycle.

Ideally, they should first be fine-tuned prior to their roll-out to compensate for any customer-specific quirks that may exist. During a fine-tuning, actual data is run through the scorecard and the results used to make small adjustments to the weightings given to each characteristic in the scorecard while the structure of the scorecard itself is left unchanged. For example, assume the original scorecard assigned the following weightings: -100 for the age group 19 to 24; -75 for the age group 25 to 30; -50 for the age group 31 to 40; and 0 for the age group 41 upwards. This could either be implemented as it is bit if there is enough data to do a fine-tune it might reveal that in this particular case the weightings should actually be as follows: -120 for the age group 19 to 24; -100 for the age group 25 to 30; -50 for the age group 31 to 40; and 10 for the age group 41 upwards. The scorecard structure though, as you can see, does not change.

In a situation where there is no client-specific data and no industry-level data exists, an expert model may be best. However, where there is no client-specific data but where there is industry-level data it is better to use a generic scorecard. In a case where there is both some client-specific data and some industry-level data a fine-tuned generic scorecard will produce the best results.

The most accurate results will always come, however, from a bespoke scorecard. That is a scorecard built from scratch using the client’s own data. This process requires significant levels of good quality data and access to advanced analytical skills and tools but the benefits of a good scorecard will be felt throughout the organisation.


Read Full Post »

You’ve got to know when to hold ‘em, know when to fold ‘em

Know when to walk away and know when to run

I’ve always wanted to use the lines from Kenny Rogers’ famous song, The Gambler, in an article. But that is only part of the reason I decided to use the game of Texas Holdem poker as a metaphor for the credit risk strategy environment.

The basic profit model for a game of poker is very similar to that of a simple lending business. To participate in a game of Texas Holdem there is a fixed cost (buy in) in exchange for which there is the potential to make a profit but also the risk of making a loss. As each card is dealt, new information is revealed and the player should adjust their strategy accordingly. Not every hand will deliver a profit and some will even incur a fairly substantial loss, however over time and by following a good strategy the total profit accumulated from those hands that are winners can be sufficient to cover the losses of those hands that are losers and the fixed costs of participating and a profit can thus be made.

Similarly in a lending business there is a fixed cost to process each potential customer, only some of whom will be accepted as actual customers who have the potential to be profitable or to result in a loss.  The lender will make an overall profit only if the accumulated profit from each profitable customer is sufficient to cover the losses from those that weren’t and the fixed processing costs.

In both scenarios, the profit can be maximised by increasing exposure to risk when the odds of a profit are good and reducing exposure, on the other hand, when the odds of a loss are higher. A good card player therefore performs a similar role to a credit analyst: continuously calculating the odds of a win from each hand, designing strategies to maximise profit based on those odds and then adjusting those strategies as more information becomes available.

Originations

To join a game of Texas Holdem each player needs to buy into that game by placing a ‘blind’ bet before they have seen any of the cards.  As this cost is incurred before any of the cards are seen the odds of victory can not be estimated. The blind bet is, in fact, the price to see the odds.

Thereafter, each player is dealt two private cards; cards that only they can see. Once these cards have been dealt each player must decide whether to play the game or not.

To play on, each player must enter a further bet. This decision must be made based on the size of the bet and an estimate of the probability of victory based on the two known cards. If the player should instead choose to not play, the will forfeit their initial bet.

A conservative player, one who will play only when the odds are strongly in their favour, may lose fewer hands but they will instead incur a relatively higher cost of lost buy-ins. Depending on the cost of the buy-in and the average odds of winning, the most profitable strategy will change but it will unlikely be the most conservative strategy.

In a lending organisation the equivalent role is played by the originations team. Every loan application that is processed, incurs a cost and so when an application is declined that cost is lost. A conservative scorecard policy will decline a large number of marginal applications choosing, effectively, to lose a small but known processing cost rather than risk a larger but unknown credit loss.  In so doing though, it also gives up the profit potential on those accounts. As with poker betting strategies, the ideal cut-off will change based on the level of processing costs and the average probability of default but will seldom be overly conservative.

A card player calculates their odds of victory from the known combinations of cards possible from a standard 54 card deck.  The player has the possibility of creating any five card combination made up from their two known cards and a further five random ones yet to be dealt, while each other player can create a five card combination made-up of any seven cards except for the two the player himself has.  With this knowledge, the odds that the two private cards will result in a winning hand can be estimated and, based on that estimate, make the decision whether to enter a bet and if so of what size; or whether to fold and lose the buy-in.

The methods used to calculate odds may vary, as do the sources of potential profits, but at a conceptual level the theory on which originations is based is similar to the theory which under-pins poker betting.

As each account is processed through a scorecard the odds of it eventually rolling into default are estimated. These odds are then used to make the decision whether to offer credit and, if so, to what extent.  Where the odds of a default are very low the lender will likely offer more credit – the equivalent of placing a larger starting bet – and vice versa.

Customer Management

The reason that card games like Texas Holdem are games of skill rather than just games of chance, is that the odds of a victory change during the course of a game and so the player is required to adapt their betting strategy as new information is revealed.  Increasing their exposure to risk as the odds grow better or retreating as the odds worsen.  The same is true of a lending organisation where customer management strategies seek to maximise organisational profit but changing exposure as new information is received.

Once the first round of betting has been completed and each player’s starting position has been determined, the dealer turns over three ‘community cards’.  These are cards that all players can see and can use, along with their two private cards, to create their best possible poker hand. A significant amount of new information is revealed when those three community cards are dealt. In time two further community cards will be revealed and it will be from any combination of those seven cards that a winning hand will be constructed. So, at this point, each player knows five of the seven cards they will have access to and three of the cards their opponents can use. The number of possible hands becomes smaller and so the odds that the players had will be a winner can be calculated more accurately. That is not to say the odds of a win will go up, just that the odds can be stated with more certainty.

At this stage of the game, therefore, the betting activity usually heats up as players with good hands increase their exposure through bigger bets. Players with weaker hands will try to limit their exposure by checking – that is not betting at all – or by placing the minimum bet possible. This strategy limits their potential loss but also limits their potential gain as the total size of the ‘pot’ is also kept down.

As each of the next two community cards is revealed this process repeats itself with players typically willing to place ever larger bets as the new information received allows them to calculate the odds with more certainty. Only once the final round of betting is complete are the cards revealed and a winner determined. Those players that bet until the final round but still lose will have lost significantly in this instance. However, if they continue to play the odds well they will expect to recuperate that loss – and more – over time.

The customer management team within a lending organisation works with similar principals. As an account begins to operate, new information is received which allows the lender to determine with ever more certainty the probability that an account will eventually default: with every payment that is received on time, the odds of an eventual default decrease; with every broken promise-to-pay, those odds increase; etc.

So the role of the customer management team is to design strategies that optimise the lender’s exposure to each customer based on the latest information received. Where risk appears to be dropping, exposure should be increased through limit increases, cross-selling of new products, reduced pricing, etc. while when the opposite occurs the exposure should be kept constant or even decreased through limit decreases, pre-delinquency strategies, foreclosure, etc.

Collections

As the betting activity heats up around them a player may decide that the odds no longer justify the cost required to stay in the game and, in these cases, the player will decide to fold – and accept a known small loss rather than continue betting and risk an even bigger eventual loss chasing an unlikely victory.

Collections has too many operational components to fit neatly into the poker metaphor but it can be most closely likened to this decision of whether or not to fold. Not every hand can be a winner and even hands that initially appeared to be strong can be shown to be weak when the latter community cards are revealed. A player who was dealt two hearts and who then saw two further hearts dealt in the first three community cards would have been in  a strong position with the odds that the fifth heart they need to create a strong ‘flush hand’ sitting at fifty percent. However, if when the next two cards are dealt neither is a heart, the probability of a winning hand will drop to close to zero.

In this situation the player needs to make a difficult decision: they have invested in a hand that has turned out to be a ‘bad’ one and they can either accept the loss or invest further in an attempt to salvage something. If there is little betting pressure from the other players, they might choose to stay in the game by matching any final bets; figuring that because the total pot was large and the extra cost of participating small it was worth investing further in an unlikely win. Money already bet, after all, is a sunk cost. If the bets in the latest round are high however, they might choose to fold instead and keep what money they have left available for investment in a future, hopefully better hand.

As I said, the scope of collections goes well beyond this but certain key decisions a collections strategy manager must make relate closely to the question of whether or not to fold. Once an account has missed a payment and entered the collections processes the lender has two options: to invest further time and money in an attempt to collect some or all of the outstanding balance or to cut their losses and sell or even to write-off the debt.

In cases where there is strong long-term evidence that the account is a good one, the lender might decide – as a card player might when a strong hand is not helped by the fourth community card – to maintain or even increase their exposure by granting the customer some leeway in the form of a payment holiday, a re-aging of debt or even a temporary limit increase. On the other hand, in cases where the new information has forced a negative re-appraisal of the customer’s risk but the value owed by that customer is significant, it might still be preferable for the lender to invest a bit more in an attempt to make a recovery, even though they know that the odds are against them. This sort of an investment would come in the form of an intensive collections campaign or the paid involvement of specialist third party debt collectors.

As with a game of cards, the lender will not always get it exactly right and will over invest in some risky customers and under-invest in others; the goal is to get the investment right often enough in the long-term to ensure a profit overall.

It is also true that a lender who consistently shies away from investing in the collection of marginal debt – one that chooses too easily to write-off debt rather than to risk an investment in its recovery – may start to create a reputation for themselves that is punitive in the long-run. A lender that is seen as a ‘soft touch’ by the market will attract higher risk customers and will see a shift in portfolio risk towards the high-end as more and more customers decide to let their debt fall delinquent in the hopes of a painless write-off. Similarly a card player that folds in all situations except those where the odds are completely optimal, will soon be found out by their fellow players. Whenever they receive the perfect hand and bet accordingly, the rest of the table will likely fold and in so doing reduce the size of the ensuing pot which, although won, will be much smaller than it might otherwise have been. In extreme cases, this limiting of the wins gained from good hands may be so sever that the player is unable to cover the losses they have had to take in the games in which they folded.

Summary

The goal of credit risk strategy, like that of a poker betting strategy, is to end with the most money possible. To do this, calculated bets must be taken at various stages and with varying levels of data; risk must be re-evaluated continuously and at times it may become necessary to take a known loss rather than to risk ending up with an even greater, albeit uncertain, loss in the future.

So, in both scenarios, risk should not be avoided but should rather be converted into a series of numerical odds which can be used to inform investment strategies that seek to leverage off good odds and hedge against bad odds. In time, if accurate models are used consistently to inform logical strategies it is entirely possible to make a long-term profit.

Of course in their unique nuances both fields also vary quite extensively from each other, not least in the way money is earned and, most importantly, in the fact that financial services is not a zero sum game. However, I hope that where similarities do exist these have been helpful in understanding how the profit levers in a lending business fit together. For a more technical look at the same issue, you can read my articles on profit modelling in general and for credit cards and banks in particular.

Read Full Post »

There are certainly analytical tools in the market that are more sophisticated than Excel and there are certainly situations where these are needed to deliver enhanced accuracy or advanced features; However, this article will concentrate on building models to aid the decision-making process of a business leader rather than a specialist statistician, the need is for a model that is flexible and easy-to-use.  Since Excel is so widely available and understood, it is usually the best tool for this purpose.

In this article I will assume a basic understanding of Excel and its in-built mathematical functions.  Instead, I’ll discuss how some of the more advanced functions can be used to build decision-aiding models and, in particular, how to create flexible matrices.

Spreadsheets facilitate flexibility by allowing calculations to be parameterised so that a model can be built with the logic fixed but the input values flexible.  For example, the management of a bank may agree that the size of a credit limit grated to a customer should be based on that customer’s risk and income and that VIP customers should be entitled to an extra limit extension, though they may disagree over one or more of the inputs.  The limit-setting logic can be programmed into Excel as an equation that remains constant while the definition of what constitutes each risk group, each income band, the size of each limit and the size of the VIP bonus extension can each be changed at will. 

When building a model to assist with business decision-making, the key is to make sure that each profit lever is represented by a parameter that can be quickly and easily altered by decision-makers without altering the logical and established links between each of those profit levers.  Making use of Excel’s advanced functions and some simple logic, it is possible to do this in almost all situations without the resulting model becoming too complex for practical business use.

*     *     *     *     *

If I were to guess, I would say that at over 80% of the functionality needed to build a flexible decision-making model can be created using Excel’s basic mathematical functions and ‘IF clauses’ and ‘LOOKUPs’. 

If Clauses

IF clauses, once understood, have a multitude of uses.  When building a model to aid decision-making they are usually one of the most important tools at an analyst’s disposal.  Simply put, an IF clause provides a binary command: if a given event happens do this, if not do that.  If a customer number has been labelled as VIP, add €5 000 extra to the proposed limit, if not do not add anything, etc. 

IF( CustStatus = “VIP”, 5000, 0 )

Using this simple logic, it is possible to replicate a decision tree connecting a large number of decisions to create a single strategy.  IF clauses are very useful for categorising data, for identifying or selecting specific events, etc.

There are two important variations of the basic IF clause: SUMIF and COUNTIF.  These two functions allow to determine how often, or to what degree, a certain event has occurred.  Both functions have the same underlying logic, though the COUNTIF function is simpler.  What is the total sum of balances on all VIP accounts or simply how many VIP accounts are there.

SUMIF( Sheet1!$A$1:$A$200, “VIP”, Sheet1!$B$1:$B$200 ) or

COUNTIF( Sheet1!$A$1:$A$200, “VIP” )

 

Lookups

Look-ups, on the other hand, are used to retrieve related data; replicating some of the basic functionality of a database. 

A ‘lookup’ will take one value and retrieve a corresponding alternate value from a specific table.  Perhaps easier to understand through an example: assume there is a list showing which branch each of a bank’s customers belongs to, given a random selection of customer numbers a lookup would take each of those customer numbers and search the larger list until it found the matching number and then retrieve the associated branch name next to that customer name in the table. 

 

By ending the statement with ‘FALSE’ it means that only exact matches are permitted.  If I had ended the function with ‘TRUE’, it would have looked for the nearest possible match to the given customer name from within the list and returned the value corresponding to that.  This is not particularly useful in an example like this one but it is a useful way to group values into logical batches among other things.  For example, if I had a list of salaries and wanted to summarise them into salary bands I could create a table with the lowest and highest value in each band and then use a lookup ending with TRUE to find the band into which each unique salary falls.

 

There are actually two types of lookups in Excel, vertical lookups and horizontal lookups.  The former looks down a list until it finds the matching number (and then moves across to find the pertinent field) while the latter looks across a list (and then moves down to find the pertinent field); other than that the logic remains the same.

In the above example, the lookup will look take a given customer number within a table on the sheet and then, once it has been found, will return the value in the second column from the left of that table.  If it has been instead been an HLOOKUP function, the value returned would have been the one in the second row from the top. 

 

Embedded Functions

The real value of IF clauses and LOOKUPs comes when they are added to together, either with each other or with other Excel functions.  For example, if the account is labelled “VIP” then look for the associated relationship manager in a list of all the relationship managers, if not then look for the associated branch name – in both cases using the customer number to do the matching.

IF( CustStatus = “VIP”, VLOOKUP( CustNum, RelMans!$A$1:$B$20, 2, FALSE), VLOOKUP (CustNum, Branches!$A$1:$B$50, 2, FALSE))

In these cases, the results of the embedded function are used by the main function to deliver a result.

Matrices

In most cases however, businesses need to make decisions on more factors than can be represented simply by lists; in our example credit limits cannot be set with a reference to risk alone, income – and as a proxy for spend – considerations also need to be borne in mind.  When building a business model, a useful tool then is a two-dimensional matrix where results can be retrieved using embedded VLOOKUPs and HLOOKUPs.  Creating Matrices in Excel is a three-step process – at least I only know how to do it using three steps. 

I will walk through the example of a limit setting matrix.  In this example I want to set a limit for each customer based on a combination of the customer’s risk and income while also keeping product restrictions in mind.  I want this model to be flexible enough so that I can easily change the definition of the risk and income bands as well as the prices assigned to each segment of the matrix.

The first step is to create the desired matrix, choosing the axis labels and number of segments.  Within this matrix, each segment should be populated with the desired limit.  The labels of the matrix will remain fixed though the definition of each label can be changed as needed.  The limit in each segment can be hard-coded in – €5 000 for example – or can relate to a reference cell – product minimum plus a certain amount for example.

In this example I have decided to 12 segment matrix that will cater for 4 income bands (Low, Moderate, High and Very High) and 3 risk bands (Low, Moderate and High).  I’ve then populated the matrix with the limits we will use as a starting point for our discussions.  Managers will not want to know just how the proposed model impacts limits at a customer level, they will also want to see how it impacts limits at a portfolio level so I have used COUNTIF and SUMIF to provide a summary of the limit distribution across the portfolio – all shown below:

 

The second step is to summarise the values of the two key variables into the respective bands; using VLOOKUPs as discussed above.  In this example we want to summarise the risk grades of customers into LOW RISK, MODERATE RISK and HIGH RISK and the income into similar LOW INCOME, MODERATE INCOME, HIGH INCOME and VERY HIGH INCOME.

As a starting point, I have decided to make the splits as shown in the tables below.  These tables were used to label each account in the dataset using two new columns I have created, also shown below:

 

 

 

Then each account can be matched to a matrix segment using VLOOKUPs and HLOOKUPs, embedded to create a matrix lookup function.  What we want to do is to use the VLOOKUP functionality to find the right row corresponding to the risk of the customer and then to move across the number of rows to find the right column corresponding to the income of the customer.  The first part of the equation is relatively simple to construct so long as we ignore the column number:

VLOOKUP( Risk Band, $A$3:$E$5, ?, False )

Provided we’re a little creative with it, an HLOOKUP will allow us to fill in the missing part.  What we need to do is find a way to convert the ‘Salary Band’ field into a number representing the column.  You might have noticed that there was a row of numbers under each of the Income Bands in the matrix shown above.  This was done in order to allow an HLOOKUP to return that number so that it can be placed it into the missing part of the VLOOKUP.  An HLOOKUP will search for the Salary Band and then return the number from the row directly blow it, which in this case has been specifically set to be equal to its column number – remembering to add one to take into account the field used to house the name of the Risk Grade that is needed for the VLOOKUP.

HLOOKUP ( Salary Band, $B$1:$E$2, 2, FALSE )

In this case it will always be the second row so we can hardcode in the ‘2’.  This entire function is then substituted into the VLOOKUP to create a function that will look-up both the Risk Band and the Salary Band of any given customer and return the relative limit from the matrix.

VLOOKUP( Risk Band, $A$1:$E$5, HLOOKUP ( Salary Band, $B$1:$E$2, 2, FALSE ) , False )

All that is now needed is to add two further fields to take into account the potential VIP bonus limit and the model is complete – and the results are shown below:

 

This version of the model can be distributed or taken into a workshop and, as each component is adjusted so too are the individual limits granted as well as the tables summarising the distribution of limits across the portfolio.  For example, the marketing team may wish to increase the limits of Low Risk, Very High Income customers to 60,000 and, at the same time, the risk team may wish to re-categorise those with a risk score of 4 as ‘High Risk’ and increase the qualifying income for ‘High Income’ to 7,000.  The first change requires a simple change in the Limit Matrix while the second requires two simple changes to the references tables, giving the new matrix limit using the tables shown below.

 

*     *     *     *     *

It is also possible to show the distribution by matrix segment. The method is based on the same logic discussed up to now, although the implementation is a bit clumsy. 

The first step is to create a dummy matrix with the same labels but populated with a segment number rather than a limit.  Then you need to create a new field in the dataset called something like ‘Segment Number’ and to populate this field using the same equation from above.  Once this field has been populated you can create a another dummy version of the matrix and, in this case, use the SUMIF or COUNTIF function to calculate the value of limits or the number of customers in each segment.  With that populated it is easy to turn those numbers into a percentage of the total either in the same step or using one final new matrix:

 






Read Full Post »

Probably the most common credit card business model is for customers to be charged a small annual fee in return for which they are able to make purchases using their card and to only pay for those purchases after some interest-free period – often up to 55 days.  At the end of this period, the customer can choose to pay the full amount outstanding (transactors) in which case no interest accrues or to pay down only a portion of the amount outstanding (revolvers) in which case interest charges do accrue.  Rather than charging its customer a usage fee, the card issuer also earns a secondary revenue stream by charging merchants a small commission on all purchases made in their stores by the issuer’s customers.

So, although credit cards are similar to other unsecured lending products in many ways, there enough important differences that are not catered for in the generic profit model for banks (described here and drawn here) to warrant an article specifically focusing on the credit card profit modelNote: In this article I will only look at the profit model from an issuer’s point of view, not from an acquirer’s.

* * * 

We started the banking profit model by saying that profit was equal to total revenue less bad debts, less capital holding costs and less fixed costs.  This remains largely true.  What changes is the way in which we arrive at the total revenue, the way in which we calculate the cost of interest and the addition of a two new costs – loyalty programmes and fraud.  Although in reality there may also be some small changes to the calculation of bad debts and to fixed costs, for the sake of simplicity, I am going to assume that these are calculated in the same way as in the previous models.

 

Revenue

Unlike a traditional lender, a card issuer has the potential to earn revenue from two sources: interest from customers and commission from merchants.  The profit model must therefore be adjusted to cater for each of these revenue streams as well as annual fees. 

Total Revenue  = Fees + Interest Revenue + Commission Revenue

                                = Fees + (Revolving Balances x Interest Margin x Repayment Rate) + (Total Spend x Commission)

                                = (AF x CH) + (T x ATV) x ((RR x PR x i) + CR)

Where              AF = Annual Fee                                               CH = Number of Card Holders  

                           T = Number of Transactions                          PR = Repayment Rate

                           ATV = Average Transaction Value              i = Interest Rate

                           RR = Revolve Rate                                              CR = Commission Rate

Customers usually fall into one of two groups and so revenue strategies tend to conform to these same splits.  Revolvers are usually the more profitable of the two groups as they can generate revenue in both streams.  However, as balances increase and approach the limit the capacity to continue spending decreases.  Transactors, on the other hand, seldom carry a balance on which an issuer can earn interest but they have more freedom to spend.

Strategies aimed at each group should be carefully considered.  Balance transfers – or campaigns which encourage large, once-off purchases – create revolving balances and sometimes a large, once-off commission while generating little on-going commission income.  Strategies that encourage frequent usage don’t usually lead to increased revolving balances but do have a more consistent – and often growing – long-term impact on commission revenue..   

Variable Costs

There is also a significant difference between how card issuers and other lenders accrue variable costs. 

Firstly, unlike other loans, most credit cards have an interest free period during which the card issuer must cover the costs of the carrying the debt.

The high interest margin charged by card issuers aims to compensate them for this cost but it is important to model it separately as not all customers end up revolving and hence, not all customers pay that interest at a later stage.  In these cases, it is important for an issuer to understand whether the commission earnings alone are sufficient to compensate for these interest costs.

Secondly, most card issuers accrue costs for a customer loyalty programme.  It is common for card issuers to provide their customers with rewards for each Euro of spend they put on their cards.  The rate at which these rewards accrue varies by card issuer but is commonly related in some way to the commission that the issuer earns.  It is therefore possible to account for this by simply using a net commission rate.  However, since loyalty programmes are an important tool in many markets I prefer to keep it out as a specific profit lever.

Finally, credit card issuers also run the risk of incurring transactional fraud –  lost, stolen or counterfeited cards.  There are many cases in which the card issuer will need to carry the cost of fraudulent spend that has occurred on their cards.  This is not a cost common to other lenders, at least not after the application stage.

Variable Costs = (T x ATV) x ((CoC x IFP) + L + FR)

Where            T = Number of Transactions                         IFP = Interest Free Period Adjustment

                         ATV = Average Transaction Value             CoC = Cost of Capital

                         FR = Fraud Rate

Shorter interest free periods and cheaper loyalty programmes will result in lower costs but will also likely result in lower response rates to marketing efforts, lower card usage and higher attrition among existing customers.

 

The Credit Card Profit Model                   

Profit is simply what is left of revenue once all costs have been paid; in this case after variable costs, bad debt costs, capital holding costs and fixed costs have been paid.

I have decided to model revenue and variable costs as functions of total spend while modelling bad debt and capital costs as a function of total balances and total limits. 

The difference between the two arises from the interaction of the interest free period and the revolve rate over time.  When a customer first uses their card their spend increases and so does the commission earned and loyalty fees and interest costs accrued by the card issuer.  Once the interest free period ends and the payment falls due, some customers (transactors) will pay their full balance outstanding and thus have a zero balance while others will pay the minimum due (revolve) and thus create a balance equal to 100% less the minimum repayment percentage of that spend. 

Over time, total spend increase in both customer groups but balances only increase among the group of customers that are revolving.  It is these longer-term balances on which capital costs accrue and which are ultimately at risk of being written-off.  In reality, the interaction between spend and risk is not this ‘clean’ but this captures the essence of the situation.

Profit = Revenue – Variable Costs – Bad Debt – Capital Holding Costs – Fixed Costs

= (AF x CH) + (T x ATV) x ((RR x PR x i) + CR) – (T x ATV) x (L + (CoC x IFP)) – (TL x U x BR) – (TL x U x CoC +   TL x   (1 – U) x BHR x CoC) – FC

= (T x ATV) x (CR – L – (CoC x IFP) -FR) – (TL x U x BR) – ((TL x U x CoC) + (TL x (1 – U) x BHR x CoC)) – FC

Where        AF = Annual Fee                                               CH = Number of Card Holders          

                      T = Number of Transactions                         i = Interest Rate

                      ATV = Average Transaction Value               TL = Total Limits

                      RR = Revolve Rate                                                U = Av. Utilisation

                      PR = Repayment Rate                                          BR = Bad Rate

                      CR = Commission Rate                                        CoC = Cost of Capital

                      L = Loyalty Programme Costs                          BHR = Basel Holding Rate

                      IFP = Interest Free Period Adjustment        FC = Fixed Costs

                      FR = Fraud Rate

 

Visualising the Credit Card Profit Model  

Like with the banking profit model, it is also possible to create a visual profit model.  This model communicates the links between key ratios and teams in a user-friendly manner but does so at the cost of lost accuracy.

The key marketing and originations ratios remain unchanged but the model starts to diverge from the banking one when spend and balances are considered in the account management and fraud management stages.   

The first new ratio is the ‘usage rate’ which is similar to a ‘utilisation rate’ except that it looks at monthly spend rather than at carried balances.  This is done to capture information for transactors who may have a zero balance – and thus a zero balance – at each month end but who may nonetheless have been restricted by their limit at some stage during the month.

The next new ratio is the ‘fraud rate’.  The structure and work of a fraud function is often similar in design to that of a debt management team with analytical, strategic and operational roles.  I have simplified it here to a simple ratio of fraud: good spend as this is the most important from a business point-of-view, however if you are interested in more detail about the fraud function you can read this article or search in this category for others.

The third new ratio is the ‘commission rate’.  The commission rate earned by an issuer will vary by each merchant type and, even within merchant types, in many cases on a case-by-case basis depending on the relative power of each merchant.  Certain card brands will also attract different commission rates; usually coinciding with their various strategies.  So American Express and Diners Club who aim to attract wealthier transactors will charge higher commission rates to compensate for their lower revolve rates while Visa and MasterCard will charge lower rates but appeal to a broader target market more likely to revolve.

The final new ratio is the revolve rate which I have mentioned above.  This refers to the percentage of customers who pay the minimum balance – or less than their full balance – every month.  On these customers an issuer can earn both commission and interest but must also carry higher risk.  The ideal revolve rate will vary by market and depending on the issuers business objectives but should be higher when the issuer is aiming to build balances and lower when the issuer is looking to reduce risk.

 






Read Full Post »

Older Posts »