Feeds:
Posts

## Creating and Using Matrices in Excel

There are certainly analytical tools in the market that are more sophisticated than Excel and there are certainly situations where these are needed to deliver enhanced accuracy or advanced features; However, this article will concentrate on building models to aid the decision-making process of a business leader rather than a specialist statistician, the need is for a model that is flexible and easy-to-use.  Since Excel is so widely available and understood, it is usually the best tool for this purpose.

In this article I will assume a basic understanding of Excel and its in-built mathematical functions.  Instead, I’ll discuss how some of the more advanced functions can be used to build decision-aiding models and, in particular, how to create flexible matrices.

Spreadsheets facilitate flexibility by allowing calculations to be parameterised so that a model can be built with the logic fixed but the input values flexible.  For example, the management of a bank may agree that the size of a credit limit grated to a customer should be based on that customer’s risk and income and that VIP customers should be entitled to an extra limit extension, though they may disagree over one or more of the inputs.  The limit-setting logic can be programmed into Excel as an equation that remains constant while the definition of what constitutes each risk group, each income band, the size of each limit and the size of the VIP bonus extension can each be changed at will.

When building a model to assist with business decision-making, the key is to make sure that each profit lever is represented by a parameter that can be quickly and easily altered by decision-makers without altering the logical and established links between each of those profit levers.  Making use of Excel’s advanced functions and some simple logic, it is possible to do this in almost all situations without the resulting model becoming too complex for practical business use.

*     *     *     *     *

If I were to guess, I would say that at over 80% of the functionality needed to build a flexible decision-making model can be created using Excel’s basic mathematical functions and ‘IF clauses’ and ‘LOOKUPs’.

If Clauses

IF clauses, once understood, have a multitude of uses.  When building a model to aid decision-making they are usually one of the most important tools at an analyst’s disposal.  Simply put, an IF clause provides a binary command: if a given event happens do this, if not do that.  If a customer number has been labelled as VIP, add €5 000 extra to the proposed limit, if not do not add anything, etc.

IF( CustStatus = “VIP”, 5000, 0 )

Using this simple logic, it is possible to replicate a decision tree connecting a large number of decisions to create a single strategy.  IF clauses are very useful for categorising data, for identifying or selecting specific events, etc.

There are two important variations of the basic IF clause: SUMIF and COUNTIF.  These two functions allow to determine how often, or to what degree, a certain event has occurred.  Both functions have the same underlying logic, though the COUNTIF function is simpler.  What is the total sum of balances on all VIP accounts or simply how many VIP accounts are there.

SUMIF( Sheet1!\$A\$1:\$A\$200, “VIP”, Sheet1!\$B\$1:\$B\$200 ) or

COUNTIF( Sheet1!\$A\$1:\$A\$200, “VIP” )

Lookups

Look-ups, on the other hand, are used to retrieve related data; replicating some of the basic functionality of a database.

A ‘lookup’ will take one value and retrieve a corresponding alternate value from a specific table.  Perhaps easier to understand through an example: assume there is a list showing which branch each of a bank’s customers belongs to, given a random selection of customer numbers a lookup would take each of those customer numbers and search the larger list until it found the matching number and then retrieve the associated branch name next to that customer name in the table.

By ending the statement with ‘FALSE’ it means that only exact matches are permitted.  If I had ended the function with ‘TRUE’, it would have looked for the nearest possible match to the given customer name from within the list and returned the value corresponding to that.  This is not particularly useful in an example like this one but it is a useful way to group values into logical batches among other things.  For example, if I had a list of salaries and wanted to summarise them into salary bands I could create a table with the lowest and highest value in each band and then use a lookup ending with TRUE to find the band into which each unique salary falls.

There are actually two types of lookups in Excel, vertical lookups and horizontal lookups.  The former looks down a list until it finds the matching number (and then moves across to find the pertinent field) while the latter looks across a list (and then moves down to find the pertinent field); other than that the logic remains the same.

In the above example, the lookup will look take a given customer number within a table on the sheet and then, once it has been found, will return the value in the second column from the left of that table.  If it has been instead been an HLOOKUP function, the value returned would have been the one in the second row from the top.

Embedded Functions

The real value of IF clauses and LOOKUPs comes when they are added to together, either with each other or with other Excel functions.  For example, if the account is labelled “VIP” then look for the associated relationship manager in a list of all the relationship managers, if not then look for the associated branch name – in both cases using the customer number to do the matching.

IF( CustStatus = “VIP”, VLOOKUP( CustNum, RelMans!\$A\$1:\$B\$20, 2, FALSE), VLOOKUP (CustNum, Branches!\$A\$1:\$B\$50, 2, FALSE))

In these cases, the results of the embedded function are used by the main function to deliver a result.

Matrices

In most cases however, businesses need to make decisions on more factors than can be represented simply by lists; in our example credit limits cannot be set with a reference to risk alone, income – and as a proxy for spend – considerations also need to be borne in mind.  When building a business model, a useful tool then is a two-dimensional matrix where results can be retrieved using embedded VLOOKUPs and HLOOKUPs.  Creating Matrices in Excel is a three-step process – at least I only know how to do it using three steps.

I will walk through the example of a limit setting matrix.  In this example I want to set a limit for each customer based on a combination of the customer’s risk and income while also keeping product restrictions in mind.  I want this model to be flexible enough so that I can easily change the definition of the risk and income bands as well as the prices assigned to each segment of the matrix.

The first step is to create the desired matrix, choosing the axis labels and number of segments.  Within this matrix, each segment should be populated with the desired limit.  The labels of the matrix will remain fixed though the definition of each label can be changed as needed.  The limit in each segment can be hard-coded in – €5 000 for example – or can relate to a reference cell – product minimum plus a certain amount for example.

In this example I have decided to 12 segment matrix that will cater for 4 income bands (Low, Moderate, High and Very High) and 3 risk bands (Low, Moderate and High).  I’ve then populated the matrix with the limits we will use as a starting point for our discussions.  Managers will not want to know just how the proposed model impacts limits at a customer level, they will also want to see how it impacts limits at a portfolio level so I have used COUNTIF and SUMIF to provide a summary of the limit distribution across the portfolio – all shown below: The second step is to summarise the values of the two key variables into the respective bands; using VLOOKUPs as discussed above.  In this example we want to summarise the risk grades of customers into LOW RISK, MODERATE RISK and HIGH RISK and the income into similar LOW INCOME, MODERATE INCOME, HIGH INCOME and VERY HIGH INCOME.

As a starting point, I have decided to make the splits as shown in the tables below.  These tables were used to label each account in the dataset using two new columns I have created, also shown below:

Then each account can be matched to a matrix segment using VLOOKUPs and HLOOKUPs, embedded to create a matrix lookup function.  What we want to do is to use the VLOOKUP functionality to find the right row corresponding to the risk of the customer and then to move across the number of rows to find the right column corresponding to the income of the customer.  The first part of the equation is relatively simple to construct so long as we ignore the column number:

VLOOKUP( Risk Band, \$A\$3:\$E\$5, ?, False )

Provided we’re a little creative with it, an HLOOKUP will allow us to fill in the missing part.  What we need to do is find a way to convert the ‘Salary Band’ field into a number representing the column.  You might have noticed that there was a row of numbers under each of the Income Bands in the matrix shown above.  This was done in order to allow an HLOOKUP to return that number so that it can be placed it into the missing part of the VLOOKUP.  An HLOOKUP will search for the Salary Band and then return the number from the row directly blow it, which in this case has been specifically set to be equal to its column number – remembering to add one to take into account the field used to house the name of the Risk Grade that is needed for the VLOOKUP.

HLOOKUP ( Salary Band, \$B\$1:\$E\$2, 2, FALSE )

In this case it will always be the second row so we can hardcode in the ‘2’.  This entire function is then substituted into the VLOOKUP to create a function that will look-up both the Risk Band and the Salary Band of any given customer and return the relative limit from the matrix.

VLOOKUP( Risk Band, \$A\$1:\$E\$5, HLOOKUP ( Salary Band, \$B\$1:\$E\$2, 2, FALSE ) , False )

All that is now needed is to add two further fields to take into account the potential VIP bonus limit and the model is complete – and the results are shown below:

This version of the model can be distributed or taken into a workshop and, as each component is adjusted so too are the individual limits granted as well as the tables summarising the distribution of limits across the portfolio.  For example, the marketing team may wish to increase the limits of Low Risk, Very High Income customers to 60,000 and, at the same time, the risk team may wish to re-categorise those with a risk score of 4 as ‘High Risk’ and increase the qualifying income for ‘High Income’ to 7,000.  The first change requires a simple change in the Limit Matrix while the second requires two simple changes to the references tables, giving the new matrix limit using the tables shown below. *     *     *     *     *

It is also possible to show the distribution by matrix segment. The method is based on the same logic discussed up to now, although the implementation is a bit clumsy.

The first step is to create a dummy matrix with the same labels but populated with a segment number rather than a limit.  Then you need to create a new field in the dataset called something like ‘Segment Number’ and to populate this field using the same equation from above.  Once this field has been populated you can create a another dummy version of the matrix and, in this case, use the SUMIF or COUNTIF function to calculate the value of limits or the number of customers in each segment.  With that populated it is easy to turn those numbers into a percentage of the total either in the same step or using one final new matrix:  