*Please enable cookies in your browser to watch the videos*
It’s the penultimate lesson of the series. In part six we’ll learn some of Excel’s most essential formulas – lookups and logical statements.
Now we’re getting serious.
Look up formulas
A common action in Excel is extracting an item from a table of data. This could be anything from finding out which album a song is on to working out the exchange rate for a particular currency on a particular day. Excel has a range of functions that you can use to achieve this including VLOOKUP(), HLOOKUP() and the more flexible, but slightly more complicated, combination of INDEX() and MATCH().
The lookup functions can seem simple, but if you don’t fully understand how they work it’s easy to return errors.
The V and H in the names of these two functions refer to Vertical and Horizontal respectively, so VLOOKUP() is for columns and HLOOKUP() for rows.
The VLOOKUP() function assumes that your data is arranged as a table with information in columns. Let’s see how a VLOOKUP works:
Even this simple example flags an issue of the lookup functions: they only return the first match. As we can see, there is another match further down that VLOOKUP() has ignored. If we need to return multiple results from a table then the lookup functions are unsuitable. You would either need to filter the table (Part 7) or use a PivotTable (part 3).
In our formula we have omitted the fourth argument. Although there are several exact matches for our value in our table, the function now returns a completely different album title – a Rolling Stones album that certainly doesn’t include Led Zeppelin’s Whole Lotta Love. In fact, if we changed our column index to 1, we would find that Excel had matched Whole Lotta Love with Honky Tonk Women.
This is because the approximate match doesn’t find the ‘closest’ value to our lookup value in the table. It’s much more specific: it finds the value that’s next largest, or equal to, our lookup value. In fact, it’s even more specific than that. If there is no exact match, it will find the first item in our table larger than the lookup value and match with the cell immediately above. For this reason, whenever you don’t specify an exact match by using FALSE as the fourth argument in VLOOKUP(), you must ensure that your table of data is sorted in ascending order, using the leftmost column.
Here, we have re-sorted our table and used two approximate matches, and two exact matches, all referring to columns 1 and 2. For two of our formulae we have entered an inexact track name (Whole lot of love). Where there is a match, we have positioned the VLOOKUP() statement next to the row it matches:
You can see that:
- Where there is no exact match, and exact is specified by entering FALSE as the fourth argument, the formula returns a #N/A error.
- Where there is no exact match and the fourth argument is omitted or TRUE is used, VLOOKUP() finds the first item ‘larger’ than our lookup value and uses the cell immediately above.
- Where there is an exact match and exact is specified, the first match from the top is found.
- Where there is an exact match and exact is not specified, the first match from the bottom is found.
In case you were wondering why Whole lotta love is ‘larger’ than Whole lot of love: they are the same up until the first t, then for the next character, the letter t has a higher value than a space.
You may be wondering what the point of this form of approximate match is. In fact, it can be very useful. Imagine we have a price list and we need to check the value of a particular product on a particular date:
Our price list just shows the date from which each new price came into use. Obviously, our invoice dates would not necessarily be just on those dates, so we would need to find the latest date (largest) that is smaller than or equal to our invoice date, which is what the approximate match will do as long as our price list table is sorted in ascending order of date.
These are definitely things you need to practice. Download the worksheet to give them a go:
VLOOKUP() and HLOOKUP() can be very useful, but sometimes you need to be able to lookup in two dimensions or to work with values sorted in descending, rather than ascending, order. The combination of the MATCH() and INDEX() functions can help you do so.
There are many similarities in the way that VLOOKUP() and HLOOKUP() work. However, with MATCH() and INDEX() there are some important differences. VLOOKUP() and HLOOKUP() return the value in a cell directly; MATCH() uses a similar approach, but returns the position of the cell in the list of cells.
As we will see, this is why MATCH() is often used with INDEX(). Also, the lookups can use a whole table of data and allow you to specify which column or row to use, MATCH() just works with a simple list – cells in a single column or in a single row, for example:
Match() and Index() together
Usually, in order to make use of MATCH() we need to use the value to retrieve the contents of the cell. This is why we need to combine MATCH() with INDEX(). INDEX() has two forms – we will look at the ‘array’ form here. In this context, array just means a block of cells rather than requiring the use of Control+Shift+Enter to create an actual array formula. At its simplest, INDEX() and MATCH() can be used to replace VLOOKUP() or HLOOKUP().
INDEX() takes three arguments: the block of cells that contains our table of values, a row number and an optional column number that define which cell within our block to return the value from.
This takes the value that our MATCH() function found and uses it as the row number, with the column number specifying that we should return the value from the second column of our table of data. We could have included the MATCH() function within the INDEX() function to do it all in one cell:
Using INDEX MATCH flexibly is not simple, but if you can master this then there won’t be much that can stand in your way. Practice using this file:
A logical formula tests whether a condition is true or false. These formulae can be used to answer questions such as:
- Are this month’s sales figures higher or lower than last month’s?
- Does the sum of country populations in Asia equal the total Asian population in the statistical data I have been given?
- Has this product both been earmarked for discontinuation and sold less than 1,000 units last year?
True, false and logical functions
Excel can test whether or not values are equal or not. Formulas that include one of the operators <, >, or = will either be true or false. For example, if you type:
=4<6 into a cell, the value returned will be TRUE. Conversely, if you type:
=4>6. It will be FALSE.
These comparisons can be applied to real data where cell references are used instead of numbers.
Here is the full list of comparison operators that will either return TRUE or FALSE:
= equal to
> greater than
< less than
>= greater than or equal to
<= less than or equal to
<> not equal to
A useful application of the ‘equal to’ (=) comparator tests equality from one row to the next to see if a cell is repeated in an ordered table. The example below sorted by film release year flags those years that appear in the list twice by comparing vertically consecutive cells in column D using the formula:
When the year differs in one cell compared to the cell below it, FALSE displays; when the year is the same, it’s TRUE (highlighted below):
Such a technique can be used to spot duplicates for removal from data, or as a way to flag changes in category in big data sets.
AND, OR, and the comparison operators are limited to outputs of TRUE or FALSE. The IF function expands on this by allowing us to specify the values a cell will take depending on whether a condition is true or false:
Whereas IF() is used when the outcome includes only two possibilities—one value appearing if the condition tested is true, the other if false—nested IFs are used to choose between more than two results. Continuing with the film table example, suppose you want to label all films released prior to 1999 as ‘Pre-99’; those released in 1999 as ‘99’; and those that came after as ‘Post-99’. Then the formula in E2 would look like this:
A word of caution: nesting to more than three levels can be confusing, and there might be a better way to achieve what you want by possibly using a LOOKUP function instead (covered earlier in this lesson).
Practice all this here [insert ex from https://learn.excelwithbusiness.com/learner/#/module/4M9tx6WkXb]
Had enough of formulas yet? I didn’t think so. We have one more to show you that combines some of the maths skills we learnt the other day with a logical function. It’s called a SUMIF
Sometimes you might want to sum only cells meeting certain criteria. For example if you had a column of years and a column of corresponding data, you may wish to only sum the data from a certain range of years. This could be done using the SUM function and manually choosing the cells to sum, but with large amounts of data this would take a long time, so SUMIF allows a more efficient way of summing only certain cells.
The function SUMIF will total only certain cells based on specific criteria. Its syntax is:
SUMIF(range, criteria, [sum_range])
Range is the range of cells that will be evaluated by the criteria; sum_range refers to the corresponding cells that are added if the criteria are matched. Sum_range is optional because if left out, the range is both evaluated and added.
AAT students and professional members can access a wide range of Excel resources and training.
Filtered provides algorithmically personalised Excel courses written by subject matter expert.