How to make Ugly data Sexy again

Rennovating your data to find Impossible Matches

By Jeff Robson

cover-copy

It’s a common problem: you have data which is an obvious match to a human, but not to Excel.

While you can do some approximate matching using VLOOKUP or INDEX/MATCH (see related articles below), you often end up doing all kinds of Excel gynmastics to get the data into a format that Excel can actually make use of.

Surely there must be an easier way to make ugly data sexy again?

Yes!  You need to get Fuzzy!

Fuzzy Lookup Add-in

Microsoft have provided this free Excel addin to help with precisely this situation.

You can give it your ugly data and tell it to find a match in some clean data … and it will!

Even if the words don’t match, even if the spelling is wrong, and even if the words are in the wrong order … it’ll attempt to find a match for what you’re looking for then tell you how confident it is in its results.

Pretty clever stuff!

Example

Imagine that you’re very rich, but your spelling and typing is atrocious!  You want to find out what your share portfolio is worth, but you can’t be bothered getting all the stock codes.  All you have is your badly-typed list of shares:

source_table

Now, you also have another list which you’ve somehow managed to download from the stock exchange and this contains all the correctly-spelled company names, with stock codes and the latest prices.

master_table

How can you match these up?  With just Excel, you wouldn’t be able to join these two tables together since nothing matches closely enough … there are just too many differences.

Here’s where the magic of the Fuzzy Lookup add-in comes.

You just run the add-in, choose the columns you think match.

fuzzy_lookup01

… then choose which columns you want in your output and how fuzzy your lookup should be.

fuzzy_lookup02

… and voila! You get a table that combines both and shows you the data you need to calculate your immense wealth!

fuzzy_lookup03

The “Similarity” column tells you how confident Excel is in the match produced.

So now, you don’t have to be stuck with ugly, messy data any more.

You can turn it into beautiful, sexy results that are actually useful!