Matching/Merging
Matching and merging refers to the process of linking and
combining records in different datasets. As an example, a
company may have business customer records organized by account
number, and would like to merge the dataset with secondary
information available from a general list of businesses. The
general business list will be organized by business name,
with standardized name and address formats. Customer information
is often organized by account number and will also contain
business name, address, and location information. In the matching
and merging process, the two lists are joined.
Matching typically employs one of two approaches. In a simple
exact match, records in two different datasets are deemed
to match if the linking or “decision” variable is identical.
Simple exact matching is trivial but often under-represents
the number of linkages in datasets because name and address
identifiers are collected in many different formats. Probabilistic
matching is used to identify and link records from one
dataset to another on the basis of a calculated statistical
probability. Probabilistic matching relaxes the rigid quality
of exact matching, replacing the exact match with a fuzzy
probabilistic match whose statistical properties are controlled
by the matching analyst.
In National Analysts’ experience, all probabilistic matching
requires that questionable cases be reviewed manually to increase
match rates and reduce errors. National Analysts has experience
matching very large datasets (12 million+ records) and fine-tuning
matching algorithms to obtain a very high proportion of matches
and very few errors.
Matching/Merging
 
Fusion
 
Scoring 
For more information on National Analysts, feel free to e-mail
us today or call (215) 496-6800.
|