Back to glossary

Data Matching

Data matching identifies records across or within datasets that refer to the same entity — the foundation of dedup, identity resolution, and cross-system stitching.

What is data matching?

Data matching is the algorithmic process of deciding whether two records describe the same person, account, or thing. Exact matching is easy (same email = same person); fuzzy and probabilistic matching get into the messy real-world cases where the same company shows up as "Acme Inc.," "Acme Incorporated," and "ACME."

Why it matters

  • Underpins deduplication — you can't merge records you didn't match
  • Required for identity resolution across systems with different keys
  • Drives account-based reporting where a single account spans many touchpoints

Matching strategies

  • Exact — fastest, highest precision, lowest recall
  • Normalized exact — lowercase, strip punctuation, then exact — catches most easy variants
  • Fuzzy — edit distance, phonetic encoding (Soundex/Metaphone) for names
  • Probabilistic — combine many weak signals into a confidence score
  • Embedding-based — vector similarity for semantic matching

How TexAu helps

Built-in matching rules let you stitch records inside a TexAu table — match contacts to accounts by domain, dedupe accounts by normalized name, link inbound leads to existing CRM records before they ever get duplicated.

Related