Data, Databases, and Dashboards — Lesson 3

Data Quality and Source of Truth

13 min read

Learning Objectives

  • 1Define source of truth and explain why every data type needs one.
  • 2Identify common data quality problems and their business impact.
  • 3Establish practical rules for maintaining data integrity.

What source of truth means

A source of truth is the single authoritative location for a specific type of information. If customer contact information lives in the CRM, the email marketing tool, and a spreadsheet, and all three disagree, there is no source of truth. Someone must decide which system is authoritative and ensure other systems get their data from that source.

Without a defined source of truth, teams waste time reconciling conflicting information, reports become unreliable, and decisions based on bad data lead to bad outcomes. The cost of poor data quality is not just technical — it shows up as missed follow-ups, embarrassing duplicate outreach, and strategic decisions based on inaccurate metrics.

Common data quality problems

Duplicate records are the most common data quality problem. The same customer appears multiple times with slightly different names, email addresses, or phone numbers. Duplicates corrupt reports, waste marketing spend, and create confusing customer experiences.

Incomplete records — required information that was never filled in — make data unreliable for analysis and automation. If half your customer records are missing industry information, you cannot segment by industry. If lead source is rarely captured, you cannot measure which channels produce results.

Inconsistent formatting turns the same information into different values. "United States," "US," "USA," and "U.S.A." are the same country, but a database treats them as four different values. Inconsistency makes filtering, grouping, and reporting unreliable.

Stale data — information that was accurate when entered but is no longer current — accumulates silently. Email addresses change, companies merge, people change roles, phone numbers disconnect. Without a process for reviewing and updating records, the database slowly becomes fiction.

Practical data quality rules

Define required fields and enforce them at the point of entry. If lead source is important, make it a required field in the form — do not rely on salespeople to remember to add it later.

Establish naming conventions and use dropdown menus or controlled vocabularies instead of free text for fields that need consistency. A dropdown with "United States" is more reliable than a text field where people type whatever variation they prefer.

Schedule regular data reviews. Quarterly duplicate checks, annual contact verification, and periodic audits of key metrics against known benchmarks catch problems before they compound.

Assign data owners. Every important metric and data category should have a person responsible for its accuracy. If nobody owns the customer data quality, nobody will maintain it.

Case Study

Three systems, three truths

Situation

A professional services firm tracked client information in their CRM, their project management tool, and their accounting system. Each system had different phone numbers, different contact names, and different company names for the same clients. When a partner wanted to know total revenue by client, the answer was different depending on which system they queried.

Analysis

The CRM was designated as the source of truth for client contact information. A monthly sync process was established to update the other systems. Required fields were enforced at client onboarding. Within two quarters, revenue reporting became reliable for the first time.

Takeaway

Choose one source of truth per data type. Sync other systems from that source. Enforce consistency at the point of entry.

Reflection Questions

  • 1. For each major data category at your organization (customers, leads, projects, financials), which system is the source of truth?
  • 2. How many duplicate records do you estimate exist in your most important database?

Key Takeaways

  • Every data type needs a single source of truth — one authoritative location.
  • Duplicate records, incomplete fields, and inconsistent formatting are the most common data quality problems.
  • Data quality requires required fields, naming conventions, regular reviews, and assigned owners.
  • Bad data quality costs money through wasted effort, unreliable reports, and poor decisions.