[MINI] Single Source of Truth

Data Skeptic

Episode | Podcast

Date: Fri, 09 Nov 2018 15:00:00 +0000

<p class="p1"><span class="s1">In mathematics, truth is universal.<span class="Apple-converted-space"> </span> In data, truth lies in the where clause of the query.</span></p> <p class="p1"><span class="s1">As large organizations have grown to rely on their data more significantly for decision making, a common problem is not being able to agree on what the data is.</span></p> <p class="p1"><span class="s1">As the volume and velocity of data grow, challenges emerge in answering questions with precision.<span class="Apple-converted-space"> </span> A simple question like "what was the revenue yesterday" could become mired in details.<span class="Apple-converted-space"> </span> Did your query account for transactions that haven't been finalized?<span class="Apple-converted-space"> </span> If I query again later, should I exclude orders that have been returned since the last query?<span class="Apple-converted-space"> </span> What time zone should I use?<span class="Apple-converted-space"> </span> The list goes on and on.</span></p> <p class="p1"><span class="s1">In any large enough organization, you are also likely to find multiple copies if the same data.<span class="Apple-converted-space"> </span> Independent systems might record the same information with slight variance.<span class="Apple-converted-space"> </span> Sometimes systems will import data from other systems; a process which could become out of sync for several reasons.</span></p> <p class="p1"><span class="s1">For any sufficiently large system, answering analytical questions with precision can become a non-trivial challenge.<span class="Apple-converted-space"> </span> The business intelligence community aspires to provide a "single source of truth" - one canonical place where data consumers can go to get precise, reliable, and trusted answers to their analytical questions.</span></p>