QUALITY DATA - Part 6

Christopher Wagner • January 13, 2022

Lets wrap this up!

Storing the results of Data Quality tests can be very helpful in determining the overall quality of the Data over time, identity poorly performing sources/processes. This quality data is also instrumental for In-Line quality tests to ensure that insufficient data doesn't make it to Production and Off-Line test for automatically rolling back issues in Production.

Example:

Customer ABC had more than 12,000 addresses for the last month. Today customer ABC only has 11,000 addresses (more than a 10% drop in addresses). All files have been processed successfully, and the other QA tests are all passed.

This variance could trigger a Rollback with notification to Data Engineering and the Business to review. Maybe client ABC closed over 10% of their locations, or a change to the source system introduced a special character in the business key that's incorrectly dropping the addresses.

The Rollback keeps the data for the customer in its last know good state until this can be reviewed and confirmed by engineering and the business.

DATA QUALITY SUMMARY

Data Quality is a highly complex topic that is critical to the success of any data platform. How you handle data quality can make or break your data platform. We know issues with data quality will arise.

How you deal with them is the challenge.

DATA QUALITY BLOG SERIES

Each day the Data Quality Blog post will be released at 8:45 AM each day.

DATA QUALITY - Part 1 January 6th

DATA QUALITY CONCEPTS - Part 2 January 7th

DATA QUALITY FOR EVERYONE - Part 3 January 10th

DATA QUALITY FRAMEWORK - Part 4 January 11th

DATA QUALITY DEVELOPMENT - Part 5 January 12th

QUALITY DATA - Part 6 January 13th

< Older Post Newer Post >

CHRIS WAGNER, MBA MVP

Analytics Architect, Mentor, Leader, and Visionary

Chris has been working in the Data and Analytics space for nearly 20 years. Chris has dedicated his professional career to making data and information accessible to the masses. A significant component in making data available is continually learning new things and teaching others from these experiences. To help people keep up with this ever-changing landscape, Chris frequently posts on LinkedIn and to this blog.

Become a Data Engineer

By Christopher Wagner • September 3, 2024

Your guide to becoming a Data Engineer.

Microsoft Fabric vs Databricks

By Christopher Wagner • August 19, 2024

Compare Microsoft Fabric and Databricks, two leading data platforms. Highlights their features, strengths, and unique offerings across various domains like data engineering, data analytics, data science, DevOps, security, integration with other tools, cost management, and governance. Microsoft Fabric is noted for its low-code/no-code solutions and seamless integration with Microsoft tools, making it accessible for users with varying technical skills. Databricks is praised for its high-performance capabilities in big data processing and collaborative analytics, offering flexibility and control for experienced data teams.

Data Wizardry and Epic Adventures: A Day in the Life of a Microsoft Fabric Engineer

By Christopher Wagner • November 15, 2023

In a dynamic data engineering scenario, Sam, a skilled professional, adeptly navigates urgent requests using Microsoft Fabric. Collaborating with Data Steward Lisa and leveraging OneLake, Sam streamlines data processes, creating a powerful collaboration between engineering and stewardship. With precision in Azure Data Factory and collaboration with a Data Scientist, Sam crafts a robust schema, leading to a visually appealing Power BI report.

Power BI usage scenarios: Lakehouse to Power BI

By Christopher Wagner • April 28, 2023

NOTE: This is the first draft of this document that was assembled yesterday as a solo effort. If you would like to contribute or have any suggestions, check out my first public GIT repository - KratosDataGod/LakehouseToPowerBI: Architectural design for incorporating a Data Lakehouse architecture with an Enterprise Power BI Deployment (github.com) This article is NOT published, reviewed, or approved by ANYONE at Microsoft. This content is my own and is what I recommend for architecture and build patterns.

Share

QUALITY DATA - Part 6

Lets wrap this up!

CHRIS WAGNER, MBA MVP

Analytics Architect, Mentor, Leader, and Visionary

Become a Data Engineer

Microsoft Fabric vs Databricks

Data Wizardry and Epic Adventures: A Day in the Life of a Microsoft Fabric Engineer

Power BI usage scenarios: Lakehouse to Power BI

CHRIS WAGNER, MBA MVP

Analytics Architect, Microsoft Data Platform MVP, Data god, and Power BI Boss

Copyright © All Rights Reserved.