August 11, 2020

AI Series, Part 1: Enabling Your Data for AI – Start With Data Quality and Avoid the Bad Data Trap

Aditya Sriram

Topic:   Data

Before You Get to KPIs, Clean up Your Data Act

A common pitfall when strategizing an artificial intelligence (AI) initiative is the assessment of data quality and completeness. As AI inches forward to deliver value and forward-visibility into organizations across industry verticals, the quality of data grows in importance to support and deliver credible insights. Poor data quality or incomplete data will often derail and complicate AI initiatives, leading to implausible insights that hinder return on investment (ROI).

As organizations move towards data-driven decisions, it becomes essential to invest in tools that can assist in cleansing and harmonizing data.

However, what does it mean to have “bad” or “dirty” data in the context of AI? Well, bad data for AI can mean missing fields/records, duplicate records, outdated data, and/or non-standardized data points. As organizations move towards data-driven decisions, it becomes essential to invest in tools that can assist in cleansing and harmonizing data.

To ensure that the health of data is continuous, organizations often adopt data standardization tools that enable data monitoring at the point of entry to centralize control over incoming data. The process of validating the credibility of data before an AI engagement exponentially increases the overall accuracy and adoption of an AI initiative across the organization.

In addition, it is also important to have enough data available to support the underlying AI use case. Organizations have often prioritized an AI use case but they do not have data readily available or the data isn’t in the correct format. If this situation is relatable, then one should either invest in an integration tool to unify the required data sources and ensure the quality of data, or revisit and re-prioritize the underlying use case should. On the contrary, it may also be beneficial to acquire external data or by using a simple AI algorithm.